The STM32F2xx has a great SD Card interface. It’s a true 4-bit parallel interface, and in general it works pretty well. I have come across a few, fairly minor but still significant, considerations when using the interface that I thought I’d pass on.

Initialisation Sequence

Proper initialization of the SD card is important, because SD cards have no reset line and it’s not going to behave if its internal state machine wanders off to where you don’t expect it. It’s a good idea to have some way of removing power from the card (a p-channel MOSFET for example) so you can reset it if it goes crazy on you.

At power-up it seems to be helpful to have all signals in the idle state, which is high. This can be done by first having the SDIO pins configured as GPIOs, and set the GPIOs to be outputs, set high. Then you can switch those pins to their alternate function for the SDIO port.

Don’t Stop The Clock

The SD Card specification allows for the stopping of the clock. This can be helpful at times. The STM32F2xx allows for this, but when I first tried it, it didn’t work – I wasn’t able to reestablish communications with the card afterwards. Soon afterwards ST published an errata that said this feature doesn’t work. I don’t know if they plan to fix this, and I personally don’t care too much – I don’t plan on using it anyway. But if your application needs this, you should probably check with ST what their intentions are.

Voltage Level Translators & SD Card Timing

The STM32F2xx can be run at 1.8V, however the SD Card is a 3.3V device (most of them anyway). Ideally the processor would have a Vcc pin specifically for the SDIO pins (some processors do have this) but the STM32F2xx does not, so in this example its SDIO pins will be at 1.8V. This is fine for reading from the SD card, because the STM32F2xx is 3.3V tolerant on its inputs. But it’s no good for writing to the card, because the SD card won’t recognize 1.8V as being a logic “high”. A level translator is required.

Some processors provide a “direction” pin as part of their SDIO interface; this can be used to drive an external lowcost bidirectional buffer. The STM32F2xx does not provide this pin so an automatic switching bidirectional buffer is required. ST has one, however the most commonly used buffer appears to be the Texas Instruments TXS0108. There are several others.

Using an external buffer substantially affects the timing for SD card reads. Some processors provide a “clock input” pin as part of their SDIO interface, which is the clock used for read cycles. The STM32F2xx does not implement this either – the SDIO clock output by the STM32F2xx is the clock used for both writes and reads.

What this means is that for a SD read cycle, the clock “arrives” at the processor much earlier than the data does. Consider the read cycle for a moment. The processor outputs a rising edge of the clock, and the processor then expects to clock in data on the next falling edge of that clock. Once the SD card sees the rising edge there will be a delay within the card before it outputs the data. So the sequence looks like this:

Rising edge output from processor -> clock delayed through level translator en route to card -> delay due to SD card response time and then data is placed on bus -> data delayed through level translator en route to processor -> processor reads data on falling edge of clock

With a 25 MHz SD clock, assuming a perfect 50% duty cycle, there’s only 20 ns between the rising and the falling edge of the clock. In those 20 ns we need 2 trips through the level translator (clock going out and data coming in) plus the delay due to the SD card, plus any setup time for the processor read (which is zero thankfully). 20 ns is not sufficient – the level translator simply isn’t fast enough, nor necessarily is the card.

You need to do the timing analysis yourself for your exact components and system, but I think you’ll find that running at 25 MHz with a level translator on the STM32F2xx simply isn’t possible. Somewhere between 15 – 20 MHz for the SDIO is probably where you’ll end up.

Busy Signalling and Data Transfer

The STM32F2xx SDIO port contains hardware support for the card to signal busy. If the card cannot accept data it indicates this by pulling its Data0 line low. Once it’s able to accept data again, it sets Data0 high and data transfer can continue. For the most part the STM32F2xx SD interface handles this pretty seamlessly, pausing when the card is busy and continuing when its able. It’s quite transparent to the programmer. One intermittent exception I’ve found is when the processor is about to start sending data to the card. If the card is signalling ‘busy’ at the time the processor wants to commence sending data, sometimes (not always) the processor attempts the data transmit, stops (when it realises the card is busy), and then generates a CRC error (SDIO_STA register bit 1). This error halts the entire SD transfer. This behaviour is intermittent – usually the processor handles a ‘busy’ at the beginning of a transfer normally, however sometimes it results in an SDIO port CRC error.

The solution to this problem, quite obviously, is to wait with initiating a data transfer until the card is not busy. There are a couple of ways of accomplishing this. One is to tie the SDIO Data0 line to a free GPIO pin, which you can then read to ensure the pin is high before kicking off a data transfer. Another is to poll the card (this is what I do). Probably the easiest thing to do is to send the card a CMD13 “SEND_STATUS” command. The response to this command is the 32-bit “R1″ response, which is the card’s “card status”. Bit 8 will be high if the card is ready for data, low if the card is busy. Just sit in a loop, sending the card CMD13 commands and checking bit 8 of the response until it’s high.

SDIO_STA register TXACT bit

Be careful interpreting the TXACT bit (bit 12) in the STM32F2xx SDIO_STA register. The documentation says:

Bit 12 TXACT: Data transmit in progress

This would imply the bit is set while a data transfer is in progress, and clear when it’s not. You might think you can look at this bit to determine if the SDIO port has finished transmitting data to the card, so you know when you can start transmitting your next chunk of data to the card.

I’ve found this bit only behaves that way for single-block writes (CMD24 commands). For multi-block writes (CMD25 commands) I’ve seen this bit remain set even after the SDIO has sent all the data to the card and the SDIO_DCOUNT register is zero. If the card is still in its receive-data state (state 6) at the completion of the data transfer, the TXACT bit may still be set.

There are other ways to know if the SDIO port has finished transmitting its data. If you’re using DMA (and you should be) then you can check your DMA NDTR register to confirm it’s zero. The SDIO_STA register DATAEND bit (bit 8 ) will have been set (and generated an interrupt if you have it enabled) at the completion of the data transfer. And of course the SDIO_DCOUNT register will be zero. You don’t need to rely on the TXACT bit, and I suggest you don’t because it can be a bit misleading, at least the way it’s currently documented.

CRC Error with CMD5

The SDIO peripheral calculates a CRC regardless of whether a CRC is actually present or not. This results in the SDIO hardware generating CRC errors in the case of commands which don’t contain a CRC. Be aware that in the case of sending CMD5 to the card, the return data does not contain a CRC. The SDIO hardware will generate a CRC error in this case: CCRCFAIL bit in the SDIO_STA register will be set and may generate an interrupt if you have the interrupt enabled in the SDIO_MASK register. This reported CRC error is wrong – make sure your software is prepared to accomodate this “special case” in the case of a CMD5.

Update Dec 2011: This is now mentioned in the STM32F4xx errata, however the STM32F2xx errata still does not mention this.

Standard Peripherals Library SD Card Software

The Standard Peripherals Library for the STM32F2xx is a set of example software routines that can be downloaded from the ST website. If you’re using this processor the library is very valuable. Aside from providing a bunch of examples for using different peripherals and features of the chip, it also provides a standard set of definitions, some example start-up code, and more.

This doesn’t mean you should blindly use this code for your production product however. ST tries to make it clear this code is “example” code, and in many cases that’s all it is. Certainly this statement is true for the SD Card examples. You need to go through the code carefully and make sure it meets your requirements, or modify it to suit your needs if it doesn’t. It’s a great starting point, but don’t assume it’s anything more than just a starting point.

With regards to the SD Card example code in there, I’ve come across a few things worth noting.

Timeouts

The SD Card specification suggests timeout values for various operations, and the STM32F2xx SDIO peripheral contains a hardware timer you can use to implement this. It’s a simple clock counter. Alternatively you can use one of the many general-purpose timer/counters the processor provides.

The example SDIO code sometimes uses the SDIO “clock counter” timeout; when it does it sets it to its maximum value. That’s not very useful – yes it will eventually timeout, but not for a really long time. Usually when the example code implements a timer, it uses a simple loop counter, for example:

static SD_Error CmdError(void)
{
  SD_Error errorstatus = SD_OK;
  uint32_t timeout;
 
  timeout = SDIO_CMD0TIMEOUT; /*!< 10000 */
 
  while ((timeout > 0) && (SDIO_GetFlagStatus(SDIO_FLAG_CMDSENT) == RESET))
  {
    timeout--;
  }

The problem with this is you’ve no idea what the value of the timeout is. A compiler can potentially optimize it away to nothing, or it could take a long time. In practice I’ve found these timeouts expiring very prematurely, resulting in the functions returning errors before the SDIO transaction has had a chance to complete. There are also many places in the example code where there’s no timeout implemented at all, meaning the code can potentially hang-up in those locations.

4 GB Maximum Card Size

The SDIO example code uses an unsigned 32-bit variable (a uint32_t) for the card address. For example:

SD_ReadBlock (uint8_t *readbuff, uint32_t ReadAddr, uint16_t BlockSize)

A little math: 2^32 = 4 GB. Beyond that this address variable overflows. SD Cards can be up to 2 TB in size (2^32 x 512 bytes). Whether this limitation is a problem for you depends upon what kind of cards you intend to support. You may want to consider changing the following 5 functions to use a “sector” parameter instead of an “address” parameter. Given that modern large cards all use 512-byte sectors (or blocks) this allows the code to match up with how the card behaves.

SD_ReadBlock (uint8_t *readbuff, uint32_t ReadAddr, uint16_t BlockSize)

SD_ReadMultiBlocks (uint8_t *readbuff, uint32_t ReadAddr, uint16_t BlockSize, uint32_t NumberOfBlocks)

SD_WriteBlock(uint8_t *writebuff, uint32_t WriteAddr, uint16_t BlockSize)

SD_WriteMultiBlocks (uint8_t *writebuff, uint32_t WriteAddr, uint16_t BlockSize, uint32_t NumberOfBlocks)

SD_Erase(uint32_t startaddr, uint32_t endaddr)

SD Card Initialisation

This was mentioned earlier, but just to reiterate. Card initialization seems to be more reliable if the SDIO pins are placed in a high “idle” state before the pins are switched to the SDIO peripheral “alternate function”. The example code does not do this.

Blocksize

I haven’t personally experienced this, but it’s been reported on the forums that some smaller cards (eg 2 GB) have problems because the example read and write functions do not issue a CMD16 blocksize command to the card before performing the transaction. SDHC cards have a fixed blocksize of 512 bytes and do not require the CDM16 command, however non-SDHC cards do need that command.

SDIO_SetPowerState() function

Many thanks to Brad & Andrew over at the STM32 forum for finding this one. The issue is that during SDIO port power-up, which is part of the SDIO initialisation routines, the power-up may not always succeed. If you find the function SDIO_SetPowerState() contains this:

  SDIO->POWER &= PWR_PWRCTRL_MASK;
  SDIO->POWER |= SDIO_PowerState;

then try changing those two lines to this:

if (SDIO_PowerState == SDIO_PowerState_ON)
  SDIO->POWER |= SDIO_PowerState;
else
  SDIO->POWER &= PWR_PWRCTRL_MASK;

I believe the problem with the original code is described in the documentation for the SDIO_POWER register:

Note: At least seven HCLK clock periods are needed between two write accesses to this register.
Note: After a data write, data cannot be written to this register for three SDIOCLK (48 MHz) clock
periods plus two PCLK2 clock periods.

You can see the original code does two writes to the register in quick succession. That would be bad. This code change helped things for me.

SD_SendSDStatus() function

Updated 7 Dec 2011. This was a tough one to reliably reproduce and hence to find. In the STM32F2xx SD code, the function SD_Init() calls the function SD_GetCardStatus() which in turn calls the function SendSDStatus() passing it a pointer to a buffer, like so:

errorstatus = SD_SendSDStatus((uint32_t *)SDSTATUS_Tab);

The purpose of SD_SendSDStatus() is not immediately obvious from the code (read the comment block for SD_GetCardStatus() if you need a good laugh), but what it does is send an ACMD13 command to the card. It then retrieves the 512-bit status that the card sends in reply and writes it into the buffer. The problem? This:

static uint8_t SDSTATUS_Tab[16];

By my math, 16 bytes = 128 bits. So what happens is that SD_SendSDStatus() writes 64 bytes of data into a 16 byte buffer, resulting in a big buffer overrun and a bunch of innocent SRAM locations being stomped on. Which creates all manner of flakely problems. The simple and obvious fix is to increase the size of SDSTATUS_Tab, although a more robust solution would include a rewrite of SD_SendSDStatus().

Summary

It should be clear by now that the standard peripheral library SDIO code cannot reliably be used as-is. It contains far too many limitations, ranging from a serious lack of error-handling (and is sometimes error-generating) to outright functional restrictions. It’s a good starting point to show how things can work, but it’s far from being production-ready. For any real product you have no choice except to grab a copy of the SD Card specification and get busy. With that said, I’ve found the SD Card interface on the STM32F2xx to perform pretty well and the library code to be a big time-saver. Just don’t expect it to be production-ready code.

More information can be found in the posting: SDIO Interface Part 2.