Frank's Random Wanderings

STM32F2xx SDIO SD Card Interface

The STM32F2xx has a great SD Card interface. It’s a true 4-bit parallel interface, and in general it works pretty well. I have come across a few, fairly minor but still significant, considerations when using the interface that I thought I’d pass on.

Initialisation Sequence

Proper initialization of the SD card is important, because SD cards have no reset line and it’s not going to behave if its internal state machine wanders off to where you don’t expect it. It’s a good idea to have some way of removing power from the card (a p-channel MOSFET for example) so you can reset it if it goes crazy on you.

At power-up it seems to be helpful to have all signals in the idle state, which is high. This can be done by first having the SDIO pins configured as GPIOs, and set the GPIOs to be outputs, set high. Then you can switch those pins to their alternate function for the SDIO port.

Don’t Stop The Clock

The SD Card specification allows for the stopping of the clock. This can be helpful at times. The STM32F2xx allows for this, but when I first tried it, it didn’t work – I wasn’t able to reestablish communications with the card afterwards. Soon afterwards ST published an errata that said this feature doesn’t work. I don’t know if they plan to fix this, and I personally don’t care too much – I don’t plan on using it anyway. But if your application needs this, you should probably check with ST what their intentions are.

Voltage Level Translators & SD Card Timing

The STM32F2xx can be run at 1.8V, however the SD Card is a 3.3V device (most of them anyway). Ideally the processor would have a Vcc pin specifically for the SDIO pins (some processors do have this) but the STM32F2xx does not, so in this example its SDIO pins will be at 1.8V. This is fine for reading from the SD card, because the STM32F2xx is 3.3V tolerant on its inputs. But it’s no good for writing to the card, because the SD card won’t recognize 1.8V as being a logic “high”. A level translator is required.

Some processors provide a “direction” pin as part of their SDIO interface; this can be used to drive an external lowcost bidirectional buffer. The STM32F2xx does not provide this pin so an automatic switching bidirectional buffer is required. ST has one, however the most commonly used buffer appears to be the Texas Instruments TXS0108. There are several others.

Using an external buffer substantially affects the timing for SD card reads. Some processors provide a “clock input” pin as part of their SDIO interface, which is the clock used for read cycles. The STM32F2xx does not implement this either – the SDIO clock output by the STM32F2xx is the clock used for both writes and reads.

What this means is that for a SD read cycle, the clock “arrives” at the processor much earlier than the data does. Consider the read cycle for a moment. The processor outputs a rising edge of the clock, and the processor then expects to clock in data on the next falling edge of that clock. Once the SD card sees the rising edge there will be a delay within the card before it outputs the data. So the sequence looks like this:

Rising edge output from processor -> clock delayed through level translator en route to card -> delay due to SD card response time and then data is placed on bus -> data delayed through level translator en route to processor -> processor reads data on falling edge of clock

With a 25 MHz SD clock, assuming a perfect 50% duty cycle, there’s only 20 ns between the rising and the falling edge of the clock. In those 20 ns we need 2 trips through the level translator (clock going out and data coming in) plus the delay due to the SD card, plus any setup time for the processor read (which is zero thankfully). 20 ns is not sufficient – the level translator simply isn’t fast enough, nor necessarily is the card.

You need to do the timing analysis yourself for your exact components and system, but I think you’ll find that running at 25 MHz with a level translator on the STM32F2xx simply isn’t possible. Somewhere between 15 – 20 MHz for the SDIO is probably where you’ll end up.

Busy Signalling and Data Transfer

The STM32F2xx SDIO port contains hardware support for the card to signal busy. If the card cannot accept data it indicates this by pulling its Data0 line low. Once it’s able to accept data again, it sets Data0 high and data transfer can continue. For the most part the STM32F2xx SD interface handles this pretty seamlessly, pausing when the card is busy and continuing when its able. It’s quite transparent to the programmer. One intermittent exception I’ve found is when the processor is about to start sending data to the card. If the card is signalling ‘busy’ at the time the processor wants to commence sending data, sometimes (not always) the processor attempts the data transmit, stops (when it realises the card is busy), and then generates a CRC error (SDIO_STA register bit 1). This error halts the entire SD transfer. This behaviour is intermittent – usually the processor handles a ‘busy’ at the beginning of a transfer normally, however sometimes it results in an SDIO port CRC error.

The solution to this problem, quite obviously, is to wait with initiating a data transfer until the card is not busy. There are a couple of ways of accomplishing this. One is to tie the SDIO Data0 line to a free GPIO pin, which you can then read to ensure the pin is high before kicking off a data transfer. Another is to poll the card (this is what I do). Probably the easiest thing to do is to send the card a CMD13 “SEND_STATUS” command. The response to this command is the 32-bit “R1” response, which is the card’s “card status”. Bit 8 will be high if the card is ready for data, low if the card is busy. Just sit in a loop, sending the card CMD13 commands and checking bit 8 of the response until it’s high.

SDIO_STA register TXACT bit

Be careful interpreting the TXACT bit (bit 12) in the STM32F2xx SDIO_STA register. The documentation says:

Bit 12 TXACT: Data transmit in progress

This would imply the bit is set while a data transfer is in progress, and clear when it’s not. You might think you can look at this bit to determine if the SDIO port has finished transmitting data to the card, so you know when you can start transmitting your next chunk of data to the card.

I’ve found this bit only behaves that way for single-block writes (CMD24 commands). For multi-block writes (CMD25 commands) I’ve seen this bit remain set even after the SDIO has sent all the data to the card and the SDIO_DCOUNT register is zero. If the card is still in its receive-data state (state 6) at the completion of the data transfer, the TXACT bit may still be set.

There are other ways to know if the SDIO port has finished transmitting its data. If you’re using DMA (and you should be) then you can check your DMA NDTR register to confirm it’s zero. The SDIO_STA register DATAEND bit (bit 8 ) will have been set (and generated an interrupt if you have it enabled) at the completion of the data transfer. And of course the SDIO_DCOUNT register will be zero. You don’t need to rely on the TXACT bit, and I suggest you don’t because it can be a bit misleading, at least the way it’s currently documented.

CRC Error with CMD5

The SDIO peripheral calculates a CRC regardless of whether a CRC is actually present or not. This results in the SDIO hardware generating CRC errors in the case of commands which don’t contain a CRC. Be aware that in the case of sending CMD5 to the card, the return data does not contain a CRC. The SDIO hardware will generate a CRC error in this case: CCRCFAIL bit in the SDIO_STA register will be set and may generate an interrupt if you have the interrupt enabled in the SDIO_MASK register. This reported CRC error is wrong – make sure your software is prepared to accomodate this “special case” in the case of a CMD5.

Update Dec 2011: This is now mentioned in the STM32F4xx errata, however the STM32F2xx errata still does not mention this.

Standard Peripherals Library SD Card Software

The Standard Peripherals Library for the STM32F2xx is a set of example software routines that can be downloaded from the ST website. If you’re using this processor the library is very valuable. Aside from providing a bunch of examples for using different peripherals and features of the chip, it also provides a standard set of definitions, some example start-up code, and more.

This doesn’t mean you should blindly use this code for your production product however. ST tries to make it clear this code is “example” code, and in many cases that’s all it is. Certainly this statement is true for the SD Card examples. You need to go through the code carefully and make sure it meets your requirements, or modify it to suit your needs if it doesn’t. It’s a great starting point, but don’t assume it’s anything more than just a starting point.

With regards to the SD Card example code in there, I’ve come across a few things worth noting.

Timeouts

The SD Card specification suggests timeout values for various operations, and the STM32F2xx SDIO peripheral contains a hardware timer you can use to implement this. It’s a simple clock counter. Alternatively you can use one of the many general-purpose timer/counters the processor provides.

The example SDIO code sometimes uses the SDIO “clock counter” timeout; when it does it sets it to its maximum value. That’s not very useful – yes it will eventually timeout, but not for a really long time. Usually when the example code implements a timer, it uses a simple loop counter, for example:

The problem with this is you’ve no idea what the value of the timeout is. A compiler can potentially optimize it away to nothing, or it could take a long time. In practice I’ve found these timeouts expiring very prematurely, resulting in the functions returning errors before the SDIO transaction has had a chance to complete. There are also many places in the example code where there’s no timeout implemented at all, meaning the code can potentially hang-up in those locations.

4 GB Maximum Card Size

The SDIO example code uses an unsigned 32-bit variable (a uint32_t) for the card address. For example:

SD_ReadBlock (uint8_t *readbuff, uint32_t ReadAddr, uint16_t BlockSize)

A little math: 2^32 = 4 GB. Beyond that this address variable overflows. SD Cards can be up to 2 TB in size (2^32 x 512 bytes). Whether this limitation is a problem for you depends upon what kind of cards you intend to support. You may want to consider changing the following 5 functions to use a “sector” parameter instead of an “address” parameter. Given that modern large cards all use 512-byte sectors (or blocks) this allows the code to match up with how the card behaves.

SD_ReadBlock (uint8_t *readbuff, uint32_t ReadAddr, uint16_t BlockSize)

SD_ReadMultiBlocks (uint8_t *readbuff, uint32_t ReadAddr, uint16_t BlockSize, uint32_t NumberOfBlocks)

SD_WriteBlock(uint8_t *writebuff, uint32_t WriteAddr, uint16_t BlockSize)

SD_WriteMultiBlocks (uint8_t *writebuff, uint32_t WriteAddr, uint16_t BlockSize, uint32_t NumberOfBlocks)

SD_Erase(uint32_t startaddr, uint32_t endaddr)

SD Card Initialisation

This was mentioned earlier, but just to reiterate. Card initialization seems to be more reliable if the SDIO pins are placed in a high “idle” state before the pins are switched to the SDIO peripheral “alternate function”. The example code does not do this.

Blocksize

I haven’t personally experienced this, but it’s been reported on the forums that some smaller cards (eg 2 GB) have problems because the example read and write functions do not issue a CMD16 blocksize command to the card before performing the transaction. SDHC cards have a fixed blocksize of 512 bytes and do not require the CDM16 command, however non-SDHC cards do need that command.

SDIO_SetPowerState() function

Many thanks to Brad & Andrew over at the STM32 forum for finding this one. The issue is that during SDIO port power-up, which is part of the SDIO initialisation routines, the power-up may not always succeed. If you find the function SDIO_SetPowerState() contains this:

then try changing those two lines to this:

I believe the problem with the original code is described in the documentation for the SDIO_POWER register:

Note: At least seven HCLK clock periods are needed between two write accesses to this register.
Note: After a data write, data cannot be written to this register for three SDIOCLK (48 MHz) clock
periods plus two PCLK2 clock periods.

You can see the original code does two writes to the register in quick succession. That would be bad. This code change helped things for me.

SD_SendSDStatus() function

Updated 7 Dec 2011. This was a tough one to reliably reproduce and hence to find. In the STM32F2xx SD code, the function SD_Init() calls the function SD_GetCardStatus() which in turn calls the function SendSDStatus() passing it a pointer to a buffer, like so:

The purpose of SD_SendSDStatus() is not immediately obvious from the code (read the comment block for SD_GetCardStatus() if you need a good laugh), but what it does is send an ACMD13 command to the card. It then retrieves the 512-bit status that the card sends in reply and writes it into the buffer. The problem? This:

By my math, 16 bytes = 128 bits. So what happens is that SD_SendSDStatus() writes 64 bytes of data into a 16 byte buffer, resulting in a big buffer overrun and a bunch of innocent SRAM locations being stomped on. Which creates all manner of flakely problems. The simple and obvious fix is to increase the size of SDSTATUS_Tab, although a more robust solution would include a rewrite of SD_SendSDStatus().

Summary

It should be clear by now that the standard peripheral library SDIO code cannot reliably be used as-is. It contains far too many limitations, ranging from a serious lack of error-handling (and is sometimes error-generating) to outright functional restrictions. It’s a good starting point to show how things can work, but it’s far from being production-ready. For any real product you have no choice except to grab a copy of the SD Card specification and get busy. With that said, I’ve found the SD Card interface on the STM32F2xx to perform pretty well and the library code to be a big time-saver. Just don’t expect it to be production-ready code.

More information can be found in the posting: SDIO Interface Part 2.

39 thoughts on “STM32F2xx SDIO SD Card Interface

  1. HC Srinivasa

    Hi,

    I’m facing a strange problem while interfacing Samsung eMMC device KLMxGxxEMx-B031 on the SDIO interface of STM32F205RET.
    I initiate a WriteBlock() operation, without DMA. After transferring all the data, I have a WHILE() loop checking for DATAEND bit to be set in SDIO_STA register @0x40012C34. I see that SDIO_DCOUNT register @0x40012C30 becomes ZERO, but the SDIO_STA register still shows a value 0x00045000, i.e., TXFIFOE=1, TXFIFOHE=1 and TXACT=1.
    The SDIO register content when I get stuck in the while-loop are given below.

    0x40012C00: 00000003 00005100 00400000 00000458
    0x40012C10: 00000018 00000900 0f5903ff f6dbffef
    0x40012C20: 8a404066 ffffffff 00000200 00000091
    0x40012C30: 00000000 00045000 00000000 00000000

    Has any of you guys faced such a problem? What could be wrong here? When I use DMA, then also I face such issues where all data doesn’t get transferred. In one DMA mode, I get DCRCFAIL=1 indicating a CRC failure; this CRC-error ofcourse is inline with the errata published by STM, cautioning us not to use Peripheral-Control-Mode.

    Really appreciate any input on this. Have tried many a sequences of SDIO programming, but not able to get past this.

    Thanks a lot, in anticipation, – HC Srinivasa

  2. frank Post author

    I doubt if you could. However, why not connect the two STMs using their SPI ports? That would certainly work.

  3. Kahraman

    This is good data. Do you know if one can program two STM32s to communicate with each other via SDIO? Do you see such capability?

  4. frank Post author

    As far as I know, the answer is “no”. FYI, I once looked at bit-bashing 4-bit SDIO, but quickly discovered that was a bad idea because there’s some kind of checksum or something that also gets sent, basically at hardware speed – it wasn’t possible to calculate it quickly enough to send it in time. I don’t exactly remember the details, it was a couple of years ago that I looked at it. Anyway, if 4 bit SDIO isn’t possible for you, consider using 1-bit mode, ie SPI. Yes it’s slow, but for some applications that’s often OK.

  5. Shivang

    I am in bit of a conundrum. I have a working SDIO interface code for stm32f103. But due to some reason the lines D2 and D3 are not available for use. Is it possible to use D6 and D7 instead of D2 and D3.

  6. Govind

    Thanks for documenting these issues. I encountered some similar problems on a non ST board. For the “IDLE state at power up”, the best way would be to use pull up resistors on all the lines except for clock.

  7. dhquan

    Thank you for your post. Its really usefull for me. But I get problem when create new file in micro SD

    I’m using the FatFs library ver R0.11 to access the SD card. Here is my code:

    //….//
    dis_initialize(0); //-> return SD_OK
    f_mount(&fs,””,); //-> return FR_OK
    f_open(&fsrc,”data.txt”,FA_OPEN_ALWAYS | FA_WRITE); // -> return FR_DISK_ERR
    In this case function check_fs(fs, bsect) return 3 -> /* An error occured in the disk I/O layer */

    I use mircoSDHC of SAMSUNG 4Gb and 32 Gb.

  8. Christopher Head

    To wait for busy signalling from the card on D0 to end, you don’t need to tie D0 to another GPIO; even with the alternate function enabled on the pin, the regular D0 pin is still a GPIO and can still be read as an input. In order to wait for the end of busy signalling, I just read the pin and, if I observe it low, enable a rising-edge interrupt on it (with a double-check to avoid the obvious race condition). I came up with this after being really annoyed at the lack of an interrupt on deassertion of TXACT, which would otherwise have done the job (since the latter appears to remain asserted until busy signalling ends).

  9. frank Post author

    That’s good news – it’s great to hear ST is actually updating and improving their code. Thanks very much for the update.

  10. Matt

    Hey all, I’ve been reading blogs like this and forum posts about using SDIO with DMA. Several people mentioned (even below this comment) that you should initialize the DMA prior to sending a read or write command to the SD card. Well, ST finally caught on! In version 1.5.0 of the StdPeriph_Lib (March 2015) in the file stm324x7i_eval_sdio_sd.c, they have moved the 3 DMA function calls above the SD card commands. This was not the case in earlier versions. I have taken the code from that eval file and implemented it into a project, and it is performing well.

  11. Bar Maley

    SDIO+DMA FEIF error.

    I didn’t find way to avoid this error but ignoring it. I wasn’t the first one of doing so. There is an example from Keil: .\Keil\ARM\Boards\Keil\MCBSTM32F400\RL\FlashFS\SD_File\SDIO_STM32F4xx.c
    [code]
    static BOOL WriteBlock (U32 bl, U8 *buf, U32 cnt) {
    /* Write a cnt number of 512 byte blocks to Flash Card. */
    U32 i;

    SDIO->DLEN = cnt * 512;
    SDIO->DTIMER = cnt * DATA_WR_TOUT_VALUE;
    SDIO->DCTRL = SDIO_DCTRL_DBLOCKSIZE_3 | SDIO_DCTRL_DBLOCKSIZE_0 |
    SDIO_DCTRL_DMAEN | SDIO_DCTRL_DTEN ;

    for (i = DMA_TOUT; i; i–) {
    if (DMA2->LISR & DMA_LISR_TEIF3) {
    break;
    }

    if (DMA2->LISR & DMA_LISR_TCIF3) {
    if ((SDIO->STA & (SDIO_STA_DBCKEND|SDIO_STA_DATAEND)) == (SDIO_STA_DBCKEND|SDIO_STA_DATAEND)) {
    /* Data transfer finished. */
    return (__TRUE);
    }
    }
    }
    /* DMA Transfer timeout. */
    return (__FALSE);
    }
    [/code]

    If you look inside their for loop they just ignoring FEIF.
    I verified that card access works (at least every time I’ve tried) and as I stated before I don’t know any way to avoid this error.
    STM errata doesn’t shed light on this issue or I didn’t find it…
    Does anyone know the better way to deal with this issue?

    Thanks.

  12. Bar Maley

    Trying to read single block with polling. Getting DCRCFAIL flag. Tryed both 400kHz and 24MHz clock frequency as well as 4-bit and 1-bit bus width. No difference. Could you please, publish the source code to learn from?

    Thank you in advance Bar.

  13. Brad

    Just in case anyone is still experiencing overruns. I found it necessary to move the following three lines of code to occur before sending the read command to the SD card.

    … set up block size
    … setup SDIO data configurations

    SDIO_ITConfig ( SDIO_IT_DATAEND, ENABLE );
    SDIO_DMACmd ( ENABLE );
    SD_LowLevel_DMA_RxConfig ( (uint32_t *)pReadBuff );

    … send READ_MULTIBLOCK or READ_BLOCK command to SD card.

    After this change I have experienced 0 RX overruns.

  14. Jon

    Please scratch that last observation on the polling times. I goofed and was reading the period reading on my logic analyser for the “polling” loop. :$ Both times were actually pretty similar with and without the SEND_STATUS polling. Oops.

  15. Jon

    Turns out the DCRCFAIL was the same issue. Because I had 2 CLKCR writes after one another to go into wide bus mode (one to clear the bits and the next to set them) the CLKCR writes would intermittently not “take” and I’d be in wide bus mode one time and 1 bus mode the next, always pretty randomly (Being in the wrong bus mode would set off the DCRCFAIL flag, not surprisingly).
    While I had the board hooked up to the scope, I also noticed another reason to try and rig up a GPIO interrupt to check the card busy status on the D0 line. Asides from the polling CPU overhead of writing SEND_STATUS commands non-stop, it actually took my card longer to finish it’s write when it was being polled by the STM32. I got about a 3.3 ms busy time after the write when polling the Card Status with SEND_STATUS and only 2.4ms when it was just left alone. I’ll be the first to admit this wasn’t a scientific survey with one card tested about 20 times, but the behaviour was very consistent.

  16. Jon

    Think I found a new silicon limitation that you might want to check out (STM32F4 in this case, but they seem to handle the SDIO stuff the same way). I was running own init sequence and noticed that the SDIO_CK divider wasn’t always being set to the correct value. Upon further mucking around, I realized that that register (CLKCR) has the same “Don’t write too quickly in succession” limitation as the POWER register. It’s a little confusing because the reference Manual has the :”Note: After a data write, data cannot be written to this register for three SDIOCLK (48 MHz) clock periods plus two PCLK2 clock periods” warning, but NOT the: “Note: At least seven HCLK clock periods are needed between two write accesses to this register” warning for the CLKCR register. Anyway, writing the whole register at once or introducing a short delay between writes seems to clear it up. I’m going to play it safe and assume ALL the registers in the SDIO module behave this way from now on!
    As a side note, I’m still having issues with a weird DCRCFAIL error happening at the beginning of some block reads (even after making sure that the STATUS returns ready), but I’ll poke around some more on my own before coming back here begging for help 🙂

  17. Jon

    Just wanted to add my thanks. I was having the “Busy” problem with CRC error flags popping up. Would have been a tough one to figure out without this!! Shame the Ref manual isn’t more specific on how it handles the D0 busy signal. At least it would give a clue as to what’s messing up. Thanks again.

  18. Wolk

    AFAIK SPL library no longer supported. ST now offer ST32CubeMX with “HAL” libraries, rewritten SPL.
    I’ve looked SDIO library for STM32L151RD. It looks better, but raw so far. They wiped support of MMC cards from there and I’ve found one bug: didn’t remember exactly which case, but it will not work correctly with SDSC or SDHC cards.

  19. frank Post author

    There’s no update that I’m aware of. Perhaps someone else can chime in if they know, or you can post on the ST forums. For example code, proof of concept code, the ST code is fine. But for robust production use, you would really need to review and update it to handle all the usage and error cases your application might encounter. Yes, ideally ST would provide production-ready code, but I’ve rarely seen that from any manufacturer. You pretty much have to put some work in yourself.

  20. frank Post author

    I don’t know of any. I tend to think the ST code is best treated as “example” code, and should probably be thoroughly reviewed and edited for production use.

  21. Brad

    Hello, I have been using the open sourced STM SDIO library for SD initialization and access. How limited do you feel the library provided is? I am starting to see issues with RX Overruns. Do you know of any other open source STM interfaced libraries for SDIO access?

  22. Bongo

    Hi. The info for SDSTATUS_Tab buffer was very useful. About 32bit/4G limit of StdLib – another solution is to define the address parameter as int64 – then it works on SD and SDHC.
    Is there some newer release of ST library. My is from 2012 but the bugs mentioned above could be fixed “officially”. Or ST has finished support for this library…

  23. chris

    Frank, thanks for all the information. From a beginner standpoint, it looks, in a nutshell, totally impossible. I could spend a year on it an never anywhere.

    Why not wrap up a project and sell it. “Large SD cards with FAT on the ST Discovery” obfuscated code = free, code + docs $50.

    At least it will give me something useful to save for.

  24. frank Post author

    I suggest you post your question over at the STM32 forums – hopefully someone there can help you.

  25. Amol

    Hi,
    I am using stm32 sdio library for sd card interfacing in 4 bit mode but my init card function not work as i cant gettinr error in card init function, Please suggest me anything issue is there in sdio.c function as i go through it but cant see anything frequency initian freq is 400khz & 4 bit mode selected.
    What problems it might be my card is working properly.

  26. frank Post author

    Interesting. I don’t know, but I do encourage you to post it on the ST forums to see if someone else there does.

  27. Peter Butler

    Frank,

    At the end of this message is a bit of assembly code for my STM32 F103xE demo board.

    1) This code is not complete. For many reasons.
    2) But this code should not fail in the way that it does.

    The bx lr returns to a “b .”
    I am reading one block. The block number 10 which is all F’s.
    The DMA transfer works for most of the block but stops short by 7 words.
    Using a debugger I see DMA2_CNDTR4 = 0x00000007.
    It looks like the SDIO peripheral is happy with the transfer. SDIO_FIFOCNT = 0xFFFF80
    Do you have a clue about what could do this?

    ;**********************************************************************
    ; readDMA
    ;
    ; on entry
    ; r0 = memory address
    ; r1 = block number
    ; r2 = number of blocks
    ;after this is fixed up we will interrupt when done
    ;but for now we loop and wait for xfer done
    readDMA
    ; first detup DMA
    movs r3,#0 ;to diaable chanannel
    str r3,[r6,#DMA_CCR+chanOffset]

    str r0,[r6,#DMA_CMAR+chanOffset]

    lsl r0,r2,#7 ;number of words to transfer
    str r0,[r6,#DMA_CNDTR+chanOffset]

    mov r0,#baseValCCR ;bit4=0: read
    str r0,[r6,#DMA_CCR+chanOffset] ;enable DMA channel for read

    ; then start SDIO read

    ldr r0,=staClrAll ;might as well clear ’em all
    str r0,[r7,#SDIO_ICR]

    ldr r0,=loCap
    ldr r0,[r0]
    cmp r0,#0
    it ne
    lslsne r1,r1,#9 ;loCap uses byte#
    str r1,[r7,#SDIO_ARG]

    ldr r0,=0xFFFFF
    str r0,[r7,#SDIO_DTIMER] ;to much? always enough?

    ; +——— block size is 512
    ; | +—– DMA=1
    ; fix |\ |?—- DTMODE block=0, 1=stream or SDIO multibyte
    ; me ||\ |?+— read from card to controller
    ; ????++++|?|+– start transfer
    mov r0,#000010011011b
    ; mov r0,#000000001011b ;?? block size 1 byte ??
    ; 109876543210
    str r0,[r7,#SDIO_DCTRL]
    cmp r2,#1
    ite le
    movle r0,#enaCPSM|rspWaitShort|SD_CMD_READ_SINGLE_BLOCK
    movgt r0,#enaCPSM|rspWaitShort|SD_CMD_READ_MULT_BLOCK

    lsls r2,r2,#9 ;blocks*512 ==> byte length
    str r2,[r7,#SDIO_DLEN]

    str r0,[r7,#SDIO_CMD] ;issue command

    bx lr

  28. Root

    Hi. Thanks for the informative post.

    Concerning the TXACT flag, I believe I’ve discovered that this flag actually seems to indicate when the chip is done writing its data and is ready to receive more. Have you or anyone else seen this or been able to confirm?

    I am trying to find a way to trigger an interrupt when the DAT0 line comes up after a write, but have not found a way yet. Catching the interrupt off of the TXACT flag seems to only generate an IRQ when TXACT is active, and not when it becomes inactive, which is what I want.

    The idea of tying DAT0 to another pin to use as an input could, however, solve this problem. Has anyone tried this that you know of.

    Thanks again.

  29. Andy

    I appreciate creation of the blog.

    I’ve port the STM’s mass storage examle into STM32F103VET – based dev eval board from China.
    It’s been taken a while, here are a few caveats for there who whold like to repeat the excersise 🙂

    unzip recent and well paired versions (destributed as one zip file) of CMSIS, STM32F10x_StdPeriph_Driver, and STM32_USB-FS-Device_Driver from STM’s webpage

    A few trivial C compilation bugs for making it compiled with gcc.

    My board has SD slot built in connected into SDIO. That’s why I’ve put -DUSE_STM3210E_EVAL -DSTM32F10X_HD -DUSE_STDPERIPH_DRIVER into my project main makefile.

    disable card detect pin = get rid code around SD_DETECT_PIN and make uint8_t SD_Detect(void) always returning SD_PRESENT

    Example check if interrupts are still working = make SysTick running, place non-0 size ISR stack out of main stack in linker script (stm32f10x_flash_hd.ld in my case)

    Connect interrupt service routines into your startup – these names were defined in startup_stm32f10x_hd.S file in my case.

    Removed code for LM75 / keys / leds / joy

    Removed code around FSMC and MAL_Init(1) (no such device on my board, just one SD card connected into SDIO, so MAL_Init(0) = one USB device suffice).

    Implemented many robustness fixes from this thread, but it still did not work really.

    Block size reported/read from the 2GB SD card as 1024 B (and time spent on trying to correct it) was not really causing an issue (I came back into original max of 512B finaly).

    It was kind of working (USB device found under windows), however no operation were possible. External disk was showing up, but no label / nor read operation were generally possible. I saw disk label 2-3 times recognized only during 2 days. Solution: My cpu was rinning at max allowed speed of 72MHz. After making the change, I can see USB device/disk, and file operations working nice:)

    /**
    * @brief SDIO Data Transfer Frequency (25MHz max)
    */
    //#define SDIO_TRANSFER_CLK_DIV ((uint8_t)0x01)
    #define SDIO_TRANSFER_CLK_DIV ((uint8_t)0x04)

    Thanks for creating the lib and the post (giving a good feedback) so fixes can be pulled into recent versions of the lib by authors) again.

  30. Décio

    I’ve found some guidance from ST’s own reference manual (for the STM32F103xx series) regarding the order of operations of a write. From section 22.3.2 of the latest manual:

    ***
    SDIO/DMA interface: procedure for data transfers between the SDIO and memory

    In the example shown, the transfer is from the SDIO host controller to an MMC (512 bytes using CMD24 (WRITE_BLOCK). The SDIO FIFO is filled by data stored in a memory using the DMA controller.

    1. Do the card identification process

    2. Increase the SDIO_CK frequency

    3. Select the card by sending CMD7

    4. Configure the DMA2 as follows:
    a) Enable DMA2 controller and clear any pending interrupts
    b) Program the DMA2_Channel4 source address register with the memory location’s base address and DMA2_Channel4 destination address register with the SDIO_FIFO register address
    c) Program DMA2_Channel4 control register (memory increment, not peripheral increment, peripheral and source width is word size)
    d) Enable DMA2_Channel4

    ??5. Send CMD24 (WRITE_BLOCK) as follows:
    a) Program the SDIO data length register (SDIO data timer register should be already programmed before the card identification process)
    b) Program the SDIO argument register with the address location of the card where data is to be transferred
    c) Program the SDIO command register: CmdIndex with 24 (WRITE_BLOCK); WaitResp with ‘1’ (SDIO card host waits for a response); CPSMEN with ‘1’ (SDIO card host enabled to send a command). Other fields are at their reset value.
    d) Wait for SDIO_STA[6] = CMDREND interrupt, then program the SDIO data control register: DTEN with ‘1’ (SDIO card host enabled to send data); DTDIR with ‘0’ (from controller to card); DTMODE with ‘0’ (block data transfer); DMAEN with ‘1’ (DMA enabled); DBLOCKSIZE with 0x9 (512 bytes). Other fields are don’t care.
    e) Wait for SDIO_STA[10] = DBCKEND

    6. Check that no channels are still enabled by polling the DMA Enabled Channel Status register.
    ***

    So I guess the right order is 3, 1, 2 indeed. I’ll try it again and report on my findings.

  31. Décio

    Thanks for the tip. So I guess there are two possibilities for the order of the write operation: 3, 1, 2 or 3, 2, 1.

    I believe 3, 1, 2 may work, but again, if an interrupt fires between 1 and 2, and if the card enforces some sort of timeout which I don’t know about, a write may be dead before it starts. But if there’s no such timeout, then this should work.

    As for 3, 2, 1, it might be completely safe, since when the command is sent the DPSM and the DMA are already configured, so there’s no risk of underrun. But this also depends on the SDIO peripheral not starting the data transfer immediately after doing 2 (SDIO_DataConfig(…)). Here’s what the reference manual says:

    “Depending on the transfer direction (send or receive), the data path state machine (DPSM) moves to the Wait_S or Wait_R state when it is enabled:

    ? Send: the DPSM moves to the Wait_S state. If there is data in the transmit FIFO, the DPSM moves to the Send state, and the data path subunit starts sending data to a card.”

    So my understanding is, if DMA’s already configured, then the moment I call SDIO_DataConfig(), it’ll start the transfer, even before the command is sent. So I guess this order is out. Do you agree?

    Also, if you’ll allow me to contribute a tidbit that wasn’t mentioned in your post: SD_FindSCR() from ST’s SDIO driver is also buggy. When reading the response to SD_CMD_SD_APP_SEND_SCR, the code sometimes gets stuck in the loop after reading the full 64 bits of the response — somehow it doesn’t realize the transfer has ended. You might add some code to break from the loop after reading 64 bits, but it feels like a kludge. One of my attempts to fix it was using DMA for the transfer. At that time I didn’t know about the ordering of DMA and command transmission, so I did it in the wrong order and still had problems from time to time. Which is weird, since the SDIO peripheral’s FIFO is deep enough to hold the 64 bits of the response to that command, but still, I had problems. Maybe if I tried changing the order, it might always work.

    But before trying that I realized that there’s no point in calling this function. It’s only used at one point in the driver code, where it queries the SCR to see if the card supports 4-bit mode before enabling it. But the SD simplified physical layer spec section 5.6 says quite clearly: “Since the SD Memory Card shall support at least the two bus modes 1-bit or 4-bit width, then any SD Card shall set at least bits 0 and 2 (SD_BUS_WIDTH=”0101″).” Hence, at least for this one use, there’s no need to query the SCR since we know the answer beforehand. If you have no other uses for the SCR (and I didn’t in my FATFS port) then you can just do away with SD_FindSCR().

    Hope this is useful for you or someone else.

  32. frank Post author

    For both a read and a write, it’s important to setup the DMA before initiating the data transfer. For the read, you’ve already worked that out. For the write, you need the DMA to be enabled before starting the write. Remember that when you setup the SDIO DMA, the DMA is configured as “peripheral controlled”. The SDIO is the only peripheral with this ability. This means that even though you’ve enabled the DMA, the DMA is not actually doing anything – it’s just sitting there, waiting for the peripheral (the SDIO) to tell it to run. There’s no harm in having the DMA enabled beforehand.

    But for the write it is important. Because once the SDIO commences the write to the card, it pretty much immediately requires data to give to the card. If it doesn’t get it pronto, you’ll end up with an SDIO error (usually a FIFO error because the transmit FIFO underran).

  33. Décio

    Have you found any issues with the order of CPSM/DPSM/DMA initialization in the SD_ReadBlock() and SD_WriteBlock() functions?

    What the sample SD_ReadBlock() does is (in summary):

    1. SDIO_DataConfig(…);
    2. SDIO_SendCommand(…);
    3. Enable DMA

    It’s possible (particularly in interrupt heavy code) for it to take a long time between steps 2 and 3. In my code I changed the order to 1, 3, 2 and I don’t recall ever having read errors again.

    As for SD_WriteBlock(), what it does is (again in summary):

    1. SDIO_SendCommand(…);
    2. SDIO_DataConfig(…);
    3. Enable DMA

    Now I don’t see why this should be a problem, even with many interrupts firing, unless there’s some kind of timeout that the SD card enforces which I missed from reading the spec. Yet, from time to time, some of my cards — particularly some old 1 GB and 2 GB SD (not SDHC) cards — will timeout after attempting a write, and inspecting the DMA registers reveals that DMA_CNTDR (i.e. the remaining number of words to send) is something like 125 out of the original 128 (128 words x 4 bytes/word = 512 bytes).

    The impression I get is that the transmission started before enabling DMA, and by the time it was enabled, it was near the end of the transfer and it only managed to send a few words before the transfer completed. I doubt this explanation is right, though, seeing as the FIFO should be empty before enabling the DMA, and so I don’t see how the DPSM would go from the Wait_S to the Send state.

  34. Edward Keyes

    Just wanted to say thanks a million for documenting these issues with the ST library code. I was tearing my hair out about the block-size problem: “Why am I only getting the first 8 bytes of each sector?!” Just the mere mention of that issue saved me a ton of debugging time…

  35. Memphis

    The biggest problem is that reference manual has too few informations about how to exactly do the read and write operations with DMA support.

    The given examples are really for a start point but the production code must be totally rewriten by yourself (means the Utilities directory too!), only a library can be used as it is, but sometimes it needs a little fix too.

    And also i am little bit suprised about SDIO, i expected more self work HW than making a bunch of code just for a simple read and write operations. I comapred the SPI library for SD card with the SDIO and it looks that simple SPI are written with less code, which doesnt make sense :-/

Leave a Reply

Your email address will not be published. Required fields are marked *