It’s funny how a topic as apparently mundane as the DMA controllers on the STM32F2xx and STM32F4xx processors can be such a can of worms. I’ve already provided 2 postings on the subject, here and here, and now we have a third.
This one is a biggie. From what we’re seeing, it appears you can only have 2 DMA transactions taking place at any one time. It appears that having 3 simultaneous DMA transfers causes intermittent failures.
To explain….
We’ve been doing some work with the STM32F207 and STM32F407 processors. We had been performing 3 DMAs simultaneously. Specifically: receiving data from the DCMI port, receiving data from the ADC, and either receiving or sending data to the SDIO port (SD card). What we saw happening was:
- Sometimes the SDIO transfer would simply stop partway through. There was no error, and the card was not busy. There was no apparent reason for the data transfer to stop, but it would. Querying the SDIO status register SDIO_STA would indicate its transmit FIFO was empty (in the case of writing to the SD card) and that it was in the middle of a transfer.
- Sometimes the ADC would report an ADC overrun error. This should be impossible. With the ADC data being read out via DMA, completely outside of any software control, there’s no way it should ever overflow. But sometimes it would.
We initially put these behaviours down to buggy peripherals, but one day we noticed the problems stopped when we turned off the DCMI port. Hmm, how puzzling. Further investigation yielded that we had no problems if only any 2 of these 3 peripherals were running, which led to the question: “what’s common between these 3 peripherals?”. The obvious answer: the DMA. In this case, specifically DMA2.
It fits the symptoms very nicely. What we theorise is that when a peripheral needs a DMA service, it raises a flag which results in the DMA servicing that peripheral, either by writing it data or reading data from it. If 3 peripherals raise their flags simultaneously, it can happen that one of them is “permanently forgotten”. It doesn’t get serviced, then or later. In the case of the ADC this results in an overrun. In the case of the SDIO this results in it eventually running its FIFO empty (when writing to a SD card). In the case of the DCMI we saw some apparent failures, but they proved hard to nail down so I’ll reserve judgement there as to exactly what happens.
Based on this theoretical understanding, we put together a test case demonstrating the problem (on both the F2x and F4x) and sent it to ST at the beginning of Nov 2011. We heard back at the beginning of Dec that they’d replicated the problem, but we haven’t heard anything since.
Then towards the end of December I was speaking with another company using the SMT32F2xx and they just happened to mention a similar problem. They also had 3 simultaneous DMAs active, however theirs were: SPI, USART and SDIO. A different set of 3 to us, and theirs were spread across both DMA1 and DMA2. Yet their basic symptom of unexplained data stoppages was the same.
Based on all this, I can only assume that any 3 simultaneous DMAs, from any one DMA controller or a mix of both DMA controllers, can produce this problem.
In our case, we took the lowest bandwidth peripheral, namely the ADC, and moved it to operating under interrupt. Because the STM32F407 ADC has no internal FIFO it can overrun very easily, hence its interrupt must be at the highest priority and be permitted to preempt other interrupts. Since we’ve done this all 3 peripherals have behaved normally.
I need to re-emphasise that what we believe to be happening is no more than a theory based on the symptoms we’ve been seeing. Although ST has confirmed seeing these symptoms, they haven’t provided any information about what’s actually going on, so we’re flying in the dark to a certain degree. What I can say with some confidence is that you’d be very wise to limit the number of simultaneous DMAs to 2. At least until ST provides more information about what’s actually causing this.
Update June 2012:
In January 2012 ST provided a response in their user forums here:
It does not answer all the questions, but it seems to indicate the problem lies not with the number of DMAs, but instead might lie specifically with DMA2. They stated:
We confirm your findings and it is a limitation that concerns only our DMA2, and here is the detailed description :
DMA2 controller could corrupt data when managing AHB and APB2 peripherals in a concurrent way.
Description :
This case is somehow critical for peripherals embedding FIFO and generates data corruption. For memories, the impact is a multiple access but the data is not corrupted. AHB Peripherals embedding FIFO are DCMI, CRYPT, HASH. on STM32F2/40xx without CRYPTO only the DCMI is impacted.Implications:
The data transferred by the DMA to the AHB peripherals could be corrupted in case
of a FIFO target or multiply accesses in case of memories access.Workarounds :
Avoid concurrent AHB and APB2 transfer using DMA2. One of the following approach could be used to solve the issue:
* If DMA2 is used to manage AHB peripheral (DCMI, CRYPT, HASH), we can use the Cortex-M CPU to manage APB2 peripherals.
* If DMA2 is used to manage APB2 peripheral, we can use the CPU to manage AHB peripheral (DCMI, CRYPT, HASH).Obviously, we will update our errata on web soon.
I don’t know how ST measures time – they say their errata will be updated “soon”, but as I write this it’s 5 months since they stated that, and their STM32 errata, at revision 2.0, does not contain this information, let alone any more detail about it. If anyone knows anything more, please post a comment.
As you probably saw from one of my posts, ST did acknowledge the DMA arbiter problem. Last time I checked (a few weeks ago) it was still listed in their errata. So who knows – did they fix it? Maybe your configuration doesn’t expose the problem. It’s great you’re not seeing it.
I am working on a system based on the STM32F427. We have numerous sensors and communication ports active all the time, consequently we maxed out all the DMA channels. I haven’t seen data corruption yet, and we have all sorts of safeguards and error traps that would indicate right away any failed data acquisition issue.
All I can say is that the problems you have observed must be configuration related. I would suggest you revisit the memory requirements for DMA configuration, because putting it into the wrong region can definitely cause unforeseen behavior.
I use DMA2 for ADC&SPI. I am seeing Bill’s problem too. Maybe the ADC overrun error leads to this.
I’m seeing Bill’s problem too. I’m using SPI1 both directions on stream 0 and 5, and SDIO on stream 3, and am also seeing transfers just stop.
SDIO is in slave mode, as it controls the transfer, SPI in master mode.
I’m glad to hear you got it working – nicely done.
Omg Frank thanks for your work on this. Of course would have been lost trying to figure all this out myself. I’m using DCMI and the ADC. I got corruption in the dcmi data coming out. Turn off adc it went away instantly. I slowed down ADC as much as possible and it did help, but I still had corruption in my video data which is not acceptable. I implemented the ADC in a interrupt based system that switches the channel after each conversion. Its not great but it works.
Hi,
can you put “DMA Maximum Transactions” example to this blog. I am currently working with DCMI and SDIO. Both are working fine. Problem happens when the DCMI data is copied to SDIO. SDIO seems to write nothing. Although I have set the SD card in receiver mode and started a bulk transfer from DCMI (about 1MByte of data).
I am seeing this DMA stoppage issue also. DMA2 using USART1_TX, ADC1 & SDIO. I don’t see any data corruption only a DMA transfer stopping before complete sometimes. I think there’s another serious problem here that has been yet to be ack’ed by ST.
I’ve no idea, and I’ve never used uClinux on those parts. If I had to guess, I’d guess it was a coincidence, but in truth I’ve no idea.
At mega buud rates, looks like the tty driver emcraft/uClinux has a similar problem WRT USART3 on DMA1
Is that a coincidence or is this not software related at all????
TTY BUFFER CORRUPTION
Hmm. I wouldn’t assume a hardware problem for that one. You might want to double-check any interrupt service routines and anything involving data handling. Running off different DMA channels, and with the USART being such a low datarate device, chances are pretty good there’s a software bug in there someplace. Good luck!
You’re right – though I have a similar problem: USART3 on DMA1 and ADC3 on DMA2 can’t operate at the same time or leads to corrupted Data.
Different peripherals are connected to DMA1 vs DMA. The DCMI port is hooked up to the DMA2 controller (see the table in chapter 9.3.3 of the STM32F2xx Reference Manual for the DMA connections for all the peripherals). So no, DMA1 is not going to work; the DCMI port isn’t connected to DMA1.
Curios:
Why do they say one should use the CPU instead of DMA2, what about DMA1 – is that not going to work?
Thanks for your inquiries and reports.
We don’t entirely know. And ST isn’t providing full details on the issue. See the thread:
https://my.st.com/public/STe2ecommunities/mcu/Lists/cortex_mx_stm32/Flat.aspx?RootFolder=https%3a%2f%2fmy.st.com%2fpublic%2fSTe2ecommunities%2fmcu%2fLists%2fcortex_mx_stm32%2fWarning%20limit%20simultaneous%20DMAs%20to%202&FolderCTID=0x01200200770978C69A1141439FE559EB459D7580009C4E14902C3CDE46A77F0FFD06506F5B¤tviews=636
They have provided some info, but certainly not all. You would be wise to thoroughly test your system if you have multiple concurrent DMAs.
Hi.
At first time sorry for my english.
At second time thank you for your blog.
And now my question. I read upper text and mentioned ST blogs. But finally i don’t know if your problem is solved. So is it true that the problem is there only if DMA transactions are run simultaneously on DMA2 controller and the periphetals has FIFO (as is wrote in errSheet), or in fact the problem could be there if any more then 2 simultaneously transactions are run on any DMA controller?
Thank you for your reply;-)
The new revision 3.0 of the STM32 errata has this issue included: http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/ERRATA_SHEET/DM00037591.pdf
Thanks for finding it and pointing it out!
Have you heard anything further from ST regarding this issue? I’m having a problem getting circular DMA working with a STM32F205, but it could well be my problem. It fills the buffer once (I get one HT interrupt and one TC interrupt), but then it doesn’t loop back and refill the buffer again. It looks like the ADC and DMA are still running, the ADC status does not show an ADC overrun, but nothing more is transferred.