It’s funny how a topic as apparently mundane as the DMA controllers on the STM32F2xx and STM32F4xx processors can be such a can of worms. I’ve already provided 2 postings on the subject, here and here, and now we have a third.

This one is a biggie. From what we’re seeing, it appears you can only have 2 DMA transactions taking place at any one time. It appears that having 3 simultaneous DMA transfers causes intermittent failures.

To explain….

We’ve been doing some work with the STM32F207 and STM32F407 processors. We had been performing 3 DMAs simultaneously. Specifically: receiving data from the DCMI port, receiving data from the ADC, and either receiving or sending data to the SDIO port (SD card). What we saw happening was:

  • Sometimes the SDIO transfer would simply stop partway through. There was no error, and the card was not busy. There was no apparent reason for the data transfer to stop, but it would. Querying the SDIO status register SDIO_STA would indicate its transmit FIFO was empty (in the case of writing to the SD card) and that it was in the middle of a transfer.
  • Sometimes the ADC would report an ADC overrun error. This should be impossible. With the ADC data being read out via DMA, completely outside of any software control, there’s no way it should ever overflow. But sometimes it would.

We initially put these behaviours down to buggy peripherals, but one day we noticed the problems stopped when we turned off the DCMI port. Hmm, how puzzling. Further investigation yielded that we had no problems if only any 2 of these 3 peripherals were running, which led to the question: “what’s common between these 3 peripherals?”. The obvious answer: the DMA. In this case, specifically DMA2.

It fits the symptoms very nicely. What we theorise is that when a peripheral needs a DMA service, it raises a flag which results in the DMA servicing that peripheral, either by writing it data or reading data from it. If 3 peripherals raise their flags simultaneously, it can happen that one of them is “permanently forgotten”. It doesn’t get serviced, then or later. In the case of the ADC this results in an overrun. In the case of the SDIO this results in it eventually running its FIFO empty (when writing to a SD card). In the case of the DCMI we saw some apparent failures, but they proved hard to nail down so I’ll reserve judgement there as to exactly what happens.

Based on this theoretical understanding, we put together a test case demonstrating the problem (on both the F2x and F4x) and sent it to ST at the beginning of Nov 2011. We heard back at the beginning of Dec that they’d replicated the problem, but we haven’t heard anything since.

Then towards the end of December I was speaking with another company using the SMT32F2xx and they just happened to mention a similar problem. They also had 3 simultaneous DMAs active, however theirs were: SPI, USART and SDIO. A different set of 3 to us, and theirs were spread across both DMA1 and DMA2. Yet their basic symptom of unexplained data stoppages was the same.

Based on all this, I can only assume that any 3 simultaneous DMAs, from any one DMA controller or a mix of both DMA controllers, can produce this problem.

In our case, we took the lowest bandwidth peripheral, namely the ADC, and moved it to operating under interrupt. Because the STM32F407 ADC has no internal FIFO it can overrun very easily, hence its interrupt must be at the highest priority and be permitted to preempt other interrupts. Since we’ve done this all 3 peripherals have behaved normally.

I need to re-emphasise that what we believe to be happening is no more than a theory based on the symptoms we’ve been seeing. Although ST has confirmed seeing these symptoms, they haven’t provided any information about what’s actually going on, so we’re flying in the dark to a certain degree. What I can say with some confidence is that you’d be very wise to limit the number of simultaneous DMAs to 2. At least until ST provides more information about what’s actually causing this.

Update June 2012:

In January 2012 ST provided a response in their user forums here:

It does not answer all the questions, but it seems to indicate the problem lies not with the number of DMAs, but instead might lie specifically with DMA2. They stated:

We confirm your findings and it is a limitation that concerns only our DMA2, and here is the detailed description :

DMA2 controller could corrupt data when managing AHB and APB2 peripherals in a concurrent way.

Description :
This case is somehow critical for peripherals embedding FIFO and generates data corruption. For memories, the impact is a multiple access but the data is not corrupted. AHB Peripherals embedding FIFO are DCMI, CRYPT, HASH. on STM32F2/40xx without CRYPTO only the DCMI is impacted.

The data transferred by the DMA to the AHB peripherals could be corrupted in case
of a FIFO target or multiply accesses in case of memories access.

Workarounds :
Avoid concurrent AHB and APB2 transfer using DMA2. One of the following approach could be used to solve the issue:
* If DMA2 is used to manage AHB peripheral (DCMI, CRYPT, HASH), we can use the Cortex-M CPU to manage APB2 peripherals.
* If DMA2 is used to manage APB2 peripheral, we can use the CPU to manage AHB peripheral (DCMI, CRYPT, HASH).

Obviously, we will update our errata on web soon.

I don’t know how ST measures time – they say their errata will be updated “soon”, but as I write this it’s 5 months since they stated that, and their STM32 errata, at revision 2.0, does not contain this information, let alone any more detail about it. If anyone knows anything more, please post a comment.