3

I am using a STM32 G474 to create a wavefrom with it's internal DAC. I give a lookup table to the direct memory access (DMA) module and that gives the values at the right time to the corresponding DAC channel. What sounds like the hard part is actually pretty straight forward and works just fine.

#define NS  64                             # number of samples

uint32_t Wave_Low[NS] = {2048,[...],2047};  # lookup table

int main(void)
{
HAL_DAC_Start_DMA( &hdac2,   DAC_CHANNEL_1, (uint32_t*)Wave_High, NS, DAC_ALIGN_12B_R);
*/ start DMA       use DAC2  channel 1  */
}

As the next step I want to change the signal form within the code. As I want this to happen without interruption, stopping the DMA and reinitializing it doesn't work (there is a 500 µs delay without a signal in between). Therefore I need to overwrite the lookup table. I've tried it like this:

#define NS  64                             # number of samples

uint32_t Wave_Low[NS] = {2048,[...],2047};  # lookup table 1
uint32_t Wave_High[NS] = {4096,[...],4067}; # lookup table 2
uint32_t Wave_Active[NS];                  #used lookup table

int main(void)
{  
memcpy(Wave_Active , Wave_High, NS );      #assign high wave as the currently used one

HAL_DAC_Start_DMA( &hdac2,   DAC_CHANNEL_1, (uint32_t*)Wave_Active, NS, DAC_ALIGN_12B_R);
*/ start DMA       use DAC2  channel 1  */
}

From my understanding this code should show the exact same behavior but the DAC signal differs significantly by showing the positive part of a sawtooth signal instead of the centered sine wave it's supposed to show. I'm a bit rusty with embedded C but that behavior definitely irritates me.

Tarick Welling
  • 3,119
  • 3
  • 19
  • 44
  • 5
    The third parameter is the number of _bytes_ to copy. So you're only copying a quarter of what you seem to have intended. Instead of `NS` you can use `sizeof(Wave_Low)` since it's an array, or you can use `NS * sizeof(uint32_t)` – paddy Jan 21 '21 at 14:00
  • Well one thing that can be the origin of the problem is that memcpy takes the number of byt es not the size therefore the memcpy line should be as follows memcpy(Wave_Active , Wave_High, NS*sizeof(uint32_t) ); – Motaz Hammouda Jan 21 '21 at 14:02
  • Unrelated: Why the cast here: `(uint32_t*)Wave_High` and here: `(uint32_t*)Wave_Active`? – Ted Lyngmo Jan 21 '21 at 14:02
  • Also unrelated: Don't do `#define NS 64`. Use proper types. `constexpr std::size_t NS = 64;` – Ted Lyngmo Jan 21 '21 at 14:03
  • @paddy that's it. I feel a bit dumb but thank you very much! – phil_o_matic Jan 21 '21 at 14:04
  • The memcpy bug is only one of several problems. I'll re-open this. – Lundin Jan 21 '21 at 14:09
  • 1
    @phil_o_matic Best to tag one of C, C++, but not both. The problem may be C/C++, but the best solution is often language dependent. – chux - Reinstate Monica Jan 21 '21 at 15:12

3 Answers3

6

Several problems:

  • DMA buffers need to be volatile qualified or otherwise the compiler might go bananas when generating the code accessing them.

  • You use memcpy incorrectly, should have been memcpy(Wave_Active , Wave_High, sizeof Wave_Active);

  • The use of memcpy to begin with is often incorrect when it comes to hardware-related programming. Copying 256 bytes takes a lot of time. Worst case, your DAC might even request new data before you are done copying.

    The correct way to write such code would be to have several allocated buffers, then swap an "active" pointer to point at the one used. With the disclaimer that I don't understand the purpose of these arrays, something like this would be an immense speed optimization:

      volatile uint32_t Wave_Low[NS] = {2048,[...],2047};  # lookup table 1
      volatile uint32_t Wave_High[NS] = {4096,[...],4067}; # lookup table 2
      volatile uint32_t* Wave_Active = Wave_High;
    
      ...
      if(DMA_flag)
      {
        Wave_Active = (Wave_Active==Wave_Low) ? Wave_High : Wave_Low;
    
        /* you might have to tell the DMA which array to use next time here */
      }
    
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 1
    1. To generate contiguous waveform OP needs to use circular mode. To change the address of the sample table the DMA has to be disabled, then DMA peripheral reset and started again. Your method will not work using this particular hardware. – 0___________ Jan 21 '21 at 14:50
  • the first two points are great, the second one fixed my actual issue. On the other hand avoiding memcpy doesn't work for me as I can't reinitialize the DMA without stopping it first (which again causes the waveform to get interrupted) – phil_o_matic Jan 21 '21 at 14:50
  • @phil_o_matic you replace the table the incorrect way. You need to copy the part of the table which is will not be accessible by the hardware during the copy process – 0___________ Jan 21 '21 at 14:53
  • 1
    @0___________ Ah that's too bad, but maybe then it shouldn't have been implemented with a DAC, but a PWM to OP amp. Depending on resolution and speed required, of course. – Lundin Jan 21 '21 at 14:55
  • Or well, write to the DAC using interrupts... but that's always icky and CPU intensive, DMA is much more elegant. – Lundin Jan 21 '21 at 14:57
  • @Lundin no - it is enough to use DMA inserrupts. When DMA is finishing the transfer of the complete buffer it starts from the beginning. You can copy the data to the second half of the samples buffer. When DMA reaches half of the transfers you can safely copy the first half. That is the reason of having those 2 interrupts. – 0___________ Jan 21 '21 at 15:00
  • @Lundin here is described in more details https://www.st.com/resource/en/application_note/cd00259245-audio-and-waveform-generation-using-the-dac-in-stm32-products-stmicroelectronics.pdf – 0___________ Jan 21 '21 at 15:06
  • Maybe OP can use double buffered mode, if the HW supports it. It takes some codding, but it's possible to set `M0AR` & `M1AR` HW pointers while the DMA is running. – Tagli Jan 21 '21 at 15:08
  • In one of my projects, I had to implement a triple-buffer using a similar method to resolve a SPI vs USB buffer race. https://stackoverflow.com/questions/60132112/how-will-circular-dma-periph-to-memory-behave-at-the-end-of-the-transfer-in-stm3/60134124#60134124 – Tagli Jan 21 '21 at 15:16
  • @Tagli not all STM32 DMA controllers support it. – 0___________ Jan 21 '21 at 22:31
  • @Tagli and STM32G4 does not allow CMARx registers to be changed when DMA is on, so this advice is wrong https://i.stack.imgur.com/J5HNJ.png – 0___________ Jan 21 '21 at 22:35
2

The memcopy problem is probably caused by the typo and I will not fucus on it.

To copy new data to the buffer you need copy only the data which is not actually being read by the DMA. Otherwise the waveform might be disrupted. To archive it you need to:

  1. Enable Transfer complete interrupt and Half transfer interrupt
  2. In the interrupt handler or (as you use HAL) in the HAL callback function check for the cause of the interrupt. If it is Transfer complete interrupt copy the second half of the table, in the Half transfer interrupt copy the first path of the table.

example for STM32F3

if(hdac2 -> Instance -> ISR & DMA_ISR_TCIFx)
   memcpy(&Wave_Active[NS/2] , &New_Wave[NS/2], (NS / 2) * sizeof(Wave_Active[0]));
if(hdac2 -> Instance -> ISR & DMA_ISR_HCIFx)
   memcpy(&Wave_Active[0] , &New_Wave[0], (NS / 2) * sizeof(Wave_Active[0]));

where x is DMA channel used.

if the memcpy function is too slow I would personally write my own one (in this case copying 32 32bit words. It will be very fast.

void inline __attribute__((always_inline)) mymemcpy32WORDS(void *dest, const void *src)
{
    const uint64_t *src64 = src;
    uint64_t *dest64 = dest;

    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
    *dest64++ = *src64++;
}

https://godbolt.org/z/4xnq5v

For the best performance, you need to make sure that your arrays are 64bits aligned. If they are the copy of the 32 32bits words will take (assuming 170MHz clock) 1.13us and your 32 word DMA to DACtransfer will take 2us. You have enough time to process everything in the interrupt routine.

It should be fine to run from FLASH (it has to be checked) but if not you should put the interrupt handler to SRAM or even better the data to SRAM and code to CCMRAM

0___________
  • 60,014
  • 4
  • 34
  • 74
  • I'm using the DMA in a circular mode. The high and low waveforms refer to their respective amplitude (the whole project is a test foramplitude shift keying). The same lookup tables gets used repeatedly until an event causes the overwrite of the lookup table – phil_o_matic Jan 21 '21 at 14:55
  • I only show you the way of replacing data in actual buffer. Changed the code to avoid confusion. It does not matter what new data is. The order is umoprtant – 0___________ Jan 21 '21 at 14:58
  • Thats tecnically the right way, but desn't work for my application. I'm going with 16MS/s so a 250 kHz sine wave. I've measured memcpy to take around 20µs to overwrite all 64 values. This means your approach would result in the same behavour of reading a value while it gets written to. – phil_o_matic Jan 22 '21 at 09:07
  • @phil_o_matic no – 0___________ Jan 22 '21 at 10:22
2

@ 0___________ suggested a solution with the right approach. He suggested overwriting one half of the lookup table while the DMA reads the other half. This would avoid reading a byte while it's getting read. The problem with this is that I'm sampling faster than memcpy can write the values.

Therefore I've tried the simple approach and bluntly overwrite the array with memcpy no matter what. This partly causes funny signal patterns (you can actually see which part of the lookup table ges overwritten first) but over all it works. This causes a signal transition within around 20 µs which is sufficient. As I can live with the imperfect signal pattern that will be my solution. Below is an oscilloscope screenshot of the signal transition from low to high.

Thanks for your help!

Oscilloscope screenshot of the transition from low to high lookup table

Tarick Welling
  • 3,119
  • 3
  • 19
  • 44
  • Really? You are happy with that? It is indicative of an error - it is not "just the way it is". As @Lundin has suggested, the memcpy is unnecessary. The transients are probably caused by not making the data ready before the DMA completes the previous buffer. You would do better to have a single buffer `wave[2][NS]` configure the DMA to operate in circular mode, then on the half transfer interrupt update `wave[1]`, and on the full-transfer update `wave[2]`. – Clifford Jan 22 '21 at 22:15
  • Then increase the core clock speed. 16MSPS is nothing comparing with the speed of the core. You can run your micro 170MHZ assuming 6 clocks per transfer - > 20M transfers per second which is more than 16MSPS. – 0___________ Jan 22 '21 at 22:55
  • 1
    @Clifford you cant do double buffer using this micro. To change the address in the DMA you need to disable the DMA, reset it, the set again with the new address and start again which will stop the waveform generation for the time of this operations. – 0___________ Jan 22 '21 at 22:57
  • 1
    See my amended answer - There you have a fast 32 dwords copy function. – 0___________ Jan 22 '21 at 23:17
  • @Clifford what are talking about? What HAL? Please tell me how can you switch buffer address in the STM32G4 DMA without abusing the documentation and without switching off DMA. Paypal reward £10. – 0___________ Jan 22 '21 at 23:42
  • Bear in mind the RM : https://i.stack.imgur.com/H8Gv9.png – 0___________ Jan 22 '21 at 23:47
  • @Clifford so how does it differ from my answer? If you want to replace data you need to copy new data to this buffer. So how `the memcpy is unnecessary` (as memcpy I understand any function writing some data to this buffer - it may copy or calculate new values) – 0___________ Jan 22 '21 at 23:50
  • @0___________ Clearly you can. But reading the comments for information that should be in the question, my suggestion may not be what is required here. The discontinuity is caused by the data changing while it is being output. – Clifford Jan 22 '21 at 23:55
  • @0___________ Sorry, you are commenting on a comment I deleted. I misunderstood the question, though the answer to the question is only the memcpy error size error. Arguably how to make it work is a different question. – Clifford Jan 23 '21 at 00:02
  • @Clifford yes it is obvious. So he needs function which will copy fast enough data from the one memory location to another. For 32 samples (16MSPS) he has 2us to complete the job. So In my answer you have another memcpy function with no branches – 0___________ Jan 23 '21 at 00:04
  • @Clifford memcpy size probem: I think it is a simple typo or too fast typing. I think everyone makes such stupid mistakes from time to time (at least I do). – 0___________ Jan 23 '21 at 00:08
  • @0___________ indeed, but that was the cause of the malformed waveform, which is what he was asking about. If after that it still did not work, then that is a different question. One where information from his own answer and subsequent comments become essential to answering. – Clifford Jan 23 '21 at 07:56