10

I have been measuring clock cycle count on the cortex m4 and would now like to do it on the cortex m7. The board I use is STM32F746ZG.

For the m4 everything worked with:

volatile unsigned int *DWT_CYCCNT;
volatile unsigned int *DWT_CONTROL;
volatile unsigned int *SCB_DEMCR;

void reset_cnt(){
    DWT_CYCCNT   = (volatile unsigned int *)0xE0001004; //address of the register
    DWT_CONTROL  = (volatile unsigned int *)0xE0001000; //address of the register
    SCB_DEMCR    = (volatile unsigned int *)0xE000EDFC; //address of the register
    *SCB_DEMCR   = *SCB_DEMCR | 0x01000000;
    *DWT_CYCCNT  = 0; // reset the counter
    *DWT_CONTROL = 0; 
}

void start_cnt(){
    *DWT_CONTROL = *DWT_CONTROL | 0x00000001 ; // enable the counter
}

void stop_cnt(){
     *DWT_CONTROL = *DWT_CONTROL & 0xFFFFFFFE ; // disable the counter    
}

unsigned int getCycles(){
    return *DWT_CYCCNT;
}

The problem is that the DWT_CTRL register isn't changed when I run on the m7 and remains 0x40000000 instead of changing to 0x40000001 so the cycle count is always zero. From what I have read in other posts it seems like you need to set the FP_LAR register to 0xC5ACCE55 to be able to change DWT_CTRL.

I added these defines (have tried both FP_LAR_PTR addresses below):

#define FP_LAR_PTR ((volatile unsigned int *) 0xe0000fb0) //according to reference
//#define FP_LAR_PTR ((volatile unsigned int *) 0xe0002fb0) //according to guy on the internet
// Lock Status Register lock status bit
#define DWT_LSR_SLK_Pos                1
#define DWT_LSR_SLK_Msk                (1UL << DWT_LSR_SLK_Pos)
// Lock Status Register lock availability bit
#define DWT_LSR_SLI_Pos                0
#define DWT_LSR_SLI_Msk                (1UL << DWT_LSR_SLI_Pos)
// Lock Access key, common for all
#define DWT_LAR_KEY                    0xC5ACCE55

and this function:

void dwt_access_enable(unsigned int ena){
    volatile unsigned int *LSR;
    LSR = (volatile unsigned int *) 0xe0000fb4;
    uint32_t lsr = *LSR;;
    //printf("LSR: %.8X - SLI MASK: %.8X\n", lsr, DWT_LSR_SLI_Msk);

    if ((lsr & DWT_LSR_SLI_Msk) != 0) {
        if (ena) {
            //printf("LSR: %.8X - SLKMASK: %.8X\n", lsr, DWT_LSR_SLK_Msk);
            if ((lsr & DWT_LSR_SLK_Msk) != 0) {    //locked: access need unlock
                *FP_LAR_PTR = DWT_LAR_KEY;
                printf("FP_LAR directly after change: 0x%.8X\n", *FP_LAR_PTR);
            }
        } else {
            if ((lsr & DWT_LSR_SLK_Msk) == 0) {   //unlocked
                *FP_LAR_PTR = 0;
                 //printf("FP_LAR directly after change: 0x%.8X\n", *FP_LAR_PTR);
            }
        }
    }
}

When I call the uncommented print I get 0xC5ACCE55 but when I printed it after the return of the function I get 0x00000000 and I have no idea why. Am I on the right track or is this completely wrong?

Edit: I think it also would be good to mention that I have tried without all the extra code in the function and only tried to change the LAR register.

BR Gustav

G. Johnsson
  • 103
  • 1
  • 7
  • 1
    According to the Cortex-M7 TRM, [DWT_LAR is a _write-only_ register](http://infocenter.arm.com/help/topic/com.arm.doc.ddi0489c/BABJFFGJ.html)... – Notlikethat Jul 13 '16 at 15:46
  • Oh, my bad, I didn't notice that, still it seems like I am able to read from it sometimes. Anyways if we overlook my mistake, I still get 0 clock cycles when I: start the counter -> call a function I want to measure -> stop counter -> read clock cycles. I have tried without any read from the LAR register if that would ruin it and it still don't work. – G. Johnsson Jul 13 '16 at 15:59
  • does your cortex-m7 have that feature implemented? There are other timers (systick) which if implemented can also count the ARM core clocks. – old_timer Jul 13 '16 at 16:58
  • When I read DWT_CTRL it says 0x40000000, according to https://web.eecs.umich.edu/~prabal/teaching/eecs373-f10/readings/ARMv7-M_ARM.pdf page C1-48 and C1-49 the 25th bit NOCYCCNT should be 1 if CYCCNT is not supported and it is 0 in this case. Hope I answered what you asked. – G. Johnsson Jul 13 '16 at 18:08
  • hmm so far I get 0x00000000 when I read 0xE0001000, I do get 0xFFF02003 when I read 0xE00FF004 – old_timer Jul 13 '16 at 21:00
  • Table 11-1 shows the DWT registers. Depending on the implementation of your processor, some of these registers might not be present. Any register that is configured as not present reads as zero – old_timer Jul 13 '16 at 21:02
  • I am using an stm32f7 discovery with the STM32F746NGH6 microcontroller – old_timer Jul 13 '16 at 21:02
  • okay caught up, when setting TRCENA in DEMCR I also see 0x40000000 but no count in the DWT counter. – old_timer Jul 13 '16 at 21:18
  • trying to set the lower bits (including the cycle count enable) in DWT control, they read back as zeros. so is it really writing to dwt control? – old_timer Jul 13 '16 at 21:26
  • No, it doesn't and that is the effect of the problem that I'm trying to describe. I somehow need to be able to alter bit 0 in DWT_CTRL. I will try Notlikethat's answer below and get back with the result :) – G. Johnsson Jul 13 '16 at 22:25
  • From my own experiments it does appear that DWT_LAR and DWT_LSR are at 0xE0001FB0 and 0xE0001FB4. The arm documentation is wrong, but the TRM table does look a bit fishy having only those two registers without the 0x1xxx – old_timer Jul 14 '16 at 03:37
  • If you have "read in other posts" it is useful to provide a link or citation so we can check that what you are reading is truly relevant. CM7 has no FP_LAR register AFAIK, so it is unclear what you are referring to - though I make an educated guess in my answer, and it differs from the register you are accessing. – Clifford Jul 14 '16 at 08:31
  • I have reported the documentation error to ARM via their online documentation feedback. – Clifford Jul 14 '16 at 08:45

2 Answers2

6

Looking at the docs again, I'm now incredibly suspicious of a typo or copy-paste error in the ARM TRM. 0xe0000fb0 is given as the address of ITM_LAR, DWT_LAR and FP_LSR (and equivalently for *_LSR). Since all the other ITM registers are in page 0xe0000000, it looks an awful lot like whoever was responsible for that part of the Cortex-M7 documentation took the Cortex-M4 register definitions, added the new LAR and LSR to the ITM page, then copied them to the DWT and FPB pages updating the names but overlooking to update the addresses.

I'd bet my dinner that you're unwittingly unlocking ITM_LAR (or the real FP_LAR), and DWT_LAR is actually at 0xe0001fb0.

EDIT by dwelch

Somebody owes somebody a dinner.

hexstring(GET32(0xE0001FB4));
hexstring(GET32(0xE0001000));
hexstring(GET32(0xE0001004));
hexstring(GET32(0xE0001004));

PUT32(0xE000EDFC,0x01000000);

hexstring(GET32(0xE0001FB4));
hexstring(GET32(0xE0001000));
hexstring(GET32(0xE0001004));
hexstring(GET32(0xE0001004));

PUT32(0xE0001000,0x40000001);

hexstring(GET32(0xE0001FB4));
hexstring(GET32(0xE0001000));
hexstring(GET32(0xE0001004));
hexstring(GET32(0xE0001004));

PUT32(0xE0001FB0,0xC5ACCE55);
PUT32(0xE0001000,0x40000001);

hexstring(GET32(0xE0001FB4));
hexstring(GET32(0xE0001000));
hexstring(GET32(0xE0001004));
hexstring(GET32(0xE0001004));

output

00000000
00000000
00000000
00000000
00000003
40000000
00000000
00000000
00000003
40000000
00000000
00000000
00000001
40000001
0000774F
0000B311

The table in the TRM is funny looking and as the other documentation shows you add the 0xFB0 and 0xFB4 to the base, the rest of the DWT for the Cortex-M7 is 0xE0001xxx and indeed it appears that the LAR and LSR are ate 0xE0001FB0 and 0xE0001FB4.

old_timer
  • 69,149
  • 8
  • 89
  • 168
Notlikethat
  • 20,095
  • 3
  • 40
  • 77
  • Okay I will definitely try that when I get back on Monday and get back to you. I also read somewhere else that the reference was wrong but he suggested 0xe000**2**fb0 instead but I tried that and it didn't work. – G. Johnsson Jul 13 '16 at 22:04
  • Yeah, e0002fb0 would look to be the corresponding lock for the FPB unit, as I hinted at. Seems the guy in the middle is the only one nobody tried ;) – Notlikethat Jul 13 '16 at 22:11
  • Yes, also I guess I should change `FP_LAR_PTR` int the code to `DWT_LAR_PTR`, I got confused when researching this :) – G. Johnsson Jul 13 '16 at 22:16
  • @Notlikethat I edited your answer instead of adding my own this was your finding I just did the experiment a few days earlier than the OP. Please re-edit to your liking (or remove my edit all together, whatever). – old_timer Jul 14 '16 at 03:36
  • 0xe0001fb0 worked for me also! Only thing now is that it actually shows more clock cycles for a dft and fft operation compared to the m4, but that is a different problem :) – G. Johnsson Jul 18 '16 at 09:56
4

I would advise against creating your own register definitions when they are defined as part of the CMSIS - to do so requires that both the documentation and your interpretation of it are correct. In this case it appears that the documentation is indeed incorrect, but that the CMSIS headers are correct. It is a lot easier to validate the CMSIS headers automatically than it is to verify the documentation is correct, so I would trust the CMSIS every time.

I am not sure what register FP_LAR might refer to, but your address assignment refers to ITM_LAR, but it seems more likely that you intended DWT_LAR which Cortex-M4 lacks.

Despite my advice to trust it, CMSIS 4.00 omits to define masks for DWT_LSR/SWT_LAR, but I believe they are identical to the corresponding ITM masks.

Note also that the LAR is a write-only register - any attempt to read it is meaningless.

Your code using CMSIS would be:

#include "core_cm7.h"  // Applies to all Cortex-M7

void reset_cnt()
{
    CoreDebug->DEMCR |= 0x01000000;
    DWT->CYCCNT = 0; // reset the counter
    DWT->CTRL = 0; 
}

void start_cnt()
{
    DWT->CTRL |= 0x00000001 ; // enable the counter
}

void stop_cnt()
{
     DWT->CTRL &= 0xFFFFFFFE ; // disable the counter    
}

unsigned int getCycles()
{
    return DWT->CYCCNT ;
}

// Not defined in CMSIS 4.00 headers - check if defined
// to allow for possible correction in later versions
#if !defined DWT_LSR_Present_Msk 
    #define DWT_LSR_Present_Msk ITM_LSR_Present_Msk
#endif
#if !defined DWT_LSR_Access_Msk 
    #define DWT_LSR_Access_Msk ITM_LSR_Access_Msk
#endif
#define DWT_LAR_KEY 0xC5ACCE55

void dwt_access_enable( unsigned ena )
{
    uint32_t lsr = DWT->LSR;;

    if( (lsr & DWT_LSR_Present_Msk) != 0 ) 
    {
        if( ena ) 
        {
            if ((lsr & DWT_LSR_Access_Msk) != 0) //locked: access need unlock
            {    
                DWT->LAR = DWT_LAR_KEY;
            }
        } 
        else 
        {
            if ((lsr & DWT_LSR_Access_Msk) == 0) //unlocked
            {   
                DWT->LAR = 0;
            }
        }
    }
}
DipSwitch
  • 5,470
  • 2
  • 20
  • 24
Clifford
  • 88,407
  • 13
  • 85
  • 165
  • You are going on the assumption that CMSIS is correct. Which is equally bad/good as assuming the docs are correct. Someone wrote the CMSIS headers using some resource. The wise thing to do for a company is have one database of information and generate the documentation addresses and header files from that. I would bet that is the exception not the rule. CMSIS has its own baggage, maybe cut and paste from it when in doubt, but dont assume it is any more correct than the docs. – old_timer Jul 14 '16 at 13:35
  • Likewise how do the docs get fixed if nobody reads them? I already posted a ticket to arm on this one, will see where it goes but at least someone there has been notified. – old_timer Jul 14 '16 at 13:36
  • @dwelch : I am assuming nothing; I am simply saying that the CMSIS is amenable to automated unit testing and validation in a way that the documentation is not. Even if the the CMSIS were to match the documentation, it would be no worse than user implemented code from the documentation - why reinvent it? Moreover the core_cmX is fundamental to the entirety of the CMSIS and used by thousands of projects - many more developers than would probably have read the documentation to that level of detail. In the CMSIS and DWT_LAR/LSR differ from the documentation (and ITM) which is more plausible. – Clifford Jul 14 '16 at 20:35
  • @dwelch : You still have to read the documentation to know what the registers are for and how they work - the CMSIS won't tell you that. In this case it is certainly true that using the CMSIS would have avoided the problem. Were I to use the CMSIS and find a problem I'd cross-check the documentation in the same way (reversed), but starting with the CMSIS would be less work. – Clifford Jul 14 '16 at 20:44
  • @Clifford FWIW, FP_LAR is the CoreSight LAR for the [Flash Patch/Breakpoint Unit](https://developer.arm.com/docs/ddi0489/latest/9-debug/93-about-the-fpb/932-fpb-programmers-model) (which doesn't actually support the thing it's primarily named after, go figure...) – Notlikethat Jul 14 '16 at 20:45
  • @Notlikethat : Ah! FP = FlashPatch (not floating-point). In thi scase however it seems more likely that DWT_LAR/LSR were intended. The documentation is incorrect for the FP_LAR/LSR too. The FPB is not defined in the CMSIS core_cm7.h - I would expect it to be of more use to debuggers rather then direct access from the code - moreover it is not implemented on all CM7 devices. – Clifford Jul 14 '16 at 20:59