Issue with global variable while making 32-bit counter

Question

I am trying to do quadrature decoding using atmel xmega avr microcontroller. Xmega has only 16-bit counters. And in addition I have used up all the available timers.

Now to make 32-bit counter I have used one 16-bit counter and in its over/under flow interrupt I have increment/decrement a 16-bit global variable, so that by combining them we can make 32-bit counter.

ISR(timer_16bit)
{

   if(quad_enc_mov_forward)
    {
      timer_over_flow++;
    }

   else if (quad_enc_mov_backward)
    {
      timer_over_flow--;
    }
}

so far it is working fine. But I need to use this 32-bit value in various tasks running parallel. I'm trying to read 32-bit values as below

uint32_t current_count = timer_over_flow;
         current_count = current_count << 16;
         current_count = current_count + timer_16bit_count;
`timer_16_bit_count` is a hardware register.

Now the problem I am facing is when I read the read timer_over_flow to current_count in the first statement and by the time I add the timer_16bit_count there may be overflow and the 16bit timer may have become zero. This may result in taking total wrong value.

And I am trying to read this 32-bit value in multiple tasks .

Is there a way to prevent this data corruption and get the working model of 32-bit value.

Details sought by different members:

My motor can move forward or backward and accordingly counter increments/decrements.
In case of ISR, before starting my motor I'm making the global variables(quad_enc_mov_forward & quad_enc_mov_backward) set so that if there is a overflow/underflow timer_over_flow will get changed accordingly.
Variables that are modified in the ISR are declared as volatile.
Multiple tasks means that I'm using RTOS Kernel with about 6 tasks (mostly 3 tasks running parallel).
In the XMEGA I'm directly reading TCCO_CNT register for the lower byte.

Code needs to prevent interrupts for a short time while `uint32_t current_count = timer_over_flow;` occurs. Solution is implementation dependent. Really need a [MCVE] for a good answer. — chux - Reinstate Monica, Jan 30 '18 at 02:54
That can't be the case. Because we will be losing motor movement data. — Vinod kumar, Jan 30 '18 at 03:09
Only problem I am expecting is when timer overflow is read zero and then overflow occurs, then hardware register count starts again from zero. So instead of reading higher value we end up reading a very low value. — Vinod kumar, Jan 30 '18 at 03:12
Still need the [MCVE] . Need to see declarations of `quad_enc_mov_forward, timer_over_flow, quad_enc_mov_backward, timer_16bit_count` and all their uses and initialization. — chux - Reinstate Monica, Jan 30 '18 at 03:12
@chux: No, we do not need an MVCE. The problem is sufficiently described and is classic. — Eric Postpischil, Jan 30 '18 at 03:25
What does 'multiple tasks' in your question mean in particular? Are you aware of the fact that reading 16bit HW counter register in XMega does use (transparently) special TEMP register (shared between all counters!), so interrupt/context switch between reading low and high byte can put you directly into problems too? That is completely regardless of any 32bit extension. — Martin, Jan 30 '18 at 16:55
@Martin You can have a look at my edited version of the question. — Vinod kumar, Jan 30 '18 at 17:12
@Vinodkumar The thing with direction is still not completely clear to me. You assume that encoder moves in the direction which the motor is instructed to run only, right? So it is ensured that encoder can not jump not only single step in opposite direction? — Martin, Jan 30 '18 at 17:25
OK, then Eric's solution + my note about reading TCCO_CNT should work for you. I believe ISR latency won't be higher than cycles necessary to read high byte of counter. Just be sure to use interrupts are enabled when using Eric's loop (and re-enable them after reading counter exactly as shown in my answer, not later). — Martin, Jan 31 '18 at 08:56

Eric Postpischil · Accepted Answer · 2018-02-02T21:08:28.707

3

One solution is:

uint16_t a, b, c;
do {
    a = timer_over_flow;
    b = timer_16bit_count;
    c = timer_over_flow;
} while (a != c);
uint32_t counter = (uint32_t) a << 16 | b;

Per comment from user5329483, this must not be used with interrupts disabled, since the hardware counter fetched into b may be changing while the interrupt service routine (ISR) that modifies timer_over_flow would not run if interrupts are disabled. It is necessary that the ISR interrupt this code if a wrap occurs during it.

This gets the counters and checks whether the high word changed. If it did, this code tries again. When the loop exits, we know the low word did not wrap during the reads. (Unless there is a possibility we read the high word, then the low word wrapped, then we read the low word, then it wrapped the other way, then we read the high word. If that can happen in your system, an alternative is to add a flag that the ISR sets when the high word changes. The reader would clear the flag, read the timer words, and read the flag. If the flag is set, it tries again.)

Note that timer_over_flow, timer_16bit_count, and the flag, if used, must be volatile.

If the wrap-two-times scenario cannot happen, then you can eliminate the loop:

Read a, b, and c as above.
Compare b to 0x8000.
If b has a high value, either there was no wrap, it was read before a wrap upward (0xffff to 0), or it was read after a wrap downward. Use the lower of a or c.
Otherwise, either there was no wrap, b was read after a wrap upward, or it was read before a wrap downward. Use the larger of a or c.

edited Feb 02 '18 at 21:08

answered Jan 30 '18 at 03:21

Eric Postpischil

195,579
13
168
312

Given OP's ISR, `timer_16bit_count` could overflow (the ISR is called) without changing `timer_over_flow`. The idea of multiple reads is sound, yet needs a bit more in this case. – chux - Reinstate Monica Jan 30 '18 at 03:56
@chux: If the ISR does not change `timer_over_flow`, there is no inconsistency in the reads; whatever value of `timer_16bit_count` is read may be correctly paired with either read of `timer_over_flow` (which have the same value). I do not see why they would not want the ISR to adjust in one direction or the other, but that is a separate matter. – Eric Postpischil Jan 30 '18 at 04:05
Good point about "may be correctly paired with either read" - seems odd. – chux - Reinstate Monica Jan 30 '18 at 04:07
Can you make it into a macro. So that I can make a single line assignment everywhere in my code. Like 32_bit_count = macro; – Vinod kumar Jan 30 '18 at 14:23
Be careful that, depending on your exact situation this solution can be wrong/insufficient. 1) timer_over_flow is variable updated in ISR but timer_16bit_count is HW register, these two does not change atomically in the single moment (study in *great* details ISR timings etc. and think twice if it will work for you) 2) if your encoder can move both up and down, you have probably big issue in the ISR already (if the value will oscillate between zero/max few times before ISR is processed). Exact definition of quad_enc_mov_forward/backward would be good to know. – Martin Jan 30 '18 at 16:19
@Vinodkumar : Better, and simpler to create a function than a macro. Mark it `inline` and let the compiler optimise it if necessary. Yes you can create a macro, but best advice is not to. If you really want an answer, you need to post a new question - it cannot easily be answered in SO comments. – Clifford Feb 01 '18 at 10:27
Please note: This code MUST NOT be called with disabled interrupts. Otherwise the hardware counter may wrap around and requests the corresponding interrupt, which is not executed and the overflow counter doesn't work as a extension to your hardware counter anymore. – user5329483 Feb 02 '18 at 19:38
@EricPostpischil: But then it's to late to detect the wraparound. Lets asume OverflowCnt=0, HwCounter=0xFFFF. In the loop `a` becomes 0. The time ticks and a wraparound occurs, so `b` becomes 0. The triggered interrupt would have changed the OverflowCnt, but is disabled. So `c` becomes the outdated 0. And the loop is exited and returns 0 to the caller, instead of 0xFFFF or 0x10000, which would be acceptable. – user5329483 Feb 02 '18 at 19:49
@user5329483: I see, `timer_16bit_count` is updated by hardware even with interrupts disabled. I’ve noted this in the answer, thanks. – Eric Postpischil Feb 02 '18 at 21:09

score 3 · Answer 2 · answered Jan 30 '18 at 09:23

3

The #1 fundamental embedded systems programming FAQ:

Any variable shared between the caller and an ISR, or between different ISRs, must be protected against race conditions. To prevent some compilers from doing incorrect optimizations, such variables should also be declared as volatile.

Those who don't understand the above are not qualified to write code containing ISRs. Or programs containing multiple processes or threads for that matter. Programmers who don't realize the above will always write very subtle, very hard-to-catch bugs.

Some means to protect against race conditions could be one of these:

Temporary disabling the specific interrupt during access.
Temporary disabling all maskable interrupts during access (crude way).
Atomic access, verified in the machine code.
A mutex or semaphore. On single-core MCU:s where interrupts cannot be interrupted in turn, you can use a bool as "poor man's mutex".

answered Jan 30 '18 at 09:23

Lundin

195,001
40
254
396

Variable are indeed volatile. It's obvious, that's the reason I didn't mention explicitly. – Vinod kumar Jan 30 '18 at 14:22
1

@Vinodkumar You don't show your variable declaration nor do you mention how you are protecting it, so I'll assume you have some bugs there. – Lundin Jan 30 '18 at 14:26
I would like to see your #1 commandment to be restricted to _atomically_ accessible variables, which leads into the realm POSIX and/or implementation defined behaviour. Ok, you mention it below, but I think it is as central as `volatile`. – Vroomfondel Jan 30 '18 at 17:25
In this instance a mutex or semaphore is not appropriate. You'd deadlock if the ISR tries to take a mutex locked by a thread context - in most systems locking a mutex or taking a semaphore in an ISR will fail. – Clifford Jan 30 '18 at 18:03
1

@Clifford The terms mutex, and particularly semaphore, aren't necessarily items provided by some OS. If you invent your own bit flag for the purpose of protecting variables shared with an ISR, then the bit flag is a semaphore. Assembler programmers have been using the term since long before multi-tasking OS were even invented. Anyway, the _principle_ is the same when protecting variables shared with an ISR, as when protecting variables shared with another process/thread/callback. – Lundin Jan 31 '18 at 07:27
You might have to explain how that works with an example. If the normal context locks a resource using a _mutex_, and is preemted by an ISR that also tries to lock the resource, how long will the ISR have to wait for the resource to become free? - Forever - is the answer. A mutex makes no sense in an ISR, and a semaphore can be used to _signal_ another context, not to lock a resource. – Clifford Jan 31 '18 at 22:57
What you are perhaps referring to is more commonly referred to as a spin-lock - In this case the ISR sets a flag to indicate that it ran and modified the resource, and the main thread clears the flag and reads the protected data and repeats if the flag is set after. This is essentially what Eric's answer does, but optimised by using the change in the data itself as the "flag". – Clifford Jan 31 '18 at 23:00
@Clifford There's an example in the linked post, of the most simple kind of semaphore. Indeed an actual mutex, as provided by an OS, could naturally not be waited for from inside an ISR. (Except perhaps if the ISR fist clears the global interrupt mask first, which would be a fishy design.) – Lundin Feb 01 '18 at 08:00
In the linked example, for _"poor man's mutex"_ read "_not really a mutex at all_". It does not match the semantics of a mutex at all. It results in _data loss_ by simply ignoring the interrupt if the main thread is accessing the data. In this application failing to update _timer_over_flow_ will result in gross errors in the 16 bit timer value. It is neither a mutex nor a solution to this problem. – Clifford Feb 01 '18 at 10:12
In a non-multitasking environment, _mutual-exclusion_ as an _effect_ rather than _a mutex_ as an _object_ can be achieved simply by disabling interrupts during access by the main thread. The interrupt may then be delayed but won't be ignored. In this case however that will still fail because the LSB on the timer value is a hardware timer-counter. So in fact neither of your suggested mechanism will work in this _specific_ case. The answer describes some important general principles, but it does not directly address the specific question or offer any workable solution. – Clifford Feb 01 '18 at 10:16

Martin · Answer 3 · 2018-01-30T17:54:05.650

Just reading TCCO_CNT in multithreaded code is race condition if you do not handle it correctly. Check the section on reading 16bit registers in XMega manual. You should read lower byte first (this will be probably handled transparently by compiler for you). When lower byte is read, higher byte is (atomically) copied into the TEMP register. Then, reading high byte does read the TEMP register, not the counter. In this way atomic reading of 16bit value is ensured, but only if there is no access to TEMP register between low and high byte read.

Note that this TEMP register is shared between all counters, so context switch in right (wrong) moment will probably trash its content and therefore your high byte. You need to disable interrupts for this 16bit read. Because XMega will execute one instruction after the sei with interrupts disabled, the best way is probably:

cli
ld [low_byte]
sei
ld [high byte]

It disables interrupts for four CPU cycles (if I counted it correctly).

An alternative would to save shared TEMP register(s) on each context switch. It is possible (not sure if likely) that your OS already does this, but be sure to check. Even so, you need to make sure colliding access does not occur from an ISR.

This precaution should be applied to any 16bit register read in your code. Either make sure TEMP register is correctly saved/restored (or not used by multiple threads at all) or disable interrupts when reading/writing 16bit value.

score 0 · Answer 4 · answered Jan 31 '18 at 15:44

This problem is indeed a very common and very hard one. All solutions will toit will have a caveat regarding timing constraints in the lower priority layers. To clarify this: the highest priority function in your system is the hardware counter - it's response time defines the maximum frequency that you can eventually sample. The next lower priority in your solution is the interrupt routine which tries to keep track of bit 2^16 and the lowest is your application level code which tries to read the 32-bit value. The question now is, if you can quantify the shortest time between two level changes on the A- and B- inputs of your encoder. The shortest time usually does occur not at the highest speed that your real world axis is rotating but when halting at a position: through minimal vibrations the encoder can double swing between two increments, thereby producing e.g. a falling and a rising edge on the same encoder output in short succession. Iff (if and only if) you can guarantee that your interrupt processing time is shorter (by a margin) than this minmal time you can use such a method to virtually extend the coordinate range of your encoder.

Issue with global variable while making 32-bit counter

4 Answers4

The #1 fundamental embedded systems programming FAQ:

Linked