Preventing torn reads with an HCS12 microcontroller

Question

Summary

I'm trying to write an embedded application for an MC9S12VR microcontroller. This is a 16-bit microcontroller but some of the values I deal with are 32 bits wide and while debugging I've captured some anomalous values that seem to be due to torn reads.

I'm writing the firmware for this micro in C89 and running it through the Freescale HC12 compiler, and I'm wondering if anyone has any suggestions on how to prevent them on this particular microcontroller assuming that this is the case.

Details

Part of my application involves driving a motor and estimating its position and speed based on pulses generated by an encoder (a pulse is generated on every full rotation of the motor).

For this to work, I need to configure one of the MCU timers so that I can track the time elapsed between pulses. However, the timer has a clock rate of 3 MHz (after prescaling) and the timer counter register is only 16-bit, so the counter overflows every ~22ms. To compensate, I set up an interrupt handler that fires on a timer counter overflow, and this increments an "overflow" variable by 1:

// TEMP
static volatile unsigned long _timerOverflowsNoReset;

// ...

#ifndef __INTELLISENSE__
__interrupt VectorNumber_Vtimovf
#endif
void timovf_isr(void)
{
  // Clear the interrupt.
  TFLG2_TOF = 1;

  // TEMP
  _timerOverflowsNoReset++;

  // ...
}

I can then work out the current time from this:

// TEMP
unsigned long MOTOR_GetCurrentTime(void)
{
  const unsigned long ticksPerCycle = 0xFFFF;
  const unsigned long ticksPerMicrosecond = 3; // 24 MHZ / 8 (prescaler)
  const unsigned long ticks = _timerOverflowsNoReset * ticksPerCycle + TCNT;
  const unsigned long microseconds = ticks / ticksPerMicrosecond;

  return microseconds;
}

In main.c, I've temporarily written some debugging code that drives the motor in one direction and then takes "snapshots" of various data at regular intervals:

// Test
for (iter = 0; iter < 10; iter++)
{
  nextWait += SECONDS(secondsPerIteration);
  while ((_test2Snapshots[iter].elapsed = MOTOR_GetCurrentTime() - startTime) < nextWait);
  _test2Snapshots[iter].position = MOTOR_GetCount();
  _test2Snapshots[iter].phase = MOTOR_GetPhase();
  _test2Snapshots[iter].time = MOTOR_GetCurrentTime() - startTime;
  // ...

In this test I'm reading MOTOR_GetCurrentTime() in two places very close together in code and assign them to properties of a globally available struct.

In almost every case, I find that the first value read is a few microseconds beyond the point the while loop should terminate, and the second read is a few microseconds after that - this is expected. However, occasionally I find the first read is significantly higher than the point the while loop should terminate at, and then the second read is less than the first value (as well as the termination value).

The screenshot below gives an example of this. It took about 20 repeats of the test before I was able to reproduce it. In the code, <snapshot>.elapsed is written to before <snapshot>.time so I expect it to have a slightly smaller value:

For snapshot[8], my application first reads 20010014 (over 10ms beyond where it should have terminated the busy-loop) and then reads 19988209. As I mentioned above, an overflow occurs every 22ms - specifically, a difference in _timerOverflowsNoReset of one unit will produce a difference of 65535 / 3 in the calculated microsecond value. If we account for this:

$19988209 + \frac{65535}{3} - 20010014 = 20010054 - 20010014 = 40$

A difference of 40 isn't that far off the discrepancy I see between my other pairs of reads (~23/24), so my guess is that there's some kind of tear going on involving an off-by-one read of _timerOverflowsNoReset. As in while busy-looping, it will perform one call to MOTOR_GetCurrentTime() that erroneously sees _timerOverflowsNoReset as one greater than it actually is, causing the loop to end early, and then on the next read after that it sees the correct value again.

I have other problems with my application that I'm having trouble pinning down, and I'm hoping that if I resolve this, it might resolve these other problems as well if they share a similar cause.

Edit: Among other changes, I've changed _timerOverflowsNoReset and some other globals from 32-bit unsigned to 16-bit unsigned in the implementation I now have.

I think you're over-complicating things. This approach should work: Atomically (i.e., within SEI//CLI block, or if you don't know the current CCR[I] state: TPA/PSHA/SEI//PULA/TAP) copy current counter and overflow, and clear both. Then, perform your calculations as you do but using the copies. — tonypdmtr, Jul 02 '18 at 16:43
On another issue: Given that `ticksPerCycle` is 0xFFFF, you could save some overhead; by `const unsigned long ticks = (_timerOverflowsNoReset << 16) + TCNT;` and if you could set the frequency to 1, 2 or 4MHz rather the 3, you could avoid the potentially _really_ expensive division conversion for microseconds. — Clifford, Jul 02 '18 at 18:41
@Clifford I used `ticksPerCycle` because it's clearer than `<< 16` (it avoids magic numbers). However, I realised afterwards that it's an off-by-one error anyway (should be `0x10000`) and I've switched to `<< 16`. I assume that the compiler is smart enough to simplify multiplication to a shift operation when it's a power of 2. As for the part about microseconds, I'm only doing that temporarily for debugging purposes (easier to write tests in terms of microseconds than ticks) and otherwise I'd just be keeping all values in ticks. — Tagc, Jul 02 '18 at 18:45
That is probably a poor reason - to write inefficient code because you cannot think of a better way to conform to a coding standard. instead use `sizeof(TCNT) * CHAR_BIT` or just define a constant such as `counterBitLength = 16` and shift it instead of multiply. On the other hand a good compiler will fix it for you. — Clifford, Jul 02 '18 at 18:55
I actually find a lot of embedded code to be pretty hard to read relative to code written in other languages, to the point where people even do things that have no impact on code performance whatsoever like abbreviating terminology (e.g. "timer" -> "tmr", "overflow" -> "ovf"). However, I'd have assumed that something `const unsigned long ticksPerCycle = 0x10000; something * ticksPerCycle` to compile to the same machine code as `something << 16`, while being a lot more explicit to the reader what I'm doing. — Tagc, Jul 02 '18 at 19:00
If that's not the case, then yeah doing what you advise (`counterBitLength = 16`) or adding a comment would be the second best thing. — Tagc, Jul 02 '18 at 19:01
Moreover `_timerOverflowsNoReset` should be a 16 bit type (suggest `uint16_t`), during calculation of `ticks` the upper 16 bits of a 32 bit type will be lost in any event. — Clifford, Jul 02 '18 at 19:04
That's true, I've changed that to a 16 bit type and a few other globals since I posted the question. — Tagc, Jul 02 '18 at 19:05
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/174186/discussion-between-clifford-and-tagc). — Clifford, Jul 02 '18 at 19:05
@tonypdmtr : TCNT is a volatile hardware register and will change even when interrupts are disabled - if TCNT overflows while in the critical section, the overflow count will be incorrect - either too low or too high by one depending on the order you read them - so your _simple_ solution is a little too simple perhaps. — Clifford, Jul 02 '18 at 19:19
@Clifford You can know this condition by reading the Overflow pending flag and adding one more to your overflow count while (in this case) setting the new overflow counter to -1 (to have the ISR that will fire right after the CLI instruction bring it back to zero), or clearing the flag. Order of reads is important. — tonypdmtr, Jul 03 '18 at 12:28
@tonypdmtr you can, but it is then _less simple_ and has no advantage over solutions that do not disable interrupts - in fact it's worse because it does disable interrupts. — Clifford, Jul 03 '18 at 12:31
@Clifford It's a few assembly instructions. But, in C, it could be worse than the loop idea. — tonypdmtr, Jul 03 '18 at 12:32
@tonypdmtr it is not the instruction count that would concern me; rather the consequent variability in interrupt latency. It unnecessarily reduces determinism. — Clifford, Jul 03 '18 at 12:35
@Clifford. OK, but don't go that extreme. Even each instruction has to disable interrupts while it executes, so there is variability there as each has a different number of cycles. So, think of it as a higher cycles instruction. :) — tonypdmtr, Jul 03 '18 at 12:37
I agree with Clifford. I use the timer overflow interrupt not just to synthesise a 32-bit count but also to perform safety-critical checks at regular intervals. If I disable the interrupts in `MOTOR_GetCurrentTicks` and that procedure is called frequently, it could starve out the interrupt routine. A solution that avoids disabling the interrupt seems ideal. — Tagc, Jul 03 '18 at 12:38
People the disabling of interrupts for so few instructions has no practical difference in your code execution. — tonypdmtr, Jul 03 '18 at 12:38
@tonypdmtr in your applications perhaps, and in this instance perhaps. The non disabling solution is no more complex, so why would you not do that? — Clifford, Jul 03 '18 at 12:42
@Clifford Sure, you can do it any way that works. No argument about that. I just think the no-ints solution is much quicker in terms of worse case execution (and can be made to have a fixed number of cycles -- which sometimes is important). — tonypdmtr, Jul 03 '18 at 12:46
@tonypdmtr you have posted an _answer_ inappropriately in _comments_ and now we are discussing it inappropriately. Post a proper answer, so we can see clearly what you are proposing, and it can be more broadly considered by your peers. If you don't want to do that, don't abuse the SO comments system. — Clifford, Jul 03 '18 at 12:52

score 4 · Answer 1 · answered Jul 02 '18 at 09:40

4

You can read this value TWICE:

unsigned long GetTmrOverflowNo()
{
    unsigned long ovfl1, ovfl2;
    do {
        ovfl1 = _timerOverflowsNoReset;
        ovfl2 = _timerOverflowsNoReset;
    } while (ovfl1 != ovfl2);
    return ovfl1;
}

unsigned long MOTOR_GetCurrentTime(void)
{
  const unsigned long ticksPerCycle = 0xFFFF;
  const unsigned long ticksPerMicrosecond = 3; // 24 MHZ / 8 (prescaler)
  const unsigned long ticks = GetTmrOverflowNo() * ticksPerCycle + TCNT;
  const unsigned long microseconds = ticks / ticksPerMicrosecond;

  return microseconds;
}

If _timerOverflowsNoReset increments much slower then execution of GetTmrOverflowNo(), in worst case inner loop runs only two times. In most cases ovfl1 and ovfl2 will be equal after first run of while() loop.

answered Jul 02 '18 at 09:40

Alexey Esaulenko

499
2
10

This is a good way to do it. Other option would be to temporarily disable interrupts while reading the timer value but that introduces jitter. – teroi Jul 02 '18 at 09:47
Thanks, this looks promising. I'll give this a try when I can. – Tagc Jul 02 '18 at 10:01
Sorry but this is nonsense given that HCS12 is a limited 16 bit MCU which cannot read 32 bit variables like this. You'll rather end up reading the value 4 times, in very unsafe and uncontrolled ways - which is the problem inherited from the originally buggy code. Just don't do this, it won't solve anything. – Lundin Jul 02 '18 at 11:06
This fails, when `GetTmrOverflowNo()` returns a stable value, but then TCNT overflows – jeb Jul 02 '18 at 11:38
1

@teroi Disabling interrupts doesn't help in this case, as the overflow occours independ of interrupts. – jeb Jul 02 '18 at 11:39
@jeb, you are right. Reading of overflow counter and TCNT must be performed in same do { .. } while() – Alexey Esaulenko Jul 02 '18 at 11:41

Tagc · Answer 2 · 2018-07-02T16:40:54.410

2

Based on the answers @AlexeyEsaulenko and @jeb provided, I gained understanding into the cause of this problem and how I could tackle it. As both their answers were helpful and the solution I currently have is sort of a mixture of the two, I can't decide which of the two answers to accept, so instead I'll upvote both answers and keep this question open.

This is how I now implement MOTOR_GetCurrentTime:

unsigned long MOTOR_GetCurrentTime(void)
{
  const unsigned long ticksPerMicrosecond = 3; // 24 MHZ / 8 (prescaler)
  unsigned int countA;
  unsigned int countB;
  unsigned int timerOverflowsA;
  unsigned int timerOverflowsB;
  unsigned long ticks;
  unsigned long microseconds;

  // Loops until TCNT and the timer overflow count can be reliably determined.
  do
  {
    timerOverflowsA = _timerOverflowsNoReset;
    countA = TCNT;
    timerOverflowsB = _timerOverflowsNoReset;
    countB = TCNT;
  } while (timerOverflowsA != timerOverflowsB || countA >= countB);

  ticks = ((unsigned long)timerOverflowsA << 16) + countA;
  microseconds = ticks / ticksPerMicrosecond;
  return microseconds;
}

This function might not be as efficient as other proposed answers, but it gives me confidence that it will avoid some of the pitfalls that have been brought to light. It works by repeatedly reading both the timer overflow count and TCNT register twice, and only exiting the loop when the following two conditions are satisfied:

the timer overflow count hasn't changed while reading TCNT for the first time in the loop
the second count is greater than the first count

This basically means that if MOTOR_GetCurrentTime is called around the time that a timer overflow occurs, we wait until we've safely moved on to the next cycle, indicated by the second TCNT read being greater than the first (e.g. 0x0001 > 0x0000).

This does mean that the function blocks until TCNT increments at least once, but since that occurs every 333 nanoseconds I don't see it being problematic.

I've tried running my test 20 times in a row and haven't noticed any tearing, so I believe this works. I'll continue to test and update this answer if I'm wrong and the issue persists.

Edit: As Vroomfondel points out in the comments below, the check I do involving countA and countB also incidentally works for me and can potentially cause the loop to repeat indefinitely if _timerOverflowsNoReset is read fast enough. I'll update this answer when I've come up with something to address this.

edited Jul 02 '18 at 16:40

answered Jul 02 '18 at 13:02

Tagc

8,736
7
61
114

This code will crash and burn when an interrupt fires while the function is executed, because you still have the rookie bug of not protecting the variable shared with the ISR. This means you will eventually read half the variable from the previous timer cycle and the other half from the current one, leading to garbage results. You still need to do one of these 3, pick one: 1) either disable the interrupt while accessing the variable, 2) use some manner of semaphore or 3) ensure atomic access. – Lundin Jul 02 '18 at 13:13
Btw you will keep accumulating your oscillator inaccuracy through algorithms like this. So depending on for how long you need to log values, you need to pick an oscillator accordingly. Internal RC oscillator is a no-go, but luckily most S12 don't have one (just in "limp home mode"). – Lundin Jul 02 '18 at 13:19
I would protect the variable if I could, but with this system I'm not sure how. I'm not using an RTOS like with other MCUs I've worked with, so I guess I'd have to do something like manually creating memory barriers in assembly. Maybe a volatile bool would work? Also, wouldn't disabling/enabling interrupts during calls to this function cause problems if I'm calling it inside loops that run thousands of times a second? I'm also not clear on what specific sequence of events could cause the crash/burn you describe - I thought double-reading the values would prevent that, even if tearing occurs. – Tagc Jul 02 '18 at 13:20
"Btw you will keep accumulating your oscillator inaccuracy through algorithms like this" Shouldn't be an issue, at most I'd need to run the timer for a few seconds before resetting it. I also have a fair amount of leeway for drift. – Tagc Jul 02 '18 at 13:22
I already wrote you an answer explaining how to protect the code. It's as simple as pressing the disassembler button in your IDE and checking what code you ended up with. The C code may end up atomic but only if you use 8 or 16 bit variables. Otherwise you either have to disable TOF interrupt during the read (shouldn't be an issue since the timer is still running and flag still set) or take advantage of that the interrupt can't get interrupted and implement a ["poor man's mutex"](https://stackoverflow.com/a/41242234/584518) with a bool. – Lundin Jul 02 '18 at 13:25
I peeked at the HCS12 ECT timer manual and it should be safe to do something like `TSCR2 &= (uint8_t)~TOF; /*access shared variable */ TSCR2 |= TOF;`. Probably the easiest solution. – Lundin Jul 02 '18 at 13:30
I'm using a very old IDE (CodeWarrior v5.9) so the disassembly output isn't as helpful as other IDEs, but I think I've captured the right part. https://i.imgur.com/oRCNCGe.png No MOV's, looks like a series of "load from address" and "store in register" instructions followed by some checks/jumps for the loops. As for enabling/disabling TOF, I can do that by setting the TOI bit (bit 7) in TSCR2 like you say. It's nice to know that a volatile bool will work too. – Tagc Jul 02 '18 at 13:47
1

I'm going to refrain from doing either of these for the time being because I'm still not clear what exact sequence of events can produce a bug, and I don't feel comfortable introducing code when I'm not clear on why exactly I'm doing it. For example, [jeb's answer](https://stackoverflow.com/a/51135178/1636276) made it clear what kind of race condition would occur. But if you're right and the code still is unsafe, if it blows up again I'll have a good idea how to fix it. – Tagc Jul 02 '18 at 13:49
Regarding the disassembler, classic CW IDE has a nice and helpful disassembler with the C code inlined. Should be something like Project menu -> Disassemble, when you have a .c file open. Use that one, not the one from the debugger. But as you can see, the 32 bit arithmetic spews a flood of code, including some mysterious JSR 0xC0E8 which is maybe the jump to the 32 bit arithmetic routines added by CW. – Lundin Jul 02 '18 at 14:04
The exact sequence that will provoke the bug is: read part of `_timerOverflowsNoReset`, ISR triggers, whole of `_timerOverflowsNoReset` is updated, ISR returns, keep reading `_timerOverflowsNoReset` except it has now changed so the result is corrupted. This is a classic race condition bug and any introductory class to real-time embedded systems ought to teach it. – Lundin Jul 02 '18 at 14:04
I get what you mean, I've been programming for a while and race conditions aren't unique to embedded code. However, `_timerOverflowsNoReset` is only read in two places: within the ISR as `_timerOverflows++` since incrementing a variable is a non-atomic read/write combo, and within `MOTOR_GetCurrentTime` where I read it *twice* in a loop and confirm the two reads are the same before proceeding. Cheers about the CodeWarrior tip, I've got more useful disassembly available [here](https://hastebin.com/hijanohuhe.http). – Tagc Jul 02 '18 at 14:21
Ok looks like you changed to a 16 bit type. It reads it in a single instruction LDD, but you still can get an interrupt before storing in the variable (STD 6,SP). This means you will be missing interrupts every time an interrupt occurs between LDD and STD and thus your timing will be very subtly screwed up, like missing one out of ten thousand interrupts. – Lundin Jul 02 '18 at 14:31
Ah yeah sorry for not mentioning, I did change `_timerOverflowsNoReset` and some other global variables to 16-bit. – Tagc Jul 02 '18 at 14:35
I tried very carefully going over the scenario you outlined and I can't identify any issue. If an interrupt occurs between LDD and STD within `Motor_GetCurrentTime`, it will simply cause the loop to repeat and try the same procedure again. https://i.imgur.com/y54cuKs.png – Tagc Jul 02 '18 at 15:09
It is fairly simple. `timerOverflowsA` reads 1, `timerOverflowsB` reads 1 into the double acc D with LDD, ISR occurs and increases `_timerOverflowsNoReset` to 2, ISR returns, `_timerOverflowsNoReset` is now 2, `timerOverflowsB` gets assigned value 1 with STD. You have lost 22ms and the loop won't help because `timerOverflowsA == timerOverflowsB`. – Lundin Jul 02 '18 at 15:18
It's not as simple as you make out. Even if an interrupt does occur beween `LDX` and `STX 4,SP`, you're still ignoring the checking done around `TCNT`, which is read twice and also used in the loop condition. With both checks it's pretty hard to reason about admittedly - but I've tested it enough times right now to be reasonably happy, and I'm reluctant to switch at this point to another untested solution (like poor man's mutex) unless I find evidence that this doesn't behave. But it was a nice exercise all the same. – Tagc Jul 02 '18 at 15:32
1

While I upvoted your answer for the obvious dedication to make the question a good, SO-conformant one, your code still has one nasty bug which can surface in a couple of scenarios, being it a different optimization, a slightly different cycles/instruction CPU model or similar: your test for `countA >= countB` will yield (unintendedly, that is) true and therefore keep your function in the loop if you happen to read `TCNT` fast enough to always hit the same value in both reads. This is a theoretical starvation scenario and a real source for unexplained glitches in execution time. – Vroomfondel Jul 02 '18 at 16:35
@Vroomfondel yeah you're right, I noticed that myself just as I was writing that last comment I posted. I've finished work for the day so I'll need to think about how I'll tackle that when I get to work tomorrow. Might be simpler at this point to just do the poor man's mutex and try another 20 tests or so. – Tagc Jul 02 '18 at 16:38
Or you use my solution: as it stands, your routine can't protect you from a hold-up of more than 22ms falling right between `timeOverflowsB = ..` and `countB = ..` either. – Vroomfondel Jul 02 '18 at 16:43
@Tagc Alright it's like talking to a wall. I have programmed HC12:s since they were released somewhere around 2002, so what do I know. Done helping you, happy debugging. – Lundin Jul 02 '18 at 17:34
I'm not sure what you're annoyed about? It was a productive discussion, a lot of avenues were explored and I've even said I'm considering using the poor man's mutex approach that you suggested as one possibility. – Tagc Jul 02 '18 at 17:36
@Tagc Hint: what does `||` mean and how does it work? Your code contains obvious bugs, I explain them to you step by step and then you stubbornly keep telling me there is no bug. The double check is nonsense since I told you how the race condition bug causes `timerOverflowsA == timerOverflowsB`, which in turn breaks the loop and short-circuits the weird count check. "I tested it enough times" does nothing, this is a rare intermittent bug that might manifest itself after weeks of up-time, you can't find it through dynamic testing. – Lundin Jul 03 '18 at 07:00
Jesus Christ. "what does || mean and how does it work?" It works by doing a short-circuiting logical OR, which means that even *if* the two timer overflow values are equal (i.e. the first operand of || is false), it will then evaluate the *next* condition (the TCNT comparison) and still potentially loop off that. There is a bug in this (the one Vroomfondel points out) but that's different from what you're talking about. Challenging your points isn't "being stubborn", it's called critical analysis. – Tagc Jul 03 '18 at 07:06
"timerOverflowsA == timerOverflowsB, which in turn breaks the loop and short-circuits the weird count check." No it doesn't. Look again. The loop condition is `timerOverflowsA != timerOverflowsB || countA >= countB`. If `timerOverflowsA == timerOverflowsB`, then the first operand is *false* so the condition becomes `false || countA >= countB`, which reduces to `countA >= countB`. Can we please drop this now? If you consider this stubbornness for some reason - fine. The discussion has become unproductive at this point. – Tagc Jul 03 '18 at 07:07

Clifford · Accepted Answer · 2018-07-02T19:22:24.093

2

Calculate the tick count, then check if while doing that the overflow changed, and if so repeat;

#define TCNT_BITS 16 ; // TCNT register width

uint32_t MOTOR_GetCurrentTicks(void)
{
   uint32_t ticks = 0 ;
   uint32_t overflow_count = 0;

   do
   {
       overflow_count = _timerOverflowsNoReset ;
       ticks = (overflow_count << TCNT_BITS) | TCNT;
   }
   while( overflow_count != _timerOverflowsNoReset ) ;

   return ticks ;
}

the while loop will iterate either once or twice no more.

edited Jul 02 '18 at 19:22

answered Jul 02 '18 at 19:13

Clifford

88,407
13
85
165

Thanks, this looks good. I repeated my test about 20 times with this implementation and I didn't observe any tearing, so I'm confident enough to use this. It's simpler than my solution and avoids the bug in it. – Tagc Jul 03 '18 at 07:51
Regarding our earlier discussion, the one change I did make is to use `ticksPerCycle` instead of an explicit shift. https://hastebin.com/dedibaxofo.c I consider this more readable, and also more maintainable in case I later switch to using a dedicated timer channel match interrupt instead of a timer overflow interrupt (I just change the value of `ticksPerCycle`). As I was hoping, the compiler produces identical machine code if you multiply by a power of 2, so there's no impact on performance. https://hastebin.com/bodemixoti.go – Tagc Jul 03 '18 at 07:52
@Tagc : That is fine, but using a full register overflow to synthesise a 32 bit timer from 16 bit hardware is not an uncommon solution. I would recommend your solution where an interrupting timebase is required in addition to a free-running counter, and you are constrained by hardware timer availability. Ideally I'd use a separate timer for that. – Clifford Jul 03 '18 at 08:17

jeb · Answer 4 · 2018-07-02T11:52:42.333

1

The atomic reads are not the main problem here.
It's the problem that the overflow-ISR and TCNT are highly related.
And you get problems when you read first TCNT and then the overflow counter.

Three sample situations:

TCNT=0x0000, Overflow=0   --- okay
TCNT=0xFFFF, Overflow=1   --- fails
TCNT=0x0001, Overflow=1   --- okay again

You got the same problems, when you change the order to: First read overflow, then TCNT.

You could solve it with reading twice the totalOverflow counter.

disable_ints();
uint16_t overflowsA=totalOverflows;
uint16_t cnt = TCNT;
uint16_t overflowsB=totalOverflows;
enable_ints();

uint32_t totalCnt = cnt;

if ( overflowsA != overflowsB )
{
   if (cnt < 0x4000)
      totalCnt += 0x10000;
}

totalCnt += (uint32_t)overflowsA << 16;

If the totalOverflowCounter changed while reading the TCNT, then it's necessary to check if the value in tcnt is already greater 0 (but below ex. 0x4000) or if tcnt is just before the overflow.

edited Jul 02 '18 at 11:52

answered Jul 02 '18 at 11:30

jeb

78,592
17
171
225

Ah yeah that actually does make sense. In a higher-level language I'd be locking around `TCNT` and `_timerOverflows` together, but here I'm not sure how to short of manually constructing memory barriers in assembly, and like you say it's very possible that I read one variable and then the other just at the point that an overflow occurs. – Tagc Jul 02 '18 at 11:38
@Tagc Even locking wouldn't help here, only temporarily stopping the counter would work, but that would produce inaccurate time results – jeb Jul 02 '18 at 11:42
Believe me, if you have ever looked at HCS12 assembler generated from accessing unsigned longs, you would know that this is a major problem in the original code. In addition, failing to protect variables shared with an ISR is by far the most common embedded system newbie bug, and it always causes very subtle phenomenon. – Lundin Jul 02 '18 at 11:45
@Lundin Once upon a time, I wrote code for the HCS12 ... And you are right, reading the overflow counter without disabling interrupts isn't a good idea. I added the disable_/enable_int() block – jeb Jul 02 '18 at 11:54
@jeb It should only disable the specific timer channel, not the main interrupt mask. Which leads us back to the root of the problem: no can do since specific timer channels weren't used. – Lundin Jul 02 '18 at 11:57
(or there might be a specific TOF interrupt disable, I don't remember) – Lundin Jul 02 '18 at 11:59

supercat · Answer 5 · 2018-07-04T19:35:54.530

One technique that can be helpful is to maintain two or three values that, collectively, hold overlapping portions of a larger value.

If one knows that a value will be monotonically increasing, and one will never go more than 65,280 counts between calls to "update timer" function, one could use something like:

// Note: Assuming a platform where 16-bit loads and stores are atomic
uint16_t volatile timerHi, timerMed, timerLow;

void updateTimer(void)  // Must be only thing that writes timers!
{
  timerLow = HARDWARE_TIMER;
  timerMed += (uint8_t)((timerLow >> 8) - timerMed);
  timerHi  += (uint8_t)((timerMed >> 8) - timerHi);
}

uint32_t readTimer(void)
{
  uint16_t tempTimerHi  = timerHi;
  uint16_t tempTimerMed = timerMed;
  uint16_t tempTimerLow = timerLow;
  tempTimerMed += (uint8_t)((tempTimerLow >> 8) - tempTimerMed);
  tempTimerHi  += (uint8_t)((tempTimerMed >> 8) - tempTimerHi);
  return ((uint32_t)tempTimerHi) << 16) | tempTimerLow;
}

Note that readTimer reads timerHi before it reads timerLow. It's possible that updateTimer might update timerLow or timerMed between the time readTimer reads timerHi and the time it reads those other values, but if that occurs, it will notice that the lower part of timerHi needs to be incremented to match the upper part of the value that got updated later.

This approach can be cascaded to arbitrary length, and need not use a full 8 bits of overlap. Using 8 bits of overlap, however, makes it possible to form a 32-bit value by using the upper and lower values while simply ignoring the middle one. If less overlap were used, all three values would need to take part in the final computation.

Lundin · Answer 6 · 2018-07-02T11:19:49.307

0

The problem is that the writes to _timerOverflowsNoReset isn't atomic and you don't protect them. This is a bug. Writing atomic from the ISR isn't very important, as the HCS12 blocks the background program during interrupt. But reading atomic in the background program is absolutely necessary.

Also, have in mind that Codewarrior/HCS12 generates somewhat ineffective code for 32 bit arithmetic.

Here is how you can fix it:

Drop unsigned long for the shared variable. In fact you don't need a counter at all, given that your background program can service the variable within 22ms real-time - should be very easy requirement. Keep your 32 bit counter local and away from the ISR.
Ensure that reads of the shared variable are atomic. Disassemble! It must be a single MOV instruction or similar; otherwise you must implement semaphores.
Don't read any volatile variable inside complex expressions. Not only the shared variable but also the TCNT. Your program as it stands has a tight coupling between the slow 32 bit arithmetic algorithm's speed and the timer, which is very bad. You won't be able to reliably read TCNT with any accuracy, and to make things worse you call this function from other complex code.

Your code should be changed to something like this:

static volatile bool overflow;


void timovf_isr(void)
{
  // Clear the interrupt.
  TFLG2_TOF = 1;

  // TEMP
  overflow = true;

  // ...
}

unsigned long MOTOR_GetCurrentTime(void)
{
  bool of = overflow;   // read this on a line of its own, ensure this is atomic!
  uint16_t tcnt = TCNT; // read this on a line of its own

  overflow = false; // ensure this is atomic too

  if(of)
  {
    _timerOverflowsNoReset++;
  }

  /* calculations here */

  return microseconds;
}

If you don't end up with atomic reads, you will have to implement semaphores, block the timer interrupt or write the reading code in inline assembler (my recommendation).

Overall I would say that your design relying on TOF is somewhat questionable. I think it would be better to set up a dedicated timer channel and let it count up a known time unit (10ms?). Any reason why you can't use one of the 8 timer channels for this?

edited Jul 02 '18 at 11:19

answered Jul 02 '18 at 11:05

Lundin

195,001
40
254
396

1

I don't see how this code would work. Where do you reset `overflow`? Also, `MOTOR_GetCurrentTime` shouldn't have side effects - your implementation will make it so that every time something queries the current time, the current time will change. "I think it would be better to set up a dedicated timer channel and let it count up a known time unit" I need to be able to keep track of time for several seconds at minimum for my application to work properly. – Tagc Jul 02 '18 at 11:11
@Tagc Oops I missed adding the reset, fixed. But of course the current time will change each time you read the function, isn't that the whole point? TCNT is a hardware register. Regarding using the dedicated timer, keeping track of seconds is not a problem. – Lundin Jul 02 '18 at 11:22
The idea of mutating state in `MOTOR_GetCurrentTime` seems problematic. It's like having a clock where the hands move every time you look at it. With your current implementation, wouldn't it lose time if several overflow interrupts occur in-between calls to `MOTOR_GetCurrentTime` i.e. if `timovf_isr` fires several times between calls to `MOTOR_GetCurrentTime`? – Tagc Jul 02 '18 at 11:28
This fails, when TCNT is 0xFFFF you read `oveflow=false`, but `uint16_t tcnt = TCNT` set `tcnt=0`, you are missing the overflows at the edges – jeb Jul 02 '18 at 11:32
@Tagc That's why I told you you need a soft real-time requirement of 22ms to service the routine. – Lundin Jul 02 '18 at 11:35
@jeb That part is handled by the hardware setting the TOF flag, not the software. The only thing that matters is to check for TOF at least once per 22ms. Also this is pseudo code. – Lundin Jul 02 '18 at 11:39
@Lundin I can't see why setting the TOF flag should change the problem at the TCNT transistion from 0xFFFF to 0x0000, there your `of` flag can be okay or wrong. Btw. you need to disable interrups for the sequence of `bool of=overflow; overflow = false` (or an atomic operation). – jeb Jul 02 '18 at 11:49
All right so just drop this inherently bad solution and use dedicated timer channels like I advised, problem solved. I would never read TCNT on the fly like this simply because it doesn't make much sense with the given hardware available. – Lundin Jul 02 '18 at 11:54
Sorry for the flurry of comments. I'm not clear on what the benefit of using a dedicated timer channel would be. I assume you mean using one of the unused channels to produce an interrupt when its value equals a match register, then responding to the interrupt by incrementing a variable like I do with a timer overflow. Then the current time would be calculated as `numberOfInterrupts * channelMatchRegister + currentChannelValue`. Wouldn't that be essentially just be the same as what I'm already doing? Or am I completely off the mark about what you mean? – Tagc Jul 02 '18 at 13:53
@Tagc The nice thing with a dedicated channel is that you get interrupts per time unit, removing a lot of overhead code, and that you can do calculations without worrying about overflows and latency etc. Basically your whole code could be replaced with `void tcx_interrupt (void) { uint16_t tcx = TCx; // read timer channel. TSOMETHING |= TCFLAG /*whatever register and mask this is*/. time_counter++; TCx = TCx + delay;}` You'd still have to protect "time_counter" from re-entrancy bugs when you read it, but the whole code turns much faster and prettier. – Lundin Jul 02 '18 at 14:22
Correct me if I I'm wrong but I don't think that would help me in my case because I'm not trying to perform operations per any particular time unit. I'm using the timer overflow for two reasons: 1) to perform operations that need to be performed regularly (doesn't matter if it's 10 ticks, 20 ticks or 65535 ticks - just so long as it's frequent enough); 2) to maintain a running clock that is too long to be kept exclusively within any channel value register or even TCNT itself - if I used a dedicated channel, I'd simply be handling match interrupts in the same way I handle overflow interrupts. – Tagc Jul 02 '18 at 14:31
You'd have to add some logic like for example `if(ms_counter == 1000) { ms_counter=0; s_counter++; }`. This can also be done with 16 bit arithmetic. – Lundin Jul 02 '18 at 14:34

Vroomfondel · Answer 7 · 2018-07-02T12:42:41.587

It all boils down to the question of how often you do read the timer and how long the maximum interrupt sequence will be in your system (i.e. the maximum time the timer code can be stopped without making "substantial" progress).

Iff you test for time stamps more often than the cycle time of your hardware timer AND those tests have the guarantee that the end of one test is no further apart from the start of its predecessor than one interval (in your case 22ms), all is well. In the case your code is held up for so long that these preconditions don't hold, the following solution will not work - the question then however is whether the time information coming from such a system has any value at all.

The good thing is that you don't need an interrupt at all - any try to compensate for the inability of the system to satisfy two equally hard RT problems - updating your overflow timer and delivering the hardware time is either futile or ugly plus not meeting the basic system properties.

unsigned long MOTOR_GetCurrentTime(void)
{
  static uint16_t last;
  static uint16_t hi;
  volatile uint16_t now = TCNT;

  if (now < last)
  {
     hi++;
  }
  last = now;
  return now + (hi * 65536UL);
}

BTW: I return ticks, not microseconds. Don't mix concerns.

PS: the caveat is that such a function is not reentrant and in a sense a true singleton.

With your implementation, what happens if multiple overflows occur between calls to `MOTOR_GetCurrentTime`? What happens if you call that function when TCNT = 30,000, then hits 65,535, resets to 0 and reaches 40,000 before the next call? "BTW: I return ticks, not microseconds. Don't mix concerns." This is a temporary function used for debugging - I have a very similar function that returns the time in ticks. — Tagc, Jul 02 '18 at 12:41
If multiple overflows occur, then the premise "test for time stamps more often than the cycle time of your hardware timer" is broken. In systems which *really* need time stamp of the delivered resolution, this is hardly ever a problem though. In the worst case one can test for the overflow flag of the hardware timer a issue a dummy read. — Vroomfondel, Jul 02 '18 at 12:46

Preventing torn reads with an HCS12 microcontroller

Summary

Details

7 Answers7