5

We are using an ARM AM1808 based Embedded System with an rtos and a File System. We are using C language. We have a watchdog timer implemented inside the Application code. So, whenever something goes wrong in the Application code, the watchdog timer takes care of the system.

However, we are experiencing an issue where the system hangs before the watchdog timer task starts. The system hangs because the File System code is badly coded with so many number of while loops. And sometimes due to a bad NAND(or atleast the File System code thinks it is bad) the code hangs in a while loop and never gets out of it. And what we get is a dead board.

So, the point of giving all the information is to ask you guys whether there is any mechanism which could be implemented in the code that runs before the application code? Is there any hardware watchdog? What steps can be taken in order to make sure we don't get a dead board caused by some while loop.

  • 3
    I would suggest firing the guy/gal who cannot write disk drivers:) – Martin James Dec 21 '17 at 22:16
  • 2
    Start the watchdog before running code that breaks the system. – Weather Vane Dec 21 '17 at 22:21
  • 1
    A watchdog won’t stop your code breaking, and expecting your watchdog to dig you out of the mess your code puts you in is a very head-in-the-sand. You’re going to have to fix that didgy disk driver code sooner rather than later, so get on and do it. Watchdog is only there for real disasters, not for routine operation. – DisappointedByUnaccountableMod Dec 24 '17 at 14:46
  • 1
    And to answer your question the mechanisom for ensuring you don’t get a dead board cause by some while loop is to do some strong design and review work which gets rid of those while loops. – DisappointedByUnaccountableMod Dec 24 '17 at 14:48
  • @MartinJames , in a way we have fired that person. We are not using that vendors product anymore, switched to a new vendor :-) – user9128860 Dec 28 '17 at 16:16
  • @WeatherVane Can not do this in the current code. – user9128860 Dec 28 '17 at 16:17
  • @barny I agree with you, but this is an old driver software, which we are not going to use anymore. However, we still need to support old boards until they are switched to newer software. – user9128860 Dec 28 '17 at 16:20

2 Answers2

8

Professional embedded systems are designed like this:

  • Pick a MCU with power-on-reset interrupt and on-chip watchdog. This is standard on all modern MCUs.
  • Implement the below steps from inside the reset interrupt vector.
  • If the MCU memory is simple to setup, such as just setting the stack pointer, then do so the first thing you do out of reset. This enables C programming. You can usually write the reset ISR in C as long as you don't declare any variables - disassemble to make sure that it doesn't touch any RAM memory addresses until those are available.
  • If the memory setup is complex - there is a MMU setup or similar - C code will have to wait and you'll have to stick to assembler to prevent accidental stacking caused by C code.
  • Setup the most fundamental registers, such as mode/peripheral routing registers, watchdog and system clock.
  • Setup the low-voltage detect hardware, if applicable. Hopefully the out-of-reset state for LVD on the MCU is a sound one.
  • Application-specific, critical registers such as GPIO direction and internal pull resistor registers should be set from here. Many MCU have pins as inputs by default, making them vulnerable. If they are not meant to be inputs in the application, the time they are kept as such out of reset should be minimized, to avoid problems with noise, transients and ESD.
  • Setup the MMU, if applicable.
  • Everything else "CRT", such as initialization of .data and .bss.
  • Call main().

Please note that pre-made startup code for your MCU is not necessarily made by professionals! It is fairly common that there's an amateur-level "CRT" delivered with your toolchain, which fails to setup the watchdog and clock early on. This is of course unacceptable since:

  1. This makes any program running on that platform a notable safety/poor quality hazard, in case the "CRT" will crash/hang for whatever reason.
  2. This makes the initialization of .data and .bss needlessly, painfully slow, as it is then typically executed with the clock running on the default on-chip RC oscillator or similar.

Please note that even industry de facto startup code such as ARM CMSIS fails to do some of the MCU-specific hardware setups mentioned above. This may or may not be a problem.

Lundin
  • 195,001
  • 40
  • 254
  • 396
1

There is a hardware watchdog that could be run before the application runs. ARM AM1808 does have a timer that could be implemented as a watchdog, as per documentation: www.ti.com/lit/ds/symlink/am1808.pdf. So, you may wish to set it like that at least during the part of the program that runs through the critical and long section. You at wish to have a piece of booting code that first sets this watchdog, and after the correct initialization, goes to application. In fact, this is a very common approach.

VladP
  • 529
  • 3
  • 15