Could a tight loop destroy cells of a microcontroller's flash?

Question

It is well-known that Flash memory has limited write endurance, less so that reads could also have an upper limit such as mentioned in this Flash endurance test's Conclusion (3rd point).

On a microcontroller the code is typically stored in Flash, and is executed by fetching code words directly from the Flash cells.
(at least this is most commonly so on 8 bit micros, some 32 bit micros might have some small buffer).

Depending on the particular code, it might happen that a location is accessed extremely frequently, such as if on the main execution path there is some busy loop, such as a wait for an interrupt
(for example from a timer, synchronizing execution to a fixed interval).

This could generate 100K or even more (read) accesses per second on average to a single Flash cell (depending on clock and the particular code).

Could such code actually destroy the cells of the Flash underneath it?

(Is there any necessity to be concerned about this particular problem when designing code for microcontrollers? Such as part of a system which is meant to operate for years? Of course the Flash could be periodically verified by CRC, but that doesn't prevent the system failing if it happens, only that the failure will more likely happen in a controlled manner)

I've never seen this happen. I've ever only tried to exceed the write cycles, with development boards that wore out after thousands and thousands of write cycles. At my job we have embedded software deployed that has been running for decades. I understand your concern, but I don't think you should be worried about exceeding the read cycles of internal flash. — Morten Jensen, Sep 30 '16 at 07:06
@AlexandreLavoie Possibly it suits there as well, but also here I guess. If the problem existed, then it becomes a software problem (you need to design so there are no such tight loops in your code). — Jubatian, Sep 30 '16 at 09:38
@MortenJensen I also think so, the problem just surfaced elsewhere, and I was surprised that I can't find any definite answer to it (while the read cycle limit exists for the Flash technology). — Jubatian, Sep 30 '16 at 09:41
@Jubatian I just asked two of my veteran colleagues. They say there is no upper read count on flash memory. Only for erase cycles. — Morten Jensen, Sep 30 '16 at 09:48
@Jubatian You are asking hardware related question that have absolutely nothing to do with programming. — Alexandre Lavoie, Sep 30 '16 at 17:02
That is not well suited for a programming Q&A site. Read the datasheets of your device, that should include endurance values for erase/program cycles. Also all serious manufacturers provide more in-depth information — too honest for this site, Sep 30 '16 at 22:23
read-disturb is more likely but you obviously cant use or cannot tolerate that in a microcontroller environment (reading the same area fast enough causing a starvation if you will and you get a bad read there or nearby). It is the erase cycles you need to worry the most about. As pointed out, simply look up the specs for your part. — old_timer, Oct 02 '16 at 22:35

score 3 · Answer 1 · answered Sep 30 '16 at 07:51

Only erasing/writing will affect the memory cells, not reading. You don't need to consider the number of reads when designing the program.

Programmed flash memory does age though, meaning that the value of the cells might not be reliable after a certain amount of years. This is known as data retention and depends mainly on temperature. MCU manufacturers typically specify a worse case in years, assuming that the part is kept in maximum specified ambient temperature.

This is something to consider for products that are expected to live long (> 10 years), particularly in environments where high temperatures can be expected. CRC and/or ECC is a good counter-measure against data retention, although if you do find that a cell has been corrupted, it typically just means that the application should shut down to a non-recoverable safe state.

I know about these, that is, data retention, and the countermeasures, possibly when specifying the data retention for a part, manufacturers assume especially the kind of worst case I refer to (continuous access of a cell at maximal frequency). Actually on small micros any cell of the Flash is in quite heavy use (for an ATTiny with 1K flash the best possible case for continuous operation would be only 1/1000th the wear per cell than the worst case). — Jubatian, Sep 30 '16 at 09:49
You don't take measured *against* data retention, but rather data *loss*. The retention metric is a measure of the guaranteed retention period with no re-writes. — Clifford, Sep 30 '16 at 21:30

Morten Jensen · Answer 2 · 2016-09-30T21:47:56.290

0

I know of two techniques to approach this issue:

1) One technique is to set aside a const 32-bit integer variable in the system code. Then calculate a CRC32 checksum of the compiled binary image, and inserting the checksum into the reserved variable using an ELF-editor. A module in the system software will then calculate a CRC32 over the flash area occupied by the application and compare to the "stored" value. If you are using GCC, the linker can define a symbol to tell you where the segment stops. This method can detect errors but cannot correct them.

2) Another technique is to use a microcontroller that supports Flash ECC. TI sells Cortex-R4 MCUs which support Flash ECC (Hercules series).

edited Sep 30 '16 at 21:47

answered Sep 30 '16 at 09:55

Morten Jensen

5,818
3
43
55

Method 1 does not *circumvent* the issue - it merely detects flash memory corruption. – Clifford Sep 30 '16 at 21:44
@Clifford You're right about that. The "circumvention" consists of going to unsafe state and flagging the error. ECC can circumvent it somewhat depending on the hardware capabilities. I've edited the post to reflect your critique. – Morten Jensen Sep 30 '16 at 21:46

Clifford · Answer 3 · 2016-09-30T22:27:27.960

I doubt that this is a practical concern. The article you cited vaguely asserts that this can happen but with no supporting evidence or quantification of the effect. There is a vague, unsupported and unquantified reference in the introduction:

[...] flash degrades over time from erasing/writing (or even just reading, although that decay is slower) [...]

Then again in the conclusion:

We did not check flash decay for reads, but reading also causes long term decay. It would be interesting to see if we can read a spot enough times to cause failure.

The author may be referring to read-disturbance in NAND flash, but microcontrollers do not use NAND flash for code storage/execution since it is not random-access. Read disturb is not a permanent effect, erasing and re-writing the affected block restores endurance. NAND controllers maintain read counts for blocks and automatically copy and erase blocks as necessary. They also employ ECC to detect and correct errors, and identify "write-worn" areas.

There is the potential for long-term "bit-rot" but I doubt that it is caused specifically by reading rather just ageing.

Most RTOS systems spend the majority of their processing time in a do-nothing idle loop, and run happily 24/7 365 days a year. Some processors support a wait-for-interrupt instruction that halts the CPU in the idle loop, but by no means all, and it is not uncommon not to use such an instruction. Processors with flash accelerators or caches may also prevent continuous rapid fetch from a single location, but again that is by no means all microcontrollers.

Could a tight loop destroy cells of a microcontroller's flash?

3 Answers3