Why must the first 31 characters of an identifier be unique?

Question

MISRA 2004 rule 5.1 states that all identifiers must have the first 31 characters unique. What is the reason for this rule? Is it a technical limitation with some compilers?

It used to be some old limitation in some very old compilers (and perhaps some old C standards). And in practice 31 characters is comfortable enough — Basile Starynkevitch, Nov 11 '13 at 12:06
It's a linker limitation, discussed in wonderful detail in [this Stackoverflow Answer][1]. [1]: http://stackoverflow.com/a/18290577/53870 — Andy Dent, Nov 11 '13 at 12:12
possible duplicate of [What's the exact role of "significant characters" in C (variables)?](http://stackoverflow.com/questions/18290165/whats-the-exact-role-of-significant-characters-in-c-variables) — unwind, Nov 11 '13 at 12:15

Arkku · Answer 1 · 2021-09-22T12:10:33.447

7

The C standards only guarantee that a certain number of initial characters in identifiers are significant. For C99 this is 31 characters for external identifiers. Even this is a huge step up from ANSI/IS C, which guarantees only 6 significant characters for external identifiers… (So if you're wondering why so many old C functions have unpronounceable names, this is one reason.)

In practice compilers tend to support a higher number of significant characters in identifiers (and IIRC the C standard even has a footnote encouraging this), but MISRA probably wanted to pick a “safe” limit already guaranteed by the then-most-recent C standard, C99, without imposing the limit of 6 that would be guaranteed by C90 which MISRA 2004 otherwise follows.

edit: Since it has been questioned twice in the comments, let me clarify: MISRA 2004 does not follow C99, and there is no hard evidence that the C99 standard contributed to MISRA's chosen limit of specifically 31 characters. However, the limit does not come from C90 (ISO C), because C90 specifies a limit of 6 characters. So, one must either accept that MISRA picked the number 31 independently, or followed the example of C99 in this particular decision. Of course it might be that both picked the same number due to that being the lower bound in popular compilers of the day, but at the very least it can be argued that the example of the older C99 validates the choice.

edited Sep 22 '21 at 12:10

answered Nov 11 '13 at 12:11

Arkku

41,011
10
62
84

On the contrary, MISRA explicitly demands more from the compiler than the C standard - this is the purpose of the rule. C99 is irrelevant for MISRA-C 2004. – Lundin Nov 12 '13 at 07:47
@Lundin I was speaking only in the context of this particular limitation, where MISRA does _not_ require more than C99 (the most recent C standard established before MISRA 2004) – 31 characters is an arbitrary limit, so my guess stands that they took it from the C99 standard to maximise compatibility with modern compilers while avoiding the ridiculously strict 6-character guarantee of ANSI C. (The part of the rationale you pasted agrees with this guess; they do not mention C99, but the number 31 almost certainly comes from there, since pretty much every compiler supports more.) – Arkku Nov 12 '13 at 09:40
This question is about MISRA-C:2004, which explicitly forbids using any other standard than C90, so when speaking of "the C standard" together with MISRA 2004, you always mean C90. (For C99 support you would use MISRA-C:2012, which is an entirely different document) – Lundin Nov 12 '13 at 10:20
@Lundin I only meant that _I_ was speaking of C99, in the context of my answer; it does not matter what MISRA specifies elsewhere. But this argument is quite pointless; whatever the reason for MISRA picking a limit of 31 characters, it is clearly based on the fact that standards-conforming compilers are not required to support arbitrarily many significant chars in identifiers. MISRA's own rationale does not specify where they got 31, so my personal guess stands that it is from C99. Whether that guess is correct or not does not even matter for purposes of answering the question. =) – Arkku Nov 12 '13 at 10:57
Arkku - as @Lundin states, MISRA-C:2004 cites compliance with C90 only (Rule 1.1). The 31 character limit stems from C90. Any reference to C99 is irrelevant when discussing MISRA-C:2004. – Andrew Nov 13 '13 at 17:00
@Andrew C90 specifies a limit of 6 characters, not 31, so it is irrelevant. 31 characters is specified in MISRA 2004, and the asker wanted to know the source of this limitation. It is clearly not C90, and my _guess_ is that it is chosen to match C99 (because I don't know of any compiler with that specific limit). The other alternative is that they just picked a number independently. Whatever the case, in the context of this question C90 is irrelevant, because it does not contribute in any way to this limit (and as Lundin's quote shows, MISRA acknowledges that it does not follow C90 here). – Arkku Nov 13 '13 at 17:56
For externals it was set to the same as internals (because that is what most compilers at the time did). I repeat, C99 does not come into it. – Andrew Nov 15 '13 at 06:25
[citation required] =) – Arkku Nov 15 '13 at 11:38

score 2 · Answer 2 · answered Nov 12 '13 at 07:46

MISRA-C:2004 follows the C90 standard, which only requires the 6 first characters of an identifier to be treated as distinct ones. You can read the rationale in the MISRA document.

MISRA-C:2004 Rule 14:

The ISO standard requires external identifiers to be distinct in the first 6 characters. However compliance with this severe and unhelpful restriction is considered an unnecessary limitation since most compilers/linkers allow at least 31 character significance (as for internal identifiers).

The ISO standard referred to is ISO 9899:1990 (C90). The purpose of the rule is ensure that you are using a sane, safe compiler with enough characters of significance.

Why must the first 31 characters of an identifier be unique?

2 Answers2

Linked