37

I don't remember the standard saying something of the max length for identifiers so in theory they can be long. In real life, those names could be limited by at least the compiler and linker implementations.

While this should work on all systems

int a;

this snippet

#!/usr/bin/perl
print "int" . "b" x 2**16 . ";";

creates a declarationen that gives undefined reference to std::somethings with ld while compiling/linking (using gcc/mingw).

So what are the size limits for an identifier on different systems?

mbx
  • 6,292
  • 6
  • 58
  • 91
  • 1
    Be warned that name length limits are for *mangled names*. I have been beaten a lot of time by some annoying warnings from MSVC when I wrote template functions and pass to them some `boost::transform_iterator`. The mangled name of the instantiation gets just crazily long. – Alexandre C. May 22 '11 at 09:54
  • @Alexandre: Until now I never had that problem. The only issue related to templates I remember was that I had to increase the template depth in some projects which made use of meta programming. – mbx May 22 '11 at 10:07
  • gcc has no such problem (see @Anders' answer). This is only a warning that occasionaly happens with MS tools when the mangled names exceeds 2048 characters. It usually does not play havoc with your builds though. – Alexandre C. May 22 '11 at 11:05

3 Answers3

54
Community
  • 1
  • 1
Anders Lindahl
  • 41,582
  • 9
  • 89
  • 93
18

Annex B of the C++ Standard says that an implementation should support identifiers at least 1024 characters long, but this is not mandatory.

  • 5
    One word on caution, this length identifies the mangled symbol if memory serves me right. Function taking template classes as arguments can get pretty hairy mangled names, so the limit is not so far fetched. – Matthieu M. May 15 '11 at 16:14
  • @Matthieu Not obviously. In fact as far as I'm aware, the standard doesn't discuss name mangling. –  May 15 '11 at 16:18
  • @Neil: I do not think it does, each compiler may (and in fact should) develop its own ABI (or if reusing someone's else, make sure that the code produced really is compatible). I just wanted to raise the fact that the standard specify a length for the tool chain, but this is different from the length indicated on a linker, since a linker only sees mangled (decorated) symbols. – Matthieu M. May 15 '11 at 17:59
  • _Should_ sounds like there might be some pitfalls using older compilers/linkers. – mbx May 22 '11 at 16:51
  • 2
    @mbx You cannot link C++ code from different compilers or even from different compiler versions. That's the price you pay for type-safe linking, which believe me is worth it. –  May 22 '11 at 16:58
  • @Neil: but I can link eg. nasm compiled assembler code with gcc. So the Question is about the system you build with. – mbx May 23 '11 at 17:11
4

Based on MISRA C 2004:

Rule 5.1 (required): Identifiers (internal and external) shall not rely on the significance of more than 31 characters. [Undefined 7; Implementation 5, 6] The ISO standard requires internal identifiers to be distinct in the first 31 characters to guarantee code portability. This limitation shall not be exceeded, even if the compiler supports it. This rule shall apply across all name spaces. Macro names are also included and the 31 character limit applies before and after substitution. The ISO standard requires external identifiers to be distinct in the first 6 characters, regardless of case, to guarantee optimal portability. However this limitation is particularly severe and is considered unnecessary. The intent of this rule is to sanction a relaxation of the ISO requirement to a degree commensurate with modern environments and it shall be confirmed that 31 character/ case significance is supported by the implementation. Note that there is a related issue with using identifier names that differ by only one or a few characters, especially if the identifier names are long. The problem is heightened if the differences are in easily mis-read characters like 1 (one) and l (lower case L), 0 and O, 2 and Z, 5 and S, or n and h. It is recommended to ensure that identifier names are always easily visually distinguishable. Specific guidelines on this issue could be placed in the style guidelines (see section 4.2.2).

I use this rule. Maybe somebody asks me "Do you see any compiler that can't recognize identifier more than 31?" Yes, I remember in IAR RL78 v2.21.1 I defined 2 identifier without any warning and error, but I faced problem in access(now I don't remember what was the exact scenario)

SpongeBob
  • 383
  • 3
  • 16