6

I'm trying to translate a huge project from C++ to Delphi and I'm finalizing the translation. One of the things I left is the '\0' monster.

if (*asmcmd=='\0' || *asmcmd==';')

where asmcmd is char*.

I know that \0 marks the end of array type in C++, but I need to know it as a byte. Is it 0?

In other words, would the code below be the equivalent of the C++ line?

if(asmcmd^=0) or (asmcmd^=';') then ...

where asmcmd is PAnsiChar.

You need not know Delphi to answer my question, but tell me \0 as byte. That would work also. :)

qwerty101
  • 145
  • 1
  • 1
  • 3

3 Answers3

14

'\0' equals 0. It's a relic from C, which doesn't have any string type at all and uses char arrays instead. The null character is used to mark the end of a string; not a very wise decision in retrospect - most other string implementations use a dedicated counter variable somewhere, which makes finding the end of a string O(1) instead of C's O(n).

*asmcmd=='\0' is just a convoluted way of checking length(asmcmd) == 0 or asmcmd.is_empty() in a hypothetical language.

tdammers
  • 20,353
  • 1
  • 39
  • 56
  • 3
    Convoluted, perhaps... but it is important to know when you **should** use `'\0'`. An example is `for(char *c = str; *c != '\0'; c++)` vs `for (int i = 0; i < strlen(str); i++)`. (Of course, depending on your application you may find little need to iterate over the characters in a string anyway.) – David Aug 06 '10 at 12:21
  • 3
    @David: "`*c != '\0'`" why not just `(char *c = str; *c; c++)` ? – SigTerm Aug 06 '10 at 12:30
  • thank you, this completely differs from what I thought it is. :) – qwerty101 Aug 06 '10 at 12:33
  • @SigTerm Good point - no reason not to remove the extra few characters of code. Anyway, now the OP knows both ways. – David Aug 06 '10 at 12:39
  • 1
    Depending on the code around this, it does NOT mean that the code is checking the length. Far more likely is that it is wanting to stop at the end of a string, or in this case also a semi-colon section. – mj2008 Aug 06 '10 at 12:57
  • 1
    @SigTerm, David: Yes, there are good reasons not to remove extra characters -- how 'bout readability? I'd actually write that as `*c != END_OF_STRING;` (where END_OF_STRING is a `const char`). Learn to friggin' type. It's just 3 more keystrokes. – James Curran Aug 06 '10 at 14:01
  • And Qwerty, the Delphi equivalent for C++'s '\0' is '#0'. You could even write your code as `if asmcmd^ in [#0, ';'] then ...` – Rob Kennedy Aug 06 '10 at 15:32
  • 1
    @James Curran: Not meaning to fan the flames too much here, but I think there's an important tradeoff to be made between readability and idiomaticity. Giving `'\0'` a special name is reminiscent of the various "frameworks" that have their own `typedef` s for all of the standard integer types, etc. Those kind of coding standards are fine when adopted for projects large enough to warrant forcing newcomers to learn a couple of conventions. But in general, readability means doing things the way people are used to. OTOH, @SigTerm's version was terser than I would have gone for, hence my `!= '\0'`. – David Aug 06 '10 at 15:37
  • @David: I used END_OF_STRING a lot in the early days when I was doing straight C with minimal 3rd party libs, so I was doing lot of complex string manipulation myself. For a simple loop-to-end `!=`\0'` is fine. – James Curran Aug 06 '10 at 15:52
  • Also keep in mind that `'\0'` will be different sizes for `char` vs. `wchar_t`. So, loosely, sometimes it is `0`, but sometimes it is `00`. :) – i_am_jorf Aug 06 '10 at 22:03
5

Strictly it is an escape sequence for the character with the octal value zero (which is of course also zero in any base).

Although you can use any number prefixed with zero to specify an octal character code (for example '\040' is a space character in ASCII encoding) you would seldom ever have cause to do so. '\0' is idiomatic for specifying a NUL character (because you cannot type such a character from the keyboard or display it in your editor).

You could equally specify '\x0', which is a NUL character expressed in hexadecimal.

The NUL character is used in C and C++ to terminate a string stored in a character array. This representation is used for literal string constants and by convention for strings that are manipulated by the<cstring>/<string.h>library. In C++ the std::string class can be used instead.

Note that in C++ a character constant such as '\0' or 'a' has type char. In C, for perhaps obscure reasons, it has type int.

Clifford
  • 88,407
  • 13
  • 85
  • 165
3

That is the char for null or char value 0. It is used at the end of the string.

Daniel A. White
  • 187,200
  • 47
  • 362
  • 445
  • So it is 0 as I thought. Thank you! – qwerty101 Aug 06 '10 at 12:10
  • 2
    `\0` is not `null` but `NUL` :) The former is a pointer, the latter a character. – fredoverflow Aug 06 '10 at 12:39
  • ...and there's the old "the null pointer might not be at address 0" thing too: http://stackoverflow.com/questions/2759845/why-is-address-zero-used-for-null-pointer/2759875#2759875 – leander Aug 06 '10 at 13:06
  • 2
    The word *null* without any code formatting applied to it is just an ordinary English word and perfectly acceptable to use when talking about character 0. It's the null character, and its ASCII name is NUL. Similarly, the character with value 2 is the start-of-text character, and its name is STX. – Rob Kennedy Aug 06 '10 at 15:29