7

I was playing around with Dart strings and noticed this:

print("\x00nullbyte".length);
print("\x00nullbyte");

If you run this, you'll find that the length is 9, which includes the null byte. But there is no output.

Trusting Google engineers more than myself, programming-wise, I'm thinking there might be a reason for that. What could it be?

user2864740
  • 60,010
  • 15
  • 145
  • 220
conradkleinespel
  • 6,560
  • 10
  • 51
  • 87
  • maybe because when they allocate the memory for the string its the size of the input.. it would not iterate through each character and be like ok memory allocate one more byte guys! .. the same time it would set the length property... print is different it acts on the memory once its already there – clancer Feb 12 '14 at 04:17
  • Maybe it's an artifact of the `print` target .. – user2864740 Feb 12 '14 at 04:22
  • @clancer: I get your point. Dart probably keeps track of the length separately (although I haven't looked at the actual source). But why not make a `write(1, str, DartStringLength(str));`? This wouldn't stop at the first `\0`. @user2864740: why keep that artifact? – conradkleinespel Feb 12 '14 at 04:24
  • @conradk anyway sometimes its nice to be able to null terminate your strings manually when reusing memory... such a trick can be useful in addition to the simplicity of the print statement – clancer Feb 12 '14 at 04:42

1 Answers1

5

The Dart string has length 9, and contains all nine code units. NUL characters are perfectly valid in a Dart string. They are not valid in C strings though, where they mark the end of the string. When printing, the string is eventually converted to a C-string to call the system library's output function. At that point, the system library sees only the NUL character and prints nothing.

Try:

main() { print("ab\x00cd"); }  // prints "ab".

The String.length function works entirely on the Dart String object, and doesn't go through the C strlen function. It's unaffected by the limits of C.

Arguably, the Dart print functionality should detect NUL characters and print the rest of the string anyway.

lrn
  • 64,680
  • 7
  • 105
  • 121
  • 2
    It is not possible to store every possible ASCII or UTF-8 string in a null-terminated string, as the encoding of the NUL character is a zero byte. However, it is common to store the subset of ASCII or UTF-8 not containing the NUL character in null-terminated strings. Some systems use "modified UTF-8" which encodes the NUL character as two non-zero bytes (0xC0, 0x80) and thus allow all possible strings to be stored. – mezoni Feb 12 '14 at 17:05