24

protocol buffer say it can contain any arbitrary sequence of bytes. but if my data contain '\0' ,how protocol buffer can encode my whole data from a string variable.

Flexo
  • 87,323
  • 22
  • 191
  • 272
xiemeilong
  • 643
  • 1
  • 6
  • 21
  • What do you mean by "a string variable"? Is this a `char *`? Is it C, Java or something else? – Marcelo Cantos Jul 13 '12 at 09:21
  • 1
    @Marcelo I'm guessing he means `string` from ``... – Marc Gravell Jul 13 '12 at 09:22
  • 2
    I'm not a c++ person any more, but I was under the impression that `\0` has no special significance in a c++ string **unless** you are using methods that specifically handle `\0`. So... just don't use those methods? – Marc Gravell Jul 13 '12 at 09:23
  • strings can contain `\0`. Some APIs operate on the assumption of strings being \0 terminated (eg. the traditional C runtime) but modern APIs operate on pointer-and-length or pointer-to-starte-and-pointer-to-end representations which do not require `\0` to be considered 'special'. – Remus Rusanu Jul 13 '12 at 09:25
  • Duh. I must have missed the "c++" in the heading (or maybe it was added during the early-edit window; I remember looking for the language and not seeing it). – Marcelo Cantos Jul 13 '12 at 12:21

1 Answers1

35

The C++ implementation of protocol buffers returns the byte and string types as std::string. This structure contains a length function telling you how long the corresponding data is (as well as the data itself.) Thus there is no special significance of embeded \0 characters.

The setting functions accept a string too, or there are versions that accept a buffer and length. If you want to set a field you can just do this:

pb.set_foo( std::string( data, data_length ) );

or

pb.set_foo( data, data_length );
Michael Anderson
  • 70,661
  • 7
  • 134
  • 187