2

I'm trying to write a parser for SCGI requests. I'm trying to parse the string described in the example but for some reason I cannot find the position of the second null character, the one that separates the content length value and the next property name.

This is my test string:

string scgi_request(
    "70:CONTENT_LENGTH\027\0SCGI\01\0REQUEST_METHOD\0POST\0REQUEST_URI\0" \
    "/deepthought\0,What is the answer to life?"
   , 91);

I can find the position of the first null character, position 18. But once I try to find the one after that, the position returned is invalid, off by a few characters, all the way up to position 24.

This is my algorithm:

size_t contentLengthEnd = scgi_request.find('\0');
size_t contentLengthValueEnd = scgi_request.find('\0', ++contentLengthEnd);
std::cerr << contentLengthEnd << std::endl; // 19, because I shifted this one forward 
                                            // otherwise I'd always get the same 
                                            // character
std::cerr << contentLengthValueEnd << std::endl; // 24, no clu why.
ruipacheco
  • 15,025
  • 19
  • 82
  • 138
  • What's `request`? How, if at all, is it related to `scgi_request`? – Igor Tandetnik Oct 16 '14 at 22:23
  • @user657267 It shouldn't be, as in the ctor he specifies the length of the string, and `\0` is not treated as a special character on construction – vsoftco Oct 16 '14 at 22:26
  • 1
    I thought creating the string by passing the length would avoid that? – ruipacheco Oct 16 '14 at 22:26
  • Sorry I only noticed now that the linebreaks were added. – user657267 Oct 16 '14 at 22:27
  • 1
    The first `\0` is at 18th as you already stated. The next characters `SCGI\01` occupy positions 19-23. The second `\0` is on position 24. Isn't that correct? – alvits Oct 16 '14 at 22:34
  • To avoid having the magic number `91` in your code, I'd suggest writing `char const c_request[] = "70:CONTENT etc...."; std::string request( c_request, c_request + sizeof c_request - 1 );` – M.M Oct 16 '14 at 22:35
  • @alvits I don't think its ever at position 24. How are you counting? – ruipacheco Oct 18 '14 at 15:01

1 Answers1

8

Your string starts:

"70:CONTENT_LENGTH\027\0SCGI\01\0REQUEST_METHOD\0POST\0REQUEST_URI\0" 

These outputs are actually correct for the string you gave. I'm guessing you may be overlooking that \027 is an octal character constant, and so on. The characters and their indices are:

16: 'H'
17: '\027'
18: '\0'
19: 'S'
20: 'C'
21: 'G'
22: 'I'
23: '\01'
24: '\0'
25: 'R'

Your program finds the first two '\0' which are 18 and 24, but you do ++ on the first one before outputting it, hence the output of 19 and 24.

If you meant '\0' then '2' then '7' then you'll need to not juxtapose those things, e.g. taking advantage of string literal concatenation:

"70:CONTENT_LENGTH\0"
"27\0" 
"SCGI\0"
"1\0"
M.M
  • 138,810
  • 21
  • 208
  • 365