4

I got a stringstream of with HTTP request content. As you know HTTP request end up with CRLF break. But operator>> won't recognize CRLF as if it's a normal end-of-file. How can I detect this CRLF break?

EDIT: All right, actually I'm using boost.iostreams. But I don't think there should be any differences.

char head[]     = "GET / HTTP1.1\r\nConnection: close\r\nUser-Agent: Wget/1.12 (linux-gnu)\r\nHost: www.baidu.com\r\n\r\n";
io::stream<My_InOut>    in(head, sizeof head);
string s;
while(in >> s){
        char c = in.peek(); // what I am doing here is to check if next character is a normal break so that 's' is a complete word.
        switch( c ){
        case -1:
              // is it eof or an incomplete word?
              break;
        case 0x20: // a complete word
              break;
        case 0x0d:
        case 0x0a: // also known as \r\n should indicate a complete word
              break;
}

In this code, I assume that the request could possibly be split into parts because of its transmission, so I wanted to recognize whether '-1' stand for actual end-of-request or just a break word that I need to read more to complete the request.

prehawk
  • 195
  • 12

1 Answers1

1

First of all, peek returns an int, not a char (at least, std::istream::peek returns int--I don't know about boost). This distinction is important for recognizing -1 as the end of the file rather than a character with the value of 0xFF.

Also be aware that i/o streams in text mode will transform the platform's line separator into '\n' (which, in C and C++, usually has the same value as a line feed, but it might not). So if you're running this on Windows, where the native line separator is CR+LF, you'll never see the CR. But if you run the same code on a Linux box, where the native separator is simply LF, you will.

So given your question:

How can I detect this CRLF break?

The answer is to open the stream in binary mode and check for the character values 0x0D followed by 0x0A.

That said, it's not unheard of for HTML code to overlook that the network protocol requires CR+LF. If you want to be abide by the "be liberal in what you accept" maxim, you just watch for either CR or LF and then skip the next character if it's the complement.

Community
  • 1
  • 1
Adrian McCarthy
  • 45,555
  • 16
  • 123
  • 175