I'm having trouble parsing some HTTP headers using c++. Right now I'd like to be able to find the carriage return/linefeed combination that ends each HTTP header entry. I'm doing this with str.find() like so:
string hdr; //filled with the header data
int line_end_pos = hdr.find("\r\n"); //also tried "\\r\\n", same results
Despite knowing that the header has the combination of a carriage return and a linefeed character, find() keeps returning -1. What am I missing here?
EDIT:
The library I'm using offers a couple of different functions for displaying the data. A sample of the header data looks like this in string format:
GET /p/libcrafter/ HTTP/1.1
Host: code.google.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:24.0) Gecko/20100101 Firefox/24.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en,en-us;q=0.5
Accept-Encoding: gzip, deflate
DNT: 1
Cookie: PREF=ID=ad8fd3ab4b0bd3c9:U=e1bd88556eeb2dce:FF=0:TM=1382531357:LM=1382531841:S=Pbh-JiokGeVbsSh-; NID=67=olK2k5sUZ95mRApV77s7CfXscytJSfmVuyubiSCMotOdBBvijqrTwyyifLQZbZA_SCTVQXqTEoE6hqaqVJkRpqoY2RPDFBPghbe5czX6QxKw7lBdOaP6-IpzGXYMWl6Q; OGPC=4061029-5:; __utma=247248150.2068354019.1382532826.1382532826.1382532826.1; __utmb=247248150.10.10.1382532826; __utmc=247248150; __utmz=247248150.1382532826.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)
Connection: keep-alive
Cache-Control: max-age=0
It looks like this in "Hex Dump" format:
47455420 2F702F6C 69626372 61667465 GET /p/libcrafte 00000000
722F2048 5454502F 312E310D 0A486F73 r/ HTTP/1.1..Hos 00000010
743A2063 6F64652E 676F6F67 6C652E63 t: code.google.c 00000020
6F6D0D0A 55736572 2D416765 6E743A20 om..User-Agent: 00000030
4D6F7A69 6C6C612F 352E3020 28583131 Mozilla/5.0 (X11 00000040
3B205562 756E7475 3B204C69 6E757820 ; Ubuntu; Linux 00000050
7838365F 36343B20 72763A32 342E3029 x86_64; rv:24.0) 00000060
20476563 6B6F2F32 30313030 31303120 Gecko/20100101 00000070
46697265 666F782F 32342E30 0D0A4163 Firefox/24.0..Ac 00000080
63657074 3A207465 78742F68 746D6C2C cept: text/html, 00000090
6170706C 69636174 696F6E2F 7868746D application/xhtm 000000A0
6C2B786D 6C2C6170 706C6963 6174696F l+xml,applicatio 000000B0
6E2F786D 6C3B713D 302E392C 2A2F2A3B n/xml;q=0.9,*/*; 000000C0
713D302E 380D0A41 63636570 742D4C61 q=0.8..Accept-La 000000D0
6E677561 67653A20 656E2C65 6E2D7573 nguage: en,en-us 000000E0
3B713D30 2E350D0A 41636365 70742D45 ;q=0.5..Accept-E 000000F0
6E636F64 696E673A 20677A69 702C2064 ncoding: gzip, d 00000100
65666C61 74650D0A 444E543A 20310D0A eflate..DNT: 1.. 00000110
436F6F6B 69653A20 50524546 3D49443D Cookie: PREF=ID= 00000120
61643866 64336162 34623062 64336339 ad8fd3ab4b0bd3c9 00000130
3A553D65 31626438 38353536 65656232 :U=e1bd88556eeb2 00000140
6463653A 46463D30 3A544D3D 31333832 dce:FF=0:TM=1382 00000150
35333133 35373A4C 4D3D3133 38323533 531357:LM=138253 00000160
31383431 3A533D50 62682D4A 696F6B47 1841:S=Pbh-JiokG 00000170
65566273 53682D3B 204E4944 3D36373D eVbsSh-; NID=67= 00000180
6F6C4B32 6B357355 5A39356D 52417056 olK2k5sUZ95mRApV 00000190
37377337 43665873 6379744A 53666D56 77s7CfXscytJSfmV 000001A0
75797562 6953434D 6F744F64 42427669 uyubiSCMotOdBBvi 000001B0
6A717254 77797969 664C515A 625A415F jqrTwyyifLQZbZA_ 000001C0
53435456 51587154 456F4536 68716171 SCTVQXqTEoE6hqaq 000001D0
564A6B52 70716F59 32525044 46425067 VJkRpqoY2RPDFBPg 000001E0
68626535 637A5836 51784B77 376C4264 hbe5czX6QxKw7lBd 000001F0
4F615036 2D49707A 4758594D 576C3651 OaP6-IpzGXYMWl6Q 00000200
3B204F47 50433D34 30363130 32392D35 ; OGPC=4061029-5 00000210
3A3B205F 5F75746D 613D3234 37323438 :; __utma=247248 00000220
3135302E 32303638 33353430 31392E31 150.2068354019.1 00000230
33383235 33323832 362E3133 38323533 382532826.138253 00000240
32383236 2E313338 32353332 3832362E 2826.1382532826. 00000250
313B205F 5F75746D 623D3234 37323438 1; __utmb=247248 00000260
3135302E 31302E31 302E3133 38323533 150.10.10.138253 00000270
32383236 3B205F5F 75746D63 3D323437 2826; __utmc=247 00000280
32343831 35303B20 5F5F7574 6D7A3D32 248150; __utmz=2 00000290
34373234 38313530 2E313338 32353332 47248150.1382532 000002A0
3832362E 312E312E 75746D63 73723D28 826.1.1.utmcsr=( 000002B0
64697265 6374297C 75746D63 636E3D28 direct)|utmccn=( 000002C0
64697265 6374297C 75746D63 6D643D28 direct)|utmcmd=( 000002D0
6E6F6E65 290D0A43 6F6E6E65 6374696F none)..Connectio 000002E0
6E3A206B 6565702D 616C6976 650D0A43 n: keep-alive..C 000002F0
61636865 2D436F6E 74726F6C 3A206D61 ache-Control: ma 00000300
782D6167 653D300D 0A0D0A x-age=0.... 00000310
Finally, it looks like this as a "Raw String":
\x47\x45\x54\x20\x2f\x70\x2f\x6c\x69\x62\x63\x72\x61\x66\x74\x65\x72\x2f\x20\x48
\x54\x54\x50\x2f\x31\x2e\x31\xd\xa\x48\x6f\x73\x74\x3a\x20\x63\x6f\x64\x65\x2e\x67
\x6f\x6f\x67\x6c\x65\x2e\x63\x6f\x6d\xd\xa\x55\x73\x65\x72\x2d\x41\x67\x65\x6e\x74
\x3a\x20\x4d\x6f\x7a\x69\x6c\x6c\x61\x2f\x35\x2e\x30\x20\x28\x58\x31\x31\x3b\x20\x55
\x62\x75\x6e\x74\x75\x3b\x20\x4c\x69\x6e\x75\x78\x20\x78\x38\x36\x5f\x36\x34\x3b\x20
\x72\x76\x3a\x32\x34\x2e\x30\x29\x20\x47\x65\x63\x6b\x6f\x2f\x32\x30\x31\x30\x30\x31
\x30\x31\x20\x46\x69\x72\x65\x66\x6f\x78\x2f\x32\x34\x2e\x30\xd\xa\x41\x63\x63\x65\x70
\x74\x3a\x20\x74\x65\x78\x74\x2f\x68\x74\x6d\x6c\x2c\x61\x70\x70\x6c\x69\x63\x61\x74
\x69\x6f\x6e\x2f\x78\x68\x74\x6d\x6c\x2b\x78\x6d\x6c\x2c\x61\x70\x70\x6c\x69\x63\x61
\x74\x69\x6f\x6e\x2f\x78\x6d\x6c\x3b\x71\x3d\x30\x2e\x39\x2c\x2a\x2f\x2a\x3b\x71\x3d
\x30\x2e\x38\xd\xa\x41\x63\x63\x65\x70\x74\x2d\x4c\x61\x6e\x67\x75\x61\x67\x65\x3a\x20
\x65\x6e\x2c\x65\x6e\x2d\x75\x73\x3b\x71\x3d\x30\x2e\x35\xd\xa\x41\x63\x63\x65\x70\x74
\x2d\x45\x6e\x63\x6f\x64\x69\x6e\x67\x3a\x20\x67\x7a\x69\x70\x2c\x20\x64\x65\x66\x6c\x61
\x74\x65\xd\xa\x44\x4e\x54\x3a\x20\x31\xd\xa\x43\x6f\x6f\x6b\x69\x65\x3a\x20\x50\x52
\x45\x46\x3d\x49\x44\x3d\x61\x64\x38\x66\x64\x33\x61\x62\x34\x62\x30\x62\x64\x33\x63
\x39\x3a\x55\x3d\x65\x31\x62\x64\x38\x38\x35\x35\x36\x65\x65\x62\x32\x64\x63\x65\x3a
\x46\x46\x3d\x30\x3a\x54\x4d\x3d\x31\x33\x38\x32\x35\x33\x31\x33\x35\x37\x3a\x4c\x4d
\x3d\x31\x33\x38\x32\x35\x33\x31\x38\x34\x31\x3a\x53\x3d\x50\x62\x68\x2d\x4a\x69\x6f
\x6b\x47\x65\x56\x62\x73\x53\x68\x2d\x3b\x20\x4e\x49\x44\x3d\x36\x37\x3d\x6f\x6c\x4b
\x32\x6b\x35\x73\x55\x5a\x39\x35\x6d\x52\x41\x70\x56\x37\x37\x73\x37\x43\x66\x58\x73
\x63\x79\x74\x4a\x53\x66\x6d\x56\x75\x79\x75\x62\x69\x53\x43\x4d\x6f\x74\x4f\x64\x42
\x42\x76\x69\x6a\x71\x72\x54\x77\x79\x79\x69\x66\x4c\x51\x5a\x62\x5a\x41\x5f\x53\x43
\x54\x56\x51\x58\x71\x54\x45\x6f\x45\x36\x68\x71\x61\x71\x56\x4a\x6b\x52\x70\x71\x6f
\x59\x32\x52\x50\x44\x46\x42\x50\x67\x68\x62\x65\x35\x63\x7a\x58\x36\x51\x78\x4b\x77
\x37\x6c\x42\x64\x4f\x61\x50\x36\x2d\x49\x70\x7a\x47\x58\x59\x4d\x57\x6c\x36\x51\x3b
\x20\x4f\x47\x50\x43\x3d\x34\x30\x36\x31\x30\x32\x39\x2d\x35\x3a\x3b\x20\x5f\x5f\x75
\x74\x6d\x61\x3d\x32\x34\x37\x32\x34\x38\x31\x35\x30\x2e\x32\x30\x36\x38\x33\x35\x34
\x30\x31\x39\x2e\x31\x33\x38\x32\x35\x33\x32\x38\x32\x36\x2e\x31\x33\x38\x32\x35\x33
\x32\x38\x32\x36\x2e\x31\x33\x38\x32\x35\x33\x32\x38\x32\x36\x2e\x31\x3b\x20\x5f\x5f
\x75\x74\x6d\x62\x3d\x32\x34\x37\x32\x34\x38\x31\x35\x30\x2e\x31\x30\x2e\x31\x30\x2e
\x31\x33\x38\x32\x35\x33\x32\x38\x32\x36\x3b\x20\x5f\x5f\x75\x74\x6d\x63\x3d\x32\x34
\x37\x32\x34\x38\x31\x35\x30\x3b\x20\x5f\x5f\x75\x74\x6d\x7a\x3d\x32\x34\x37\x32\x34
\x38\x31\x35\x30\x2e\x31\x33\x38\x32\x35\x33\x32\x38\x32\x36\x2e\x31\x2e\x31\x2e\x75
\x74\x6d\x63\x73\x72\x3d\x28\x64\x69\x72\x65\x63\x74\x29\x7c\x75\x74\x6d\x63\x63\x6e
\x3d\x28\x64\x69\x72\x65\x63\x74\x29\x7c\x75\x74\x6d\x63\x6d\x64\x3d\x28\x6e\x6f\x6e
\x65\x29\xd\xa\x43\x6f\x6e\x6e\x65\x63\x74\x69\x6f\x6e\x3a\x20\x6b\x65\x65\x70\x2d\x61
\x6c\x69\x76\x65\xd\xa\x43\x61\x63\x68\x65\x2d\x43\x6f\x6e\x74\x72\x6f\x6c\x3a\x20\x6d
\x61\x78\x2d\x61\x67\x65\x3d\x30\xd\xa\xd\xa
As you can see, when outputted in hex format the lines end with 0D and 0A and when in raw string format they end with \xd and \xa. My question remains though, how can I find these end-of-line characters when working with the data as a string (or can't I)?