I'm using a socket to connect to various XML webservices. But when i convert my recieved bytes to a string (usually UTF-8 encoded) I get some extra string interspersed. Most of the time the returned string starts with something like "4000\r\n" and then "\r\n4000\r\n" is interspersed through the data. Other times the string can be "\r\nd1ef\r\n" or other combinations of 4-8 hex "letters". Sometimes it is all at once. Some stuff i noticed:
- If there is no "xxxx\r\n" in the beginning, the string is clean
- I always get the same result (same extra strings at the same locations) if I call the same URL multiple times
- The strings are usually 4 hex chars with "\r\n" around it, but it can also be 8 hex chars
- It happens with many different webservices, so it's probably not on the server side
- Since it always starts and ends with "\r\n" it cannot be random extra bytes of data
I'm guessing this is some kind of HTTP "paging"-feature or something that I am not aware of.
This is my code:
var client = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
client.ReceiveTimeout = timeout;
client.SendTimeout = timeout;
client.NoDelay = true;
client.Connect(server, port);
//send HTTP request
client.Send(totalData, totalData.Length, SocketFlags.None);
//read the data
var buffer = new byte[32];
byteStream = new MemoryStream();
while (true)
{
var readCount = client.Receive(buffer, buffer.Length, SocketFlags.None);
if (readCount > 0)
{
byteStream.Write(buffer, 0, readCount);
}
else
break;
}
client.Disconnect(false);
client.Close();
//get the HTTP response
var bytes = byteStream.ToArray();
var ascii = Encoding.ASCII.GetString(bytes.ToArray());
var bodyPosition = ascii.IndexOf("\r\n\r\n") + 4;
var bodyBytes = new byte[bytes.Length - bodyPosition];
Array.Copy(bytes,bodyPosition,bodyBytes,0,bodyBytes.Length);
var body = dataEncoding.GetString(bodyBytes);
Does anyone know what I'm doing wrong?