Read from TcpClient.GetStream() without knowing the length

Question

I'm working on a tcp base communication protocol . As i know there are many ways to determine when to end reading.

Closing the connection at the end of the message
Putting the length of the message before the data itself
Using a separator; some value which will never occur in the normal data (or would always be escaped somehow)

Typically i'm trying to send a file over the WiFi network (that may be Unstable and Low speed)

Cause of RSA and AES communication I don't like to close the connection each time (Can't use 1)
It's a large file that i cant predict the length of it so i cant act as method (Can't use 2)
Checking for something special when reading and escape it when writing need a lot of process (Can't use 3)
This method should be compatible with both c# and java.

What you suggest ?

More general problems :

How to identify end of InputStream in java

C# - TcpClient - Detecting end of stream?

More Iformation

I'm coding a TCP client server communication

At first server generates and sends a RSA public code to the client.

Then the client will generate AES(key,IV) and send it back using RSA encryption.

Till here everything is fine.

But i want to send a file over this network. here is my current packet EncryptUsingAES(new AES.IV(16 byte) +file.content(any size))

In the server i can't capture all the data sent by client. So i need to know how much data to read with (TcpClient.GetStream().read(buffer , 0 , buffersize) ) Current code:

List<byte> message = new List<byte>();
    int bytes = -1;
    do
    {
        byte[] buffer = new byte[bufferrSize];
        bytes = stream.Read(buffer, 0, bufferrSize);
        if (bytes > 0)
        {
            byte[] tmp = new byte[bytes];
            Array.Copy(buffer, tmp, bytes);
            message.AddRange(tmp);
        }
    } while (bytes == bufferrSize);

_"It's a large file that i cant predict the length of it so i cant act as method"_ - Don't send the entire file at once, read a few kB of it each time and insert the length before you send the packet. You must be able to know at least some amount of bytes, or else what you're trying would be pretty much impossible, length-prefixing or not. — Visual Vincent, Sep 05 '16 at 19:26
@VisualVincent yes your right about reading in chunks but i also use AES so its hard to determine what is the total length. AES adds some padding depends on model and config. and varies buffer size makes it more complicated . — M at, Sep 05 '16 at 19:29
Insert the length _after_ you've performed the AES encryption then, so you send the length of the encrypted message. Ignore using any buffer other than the one for reading the entire message ("packet"). — Visual Vincent, Sep 05 '16 at 19:40
Get your structure to look something like this: `[Encrypted msg length][Encrypted data]`. — Visual Vincent, Sep 05 '16 at 19:45
Here is the tricky part @VisualVincent i cant perform AES on a large file , its time consuming and memory or hdd needed. 2) i wanted to use a dynamic buffer size cause of performance — M at, Sep 05 '16 at 19:45
Yet again, **never** process the entire file at once. Read a few kB of it and encrypt the messages only. As for the buffer you just read until you've read `Message length` bytes, and keep track of how many bytes you've read so far. If there's less bytes left than your buffer, then subtract the buffer with the length left. — Visual Vincent, Sep 05 '16 at 19:48
if you like we can talk on the room http://chat.stackoverflow.com/rooms/122696/read-from-tcpclient-getstream-without-knowing-the-length — M at, Sep 05 '16 at 19:49

score 2 · Accepted Answer · edited May 23 '17 at 12:25

Your second method is the best one. Prefixing each packet with the packet's length will create a reliable message framing protocol which will, if done correctly, ensure that all your data is received even in the same size you sent it (that is, no partial data or data being lumped together).

Recommended packet structure:
```
[Data length (4 bytes)][Header (1 byte)][Data (?? bytes)]
```
- The header in question is a single byte you can use to indicate what kind of packet this is, so that the endpoint will know what to do with it.

Sending files

The sender of a file is in 90% of the cases aware of the amount of data it is about to send (after all, it usually has the file stored locally), which means there will be no problem knowing how much of the file has been sent or not.

The method I use and recommend is that you start by sending an "info packet", which explains to the endpoint that it is about to receive a file and also how many bytes that file consists of. After that you start sending the actual data - most preferrably in chunks since it's inefficient to proccess the entire file at once (at least if it's a large file).

Always keep track of how many bytes of the file you've received so far. By doing so the receiver can automatically tell when it has received the whole file.
Send a file a few kilobytes at a time (I use 8192 bytes = 8 kB as a file buffer). That way you don't have to read the entire file into memory nor encrypt it all at the same time.

Encrypting the data

Dealing with encryption will not be a problem. If you use length-prefixing just encrypt the data itself and leave the data length header untouched. The data length header must then be generated by the size of the encrypted data, like so:

Encrypt the data.
Get the length of the encrypted data.
Produce the following packet:
```
[Encrypted data length][Encrypted data]
```
(Insert a header byte in there if you need to)

Receiving an encrypted file

Receiving an encrypted file and knowing when everything has been received is infact not very hard. Assuming you're using the above the described method for sending the file, you would just have to:

Receive the encrypted packet → decrypt it.
Get the length of the decrypted data.
Increment a variable keeping track of the amount of file-bytes received.
If the received amount is equal to the expected amount: close the file.

Additional resources/references

You can refer to two of my previous answers that I wrote about TCP length-prefixed message framing:

I never tough of keep track of decrypted packages instead of encrypted packages . thanks — M at, Sep 05 '16 at 21:15

score 0 · Answer 2 · answered Sep 05 '16 at 07:13

0

The easiest way would be to use your #2. If you cannot predict message length, buffer up to a certain amount of bytes (like 1 KiB or something along those lines), and insert a length header for every one of those chunks instead of prefixing the whole message once.

answered Sep 05 '16 at 07:13

jwueller

30,582
4
66
70

So how should i know when there is no more chunks ? i guess its better to use networkstream as in https://msdn.microsoft.com/en-us/library/system.net.sockets.networkstream.read(v=vs.110).aspx – M at Sep 05 '16 at 07:45
The last chunk is the only one with less than 1 KiB length. – jwueller Sep 05 '16 at 07:49
Is it possible that due to low network speed i receive some chunks in half ? for the example 2048 byte chunk may come as 2000 byte and a 48 byte chunks ?? – M at Sep 05 '16 at 08:00
1

Your chunks are TCP payload which is guaranteed to be identical to the original (via checksums, reordering and such). – jwueller Sep 05 '16 at 08:08
Great thing to know. so may i also can use a fix length fix data chunk at the end of message ? – M at Sep 05 '16 at 08:13
as i experienced it does not mater how you write , data will be kept as bytes till someone reads them EX: write(1) ;write(2);write(3); read();//=> will be 123 not just one – M at Sep 05 '16 at 23:30

Read from TcpClient.GetStream() without knowing the length

2 Answers2

Sending files

Encrypting the data

Receiving an encrypted file

Additional resources/references

Linked