I've worked with a few protocols, and written my own. I have written some message formats with only 1 char to identify the message, and some with 4 chars. I don't feel that I'm experienced enough to tell which is better, so I'm looking for an answer which describes in which scenario one might be better than the other.
For performance, you would imagine that sending 2 bytes (A%1i
) is faster than sending 5 bytes (ABCD%1i
). However, I have noticed that when writing the protocol with the 1 byte prefix, if you have a bug which causes your code to not read enough data from the socket, you might get garbage data comming into your system.
So is the purpose of a 4 byte prefix just to provide a guarentee that your message is clean? Is it worth it for the performance you sacrafice? Do you really sacrafice any performance at all? Maybe it's better to have 2 or 3 byte prefix?
I'm not sure if this question should be specific to TCP, or whether it applies to all transport protocols. Advice on this would be interesting.
Update: For interest, I will mention that Synergy uses 4-byte message prefixes, so for a mouse move delta the header is the same size as the actual data. Some have suggested just having a 1 or 2 byte prefix to improve efficiency. I wonder what drawbacks this would have?
Update: Also, I wonder if only the handshake really matters, if you're worried about garbage data. Synergy has a long handshake (a few bytes), so are the 4-byte message prefixes needed? I made a protocol recently that has only a 1 byte handshake, and that turned out to be a bad idea, since incompatible protocols were spamming the system with bad data (off the back of this, I might reccomend at least having a long handshake).