The title says it all. Basically I'm using TCP for a client-server setup and I'm wondering if there is an advantage of transforming strings to binary before sending the data over tcp?
-
strings ARE binary. they're just generally considered human-readable. – Marc B Mar 12 '13 at 20:10
-
So it doesn't matter. I guess transforming strings to binary is a waste of processing power then right? – pandoragami Mar 12 '13 at 20:11
-
depends. if your string is textual representations of hex values, eg `$txt = '0x12 0x23 0x34` and so on, you could save a few bytes by converting to the raw numeric values, sending 3 bytes instead of 14. – Marc B Mar 12 '13 at 20:12
-
Ok so its the space in terms of bytes thats different. Why didn't you post this as an answer? – pandoragami Mar 12 '13 at 20:13
-
there's no right/wrong answer. e.g. what if you're writing a webserver and sending html? there's no point in "binary-izing", because html is supposed to be plain text anyways. it depends on what your're sending, what the receiving end is expecting, and if the cost of conversion outweighs the bandwidth savings. – Marc B Mar 12 '13 at 20:14
3 Answers
Strings are binary data, or can at least be easily converted to such, byte[], with
static byte[] GetStringBytes(string str)
{
byte[] bytes = new byte[str.Length * sizeof(char)];
System.Buffer.BlockCopy(str.ToCharArray(), 0, bytes, 0, bytes.Length);
return bytes;
}
If you compress/encode the data you send, whether it starts out life as a string or binary data you will most likely be sending the same total number of bytes.

- 3,169
- 2
- 30
- 57
No real advantage in a vast majority of cases. Also, binary data tends to be more platform dependent, so if you want to extend your client/server to a multi-platform environment, you're probably better off sticking with Strings.

- 1,825
- 12
- 7
There are some data that are encoded in binary, like DER.
If your application needs to send these kinds of data, you can send binary directly. For example, the certificate of CA. If you send it as string like base64, it needs more size to represent it.
In some lower level, like TCP, sharing information (handshaking) between two entity is using binary because of the size, performance and fault tolerance.
But as an application developer which needs to evolve agilely, there are lots of things other than sizes you need to consider, like compatibility and maintenance. Let's say you need to response some metadata with the binary data, you might need to use JSON to represent:
{
"version": 1,
"data": "base64 encoded data"
}
Things will be easier to clients, because there are tons of tools to help parsing JSON and it is easier to debug when you can see the data without tools (human readable). In other hand, binary encoded (like DER) you have to upgrade your protocol document to let clients can easily upgrade their code for new format and you also need to consider backward/forward compatible.
Although there are some tools help binary encoded more compatible and maintainable, like Apache Thrift, Protocol Buffer, Apache Avro.
Looking deeper on performance aspect. Unless you are comparing with and without encode/decode process (that is easy to compare which is better). You must do some load-test to check it.
For example there are some extremely fast JSON parser, it can easily beat some binary parser which claims to have better performance than others.

- 1,123
- 8
- 19