15

What I am aiming to do is send JSON containing a header object and a compressed data in a field that is byte array.

[JsonObject(MemberSerialization.OptOut)]
public class Message
{
    public Message()
    {
        Header = new Header();
    }

    public Header Header { get; set; }


    public byte[] Data { get; set; }
}

Byte array is gzip compressed JSON object, but this is not that relevant. Issue I am having is that if I serialize the JSON it gets converted into string and then back to bytes. Issue is, the message size increases quite a bit , since serializing the byte array converts it to string representation.

I am constrained by maximum message size and I have spiting of compressed data in place, but when I get to send the JSON containing compressed data in Byte array and Uncompressed header, serializing JSON object puts me way over the message size limit.

Is there any reliable way of converting JSON object to byte array straight away.

var stringMessage = JsonConvert.SerializeObject(message,Formatting.None);
var bytes = Encoding.UTF8.GetBytes(stringMessage);

var stringMessage2 = JsonConvert.SerializeObject(message.TransportHeader, Formatting.None);
var bytes2 = Encoding.UTF8.GetBytes(stringMessage2);

Message eventMessage = new Message(bytes);
var bytes3= Encoding.UTF8.GetBytes(JsonConvert.SerializeObject(message.Transportdata));

Compressed data size =243905

Full JSON in Bytes after serialization = 325313

Just header in bytes size =90

Just Compressed data serialized and converted back to bytes = 325210, (size increases when data gets serialized by JsonConvert.SerializeObject and string representation is produced)

It clearly goes up quite a bit and its caused by byte array.

The_Black_Smurf
  • 5,178
  • 14
  • 52
  • 78
Aistis Taraskevicius
  • 781
  • 2
  • 10
  • 31
  • If you use the answer by @ygaradon, pass in a `MemoryStream` and then use `ToArray()` to get the `byte[]` – Camilo Terevinto Oct 04 '18 at 15:50
  • It's not a duplicate. His problem is that his serialized size is unexpectedly high. – usr Oct 04 '18 at 16:10
  • Have you considered sending your data over the wire using a multipart/mixed content type instead of straight JSON? Put your JSON in one part and the binary data in another part. – Brian Rogers Oct 05 '18 at 02:17
  • Well JSON part is there so binary data can be identified and pieced together, without a json header it would be impossible to recombine split and compressed data. And I can't sent two separately as there will be no way knowing which header belongs to which package. – Aistis Taraskevicius Oct 05 '18 at 08:17

2 Answers2

7

I found a way to do what I wanted, its not precisely JSON, buts is BSON or also known as Binary JSON. Well since finding the solution was a pure luck, I am not sure how well known is BSON.

Anyway Newtonsoft supports it via Newtonsoft.Json.Bson nuget package at https://www.nuget.org/packages/Newtonsoft.Json.Bson/1.0.1

Some code for serialization/deserialization

foreach (var message in transportMessageList)
{
    MemoryStream ms = new MemoryStream();
    using (BsonDataWriter writer = new BsonDataWriter(ms))
    {
        JsonSerializer serializer = new JsonSerializer();
        serializer.Serialize(writer, message);
    }

    var bsonByteArray = ms.ToArray();
    Assert.True(bsonByteArray.Length!=0);
    bsonList.Add(bsonByteArray);
}

var deserializdTransmortMessageList = new List<TransportMessage>();
foreach (var byteArray in bsonList)
{
    TransportMessage message;
    MemoryStream ms = new MemoryStream(byteArray);
    using (BsonDataReader reader = new BsonDataReader(ms))
    {
        JsonSerializer serializer = new JsonSerializer();
        message = serializer.Deserialize<TransportMessage>(reader);
    }
    Assert.True(message.Transportdata.Length!=0);
    deserializdTransmortMessageList.Add(message);
}

You can use same classes/objects you use for JSON, serializing compressed array of bytes no longer cause an increase in size.

Please note that BSON documentation at Newtonsoft website is out dated and lists only deprecated api calls at this moment. My code uses proper non deprecated API calls.

The_Black_Smurf
  • 5,178
  • 14
  • 52
  • 78
Aistis Taraskevicius
  • 781
  • 2
  • 10
  • 31
2

JSON is a character based format so there is necessarily character data involved. I suspect you used UTF16 encoding which makes each char into two bytes. If you use UTF8 you will not experience any meaningful size overhead.

usr
  • 168,620
  • 35
  • 240
  • 369
  • I am using UTF8 to get bytes after serializing. – Aistis Taraskevicius Oct 04 '18 at 15:43
  • Then please post your code and what kinds of size expansion you are experiencing. – usr Oct 04 '18 at 15:47
  • Updated original post with serialization and size increases. – Aistis Taraskevicius Oct 04 '18 at 15:54
  • @AistisTaraskevicius thanks. How many characters are there? It is not clear why you think that the conversion to bytes is increasing the size. Relative to what do you notice an increase? – usr Oct 04 '18 at 16:11
  • If you look the bottom of the post, I pointed out that compressed data has size of 243905 bytes, when the same data is serialized and converted back to bytes ( so I can send it to azure) size goes up to 325210, it goes up when serialization takes place, string representation is that long and converting it back to bytes does nothing to help with the size. – Aistis Taraskevicius Oct 04 '18 at 16:27
  • @AistisTaraskevicius I see... What does `new Message(bytes)` look like? Maybe the bytes end up serialized as base64 which does blow up the size. You are double-serializing as JSON. Are you compressing the data before serializing again or after? You should compress after. Then, the compression should make the base64 size increase almost undone. – usr Oct 04 '18 at 16:34
  • the size increases before new message, is when it becomes byte array or rather serialized. I am compressing before, but that's the requirement, to have an uncompressed header and compressed data, which sounded fine at the time, but proves to be tricky at this moment. surely there must be a way to convert json to byte array instead of serializing to string – Aistis Taraskevicius Oct 04 '18 at 19:53
  • The problem is not any string/bytes issue. The problem is the way JSON.NET converts `byte[] Data` to JSON text. Look at the JSON, it's probably base64. JSON does not support binary data at all. Apply compression to `bytes3` and the size will shrink back. @AistisTaraskevicius – usr Oct 04 '18 at 20:12
  • bytes3 is just a test, what i need is uncompressed header and compressed data in JSON. I cant compress bytes3 again because I still have to put it into main json containing header, where I get same issue as I have now. – Aistis Taraskevicius Oct 04 '18 at 21:05
  • @AistisTaraskevicius if you want to transmit binary data compactly I think you're simply out of luck with JSON. If you can change the format then invent a small custom binary format or use protocol buffers. – usr Oct 05 '18 at 08:41