4

Currently I am starting a project, which needs to serialize the data from .net application ( c# app) and pass it throug a network to a java based server application. Therefore I would like to know which serialization mechanism is most efficient and at the same time serilized objct must be desirialize by java aplication.

queries:

I have heard that protobuf is much more faster than any other serialization like xml. Is is possible to use protobuf to accomplish the above mentioned requirement ??

In java there is newly developed technology named "Kryo" framework for serialization, which is even more efficient than protobuf, so are there any such thing in .net enviornment which must be language independent.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Possible Duplicate: http://stackoverflow.com/questions/20049887/best-practice-sending-objects-over-tcp-ip-between-platforms – El Mac Jan 28 '14 at 13:22

4 Answers4

5

Yes, protobuf is language independent. The java version is provided by google, with multiple C# implementations (I would recommend protobuf-csharp-port if you want very similar code at both ends, and protobuf-net if you prefer the .NET code to look like idiomatic .NET).

Re Kryo - I genuinely don't know enough to comment, but the only way to answer the "is it more efficient" question is to test it (also: define what efficiency means to you: is that serialization size? CPU time? resource usage? or...?). Personally, I'd be a little skeptical that it is going to be smaller, but: there's a sure fire way to find out: you try it.

I do not know whether Kryo is language agnostic, but I can only see Java mentioned.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • +1. I'd add two things; don't forget compression in the tests (protobuf will still beat compressed XML for example, but will it do so by enough of a factor to be worth it to someone already familiar with XML in the case in hand?). Also, language independence comes from format documentation first, libraries second and one can only futureproof by pointing to the format documentation that allows building a library for an as-yet unconsidered language) – Jon Hanna Jan 28 '14 at 13:33
  • @JonHanna indeed, the main thing that occurred to me when I looked at the Kryo pages is "and... what is it actually going to produce? what is the raw format?" – Marc Gravell Jan 28 '14 at 13:34
  • Same thought process that led to my comment. – Jon Hanna Jan 28 '14 at 13:48
  • @MarcGravell, in my case efficeincy refres to size of serialization produced and time to produced as well as time for deserialize the object. Moreover, I am conerned with size because I have to serialize huge set of data (in the ranage of GB)and immediately pass it over netowrk to server application, which is written in java. – UserOverFlow Jan 28 '14 at 14:34
  • @UserOverFlow that's fair enough - however, I still think you'll struggle to beat protobuf as a general purpose format. You *can* get smaller, but only if you write every part of serialization yourself, and don't worry about things like version tolerance - for example, you could omit the field-headers if you **always** send all the expected fields. But that takes a lot of work. – Marc Gravell Jan 28 '14 at 14:47
2

Hessian is a highly efficient, binary but language independent serialization protocol.

Implementations are available for Java, C++, C#, Objective-C, PHP, Ruby, Javascript etc.

A comparison of the performance of various remote protocol can be found here:
Java Remoting: Protocol Benchmarks

Bob Bryan
  • 3,687
  • 1
  • 32
  • 45
Peter Walser
  • 15,208
  • 4
  • 51
  • 78
1

Hmm..

It depends on the type of data you want to share between applications ofcourse.. Here's a brief overview of what I find to be pros & cons.. Can you maybe explain what type of data structures you'd want to share?

I'd advise to either use XML or JSON, to allow flexibility. Other binary based serialisation options will be difficult in the longer run, because,

  • it could interfere with unreadable/unrecoverable data..
  • the support could go missing & making your own implementation of reading out data will be harder
  • XML or JSON have a more clear syntax for which you can easily write your own wrapper, if, for whatever reason all tooling would disappear - because it's human readible

Json is an option

  1. human readable/editable
  2. can be parsed without knowing schema in advance
  3. excellent browser support
  4. less verbose than XML, but lacks "structure checking with schemas"

XML as well

  1. human readable/editable
  2. can be parsed without knowing schema in advance
  3. standard for SOAP etc
  4. good tooling support (xsd, xslt, sax, dom, etc)
  5. pretty verbose vs JSON

And then,

Protobuf

  1. very dense data (small output)
  2. hard to robustly decode without knowing the schema (data format is internally ambiguous, and needs schema to clarify)
  3. very fast processing
  4. not intended for human eyes (dense binary)
Alberto
  • 15,626
  • 9
  • 43
  • 56
Yves Schelpe
  • 3,343
  • 4
  • 36
  • 69
0

Language independence of communication is achieved by the contract-first approach. Create a clear and simple specification of the interchange format and then find the best tool at either end to help you adhere to it.

There are two basic choices for the wire format: XML and JSON.

XML has the advantage of a widely understood Schema specification, which then allows tools to generate binding code.

JSON has the advantage of being a simpler format to "hack together" in any language.

Regarding any statements about the speed of a format, this is tightly bound to existing implementations on a specified platform. There is no language-independent speed rating of a protocol.

Marko Topolnik
  • 195,646
  • 29
  • 319
  • 436