Serialization - Viewing the Object Graph from a Stream

Question

I'm wondering if there's a way in which I can create a tree/view of a serialised object graph, and whether anyone has any pointers? EDIT The aim being that should we encounter a de-serialization problem for some reason, that we can actually view/produce a report on the serialized data to help us identify the cause of the problem before having to debug the code. Additionally I want to extend this in the future to take two streams (version 1, version 2) and highlight differences between the two of them to help ensure that we don't accidently remove interesting information during code changes. /EDIT

Traditionally we've used Soap or XML serialization, but these are becoming too restricted for our needs, and Binary serialization would generally do all that we need. The reason that this hasn't been adopted, is because it's much harder to view the serialized contents to help fix upgrade issues etc.

So I've started looking into trying to create a view on the serialized information. I can do this from an ISerializable constructor to a certain extent :

public A(SerializationInfo info, StreamingContext context)
{}

Given the serialization info I can reflect the m_data member and see the actual serialized contents. The problem with this approach is

It will only display a branch from the tree, I want to display the entire tree from the root and it's not really possible to do from this position.
It's not a convenient place to interrogate the information, I'd like to pass a stream to a class and do the work there.

I've seen the ObjectManager class but this works on an existing object graph, whereas I need to be able to work from the stream of data. I've looked through the BinaryFormatted which uses an ObjectReader and a __BinaryParser, hooking into the ObjectManager (which I think will then have the entire contents, just maybe in a flat list), but to replicate this or invoke it all via reflection (2 of those 3 classes are internal) seems like quite a lot of work, so I'm wondering if there's a better approach.

Why is XML serialization 'becoming too restricted for our needs'? For binary components, e.g. images, you can use base64 encoding, which is relativelly efficient. If you have huge objects to serialize, it is worth going binary, but then create your own editor to read it. — zmilojko, Nov 10 '11 at 11:37
XML serialization is proving to be too slow, due to the quantities of information serialized, we've had a number of encoding issues in the past the other half of our product is ISerializable. We'd like to move to a common format, and XML had just proven to give us too many headaches thus far — Ian, Nov 10 '11 at 11:44
can you specify the question more precisely, what are you looking for? A better format or a tool to read the binary serialized objects? Because obvious answer to your question is: deserialize it and show it in your app — zmilojko, Nov 10 '11 at 11:48
have you looked at protobuf? protobuf-net to be specific. It's blazingly fast and allows higher level analyses. — Polity, Nov 10 '11 at 11:54
@zmilojko: I've updated my question, it's more about when we can't de serialize the stream for some reason and we want to find out why. — Ian, Nov 10 '11 at 12:58
@Polity: I've looked at protobuf, but we've tried implementing custom serializers ourselves and they've been a headache. Reading about protobuf previously it looked like there were a few things it didn't support, and I didn't like the integer attribute convention, I believe it would make it very difficult to correctly handle upgrades etc in a larger codebase. For a smaller, less dynamic object graph I think it'd be great. — Ian, Nov 10 '11 at 13:00
I am buffled about 'when we cannot deserialize'. At least in my experience, all standard serializers work the same way. So make it work with XML (even if it is slow) and it will work the binary one. If you are implementing your own, prepare to suffer! But the answer to your question would be: create a binary parser. And that is exactly the reason I like .NET: I do not have to do this myself, .NET already has it implemented. — zmilojko, Nov 10 '11 at 14:19
@zmilojko: I believe your confusing XML with Soap. Soap has been deprecated, and XML doesn't handle a lot of simple cases (Dictionaries, Generics etc). Hence Binary is the simplest choice. Errors in serialization tend to be due to errors in code, or when ISerializable hasn't been implemented and we want to explicitly write it, being able to see the contents of the object graph and member names from an example stream make it much easier to write. — Ian, Nov 10 '11 at 14:56

score 1 · Answer 1 · edited Jan 04 '12 at 10:57

You could put a List<Child class> in every parent class (Even if there the same)

and when you create a child you immediately place it in that list or better yet declare it whilst adding it the list

For instance

ListName.Add(new Child(Constructer args));

Using this you would serialize them as one file which contains the hierarchy of the objects and the objects themselves.

If the parent and child classes are the same there is no reason why you cannot have dynamic and multi leveled hierarchy.

score 0 · Answer 2 · answered Nov 10 '11 at 13:49

In order to achieve what you describe you would have to deserialize whole object graph from stream without knowing a type from which it was serialized. But this is not possible, because serializer doesn't store such information. AFAIK it works in a following way. Suppose you have a couple of types:

class A { bool p1 }
class B { string p1; string p2; A p3}
// instantiate them:
var b = new B { p1 = "ppp1", p2 = "ppp2", p3 = new A { p1 = true} };

When serializer is writing this object, it starts walking object graph in some particular order (I assume in alphabetic order) and write object type and then it's contents. So your binary stream will like this:

[B:[string:ppp1][string:ppp2][A:[bool:true]]]

You see, here there are only values and their types. But order is implicit - like it is written. So, if you change your object B, to suppose

class B { A p1; string p3; string p3;}

Serialzer will fail, because it will try to assing instance of string (which was serialized first) to pointer to A. You may try to reverse engineer how binary serialization works, then you may be able to create a dynamic tree of serialized objects. But this will require considerable effort.

For this purpose I would create class similar to this:

class Node
{
    public string NodeType;
    public List<Node> Children;
    public object NodeValue;
}

Then while you will be reading from stream, you can create those nodes, and recreate whole serialized tree and analyze it.

Are you sure it works this way? From what I've looked at I thought the Binary Serializer did indeed serialize type information... If you were to look at Soap serialization for example the fully qualified assembly type names are contained the Soap Xml. — Ian, Nov 10 '11 at 15:00

Serialization - Viewing the Object Graph from a Stream

2 Answers2