Dump and restore a tree data structure to/from a file

Question

I have a n-ary tree data structure with a specific set of classes. The data structure goes through a set of transformations say 1 to n. At the end of the above transformations, the final tree is the output result which gets used to retrieve information.

Is there a way that I can (binary) dump the tree after every correct transformation. The dump will reflect the state of the tree after the previous transformation. Thus, if any transformation goes wrong, I can restore the dump in memory w/o going through the correct transformation again. It's similar to the Checkpoint functionality that GDB provides to save the snapshot of the program's state.

I looked at NoSQL databases like MongoDB, CouchDB, Redis etc but they are primarily key-value datastores(Redis) or store the information in document-type structure w/o storing the associations/relationships between the nodes in a tree(MongoDB).
I also looked at Neo4j graph database, which is a great tool for representing graph-like structures.

Redis-dump, Neo4j-dump and MongoDB-dump are available but I am not able to decide which one to pick among these. Which of the above would be easier to populate as the dump creation and restore time should not be large.

I wanted to know the opinion and feedback from fellow programmers who have faced this problem and how did they solve it. What would be the best way of doing this?

P.S. My existing implementation is in C++. Let me know if anything is unclear and I'll try to explain it in a better way.

doron · Answer 1 · 2013-10-22T10:24:21.687

2

What you need to do is add a serialize method to each node in the tree. This method will need to know how to serialize itself and all its children.

Serializing the node itself is specific to what data is stored in the node although you will probably use some form of Type-Length-Value format. Serializing the children is done recursively.

The only tricky thing is finding out how big a buffer you will need to create. To that end you will probably need a recursive method that can work out the buffer size.

One you have the data in a nice flat buffer, you can either save it to a flat file or if easier, to some sort of database.

edited Oct 22 '13 at 10:24

answered Oct 14 '13 at 09:08

doron

27,972
12
65
103

There's an open source project to help with the serialization https://code.google.com/p/protobuf/. It has some nice features like forwards and backwards compatibility. There are probably others. – Sorin Oct 18 '13 at 11:27
There is also always ASN.1 – doron Oct 18 '13 at 12:02
Thanks doron & @Sorin for your answer and feedback. Will look at the serialize approach. – Jatin Ganhotra Oct 22 '13 at 09:39

score 2 · Answer 2 · answered Oct 16 '13 at 13:58

2

I believe what you want is functionality provided by boost::serialize. It will serialize and de-serialize STL collections for you.

See here: http://www.boost.org/doc/libs/1_54_0/libs/serialization/doc/tutorial.html#stl

answered Oct 16 '13 at 13:58

Nathan Doromal

3,437
2
24
25

Thanks for your answer. Will look at the serialize approach. – Jatin Ganhotra Oct 22 '13 at 09:40

score 1 · Accepted Answer · edited May 23 '17 at 12:14

In Neo4j, would it work for you to simply maintain each generation of the pattern in the database until you are satisfied with the result, and then remove redundant generations? You could create a node e.g.

(t1:Transformation {transformationId:1})

to serve as an index or anchor for a set of transformations, then create each new generation of the pattern and relate it to the transformation node with a relationship that has an ordering property

(t1)-[:STEP {order:0}]->(root)-[:..*]->(branch) //tree
(t1)-[:STEP {order:1}]->(transformedRoot)-[:..*]->(transformedBranch) //first transformed tree
(t1)-[:STEP {order:2}]->(transformedRoot2)-[:..*]->(transformedBranch2) //second transformed tree

Nodes that are not altered in a transformation could be included directly in the new pattern

(t1)-[:STEP {order:3}]->(transformedRoot)-[:..*]->(originalBranch) // transformed tree with original branch

until they are indeed altered

(t1)-[:STEP {order:4}]->(transformedRoot)-[:..*]->(transformedBranch)

You would have a snapshot of each transformation for as long as you want it and you could interact with it to rollback, compare or do anything you want, in the DB, instead of exporting/importing dumps.

Edit:
Re your comment
1)
Would Neo4j extend or replace your C++ implementation? If replace, there are several tools for importing initial quantities of data into Neo4j, particularly note 1 and 2.

If extend, it depends on how you would interact with your data. As far as I know there are no very good Neo4j drivers for C++, though there are some more or less mature projects (1,2,3 and 3, also 4) that may be helpful. I would run Neo4j as a server and build a RESTful client (1,2,3), that uses Cypher & (de)serialized JSON to communicate with it. I would spend time learning to craft good cypher queries, and utilize the transactional server endpoint. I would however do this in Java or Python, or some other language with good driver support for Neo4j, probably not C++.

2)
It's hard to give examples without data. Start by looking at the modeling examples, then search the Google Group discussions for threads where people go into details about how they model their domain and design their queries. Here and here for instance are referred to two different ways to order with relationships; by relationship property and by relationship type. Then if you want help modelling or querying, put sample data into Neo4j Console and share the link along with your question on SO. (I put a tiny sample with query in the linked console, you can start playing with that.)

Thanks. That's a great idea. Since I am new to Neo4j, can you please share 1. what would be the best strategy to populate the Neo4j database from my C++ implementation and 2. some more info on ordering of transformated nodes. A link/tutorial would do great. — Jatin Ganhotra, Oct 16 '13 at 11:39
Thanks for your edit...Thanks a lot of information for me to chew and your effort is greatly appreciated. :) — Jatin Ganhotra, Oct 22 '13 at 09:41
I'm awarding the bounty to you, as my inner feeling is pushing me to try and use Neo4j. Will try the serialize approach too. — Jatin Ganhotra, Oct 22 '13 at 09:42

Dump and restore a tree data structure to/from a file

3 Answers3