2

What is best way to store big objects? In my case it's something like tree or linked list.

I tried the following:

1) Relational db

Is not good for tree structures.

2) Document db

I tried RavenDB but it raised System.OutOfMemory exception when i call SaveChanges method

3) .Net Serialization

It's working very slow

4) Protobuf

It cannt to deserialize List<List<>> types and im not sure about linked structures.

So...?

Neir0
  • 12,849
  • 28
  • 83
  • 139
  • 1
    Define "big objects". What size? – Matías Fidemraizer Jun 09 '12 at 08:39
  • @Matías Fidemraizer i have a problem with already 100mb file (after binary serialization) – Neir0 Jun 09 '12 at 08:41
  • Seems more like "many objects" than "big objects". Document it better, as it stands this question is unanswerable. Provide a sample of the data, outline the requirements. – H H Jun 09 '12 at 08:44
  • @Henk Holterman yes, many objects but all objects are linked. It's tree. In another words "How to store a big tree?" – Neir0 Jun 09 '12 at 08:48

4 Answers4

3

You mention protobuf - I routinely use protobuf-net with objects that are many hundreds of megabytes in size, but: it does need to be suitably written as a DTO, and ideally as a tree (not a bidirectional graph, although that usage is supported in some scenarios).

In the case of a doubly-linked list, that might mean simply: marking the "previous" links as not serialized, then doing a fix-up in an after-deserialize callback, to correctly set the "previous" links. Pretty easy normally.

You are correct in that it doesn't currently support nested lists. This is usually trivial to side-step by using a list of something that has a lists but I'm tempted to make this implicit - i.e. the library should be able to simulate this without you needing to change your model. If you are interested in me doing this, let me know.

If you have a concrete example of a model you'd like to serialize, and want me to offer guidance, let me know - if you can't post it here, then my email is in my profile. Entirely up to you.

Eren Ersönmez
  • 38,383
  • 7
  • 71
  • 92
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Thanks for answer. It will be nice if you are provide example or link to documentation how to serialize linked list. I donnt understand. Let we have a simple linke list "class LinkedListItem{ string Value; LinkedListItem Prev; LinkedListItem Next; }" Or what if we have a tree "LinkedListItem{ string Value; LinkedListItem Prev; List Next; }" Is it real to serialize such type? – Neir0 Jun 09 '12 at 09:02
  • @Neir0 why is Next a list? That isn't really a list, that is a tree. If by "next" you mean "parent", then I think I see what you mean. But yes, in noth the list and tree example, Yes, that should be serializable - I'd need a couple of minutes at a computer to give a full example (I'm on mobile at the moment). – Marc Gravell Jun 09 '12 at 09:06
  • @Neir0 I will try to do a tree example later. – Marc Gravell Jun 09 '12 at 09:11
0

Did you tried Json.NET and store the result in a file?

qianfg
  • 878
  • 5
  • 8
  • according to the http://stackoverflow.com/questions/3790728/performance-tests-of-serializations-used-by-wcf-bindings/3793091#3793091 it's not work fast and raven db using newton Json.net too. So...with raven db i have outofmemory exception. – Neir0 Jun 09 '12 at 08:44
0

Option [ 2 ] : NOSQL ( Document ) Database

I suggest Cassandra.


From the cassandra wiki,

Cassandra's public API is based on Thrift, which offers no streaming abilities 
any value written or fetched has to fit in memory. This is inherent to Thrift's 
design and is therefore unlikely to change. So adding large object support to
Cassandra would need a special API that manually split the large objects up 
into pieces. A potential approach is described in http://issues.apache.org/jira/browse/CASSANDRA-265.    
As a workaround in the meantime, you can manually split files into chunks of whatever 
size you are comfortable with -- at least one person is using 64MB -- and making a file correspond 
to a row, with the chunks as column values.

So if your files are < 10MB you should be fine, just make sure to limit the file size, or break large files up into chunks.

Ahmed Ghoneim
  • 6,834
  • 9
  • 49
  • 79
0

CouchDb does a very good job with challenges like that one.

storing a tree in CouchDb

storing a tree in relational databases

Mare Infinitus
  • 8,024
  • 8
  • 64
  • 113