1

Possible Duplicate:
How to refer to children in a tree with millions of nodes

I'm trying to implement a tree which will hold millions of nodes, which in turn can have an unspecified number of children nodes.

To achieve this (since each node can have more than one child node), I'm storing a node's children within a Dictionary data structure. As a result of this, when each object node is created (out of millions), I've got a node object which contains a character value stored in the respective node, aswell as a separate Dictionary structure which holds a reference to the children nodes.

My tree works for a few thousand nodes, however when it reaches millions of nodes, an out of memory exception occurs. Is this due to the fact that each one of the millions of nodes running in memory also has its own Dictionary? i.e. I've got millions of objects running?

I need to have these objects running in memory, and cannot use files or databases. Could anyone suggest a solution?

Community
  • 1
  • 1
Dot NET
  • 4,891
  • 13
  • 55
  • 98
  • 3
    what good would this do you? is it readable? why can't you use files or db? – Daniel Casserly Mar 12 '12 at 15:39
  • You could change your dictionary object to an array for one. – N_A Mar 12 '12 at 15:40
  • 3
    Can you show us the structure of your Node class? – Tudor Mar 12 '12 at 15:41
  • What type of keys are you using? For "main" dictionary and for "leaf" dictionary. Is your data structure simply Dictionary> or what? – xanatos Mar 12 '12 at 15:42
  • 1
    I would recommend reading up on how to implement trees, since you're at the point where an efficient implementation obviously matters. – Alexander Corwin Mar 12 '12 at 15:42
  • Could you show some code what the nodes look like and what kind of code you use for lookup? – Joachim Isaksson Mar 12 '12 at 15:44
  • Are you modifying/adding items to the dictionary as you go, or creating the structures all at once? – thecoop Mar 12 '12 at 15:45
  • some details plus source code are necessary to help...As long as you have enough RAM available and compile your application for x64 and no single object is bigger than 2 GB there is no problem with having several million object/dictionaries etc. - so the question is: do you have enough RAM ? do you have a memory leak ? Have you configured the GC to run in "Server"-mode ? Is your tree-implementation efficient in terms of memory ? – Yahia Mar 12 '12 at 15:46
  • @Yahia: if objects get put on the large object heap, you can get OOM exceptions even when you have plenty of memory left: http://www.simple-talk.com/dotnet/.net-framework/the-dangers-of-the-large-object-heap/ – thecoop Mar 12 '12 at 15:53
  • @thecoop yes, I understand that but from the question it is not clear if there are even objects > 85 K which is why I asked for source code... – Yahia Mar 12 '12 at 15:56
  • Objects only contain a single-character `char` value, as well as a Dictionary which contains children of each node. – Dot NET Mar 12 '12 at 16:10
  • Do you work with classes or structs?
    Do you know the number of your childs when inializing or not?
    The children should be stored in Lists, not dictionaries.
    – weismat Mar 12 '12 at 15:43
  • Would storing children in Lists make such a difference? I'm using a Dictionary so that the corresponding node's value will be stored too. – Dot NET Mar 12 '12 at 16:12
  • Dictionaries add about 50 % overhead compared to Lists - if you know the number of nodes for the list, then the memory advantage might be even bigger. – weismat Mar 12 '12 at 16:20
  • Unfortunately I do not know the number of items, as each node can have an unspecified amount of children nodes stored in the datastructure. – Dot NET Mar 12 '12 at 16:21

5 Answers5

2

Your OOM exception may be due to LOH fragmentation, rather than actually running out of memory. You could try switching to SortedDictionary, which uses a red-black tree, rather than Dictionary, which uses a hashtable, and see if that improves matters. Or you could implement your own tree structure.

thecoop
  • 45,220
  • 19
  • 132
  • 189
1

You could try to use Windows with 64 bits and compile the program at 64 bits. This will give you much more memory... It isn't a magic bullet (there are still limits to how much big a memory structure can be)

xanatos
  • 109,618
  • 12
  • 197
  • 280
  • 1
    @KeithS I'm quite happy you think so, but in truth it's quite different. A single OBJECT is limited to 2 GB (so the reason for BigArray suggested by Dor Cohen) – xanatos Mar 12 '12 at 15:53
  • KeithS: I believe that's incorrect: http://stackoverflow.com/questions/982051/net-max-memory-use-2gb-even-for-x64-assemblies – thecoop Mar 12 '12 at 15:54
1

You can read about Memory Limits for Windows Releases.

http://msdn.microsoft.com/en-us/library/aa366778(v=vs.85).aspx

Think about using 64-bit..

Take a look on this question: Is there a memory limit for a single .NET process

For your scenario try using BigArray

http://blogs.msdn.com/b/joshwil/archive/2005/08/10/450202.aspx

Community
  • 1
  • 1
Dor Cohen
  • 16,769
  • 23
  • 93
  • 161
1

The solution isn't going to be easy.

Options are:

  1. Offload some of those nodes to disk, where you have a greater amount of "workspace" to deal with. Only load in memory what really really needs to be there. Due to the radical difference between disk and RAM speeds this could result in a huge performance penalty.

  2. Increase the amount of RAM in the machine to accommodate what you are doing. This might necessitate a move to 64 bit (if it is currently a 32 bit app). Depending on your real memory requirements this could be pretty expensive. Of course, if you have a 32bit app now AND have plenty of RAM available, switching to 64 bit is going to at least get you above the 3 to 4GB range...

  3. Stream line each node to have a much much smaller footprint. In other words, do you need everything Dictionary offers or can you do with just a struct defining the left and right links to other nodes? Basically, take a look at how you need to process this and look at the traditional ways of dealing with tree data structures.

NotMe
  • 87,343
  • 27
  • 171
  • 245
0

Another solution I can suggest if your only restriction is no file access (if you're opposed to ANY sort of database, this answer is moot) - is to use an in-memory database (like SQlite) or something else that provides similar functionality. This will help you in many ways, not the least of which is that the database will perform memory management for you and there are many proven algorithms for storing large trees in databases that you can adapt.

Hope this helps!

Ani
  • 10,826
  • 3
  • 27
  • 46