39

What is a clean/efficient method for storing the directory Hierarchy/tree in a Key-Value database (in my case MongoDB but any of them)?

For example a tree structure

- Cars 
   + Audi 
   + BMW
      - M5
   + Ford
- Color
   + Red
      - Apple
      - Cherry
   + Purple
- Funny

The method I am using now, each object links to it's parent

{ 
  dir: "red"
  parent-dir: "color"
}

This makes it very efficient/fast to insert and reorder any aspect of the tree (for example if I want to move Red and all it's children to the Cars directory).

But this method sucks when I want to all subdirectories and their children for a given directory recursively. To make it efficient to parse I can have a structure for example

{ 
  dir: "red"
  children: "audi, bmw, ford"
}

{ 
  dir: "bmw"
  children: "m5"
}

But if I want to modify the tree, a whole bunch of objects need to touched and modified.

Are there any other methods to storing a directory structure in a KV store?

Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
The Unknown
  • 19,224
  • 29
  • 77
  • 93
  • 4
    Really this question is more general... What is the best way to store ANY hierarchical data in a KV data store... – dicroce Oct 24 '09 at 21:06
  • 1
    +1: I did not know about this KV-trend. I learned something new, thanks. – slashmais Dec 20 '09 at 06:57
  • 1
    PS: for those like me, here is a decent exposition of KV: http://www.readwriteweb.com/enterprise/2009/02/is-the-relational-database-doomed.php – slashmais Dec 20 '09 at 06:58
  • 1
    MonogoDB is **NOT** a key value store! [It's a document oriented database database](https://en.wikipedia.org/wiki/Document-oriented_database) – Alex Bitek Jan 15 '13 at 13:48

4 Answers4

61

The method you currently use now is called adjacency list model.

Another model to store hierarchical data in a (relational) database is the nested set model. Its implementation in SQL databases is well known. Also see this article for the modified preorder tree traversal algorithm.

A very simple method: you could store a path per object - with those it should be easy to query trees in NOSQL databases:

{ path: "Color", ... }
{ path: "Color.Red", ... }
{ path: "Color.Red.Apple", ... }
{ path: "Color.Red.Cherry", ... }

When nodes will be removed or renamed some paths must be updated. But in general, this method looks promising. You just have to reserve a special character as separator. The storage space overhead should be negligible.

edit: this method is called materialized path

Finally, here is a comparison of different methods for hierarchical data in NOSQL databases.

nedim
  • 1,767
  • 1
  • 18
  • 20
Frunsi
  • 7,099
  • 5
  • 36
  • 42
  • 3
    There is a pretty good article in the MongoDB Documentation about the possibilities to store trees: http://www.mongodb.org/display/DOCS/Trees+in+MongoDB – amiuhle Jun 06 '12 at 07:16
  • @Frunsi Why not use Zookeeper to store this information instead as it comes with in built support for hierarchy – Itachi Nov 25 '15 at 17:09
  • @Itachi: Why? Why not? This is as off-topic as if I'd ask you why don't you always use a child safety seat when driving in a car. – Frunsi Nov 25 '15 at 21:44
  • 1
    Working link for Tree Structures: https://docs.mongodb.com/manual/applications/data-models-tree-structures/ – duggi Jun 18 '20 at 07:09
1

I don't have a huge amount of NOSQL experience, so this isn't a definitive answer, but here's how I'd approach it:

I would likely use your first approach, where you have:

{
  dir: 'dir_name',
  parent_dir: 'parent_dir_name'
}

And then set up a map-reduce to quickly query the children of a directory. MongoDB's map-reduce functionality is still only available in the development branch and I haven't worked with it yet, but in CouchDB (and I assume, with a few modification, in MongoDB) you could do something like:

map:
function(doc) {
  emit( doc.parent_dir, doc.dir );
}

reduce:
function(key, values) {
  return( values );
}

Which would give you the list of sub-directories for each parent directory.

Emily
  • 17,813
  • 3
  • 43
  • 47
-1

I suggest storing a heap to the the id's of the data items. I think this is the best plan. If you need lots and lots of stuff any heap element could be an index to another heap.

eg

{ "id:xxx", "id:yyy", "sub-heap-id:zzz"....}

If this is not clear post a comment and I will explain more when I get home.

Hogan
  • 69,564
  • 10
  • 76
  • 117
-3

Make an index!

http://www.mongodb.org/display/DOCS/Indexes

Matt Williamson
  • 39,165
  • 10
  • 64
  • 72