1

I need to implement a simple graph database engine, what are the things should I consider? First, I am confused between which data structure to use, I mean graph representation (like adjacency matrix or adjacency list) or the actual graph itself? I need this to be scalable. Later how do I store the graph in the hard disk as files? After I store the graph data in the form of files, I would also need a way to selectively load only certain files into the graph, since I can not load everything at once into the RAM. Sorry for being vague, but I need someone to point me in the right direction. Also please suggest the language I can use, can I use python for this project? Thank you.

ravenspoint
  • 19,093
  • 6
  • 57
  • 103
aditya sista
  • 131
  • 1
  • 12
  • Why don't you just use an existing graph database like neo? – tddmonkey Feb 27 '16 at 19:06
  • because it defeats the purpose of the project? I'm talking about creating something like neo4j, but a much simpler version, not using it... – aditya sista Feb 27 '16 at 19:09
  • I created such a database in Python already there's several implementations have a look at https://pypi.python.org/pypi/ajgu https://pypi.python.org/pypi/AjguDB and also this post http://hypermove.net/notes/do-it-yourself-a-graph-database-in-python/ – amirouche Feb 28 '16 at 10:14
  • the basic is that you need to use key/value like leveldb (but it's slow) and build upon it the graph datastructure. You can go with a documentat store too, but usually they don't provide good ACID semantic. – amirouche Feb 28 '16 at 10:16
  • @adityasista can you mark or at least upvote my answer if it answer your question? Thanks. – amirouche Mar 01 '16 at 19:08
  • @amirouche: I did, my reputation is too low for that to appear publicly... – aditya sista May 17 '16 at 09:26

1 Answers1

2

Depending on your needs you will implement different interface to the database ie. an adjacency matrix or the graph itself.

Instead of using a file based database, the important step forward you can take is use a key/value store like bsddb, leveldb or wiredtiger (prefered). This will deal with caching often accessed files, provide ACID semantic, and indices if you use wiredtiger.

The storage layer made upon the key/value store, can have several layout. It depends on the final interface you need.

To get started with developing custom databases using key/value stores I recommend you read questions answered about mostly leveldb and bsddb on SO.

Like the following:

Community
  • 1
  • 1
amirouche
  • 7,682
  • 6
  • 40
  • 94