Implementing a basic graph database engine

Question

I need to implement a simple graph database engine, what are the things should I consider? First, I am confused between which data structure to use, I mean graph representation (like adjacency matrix or adjacency list) or the actual graph itself? I need this to be scalable. Later how do I store the graph in the hard disk as files? After I store the graph data in the form of files, I would also need a way to selectively load only certain files into the graph, since I can not load everything at once into the RAM. Sorry for being vague, but I need someone to point me in the right direction. Also please suggest the language I can use, can I use python for this project? Thank you.

because it defeats the purpose of the project? I'm talking about creating something like neo4j, but a much simpler version, not using it... — aditya sista, Feb 27 '16 at 19:09
I created such a database in Python already there's several implementations have a look at https://pypi.python.org/pypi/ajgu https://pypi.python.org/pypi/AjguDB and also this post http://hypermove.net/notes/do-it-yourself-a-graph-database-in-python/ — amirouche, Feb 28 '16 at 10:14
the basic is that you need to use key/value like leveldb (but it's slow) and build upon it the graph datastructure. You can go with a documentat store too, but usually they don't provide good ACID semantic. — amirouche, Feb 28 '16 at 10:16
@adityasista can you mark or at least upvote my answer if it answer your question? Thanks. — amirouche, Mar 01 '16 at 19:08
@amirouche: I did, my reputation is too low for that to appear publicly... — aditya sista, May 17 '16 at 09:26

score 2 · Answer 1 · edited May 23 '17 at 11:44

Depending on your needs you will implement different interface to the database ie. an adjacency matrix or the graph itself.

Instead of using a file based database, the important step forward you can take is use a key/value store like bsddb, leveldb or wiredtiger (prefered). This will deal with caching often accessed files, provide ACID semantic, and indices if you use wiredtiger.

The storage layer made upon the key/value store, can have several layout. It depends on the final interface you need.

To get started with developing custom databases using key/value stores I recommend you read questions answered about mostly leveldb and bsddb on SO.

Like the following:

Implementing a basic graph database engine

1 Answers1