46

How would one go about storing and querying sparse directed or undirected graphs in Postgresql. There is something like pggraph, but that is still in planning stage.

I realize dedicated graph databases like Neo4J are best suited for this. However is there way to implement same within Postgresql, by using extension or a data type, which would avoid adding another database engine.dtata

Brian Burns
  • 20,575
  • 8
  • 83
  • 77
jethar
  • 2,223
  • 2
  • 22
  • 19
  • 1
    Currently, the needs are very basic. Being able to model relationships between entities, and able to query across the graph in efficient manner avoiding the join. My query is whether there is some available ready made solution for this. I have read - [graphs-in-the-database-sql-meets-social-networks](http://techportal.inviqa.com/2009/09/07/graphs-in-the-database-sql-meets-social-networks/) , but was just wondering if I was missing some obvious solution. – jethar Dec 26 '13 at 06:46
  • 8
    Conventional wisdom that graph databases will be best suited to modelling graph data may actually be wrong according to these IBM and Google researchers https://research.google.com/pubs/archive/43287.pdf How could it possibly be wrong? I think it boils down to the fact that postgres is just very good. It's hard to make a DB that can stand up to postgres in any category and most attempts to do so do not succeed, even for specialist use cases. – mako May 06 '16 at 04:46
  • 1
    Hello all and @mako! I'm very excited to see you all here. These question is so Important for me, just because I spent 3 years testing TitanDB as a junior developer from 0.5- to 1.0 version. I have very bad experience ( problem in me? ) with these stuff. Everytime I got some results in graph modeling, some error occured and was blocking my development. But for these 3 years I created many projects on Django+postgres and they are lilving. – Dmitry Yudin Aug 02 '16 at 18:08
  • I believe semantic web stuff often boils down into triples. RDF, N3, Turtle are all sort of the triple. sparQL? https://www.youtube.com/watch?v=M0pcFgI2-uE – grantwparks Dec 18 '19 at 05:30
  • 1
    An ISO standard is underway: https://www.gqlstandards.org/, maybe for PostgreSQL 13 or later ... – Christophe Roussy Dec 23 '19 at 16:08
  • The article linked in the 2013 comment from @jethar still exists. The URL has changed. Here's the updated URL: https://inviqa.com/blog/storing-graphs-database-sql-meets-social-network – vhs May 22 '22 at 13:52

5 Answers5

27

In essence, there are some techniques to efficiently query graph data within an SQL database, that apply to highly specialized scenarios.

You could opt to maintain a GRIPP index, for instance, if your interests lie in finding shortest paths. (It basically works a bit like pre-ordered tree index, applied to graphs.) To the best of my knowledge, none of these techniques are standardized yet.

With that being said, and seeing your comment that mentions social networks, the odds are that each of them will be overkill.

If your interest primarily lies in fetching data related to a user's friends, or something equivalent in the sense that it amounts to querying a node's neighborhood, the number of nodes you'll need to traverse in joins is so tiny that there is no need for specialized tools, data structures, etc.: simply use recursive CTEs.

http://www.postgresql.org/docs/current/static/queries-with.html

WITH provides a way to write auxiliary statements for use in a larger query. These statements, which are often referred to as Common Table Expressions or CTEs, can be thought of as defining temporary tables that exist just for one query.

For optimal performance when using the latter, shift as many where conditions within the with (...) part of the query, so as to eliminate nodes early.

Brian Burns
  • 20,575
  • 8
  • 83
  • 77
Denis de Bernardy
  • 75,850
  • 13
  • 131
  • 154
  • 1
    I agree, the question is vague, but there is certain purpose to madness. I have started using postgresql in last 6 months and still new to its nuances, plus I would not consider myself adept and using advanced topics. Your link to recursive CTEs is helpful though, as it gives helpful hints to proceed which I was not able to get by googling. I will +1 your answer, in case I do not get further useful replies . Thanks. – jethar Dec 26 '13 at 19:12
23

Use PostgreSQL for the underlying storage and use networkX or iGraph via PL/Python for the processing engine.

In their book Graph Databases, Ian Robinson, Jim Webber, and Emil Eifrem make a distinction between the underlying storage and the processing engine. If you look at the answer I followed in a recent problem (see here), you will see that I'm using PostgreSQL for the underlying storage and networkX as the processing engine. The performance gain relative to my original solution was huge (and similar to the ones described in the "Graph Databases" book) and implementing it was very easy.

Brian Burns
  • 20,575
  • 8
  • 83
  • 77
Ian Gow
  • 3,098
  • 1
  • 25
  • 31
  • 2
    I recommend to switch to using iGraph, see this benchmark: http://graph-tool.skewed.de/performance I also did extensiv tests by myself on facebook subgraph data. – Julius F Jan 23 '15 at 11:07
8

At this point I would recommend experimenting with AgensGraph, a promising multimodel distribution of PostgreSQL that offers first class graph databases and queries from both SQL and Cypher. Note that it is a full server, and not an extension like PostGIS, although PostgreSQL extensions can be added to it.

Update:

AgensGraph is now an extension named AGE and was handed over to the Apache foundation.

Eron Lloyd
  • 388
  • 4
  • 8
  • Link is broken. Try out: https://github.com/bitnine-oss/agensgraph – NateW Sep 23 '18 at 21:25
  • 2
    Looks like they've been doing some website upgrades; site is back up now. – Eron Lloyd Sep 25 '18 at 02:04
  • 1
    AgensGraph team just announced that they are working on their extension for Postgresql and that its beta version will be released in early next year. I think AgensGraph can be a good option if you are using PostgreSQL since its based on PG and its extension is coming soon. – Eya Aug 22 '19 at 21:12
  • 1
    @Eya That's GREAT news! I've been anticipating this for years. This is even more of a reason to check it out. – Eron Lloyd Aug 23 '19 at 22:07
  • @EronLloyd Yes, this is indeed great news! Have you used AgensGraph before in your projects? – Eya Aug 25 '19 at 01:37
  • I've been following AgensGraph on Github and there's a few issues about creating an extension, and nothing is said about one being developed. Eya's comment lead me to [their announcement](https://bitnine.net/blog-about-bitnine/blog-activities/agensgraph-postgres-vision-2019-review/), so it seems we really are going to have graph db capabilities soon! – Jan Sommer Aug 25 '19 at 22:14
  • 3
    The successor to AgensGraph is the [AGE extension](https://www.postgresql.org/about/news/announcing-age-a-multi-model-graph-database-extension-for-postgresql-2050/). – Ken Fehling Feb 16 '21 at 04:31
4

Since the question is generic, I would add a solution that can work for mostly planar graphs like street networks - PostgreSQL offers an excellent solution through Postgis Topology.

Postgis Topology stores geometries as edges, nodes and faces and their relative relationships. This means that from the geometry of a street network you can select edges and their starting and ending nodes and from this easily build a graph in the processing engine of your choice (networkx or graph-tool for Python are examples).

As I said, though, Postgresql/Postgis Topology works when we want to study geometries like street networks from the perspective of graph analysis.

Brian Burns
  • 20,575
  • 8
  • 83
  • 77
Duccio A
  • 1,402
  • 13
  • 27
0

Use AgensGraph from bitnine.net https://bitnine.net/

And if you want to visualise your graph you can use AgensBrowser also from bitnine

Jay Kominek
  • 8,674
  • 1
  • 34
  • 51
Joe
  • 36
  • 3