3

For the sake of efficient means of processing data, Neo4j said they search graph data in the way of "index-free adjacency". However, I know AgensGraph uses the way of "Btree" of PostgreSQL for query

What is the benefit to use "Btree" of PostgreSQL comparing to "Index-free adjacency" ?

toraritte
  • 6,300
  • 3
  • 46
  • 67
ryanlim
  • 71
  • 2

2 Answers2

8

Disclaimer: I lead the development of AgensGraph

Neo4j uses a fixed-size array to store information about nodes and relationships. One of the benefits the structure provides is that it can find a node or a relationship's location in the file with its internal ID by calculating multiplication of ID and the fixed size of the elements. So Neo4j doesn't need Btree structure (which is popular in RDBMS) and they insist that Neo4j provides "index-free adjacency" for graph data.

In contrary, AgensGraph uses to find a vertex (node) or an edge (relationship). So people might feel that AgensGraph's approach is not so fast as Neo4j because the asymptotic complexity is O(log n) compared to O(1) of Neo4j.

In theory, this is true. But in reality, there are several points to consider. First, in RDBMS, the base of log is very big. So the height of Btree (log n) is very small (most cases <= 3) and the internal pages of Btree are mostly cached in memory.

And more importantly, actually it is not that simple considering real disk I/O to be required when processing graph traversal.

When a query finds edges adjacent to a vertex (whose ID is v1), in AgensGraph, it searches Btree and it can retrieve all adjacent edges' ID with one loop-up of Btree and sequentially read the leaf entries of the Btree. The edges' are clustered in Btree' leaf entries so it is fast to retreive the adjacent edges. Although there are serveral adjacent edges, AgensGraph needs one lookup of Btree. But in Neo4j, the relationships can be stored in different pages of the fixed size array. The adjacent relationships are linked using doubly-linked list. So if they are scattered across the file, it needs more random I/O.

Actually, we find that AgensGraph is faster than Neo4j, and it is more efficient for updates too even in multi-client session because Btree is also developed to be optimized for these concurrent accesses.

Rob Grant
  • 7,239
  • 4
  • 41
  • 61
Kisung Kim
  • 463
  • 3
  • 6
  • Worth noting that BTree is a tree structure, and therefore a graph, so AgensGraph uses a graph algorithm to find adjacent edges. – Jasper Blues Nov 02 '19 at 12:22
1

As I understand it, "index-free adjacency" means that Neo4j can get the adjacent elements of a node in constant time O(1).

I don't know how AgensGraph works in particular, but in a BTree getting an element is O(log n).

David Perez
  • 478
  • 5
  • 17