0

I need a way to load graphs in a structured way that I can perform queries in a query language, without the usage of an "external database", in the sense of a separate process, with an endpoint to perform requests, etc.

Just to ensure I'm not falling in the XY problem, the problem I'm trying to solve is to make a service that:

  • Receives a graph from a client
  • Receives a rule for this graph, meaning a set of assertions over the structure of this graph, such as "at least one node of type A". This rule can be a Cypher of Gremlin query, for example.
  • Run the rule over the graph
  • Erase the "database". The graph exists just for the purpose to run the rule.

The idea of a database is because the rule must be human-readable, so having languages like Cypher or Gremlin, that are already consolidated would be easier than creating one.

I've considered using:

  • AWS Neptune
  • Neo4j
  • Apache TinkerPop
  • RedisGraph

But all of them need a separate process that exposes an endpoint, so the graph will not be a "runtime" graph.

Incognitex
  • 30
  • 5
  • 1
    It sounds like you are describing an in-memory, ephemeral graph, that runs in the same process address space as the application. Both TinkerGraph and JanusGraph support this mode of operation. If you don’t need transactions, TinkerGraph might be an ideal fit. If that sounds interesting I can expand in an answer. – Kelvin Lawrence Mar 24 '23 at 12:29
  • You describe it perfectly! Can you explain more, please? – Incognitex Mar 24 '23 at 12:39

2 Answers2

3

If using Gremlin is an option you can run a TinkerGraph (part of the Apache TinkerPop project) embedded in your application. The only caveat is that the application needs to be running on one of the JVM supported languages (e.g., Java, Groovy, Scala etc.). To configure the graph is as simple as:

myGraph = TinkerGraph().open()
g = myGraph.traversal()

You can load data using Gremlin steps or using GraphSON (JSON) or GraphML formatted files.

The JanusGraph project also supports a graph type of "in-memory". The main difference when using JanusGraph is that it supports transactions and so you need to commit/rollback as necessary when you work with the graph.

If you don't need transaction support, you could be up and running with TinkerGraph in minutes. It can be installed either using Docker or just unpacking the JAR files locally. The official documentation is here.

You will find a worked example of setting up TinkerGraph and loading data here and an example using Java is here

Kelvin Lawrence
  • 14,674
  • 2
  • 16
  • 38
  • This seems to fit my case, thank you! Do you know if there is an advantage of using TinkerGraph instead of Neo4j embedded? – Incognitex Mar 24 '23 at 18:36
  • I think both would probably be fine. Perhaps more a question of do you have any preference regarding writing queries in Gremlin or Cypher. I think either approach should work for your case though. – Kelvin Lawrence Mar 24 '23 at 20:14
  • I've been reading your practical tutorial and wanted to know if it's possible to run Gremlin queries in Java using query strings. Something like: ... result = Gremlim.execute("g.V().has()...)", so that I can import queries from external files – Incognitex Mar 30 '23 at 13:20
  • Yes it can be done using the GremlinGroovyScriptEngine class. Perhaps, if you wouldn't mind, create a new post/question and I can put a code sample into the answer there. – Kelvin Lawrence Mar 30 '23 at 15:12
  • Sure: https://stackoverflow.com/questions/75890943/how-can-i-run-queries-in-tinkergraph-using-query-strings – Incognitex Mar 30 '23 at 16:48
  • It seems that TinkerGraph begins to support transactions since 3.7.0. See https://tinkerpop.apache.org/docs/current/reference/#tinkergraph-gremlin-tx – yangty89 Aug 07 '23 at 03:46
2

You can embed neo4j into a Java application. And the input data can be included in the application, or be in a file that is on the same machine.

In fact, you can just put the database file structure on the same machine, so that it can be used immediately. Packaging everything in a docker image is probably the best approach for this.

cybersam
  • 63,203
  • 6
  • 53
  • 76