7

Is it possible to generate gremlin script from the bytecode?

I am working on a POC in which I need to query graph Azure CosmosDB database via Gremlin API.

Currently, Azure CosmosDB does not support bytecode. Azure development team has started working on this but no release timeline has been published so far.

I would like to prepare working code which would require minimum refactoring in future when bytecode support will be generally available.

Based on the Apache TinkerPop docs, there are two ways of submitting Gremlin queries: bytecode and script

# script
client = Client('ws://localhost:8182/gremlin', 'g')
list = client.submit("g.V().has('person','name',name).out('knows')",{'name': 'marko'}).all()

# bytecode
g = traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
list = g.V().has("person","name","marko").out("knows").toList()

The "bytecode way" seems to me much more efficient (syntax checking, IDE intellisens, etc.) moreover I am interested in creating the DSL (Domain Specific Language).

Would it be possible to use the fluent api and serialize it to string, in a way similar to this:

client = Client('ws://localhost:8182/gremlin', 'g')
g = traversal()
q = g.V().has("person","name","marko").out("knows").toString()
list = client.submit(q).all()

I am using python 3.5 and gremlinpython 3.4.0

Wolfgang Fahl
  • 15,016
  • 11
  • 93
  • 186
Sebastian Widz
  • 1,962
  • 4
  • 26
  • 45
  • 1
    Also there is [gremlinpy](https://github.com/emehrkay/gremlinpy) which builds scripts (with bound parameters) for you. I personally feel (due to the state of lackluster bytecode support from vendors, e.g. CosmosDB) that this route is maybe a better way if you want to use python. – Sascha Mar 22 '19 at 13:03
  • @Sascha Thanks, I will definetly give it a try. I might be wrong but I think that writeing the translator would be more benefital becaouse it gives me the possibility to write DSL using GraphTraversal extensions and translating the bytecode output to groovy script. At some point when bytecode is availeble I will just switch to bytecode. Not sure if this would also be possible with gremlinpy. – Sebastian Widz Mar 25 '19 at 21:20

1 Answers1

2

It's definitely possible to generate a String representation of a traversal from bytecode. TinkerPop already does it for Groovy and Python scripts (for various reasons, primarily for testing but it has it's other uses like supporting lambdas in bytecode and for other utilitarian purposes). We accomplish this through ScriptTranslator implementations and there is one for Groovy and two for Python (where one is actually for Jython). The problem of course is that all of these ScriptTranslator instances are for technically for the JVM and it sounds like you need something for native Python.

Perhaps you could examine the PythonTranslator code and implement that in native Python? It's basically just a bunch of if-then and string concatenation.

stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • Thanks for pointing me to PythonTranslator. I have also found there was a similar request for JavaScript https://issues.apache.org/jira/browse/TINKERPOP-1959 and https://github.com/apache/tinkerpop/pull/952,I have not verified if yet but maybe https://github.com/apache/tinkerpop/blob/master/gremlin-javascript/src/main/javascript/gremlin-javascript/lib/process/translator.js would be simpler code to port to python ?? – Sebastian Widz Mar 21 '19 at 18:35
  • maybe - so. whichever you find easier to read. the main point is parsing bytecode and building a Gremlin string from that. it's not super hard as you can see from all the examples out there. just work to do. – stephen mallette Mar 21 '19 at 18:45
  • I have finally found some time to port the PythonTranslator.java. It is almost finished but need few guidelines here. I cannot find TraversalStrategyProxy, ConnectiveP classes in python. Is it save to skip them in gremlin_python ? – Sebastian Widz Mar 31 '19 at 20:38
  • Also need advice, is it worth to create binding for the translated script? I have read somewhere (probably tinkerpop's docs) that is helps the performance to use parameterized queries are used. If so, which parts should and shouldn't be parametrized e.g. should all args including property and label names be changed to parameters or only values e.g. in .has() step ? – Sebastian Widz Mar 31 '19 at 20:44
  • you can skip `ConnectiveP` as a class in python but you probably still need to account for it - that class basically handles `P.and` / `P.or`. Similarly, you should account for what `TraversalStrategyProxy` represents - basically, just calls to `g.withStrategy()` (i don't think it's more complicated than that offhand). I wouldn't try to dynamically parameterize despite the improvements it might generate. afaik, neither the groovy or js translators try to do that. – stephen mallette Mar 31 '19 at 23:05