2

I'm using OrientDB 2.1.11 and rexster 2.6 and gremlin is the main query language. I use via rexpro (and rexster REST). My issue is: how to get the indexes to hit from gremlin (I must use gremlin not orient sql).

I have a vertex class zipcode, which has 1 property zip_code defined in schema and indexed as dictionary:

zipcode.zip_code    DICTIONARY  ["zip_code"]    SBTREE 

But when I query it using gremlin, its slow when records are around >25k (haven't tested with lower numbers). To give proper context, I try to find the zipcode first, if it doesn't exist then I create the vertex for later use. Find query goes like this:

g.V('@class', 'zipcode').has('zip_code','10018')

Question: Is g.V('@class'... hitting indexes? Is it not going over 1000000 objects of V? Is there a way to write it better to be more efficient for my vertex class i.e. zipcode? I just need to match a property of vertices in my class (zipcode).

Is it better to use has('zip_code', '12345') or filter {it.zip_code == '12345'}? Which one would hit the index created?

What if I have to match more than 1 properties to match against:

.has('zip_code', '12345').has('state','NY').has('city','NEW YORK') 

would has' hit indexes or 'filter{}'? please advise.

Michela Bonizzi
  • 2,622
  • 1
  • 9
  • 16
Omair Jafri
  • 206
  • 2
  • 8
  • I even used the studio, same slowness observed when zipcode vertex has 8000 records only and zip_code is dictionary indexed: g.V('@class', 'zipcode').has('zip_code','10018') Query executed in 1.936 sec. Returned 1 record(s). Limit: 20 – Omair Jafri Feb 18 '16 at 19:49
  • some more info, when i use orient sql in studio, it seems to be hitting index: select * from zipcode where zip_code='10018' Query executed in 0.047 sec. Returned 1 record(s). Limit: 20 ' please help how can I hit indexes using gremlin – Omair Jafri Feb 18 '16 at 23:28

1 Answers1

0

Ok, after some hit and trial, I was able to figure this out to work via rexster/gremlin. I changed my query to something like:

new GremlinPipeline(g.getVertices('city_state.city','PALMETTO')).has('state_code','FL')

or
g.getVertices('city_state.city','PALMETTO')._().has('state_code','FL')

The g.getVertices method does accept 'class.field' notation (which is required to hit indexes) but it returns an iterator not a pipe so I have to put it in GremlinPipeline, or the alternate _(), in order to write further steps in gremlin.

Hopefully, this would help other folks as well. Made me burn 2 days, its hard when you are really trying to go for a new product coming from neo4j (which has mastered its queries and support).

Omair Jafri
  • 206
  • 2
  • 8
  • Any clues how to hit the index with pure Gremlin API? This looks like Blueprints API. – Mon Calamari Apr 23 '16 at 21:27
  • Do you have an example which youre stuck at? To me blueprints are used synonymously with gremlin. The above answer did hit all the indexes using orientdb – Omair Jafri Apr 26 '16 at 17:48
  • Yes, it did because of `getVertices` - this is blueprints API. I am using gremlin-orientdb and index is not hit with something like: `g.V().has(Key[String]("prop"), value)` – Mon Calamari Apr 26 '16 at 17:50
  • g.V() is the cause of not hitting indexes. I'd suggest try using g.getVerticesOfClass('className')._() or g.getVertices('class.property')._(). This is just the behavior or orientdb in my opinion, may be one of their devs can help you out – Omair Jafri May 01 '16 at 03:14
  • I have found out. The query is missing label: `g.V().hasLabel("label").has(...`. Label internally is mapped to class in orientdb. Sad story is you cannot use V super class label to hit the index even though a property exists on V class. – Mon Calamari May 01 '16 at 08:18