3

I want to save a large graph in Redis and was trying to accomplish this using RedisGraph. To test this I was creating a test-graph first to check the performance characteristics. The graph is rather small for the purposes we need.

  • Vertices: about 3.5 million
  • Edges: about 18 million

And this is very limited for our purposes, we would need to be able to increase this to 100's of millions of edges in a single database. In any case, I was checking space and performance requirements buit stopped after only loading in the vertices and seeing that the performance for a:

GRAPH.QUERY gid 'MATCH (t:token {token: "some-string"}) RETURN t' 

Is over 300 milliseconds for just this retrieval which is absolutely unacceptable.

Am I missing an obvious way to improve the retrieval performance, or is that currently the limit of RedisGraph?

Thanks

Tom P.
  • 390
  • 3
  • 12

2 Answers2

5

Adding an index will speed things up a lot when matching.

CREATE INDEX ON :token(token)

From my investigations, I think that at least one instance of the item must exist for an index to be created, but I've not done any numbers on extra overhead of creating the index early and then adding most of the new nodes, rather than after all items are in the tree and they can be indexed en-mass.

Alister Bulman
  • 34,482
  • 9
  • 71
  • 110
  • That makes sense. I hadn't attempted creating an index yet as it is not included in the redis-graph documentation anywhere. There's even mention of indices not being supported. I have checked the syntax now and it does seem to work, I'll report back when I can test this on my larger set from home tomorrow. – Tom P. Dec 17 '18 at 10:01
  • Its there, just not well signposted - https://oss.redislabs.com/redisgraph/commands/#indexing – Alister Bulman Dec 17 '18 at 11:51
  • I can't believe I missed that. Thank you for the response. I have created a test set with 3.5 million nodes and simple strings with values 'test-number' and afterwards created an index on them. The index sped up search from 1400ms to ~0.2-22ms. Memory usage increased from 88MB to 342MB. – Tom P. Dec 18 '18 at 11:00
  • I have the same problem and I have 100 fields so creating an index for everyone and subsets would be huge!! There should be a way to retrieve this quickly without an index, isn't it? – user1870400 Mar 27 '19 at 08:55
5

In case all nodes are labeled as "token" then redisgraph will have to scan 3.5 million entities, comparing each entity "token" attribute against the value you've provided ("some-string")

for speed up I would recommend either adding an index, or limiting the number of results you would like to receive using LIMIT.

Also worth mentioning is that the first query to be served might take awhile longer then following queries due to internal memory management.

SWilly22
  • 869
  • 4
  • 5
  • is there a documentation on how to model vertices? similar to say documentation on Data model design say (normalization) in SQL – user1870400 Mar 27 '19 at 09:06