1

We want to store 500k nodes in a RedisGraph (Tweets, Users, Hashtags) with Edges (wrote, mentioned, linked). redisgraph-sizing-calculator estaminated far less than 1 GB of RAM for everything. But with about 5000 nodes RedisGraph uses already over 2 GB of RAM (from RedisInsight). So we run out of RAM and the Python client throws an exception ("Connection closed by server" and sometimes "MISCONF Redis is configured to save RDB snapshot, but it is currently not able to persist on disk. Commands that may modify the data set are disabled, because this instance is configured to report errors during writes if RDB snapshotting fails...") at commit(). In addition redis gets really slowcommands per second on redis We tried commiting on different positions and running redis on a device with more ram. How can we achive to store the entire Graph?

happens with default docker container: docker run -p 6379:6379 -it --rm redislabs/redisgraph and

import redis
from redisgraph import Node, Graph

db_connection = redis.Redis(host='localhost', port=6379)
graph = Graph('Twitter', db_connection)

for i in range(100000):
    graph.add_node(Node(label='user', properties={'id': i, 'name': str(i)}))
    graph.commit() #Here raises the exception if i is round about 5000 (could change depending on system ram)

db_connection.close()
schumidu
  • 11
  • 2

1 Answers1

1

Try:

import redis
from redisgraph import Node, Graph

db_connection = redis.Redis(host='localhost', port=6379)
graph = Graph('Twitter', db_connection)

for i in range(100000):
    graph.add_node(Node(label='user', properties={'id': i, 'name': str(i)}))
    if i % 5000 == 0:
        graph.commit() #Here raises the exception if i is round about 5000 (could change depending on system ram)

db_connection.close()

This way you commit 5000 nodes at once and it will be much faster

Other way to load data like this can be done in one cypher query

UNWIND range(1, 100000) AS x CREATE (:user { id: x, name: toString(x) })
  • Please note `graph.commit` won't flush the python graph content, and so in the second iteration of the loop you'll be committing 10000 entities: the original first 0-5000 and the newly brach 5000-100000 I believe you should use `graph.flush` in this case, the flush method commits the Python's graph content and clears it. – SWilly22 May 23 '22 at 06:42