I have loaded a large RDF dataset (Geonames dataset: 18GB) in PostgreSQL tables using rdflib_sqlalchemy.SQLAlchemy
.
I have run following simple query from Python script with RDFLib support. It has taken more than two hours to give me the result. Is there any way to make it faster without injecting RDF data to a triplestore (e.g., Virtuoso)?
mystore = store.SQLAlchemy(configuration="postgresql://localhost:5873/postgres")
g = Graph(mystore, identifier="test")
results = g.query("""SELECT ?s ?p ?o WHERE {?s ?p ?o .} LIMIT 1""")
for row in results:
print row
I am working on a cluster's compute node. I have tried to execute my query with in-memory data like following as well. However, still, it is slow.
g = Graph()
g.parse('geonames.nt', format='nt')
results = g.query("""SELECT ?s ?p ?o WHERE {?s ?p ?o .} LIMIT 1""")
for row in results:
print row
Please let me know your opinion. Thank you for your help.