I have a list of around 300.000 wikidata ids (e.g. Q1347065, Q731635 etc.) in an ndjson file as
{"Q1347065": ""}
{"Q731635": ""}
{"Q191789": ""} ... etc
What I would like is to get the label of each id, and form a dictionary of key values, such as
{"Q1347065":"epiglottitis", "Q731635":"Mount Vernon", ...}
etc.
What I've used before the list of ids got so large, was a Wikidata python library (https://pypi.org/project/Wikidata/)
from wikidata.client import Client
import ndjson
client = Client()
with open("claims.ndjson") as f, open('claims_to_strings.json', 'w') as out:
claims = ndjson.load(f)
l = {}
for d in claims:
l.update(d)
for key in l:
v = client.get(key)
l[key] = str(v.label)
json.dumps(l, out)
But it is too slow (around 15 hours for 1000 ids). Is there another way to achieve this that is faster than what I have been doing?