There are already questions in this direction, but in my situation I have the following problem:
The column alias contains dictionaries. If I use the csv reader I get strings.
I have solved this problem with ast eval, but it is very slow and consumes a lot of resources.
The alternative json.loads does not work because of encoding.
Some Ideas to solve this?
CSV File:
id;name;partei;term;wikidata;alias
2a24b32c-8f68-4a5c-bfb4-392262e15a78;Adolf Freiherr Spies von Büllesheim;CDU;10;Q361600;{}
9aaa1167-a566-4911-ac60-ab987b6dbd6a;Adolf Herkenrath;CDU;10;Q362100;{}
c371060d-ced3-4dc6-bf0e-48acd83f8d1d;Adolf Müller;CDU;10;Q363453;{'nl': ['Adolf Muller']}
41cf84b8-a02e-42f1-a70a-c0a613e6c8ad;Adolf Müller-Emmert;SPD;10;Q363451;{'de': ['Müller-Emmert'], 'nl': ['Adolf Muller-Emmert']}
15a7fe06-8007-4ff0-9250-dc7917711b54;Adolf Roth;CDU;10;Q363697;{}
Code:
with open(PATH_CSV+'mdb_file_2123.csv', "r", encoding="utf8") as csv8:
csv_reader = csv.DictReader(csv8, delimiter=';')
for row in csv_reader:
if not (ast.literal_eval(row['alias'])):
pass
elif (ast.literal_eval(row['alias'])):
known_as_list = list()
for values in ast.literal_eval(row['alias']).values():
for aliases in values:
known_as_list.append(aliases)
Its working good, but very slowly.