Instead of manually splitting each line by commas, use the CSV module that you've imported. This module contains a DictReader
class that will yield dictionaries for each row. Then, you just need to add this to your output dictionary.
# Create an empty dictionary
# We will add keys to this as needed
output = {}
# Keep track of number of rows, so we can add an empty column if needed
row_count = 0
# This function adds a row to the output dictionary
def add_row(row_dict):
global row_count # Need to declare this as global because we're assigning to the variable in this function
if not row_dict: return # If row is empty, do nothing
for k, v in row_dict.items():
# Loop over all key-value pairs in the row to add
if k not in output: # If the output doesn't contain this column, create a blank column
output[k] = [None] * row_count
output[k].append(v) # Append the value to the correct column in output
row_count += 1
input_file = r'C:\Users\_M92\csvFiles\csv0001.csv'
with open(input_file, 'r') as fh:
reader = csv.DictReader(fh) # Create a DictReader
for row in reader:
add_row(row) # Add every row to the output
This gives the following output
:
{'OsmID': ['1020287758', '1021229973', '1025409497'],
'NewName': ['NN1', 'NN2', 'NN3'],
'IdLocal': ['Id0001', 'Id0002', 'Id0003']}
Note: I removed the blank lines in the input csv you provided, but it doesn't make a difference to the program, since a blank line will yield an empty dictionary from DictReader
, and add_row
doesn't do anything with empty dicts
Note 2: You could discard the row_count
variable if you dynamically count the number of rows like so:
def add_row(row_dict):
row_count = 0
for first_key, first_val in output.items():
row_count = len(first_val)
break # We can just break out here because all keys should have the same number of values
# Create keys that do not yet exist in output but do exist in the new row
existing_keys = set(output.keys())
new_row_keys = set(row_dict.keys())
keys_to_create = new_row_keys - existing_keys
for key in keys_to_create:
output[key] = [None] * row_count
# Append to each column in output
for key in output:
output[key].append(row_dict.get(key, None)) # If the key doesn't exist in the current row, append None