Convert multi dimensional array to dict without any imports

Question

Suppose I have the following csv data:

first_name,last_name
tom,hanks
tom,cruise

I would like to convert this data as follows:

data = {
    'first_name': ['tom','tom'],
    'last_name': ['hanks', 'cruise']
}

What would be the best way to do the above (not using a library such as pandas, numpy, or csv).

@AlexanderReynolds yea I'd agree but I'm looking to implement a pure python solution as the data may not be an actual valid csv. — David542, Dec 17 '18 at 20:34
I mean those are pure python, they're standard library. Still, if you don't want to use the `csv` module, you can parse each line with a `line.split(',')` to split on commas, and then append each value in the split list to the corresponding list in the dictionary. — alkasm, Dec 17 '18 at 20:36
@timgeb no I'm just saying a solution without using the csv module. — David542, Dec 17 '18 at 20:38
Why, **why** can't you use the csv module? And what exactly is the issue you are encountering when you attempt to do this? — juanpa.arrivillaga, Dec 17 '18 at 20:40

score 6 · Answer 1 · answered Dec 17 '18 at 20:41

6

Personally, I'd go with pandas or csv but this is fairly easy to implement without any imports:

header = None
data = {}
for line in myfile:
    lstrip = line.strip().split(",")
    if not header:
        header = lstrip
        data = {k: [] for k in header}
    else:
        for i, value in enumerate(lstrip):
            data[header[i]].append(value)

print(data)
#{'first_name': ['tom', 'tom'], 'last_name': ['hanks', 'cruise']}

answered Dec 17 '18 at 20:41

pault

41,343
15
107
149

The only thing I'd do differently is create an iterator for `myfile` and use `next` to get the header row, and then iterate through normally after that so you wouldn't have to check a condition on each loop. – alkasm Dec 17 '18 at 20:42
1

If `myfile` is a file-like object, it should already be an iterator. – Patrick Haugh Dec 17 '18 at 20:43
1

@AlexanderReynolds `myfile` is an iterator, but yeah generally I agree – juanpa.arrivillaga Dec 17 '18 at 20:44

timgeb · Accepted Answer · 2018-12-17T21:04:10.097

Faking your file:

>>> from io import StringIO                                                                                            
>>> file = StringIO('''first_name,last_name 
...: tom,hanks 
...: tom,cruise''')

Creating the dict:

>>> data = [(k, []) for k in next(file).strip().split(',')]                                                            
>>> for line in file: 
...:     for i, field in enumerate(line.strip().split(',')): 
...:         data[i][1].append(field) 
...:                                                                                                                   
>>> data = dict(data)                                                                                                  
>>> data                                                                                                               
{'first_name': ['tom', 'tom'], 'last_name': ['hanks', 'cruise']}

This is more of a programming exercise than a solution you should use in the real world. It's not robust at all and will fail for all kinds of common cases, such as having quoted fields containing commas in the csv file.

With csv, for other readers:

>>> import csv                                                                                                         
>>> reader = csv.reader(file) # assume fresh StringIO instance
>>> dict(zip(next(reader), zip(*reader)))                                                                              
{'first_name': ('tom', 'tom'), 'last_name': ('hanks', 'cruise')}

(Use dict(zip(next(reader), map(list, zip(*reader)))) if having lists as values is important.)

Convert multi dimensional array to dict without any imports

2 Answers2