5

In Python I have a list of dictionaries like this:

[
    {
        "col2": "2",
        "id": "1",
        "col3": "3",
        "col1": "1"
    },
    {
        "col2": "4",
        "id": "2",
        "col3": "6",
        "col1": "2"
    },
    {
        "col1": "1",
        "col2": "4",
        "id": "3",
        "col3": "7"
    }
]

and I need to convert this to a string in csv format including a header row. (For starters let's not care about column and row delimiters...) So, ideally the result would be:

id,col1,col2,col3
1,1,2,3
2,2,4,6
3,1,4,7

("ideally" because the column order does not really matter; having the "id" column first would be nice though...)

I have searched SOF and there are a number of similar questions but the answers always involve creating a csv file using csv.DictWriter. I don't want to create a file, I just want that string!

Of course, I could loop over the list and inside this loop loop over the dictionary keys and in this way create the csv string using string operations. But surely there must be some more elegant and efficient way of doing this?

Also, I'm aware of the Pandas library but I am trying to do this in a very limited environment where I would prefer to use only built-in modules.

Kai Roesner
  • 429
  • 3
  • 17

4 Answers4

7

You can use io.StringIO to write to a 'string' instead of a file. Using the example of csv.DictWriter we get the following code:

import csv
import io

data = [...]  # your list of dicts

with io.StringIO() as csvfile:
    fieldnames = ['id', 'col1', 'col2', 'col3']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

    writer.writeheader()
    for row in data:
        writer.writerow(row)
    print(csvfile.getvalue())
ikkuh
  • 4,473
  • 3
  • 24
  • 39
3

The easiest way would be to use pandas:

import pandas as pd
df = pd.DataFrame.from_dict(your_list_of_dicts)
print(df.to_csv(index=False))

Result:

col1,col2,col3,id
1,2,3,1
2,4,6,2
1,4,7,3

If you want to reorder columns, nothing easier:

col_order = ['id', 'col1', 'col2', 'col3']
df[col_order].to_csv(index=False)

or, to just ensure the id column is first:

df.set_index('id', inplace=True) # the index is always printed first
df.to_csv() # leave the index to True this time
Derlin
  • 9,572
  • 2
  • 32
  • 53
2

With built-in features:

from collections import OrderedDict

ord_d = OrderedDict().fromkeys(('id', 'col1', 'col2', 'col3'))
s = ','.join(ord_d.keys()) + '\n'
for d in lst:
    ord_d.update(d)
    s += ','.join(ord_d.values()) + '\n'

print(s)

The output:

id,col1,col2,col3
1,1,2,3
2,2,4,6
3,1,4,7
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
0

The idea is to get all possible keys and the get all values. Supposing data is the list of dicts that you have. This should work:

output = ''
all_keys = set().union(*(d.keys() for d in data))
output += ",".split(all_keys) + '\n'
for item in data:
    item_str = ",".split([data[key] for key in all_keys if key in data else ''])
    output += item_str + '\n'

source

pabsldn
  • 46
  • 5