1

Data input:

[
    {'a': 1, 'b': 2, 'c': 3},
    {'b': 2, 'd': 4, 'e': 5, 'a': 1},
    {'b': 2, 'd': 4, 'a': 1}
]

CVS output (columns order does not matter):

a, b, c, d, e
1, 2, 3
1, 2, ,  4, 5
1, 2, ,  4

Standard library csv module cannot cover such kind of input.

Is there some package or library for a single-method export? Or a good solution to deal with column discrepancies?

martineau
  • 119,623
  • 25
  • 170
  • 301
Kirby
  • 2,847
  • 2
  • 32
  • 42
  • 1
    Check out `pandas.DataFrame.from_dict().to_csv()` You'll probably need to fill the blanks with `NaN`s or something but that would be my first intuition. Otherwise `xlwings` probably has some functionality for this but I've not really used it that much. – havingaball Oct 29 '21 at 18:28

3 Answers3

2

It can be done fairly easily using the included csv module with a little preliminary processing.

import csv

data = [
    {'a': 1, 'b': 2, 'c': 3},
    {'b': 2, 'd': 4, 'e': 5, 'a': 1},
    {'b': 2, 'd': 4, 'a': 1}
]

fields = sorted(set.union(*(set(tuple(d.keys())) for d in data)))  # Determine columns.

with open('output.csv', 'w', newline='') as file:
    writer = csv.DictWriter(file, fieldnames=fields)
    writer.writeheader()
    writer.writerows(data)

print('-fini-')

Contents of file produced:

a,b,c,d,e
1,2,3,,
1,2,,4,5
1,2,,4,
martineau
  • 119,623
  • 25
  • 170
  • 301
1

Straightforward with pandas:

import pandas as pd

lst = [
{'a': 1, 'b': 2, 'c': 3},
{'b': 2, 'd': 4, 'e': 5, 'a': 1},
{'b': 2, 'd': 4, 'a': 1}
]

df = pd.DataFrame(lst)
print(df.to_csv(index=None))

Output:

a,b,c,d,e
1,2,3.0,,
1,2,,4.0,5.0
1,2,,4.0,
Tranbi
  • 11,407
  • 6
  • 16
  • 33
  • It seems it does the job. Why there is `.0`? – Kirby Oct 29 '21 at 18:49
  • it's because pandas formats the cells as float. You can convert every cell to int with the following: `df = pd.DataFrame(lst).fillna(pd.NA).applymap(lambda x: x if pd.isna(x) else int(x))` – Tranbi Oct 29 '21 at 20:04
1

you have to pass a restval argument to Dictwriter which is the default argument for missing keys in dictionaries

writer = Dictwriter(file, list('abcde'), restval='')
nadapez
  • 2,603
  • 2
  • 20
  • 26
  • Yes, seem I missed it. But it still requires to mention all possible columns, doesn't it? `fieldnames=...` – Kirby Oct 29 '21 at 18:48
  • yes, that is the second argument of DictWriter. – nadapez Oct 29 '21 at 18:56
  • Please consider vote to reopen on [your question](https://stackoverflow.com/questions/73504010/grep-highlight-only-first-match-of-each-line-when-using-color) if you feel like!! – anubhava Aug 28 '22 at 06:24