Export data in CSV but with different columns set. Can it be done via a library?

Question

Data input:

[
    {'a': 1, 'b': 2, 'c': 3},
    {'b': 2, 'd': 4, 'e': 5, 'a': 1},
    {'b': 2, 'd': 4, 'a': 1}
]

CVS output (columns order does not matter):

a, b, c, d, e
1, 2, 3
1, 2, ,  4, 5
1, 2, ,  4

Standard library csv module cannot cover such kind of input.

Is there some package or library for a single-method export? Or a good solution to deal with column discrepancies?

Check out `pandas.DataFrame.from_dict().to_csv()` You'll probably need to fill the blanks with `NaN`s or something but that would be my first intuition. Otherwise `xlwings` probably has some functionality for this but I've not really used it that much. — havingaball, Oct 29 '21 at 18:28

martineau · Answer 1 · 2021-10-29T19:08:50.760

It can be done fairly easily using the included csv module with a little preliminary processing.

import csv

data = [
    {'a': 1, 'b': 2, 'c': 3},
    {'b': 2, 'd': 4, 'e': 5, 'a': 1},
    {'b': 2, 'd': 4, 'a': 1}
]

fields = sorted(set.union(*(set(tuple(d.keys())) for d in data)))  # Determine columns.

with open('output.csv', 'w', newline='') as file:
    writer = csv.DictWriter(file, fieldnames=fields)
    writer.writeheader()
    writer.writerows(data)

print('-fini-')

Contents of file produced:

a,b,c,d,e
1,2,3,,
1,2,,4,5
1,2,,4,

score 1 · Answer 2 · answered Oct 29 '21 at 18:37

1

Straightforward with pandas:

import pandas as pd

lst = [
{'a': 1, 'b': 2, 'c': 3},
{'b': 2, 'd': 4, 'e': 5, 'a': 1},
{'b': 2, 'd': 4, 'a': 1}
]

df = pd.DataFrame(lst)
print(df.to_csv(index=None))

Output:

a,b,c,d,e
1,2,3.0,,
1,2,,4.0,5.0
1,2,,4.0,

answered Oct 29 '21 at 18:37

Tranbi

11,407
6
16
33

It seems it does the job. Why there is `.0`? – Kirby Oct 29 '21 at 18:49
it's because pandas formats the cells as float. You can convert every cell to int with the following: `df = pd.DataFrame(lst).fillna(pd.NA).applymap(lambda x: x if pd.isna(x) else int(x))` – Tranbi Oct 29 '21 at 20:04

score 1 · Answer 3 · answered Oct 29 '21 at 18:38

1

you have to pass a restval argument to Dictwriter which is the default argument for missing keys in dictionaries

writer = Dictwriter(file, list('abcde'), restval='')

answered Oct 29 '21 at 18:38

nadapez

2,603
2
20
26

Yes, seem I missed it. But it still requires to mention all possible columns, doesn't it? `fieldnames=...` – Kirby Oct 29 '21 at 18:48
yes, that is the second argument of DictWriter. – nadapez Oct 29 '21 at 18:56
Please consider vote to reopen on [your question](https://stackoverflow.com/questions/73504010/grep-highlight-only-first-match-of-each-line-when-using-color) if you feel like!! – anubhava Aug 28 '22 at 06:24

Export data in CSV but with different columns set. Can it be done via a library?

3 Answers3