5

I have an input dict-of-string-to-list with possibly different lengths for the list.

d = {'b': [2,3], 'a': [1]}

when I do: df = pd.DataFrame(data=d), i'm seeing ValueError: arrays must all be same length

Question: How do i fill the missing values with default (e.g. 0) when creating the df?


The reason to create the df is to get the final result of: {'b': 3}

whereas 3 is the max of all numbers in the lists.

XoXo
  • 1,560
  • 1
  • 16
  • 35
  • 1
    You can look into similar question for multiple methods: https://stackoverflow.com/questions/19736080/creating-dataframe-from-a-dictionary-where-entries-have-different-lengths In addition, you need `fillna` after creating the dataframe. – niraj Feb 12 '19 at 20:12
  • [df.max().max()](https://stackoverflow.com/a/45681145/3380951) gets the max value 3 once the df is created – XoXo Feb 14 '19 at 12:13

2 Answers2

7

You can use DataFrame.from_dict setting orient to index so the keys of the dictionary are used as indices and the missing values are set to NaN. Then simply fill NaNs using .fillna and transpose to set the keys as columns:

pd.DataFrame.from_dict(d, orient='index').fillna(0).T

    b    a
0  2.0  1.0
1  3.0  0.0
yatu
  • 86,083
  • 12
  • 84
  • 139
1
d = {'b': [2,3], 'a': [1]}
df = pd.DataFrame({ k:pd.Series(v) for k, v in d.items() })

This will give the following output.

a  b
0  1.0  2
1  NaN  3
Vinoth
  • 2,419
  • 2
  • 19
  • 34