4

I am trying to create a DataFrame with pandas from a list of dictionaries which looks like this:


[{'3600': '12', '7600': '1212343'}, {'0': 0.0, '3600': 0.0, '7200': 0.0, '10800': 0.0, '14400': 0.0, '18000': 0.0, '21600': 0.0, '25200': 116.93828280645994} .... ]

My columns is a list of items: ["col1", "col2" ...]

What I want is the keys of the dicts to be the index, the values of the dicts should fill the columns. In this example:

                       col1                  col2
0                       0/NaN                0.0
3600                    12                   0.0
7600                    1212343              NaN
7200                    NaN                  0.0
10800                   NaN                  0.0
18000                   NaN                  0.0
21600                   NaN                  0.0
25200                   NaN                  116.93828280645994

So the values of each dictionary basically represent column values. Since the dicts can be of different sizes I need to add NaN.

I thought I had this already figured out with the help here(Create a Dataframe from list of Dictionaries) like this:


    columns = ["col", "col2" ...]
    df_data = mydataasabove

    final_dict = defaultdict(list)

    for data in df_data:
        for key, value in data.items():
            final_dict[key].append(value)

    final_dict = dict(final_dict)

    df = pd.DataFrame.from_dict(final_dict, orient='index', columns=columns)

But this gives me a df like this:

                     col1                     col2
3600                   12                    0.0
7600              1212343                    NaN
0                       0                    NaN
7200                    0                    NaN
10800                   0                    NaN
14400                   0                    NaN
18000                   0                    NaN
21600                   0                    NaN
25200             116.938                    NaN

As you can see the values do not correspond correctly to my columns. The output of printing final_dict is:

{'3600': ['12', 0.0], '7600': ['1212343'], '0': [0.0], '7200': [0.0], '10800': [0.0], '14400': [0.0], '18000': [0.0], '21600': [0.0], '25200': [116.93828280645994]}

I also tried something along the lines with Chainmap:

df = pd.DataFrame.from_dict(ChainMap(*nec_data), orient='index', columns=['col1']) but I couldn't add multiple columns.

Maybe someone can lend me a ? It would be very much appreciated! Thanks in advance

Micromegas
  • 1,499
  • 2
  • 20
  • 49
  • 4
    `pd.DataFrame(your_dictionary).T` ? – anky Aug 20 '20 at 17:33
  • Does this answer your question? [Convert list of dictionaries to a pandas DataFrame](https://stackoverflow.com/questions/20638006/convert-list-of-dictionaries-to-a-pandas-dataframe) – Puneet Singh Aug 20 '20 at 17:38
  • `pd.DataFrame.from_dict(your_dictionary, orient='index')` – ansev Aug 20 '20 at 17:58
  • 1
    @anky, thank you, but this transposes my dictionary, but I need the keys to remain the index. – Micromegas Aug 20 '20 at 18:25
  • @ansev thank you, but as you can see, this is what I am already trying after my loop and it gives me the df which I describe - which unfortunately - populates the df with incorrect values – Micromegas Aug 20 '20 at 18:25
  • @PuneetSingh thank you, but no it doesn't since it doesn't give me the keys as index with the correct column values – Micromegas Aug 20 '20 at 18:26
  • 1
    the keys will be the index with that code, can you try? please let me know what you get and how does it differ from what you want – anky Aug 20 '20 at 18:28
  • @anky My apologies, I was trying your suggestion before but specified the kwarg columns and that's why it got switched. Your suggestion does indeed give me the correct df which I want to get, but it doesn't give me the correct column names I want (but it gives me 0,1...). If I specify the columns it gets switched around.... Might you have an idea how to adjust that? – Micromegas Aug 20 '20 at 18:36
  • Figured it out, had to use index instead of columns – Micromegas Aug 20 '20 at 18:37
  • @anky thak you so much for this! If you want to formulate an answer I'd be happy to accept it to give you credit for it! – Micromegas Aug 20 '20 at 18:38
  • 1
    hmm, do you mean columns should be `col1` and `col2` instead of 0 and 1? `pd.DataFrame(d).T.rename(columns=lambda x: f"col{x+1}")` ? – anky Aug 20 '20 at 18:39
  • @anky Yes that's exactly what I meant. See my comments above I achieved it by specifying index=["col1", "col2"] instead of columns=.. Thank you very much, really helped me out. As I said above, If you want to formulate an answer I'd be happy to accept it to give you credit for it! – Micromegas Aug 20 '20 at 18:41
  • 1
    Okay :) dont specify when you can do dynamic :-) – anky Aug 20 '20 at 18:43
  • 1
    Yess agreed :) !! – Micromegas Aug 20 '20 at 18:45

1 Answers1

3

You can read as a dataframe and transpose , then rename to adjust the column names with the help of df.rename and f-strings

pd.DataFrame(d).T.rename(columns=lambda x: f"col{x+1}")

          col1     col2
3600        12        0
7600   1212343      NaN
0          NaN        0
7200       NaN        0
10800      NaN        0
14400      NaN        0
18000      NaN        0
21600      NaN        0
25200      NaN  116.938
anky
  • 74,114
  • 11
  • 41
  • 70
  • 1
    Fantastic, that did the trick! Thank you so much, very much appreciated!! – Micromegas Aug 20 '20 at 18:42
  • Maybe out of place here, but a small follow up question. Sorting this by the actual number of the index (0, 3600, 7200, 7600 etc...) seems not to work because it is created from 2 dictionaries and the number of the creation matters to the index? Or can I specify this somehow – Micromegas Aug 20 '20 at 19:10
  • @Micromegas hmm, so there are 2 dicts which influences the creation of a dataframe? really can't guess :-) I am okay if you edit the question and unaccept my answer or ask a new one. Whatever suits you :) feel free please – anky Aug 20 '20 at 19:14
  • Thank you! I ended up just overwriting the index via df.index = ["..", "..."] – Micromegas Aug 20 '20 at 20:31