1

Hi I have the following data

data = {
    "a": 1,
    "b": 2,
    "c": 3,
    "d": 4,
    "efgh": [
        {
            "e": 5,
            "f": 6,
            "g": 7,
            "h": 8
        }
    ]
}

I would like to convert it to pandas data frame with the following format

image

I have tried with following method

df = pd.DataFrame(data)
df

image

Timus
  • 10,974
  • 5
  • 14
  • 28
Saw Ko
  • 13
  • 1
  • 2
  • 1
    You can first prepare the `data` dictionary as required by [extracting](https://stackoverflow.com/a/11277439/7283201) and [concatenating](https://stackoverflow.com/q/1781571/7283201) items. – Sadman Sakib Mar 27 '22 at 08:57
  • Updated my answer to a more correct and optimal one. – inquirer Mar 27 '22 at 11:59

2 Answers2

0

I create two dictionaries before key "efgh" and after. I attach the dictionary "dict1" to the dictionary "dict2".
In pandas, we convert keys to columns, values to rows.

import pandas as pd

data = { "a": 1, "b": 2, "c": 3, "d": 4, "efgh": [ { "e": 5, "f": 6, "g": 7, "h": 8 } ] }
dict1 = {k: v for k, v in data.items() if k!="efgh"}
dict2 = {k: v for k, v in data['efgh'][0].items()}

dict1.update(dict2)

df = pd.DataFrame([dict1])

print(df)
inquirer
  • 4,286
  • 2
  • 9
  • 16
0

This looks a bit like you are looking for pd.json_normalize? If you do

df = pd.json_normalize(data, record_path="efgh", meta=["a", "b", "c", "d"])

you'll get

   e  f  g  h  a  b  c  d
0  5  6  7  8  1  2  3  4

which is essentially what you want, just with the columns in a different order. You could adjust that by:

df = df.sort_index(axis=1)
   a  b  c  d  e  f  g  h
0  1  2  3  4  5  6  7  8
Timus
  • 10,974
  • 5
  • 14
  • 28