Arrays to row in pandas

Question

I have following dict which I want to convert into pandas. this dict have nested list which can appear for one node but not other.

dis={"companies": [{"object_id": 123,
                           "name": "Abd ",
                           "contact_name": ["xxxx",
                                                                       "yyyy"],
                           "contact_id":[1234,
                                                                     33455]
                           },
                          {"object_id": 654,
                           "name": "DDSPP"},
                          {"object_id": 987,
                           "name": "CCD"}
                          ]}

AS

object_id, name, contact_name, contact_id
123,Abd,xxxx,1234
123,Abd,yyyy,
654,DDSPP,,
987,CCD,,

How can i achive this

I was trying to do like

abc = pd.DataFrame(dis).set_index['object_id','contact_name']

but it says

'method' object is not subscriptable

What have you tried so far? – FHTMitchell Aug 24 '18 at 09:13 — FHTMitchell, Aug 24 '18 at 09:13
Please check the answers provided. Could help. – anky Aug 27 '18 at 16:35 — anky, Aug 27 '18 at 16:35

anky · Accepted Answer · 2018-08-27T11:00:34.700

This is inspired from @jezrael answer in this link: Splitting multiple columns into rows in pandas dataframe

Use:

s = {"companies": [{"object_id": 123,
                       "name": "Abd ",
                       "contact_name": ["xxxx",
                                                                   "yyyy"],
                       "contact_id":[1234,
                                                                 33455]
                       },
                      {"object_id": 654,
                       "name": "DDSPP"},
                      {"object_id": 987,
                       "name": "CCD"}
                      ]}
df = pd.DataFrame(s) #convert into DF
df = df['companies'].apply(pd.Series) #this splits the internal keys and values into columns
split1 = df.apply(lambda x: pd.Series(x['contact_id']), axis=1).stack().reset_index(level=1, drop=True)
split2 = df.apply(lambda x: pd.Series(x['contact_name']), axis=1).stack().reset_index(level=1, drop=True)
df1 = pd.concat([split1,split2], axis=1, keys=['contact_id','contact_name'])
pd.options.display.float_format = '{:.0f}'.format
print (df.drop(['contact_id','contact_name'], axis=1).join(df1).reset_index(drop=True))

Output with regular index:

    name  object_id  contact_id contact_name
0   Abd         123        1234         xxxx
1   Abd         123       33455         yyyy
2   DDSPP       654         nan          NaN
3   CCD         987         nan          NaN

Is this something you were looking for?

Anna Iliukovich-Strakovskaia · Answer 2 · 2018-08-24T10:54:23.100

0

If you have just only one column needs to convert, then you can use something more shortly, like this:

df = pd.DataFrame(d['companies'])
d = df.loc[0].apply(pd.Series)
d[1].fillna(d[0], inplace=True)
df.drop([0],0).append(d.T)

Otherwise, if you need to do this with more then one raw, you can use it, but it have to be modified.

edited Aug 24 '18 at 10:54

answered Aug 24 '18 at 10:47

Anna Iliukovich-Strakovskaia

1,383
1
9
20

Arrays to row in pandas

2 Answers2

Linked