1

The dataframe has a column with list of dictionaries with same key names . How can i convert it into a tall dataframe? The dataframe is as shown.

A       B
1   [{"name":"john","age":"28","salary":"50000"},{"name":"Todd","age":"36","salary":"54000"}]
2   [{"name":"Alex","age":"48","salary":"70000"},{"name":"Mark","age":"89","salary":"150000"}]
3   [{"name":"jane","age":"36","salary":"20000"},{"name":"Rose","age":"28","salary":"90000"}

How to convert the following dataframe to the below one

A    name   age    salary
1    john   28     50000
1    Todd   36     54000
2    Alex   48     70000
2    Mark   89     150000
3    jane   36     20000
3    Rose   28     90000
Vamsi Nimmala
  • 497
  • 1
  • 7
  • 19

1 Answers1

1

You are looking for unesting first then , using the same method I provided before .

newdf=unnesting(df,['B'])
pd.concat([newdf,pd.DataFrame(newdf.pop('B').tolist(),index=newdf.index)],axis=1)
   A age  name  salary
0  1  28  john   50000
0  1  36  Todd   54000
1  2  48  Alex   70000
1  2  89  Mark  150000
2  3  36  jane   20000
2  3  28  Rose   90000

More info I have attached my self-def function , you can also find it in the page I linked

def unnesting(df, explode):
    idx=df.index.repeat(df[explode[0]].str.len())
    df1=pd.concat([pd.DataFrame({x:np.concatenate(df[x].values)} )for x in explode],axis=1)
    df1.index=idx
    return df1.join(df.drop(explode,1),how='left')

Data Input

df.B.to_dict()
{0: [{'name': 'john', 'age': '28', 'salary': '50000'}, {'name': 'Todd', 'age': '36', 'salary': '54000'}], 1: [{'name': 'Alex', 'age': '48', 'salary': '70000'}, {'name': 'Mark', 'age': '89', 'salary': '150000'}], 2: [{'name': 'jane', 'age': '36', 'salary': '20000'}, {'name': 'Rose', 'age': '28', 'salary': '90000'}]}
BENY
  • 317,841
  • 20
  • 164
  • 234
  • I like the (your) `chain` method in your link :D – ALollz Jan 21 '19 at 05:00
  • Cannot cast array data from dtype('float64') to dtype('int64') according to the rule 'safe'.... This is the error I get. When I debugged it occurs at line 2 in the def function. could you please help me with that. – Vamsi Nimmala Jan 21 '19 at 19:49
  • @VamsiNimmala what is your real data , why there is float here ? should not be all object ? – BENY Jan 21 '19 at 20:07
  • My data is extracted from mongo, I even converted the dtype of the dataframe using astype(object). – Vamsi Nimmala Jan 22 '19 at 03:17
  • @VamsiNimmala I have attach the data I am using here – BENY Jan 22 '19 at 03:27