0

I have a 3 data frames like this and the shape in each data frame are not the same size.

df
fruits           priceyear_2010
orange                15
apple               10
watermelon            3
melon                 7
strawberry           11

df1
fruits           priceyear_2011
watermelon           5
apple                4
strawberry          19

df2
fruits           priceyear_2012
apple                12
orange               16
watermelon           14
melon                18

and I would like to map with the string to get the result like this

df_result
fruits           priceyear_2010  priceyear_2011 priceyear_2012
apple                10              4            12
orange               15              Nan          16
watermelon            3              5            14
melon                 7              Nan          18
strawberry           11             19            Nan

sorry for asking this but i have no idea.

I have followed the suggestion below and got the ValueError

----> 2 df = pd.concat([x.set_index(['fruits']) for x in dfs], axis = 1)

~/anaconda3/lib/python3.6/site-packages/pandas/core/reshape/concat.py in concat(objs, axis, join, join_axes, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    224                        verify_integrity=verify_integrity,
    225                        copy=copy, sort=sort)
--> 226     return op.get_result()
    227 
    228 


ValueError: Shape of passed values is (3, 2285), indices imply (3, 2284)
Sujin
  • 273
  • 1
  • 11
  • Use `df = pd.concat([df1, df2, df3],axis=1)` – jezrael Nov 21 '18 at 12:06
  • thank you for your comment. i have updated the post in case that the column fruits in 3 data frames is not ordered the same, how can i map this T T – Sujin Nov 21 '18 at 12:11
  • So need `dfs = [df1, df2, df3] df = pd.concat([x.set_index(['fruits']) for x in dfs], axis=1)` ? – jezrael Nov 21 '18 at 12:11
  • i have updated the post for telling that the size of both 3 data frames are not the same size ;-;. thank you for your help but the error occurred according to the different size in both of them T T – Sujin Nov 21 '18 at 12:15
  • So my solution not working? – jezrael Nov 21 '18 at 12:15
  • yes, it's not working T T – Sujin Nov 21 '18 at 12:16
  • There is error? Can you add output from your sample data to question? – jezrael Nov 21 '18 at 12:17
  • yes, i already updated the post ;-; many thanks for your help T T – Sujin Nov 21 '18 at 12:20
  • I found problem - data in `fruits` are not unique, in same DataFrame are duplicated. – jezrael Nov 21 '18 at 12:26
  • This should do the trick. df_f = df.reset_index().merge(df1.reset_index, on = 'fruits', how = 'outer').merge(df2.reset_index(), on = 'fruits', how = 'outer').drop_duplicates() Only use reset_index() if the column fruits is the index as it appears based on your post. – Jorge Nov 21 '18 at 12:28
  • So you can check it by `for df in dfs: print (df[df.duplicated('fruits', keep=False)])` anf iw want remove duplicated row use `df = pd.concat([x.drop_duplicates('fruits').set_index(['fruits']) for x in dfs], axis=1)` – jezrael Nov 21 '18 at 12:37

0 Answers0