-1

I have two DataFrames that roughly look like

(ID) (Category) (Value1)  (Value2)

111   1          5          7
112   1          3          8
113   2          6          9
114   3          2          6

and

(Category)  (Value1 Average for Category) (Value2 Average for Category)

1              4                              5 
2              6                              7
3              9                              2

Ultimately, I'd like to join the two DataFrames so that each ID can have the average value for its category in the row with it. I'm having trouble finding the right way to join/merge/etc. that will fill in columns by checking the category from the other DateFrame. Does anyone have any idea where to start?

ALollz
  • 57,915
  • 7
  • 66
  • 89
nostradukemas
  • 317
  • 3
  • 10

2 Answers2

0

Just do: df1.groupby(['ID', 'Category']).transform(func='mean') on the first dataframe to get the desired dataframe.

hacker315
  • 1,996
  • 2
  • 13
  • 23
  • 1
    This does not work in this case, since his average is not the same as grouping by as we see in his data. – Erfan Mar 01 '19 at 18:51
0

You are simply looking for a join, in pandas we use pd.merge for that like the following:

df3 = pd.merge(df1, df2, on='Category')

    ID  Category    Value1  Value2  Value 1 Average Value 2 Average
0   111 1           5       7       4               5
1   112 1           3       8       4               5
2   113 2           6       9       6               7
3   114 3           2       6       9               2

Official documentation of pandas on merging:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html

Here is a good explanation on joins: Pandas Merging 101

Erfan
  • 40,971
  • 8
  • 66
  • 78