create a python dataframe from existing dataframe groupby a columns with min, max, mean and median values

Question

This is how my dataframe looks like.

   A      B    C

  ice   solid  Kate
  ice   solid  Jake
        solid  Jake
  tea   liquid Lilly
  tea   solid  Jake
        liquid Kate
  tea   liquid Apple
  ice   liquid Apple

I would like to fill the missing values of A with "most frequent" values of A groupby B
So missing values would be ice for column B with value of solid and tea for value of liquid based on their group by values.
How can I get this done.

Thanks for posting sample data. Please see https://stackoverflow.com/help/mcve for tips on posting questions. What code have you tried so far? What was the output or error message? — Evan, Feb 01 '18 at 22:14
For example, can you try this solution: https://stackoverflow.com/questions/42789324/pandas-fillna-mode — Evan, Feb 01 '18 at 22:15
@Evan - I had edited the question. I had already tried the code given in the link you had mentioned. But what it does is fills all the NaN's with one most frequent value of the column. But what I need is it should fill based on "most frequent" value of A grouped by B. — Techflu, Feb 02 '18 at 09:43

score 1 · Answer 1 · edited Feb 02 '18 at 10:30

1

This is how I solved it.

df['A'] = df.groupby('Outlet_Type').A.transform(lambda x: x.fillna(x.mode()[0]))

edited Feb 02 '18 at 10:30

ayhan

70,170
20
182
203

answered Feb 02 '18 at 10:15

Techflu

55
1
8

create a python dataframe from existing dataframe groupby a columns with min, max, mean and median values

1 Answers1