-2

I have a pandas DataFrame with two columns: toy and color. The color column includes missing values.

How do I fill the missing color values with the most frequent color for that particular toy?

Here's the code to create a sample dataset:

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'toy':['car'] * 4 + ['train'] * 5 + ['ball'] * 3 + ['truck'],
    'color':['red', 'blue', 'blue', nan, 'green', nan,
             'red', 'red', np, 'blue', 'red', nan, 'green']
    })
Timus
  • 10,974
  • 5
  • 14
  • 28
  • Why did you add the `np` module to the "color" values? – mkrieger1 Dec 10 '22 at 10:08
  • Does this answer your question? [How to Pandas fillna() with mode of column?](https://stackoverflow.com/questions/42789324/how-to-pandas-fillna-with-mode-of-column) – Timus Dec 12 '22 at 12:11

2 Answers2

0

instead on nan and np you have to use np.nan

>>> df = pd.DataFrame({
'toy':['car'] * 4 + ['train'] * 5 + ['ball'] * 3 + ['truck'],
'color':['red', 'blue', 'blue', np.nan, 'green', np.nan,
         'red', 'red', np.nan, 'blue', 'red', np.nan, 'green']
})
>>> df.color = df.color.fillna(method='mode')
    toy color
0   car red
1   car blue
2   car blue
3   car mode
4   train   green
5   train   mode
6   train   red
7   train   red
8   train   mode
9   ball    blue
10  ball    red
11  ball    mode
12  truck   green
  • Are you sure a `method='mode'` is actually available? – Timus Dec 10 '22 at 17:49
  • Yes, please refer to pandas documentations – Priyanshu Shekhar Sinha Dec 11 '22 at 22:34
  • I [did](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.fillna.html), but wasn't able to find anything? – Timus Dec 11 '22 at 22:50
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 14 '22 at 12:34
-3

To create a dataframe, we need to import pandas. Dataframe can be created using dataframe() function. The dataframe() takes one or two parameters. The first one is the data which is to be filled in the dataframe table.