I'm a python beginner, so I'm practicing some data analysis using pandas in a dataframe with a list of restaurants with a Michelin star (restaurants_df
).
When I show, for example, the first 5 rows I notice that in the "price
" column (object type
) of row 4 I have a blank value:
In [ ]: restaurants_df.head()
Out[ ]:
name year latitude longitude city region zipCode cuisine price
0 Kilian Stuba 2019 47.348580 10.17114 Kleinwalsertal Austria 87568 Creative $
1 Pfefferschiff 2019 47.837870 13.07917 Hallwang Austria 5300 Classic cuisine $
2 Esszimmer 2019 47.806850 13.03409 Salzburg Austria 5020 Creative $
3 Carpe Diem 2019 47.800010 13.04006 Salzburg Austria 5020 Market cuisine $
4 Edvard 2019 48.216503 16.36852 Wien Austria 1010 Modern cuisine
Then I check how many NaN
values are in each column. In the case of the price
column there are 151 values:
In [ ]: restaurants_df.isnull().sum()
Out[ ]: name 0
year 0
latitude 0
longitude 0
city 2
region 0
zipCode 149
cuisine 0
price 151
dtype: int64
After, I replace those values with the string "No Price"
, and confirm that all values have been replaced.
In [ ]: restaurants_df["price"].fillna("No Price", inplace = True)
restaurants_df.isnull().sum()
Out[ ]: name 0
year 0
latitude 0
longitude 0
city 0
region 0
zipCode 0
cuisine 0
price 0
dtype: int64
However, when I show the first 5 rows, the problem persists.
In [ ]: restaurants_df.head()
Out[ ]:
name year latitude longitude city region zipCode cuisine price
0 Kilian Stuba 2019 47.348580 10.17114 Kleinwalsertal Austria 87568 Creative $
1 Pfefferschiff 2019 47.837870 13.07917 Hallwang Austria 5300 Classic cuisine $
2 Esszimmer 2019 47.806850 13.03409 Salzburg Austria 5020 Creative $
3 Carpe Diem 2019 47.800010 13.04006 Salzburg Austria 5020 Market cuisine $
4 Edvard 2019 48.216503 16.36852 Wien Austria 1010 Modern cuisine
Any idea why this is happening and how I can solve it? Thanks in advance!