-1

Given a pandas.DataFrame named hospitals that looks like this:

    hospital gender   age  height  ...  mri  xray children months
0    general      m  33.0   1.640  ...  NaN   NaN      NaN    NaN
1    general      m  48.0   1.930  ...  NaN   NaN      NaN    NaN
2    general      f  23.0   1.540  ...  NaN   NaN      NaN    NaN
3    general      m  27.0   1.940  ...  NaN   NaN      NaN    NaN
4    general      f  22.0   1.760  ...  NaN     f      NaN    NaN
..       ...    ...   ...     ...  ...  ...   ...      ...    ...
995   sports      m  22.0   6.777  ...    f     t      NaN    NaN
996   sports      m  20.0   5.400  ...    t     f      NaN    NaN
997   sports      m  17.0   6.089  ...    f     f      NaN    NaN
998   sports      f  16.0   6.176  ...    f     t      NaN    NaN
999   sports      f  18.0   6.692  ...    t     f      NaN    NaN

[1000 rows x 14 columns]

Why does this work:

self.hospitals = self.hospitals.loc[:, 'bmi':'months'].fillna(0)

but this doesn't? (That is, hospitals is not modified)

self.hospitals.loc[:, 'bmi':'months'].fillna(0, inplace=True)
ksnortum
  • 2,809
  • 4
  • 27
  • 36
  • DataFrame.fillna returns None if inplace=True as intended. See here https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html – Ozgur O. Jan 22 '22 at 17:58
  • 1
    The second form doesn't work because the slice with `.loc` returns a copy and you fill this copy in place – Corralien Jan 22 '22 at 17:58

1 Answers1

1

Your 2 statements does not work as expected:

You extract a slice of hospitals dataframe with .loc. The returned copy is a subset of your original dataframe. Now hospitals lost some columns.

self.hospitals = self.hospitals.loc[:, 'bmi':'months'].fillna(0)

Note: but maybe it's what you want, keep columns from bmi to months.

This doesn't work because the slice with .loc returns a copy and you fill this copy in place.

self.hospitals.loc[:, 'bmi':'months'].fillna(0, inplace=True)
Corralien
  • 109,409
  • 8
  • 28
  • 52