3

I'm trying to select all columns that start with a certain string and then fill all the null values in with a new value. What I'm doing now just turns all the column headers into a list instead though.

lifestyle_var = [col for col in list(df) if col.startswith('lifestyle')]

df[lifestyle_var].fillna(1, inplace=True)
Ron
  • 67
  • 6

2 Answers2

1

I had the same problem meanwhile: https://github.com/pydata/pandas/issues/10342

You can use this command: df.loc[:,lifestyle_var] = df.loc[:,lifestyle_var].fillna(1)

This problem happens because you're trying to fill the copy of the dataframe, not the original data.

0

Try

df.update(df[lifestyle_var].fillna(1))

See this.

Example:

import pandas as pd
import numpy as np
data = pd.DataFrame([ [ 1, 2, np.nan ], [ np.nan, np.nan, 6] ], columns=   ['a1', 'b', 'a2'])
vars = [ col for col in list(data) if col.startswith('a')]
data.update(data[vars].fillna(value=1))
Community
  • 1
  • 1
vmg
  • 4,176
  • 2
  • 20
  • 33
  • This ends up not working as it doesn't treat lifestyle_var as the column headers for whatever reason. – Ron Sep 17 '15 at 17:49
  • @Ron can you try the example and tell me the result? i.e. print the values for 'vars' and 'data' – vmg Sep 17 '15 at 17:58
  • Worked for your example: `>>> print vars ['a1', 'a2'] >>> print data a1 b a2 0 1 2 1 1 1 NaN 6` Now got it to work, I had kept the argument inplace=True when I initially tested it. – Ron Sep 17 '15 at 18:41