Trying to select data from all columns that start with string from a pandas dataframe

Question

I'm trying to select all columns that start with a certain string and then fill all the null values in with a new value. What I'm doing now just turns all the column headers into a list instead though.

lifestyle_var = [col for col in list(df) if col.startswith('lifestyle')]

df[lifestyle_var].fillna(1, inplace=True)

score 1 · Accepted Answer · answered Sep 17 '15 at 16:53

I had the same problem meanwhile: https://github.com/pydata/pandas/issues/10342

You can use this command: df.loc[:,lifestyle_var] = df.loc[:,lifestyle_var].fillna(1)

This problem happens because you're trying to fill the copy of the dataframe, not the original data.

score 0 · Answer 2 · edited May 23 '17 at 12:31

0

Try

df.update(df[lifestyle_var].fillna(1))

See this.

Example:

import pandas as pd
import numpy as np
data = pd.DataFrame([ [ 1, 2, np.nan ], [ np.nan, np.nan, 6] ], columns=   ['a1', 'b', 'a2'])
vars = [ col for col in list(data) if col.startswith('a')]
data.update(data[vars].fillna(value=1))

edited May 23 '17 at 12:31

Community

1
1

answered Sep 17 '15 at 16:50

vmg

4,176
2
20
33

This ends up not working as it doesn't treat lifestyle_var as the column headers for whatever reason. – Ron Sep 17 '15 at 17:49
@Ron can you try the example and tell me the result? i.e. print the values for 'vars' and 'data' – vmg Sep 17 '15 at 17:58
Worked for your example: `>>> print vars ['a1', 'a2'] >>> print data a1 b a2 0 1 2 1 1 1 NaN 6` Now got it to work, I had kept the argument inplace=True when I initially tested it. – Ron Sep 17 '15 at 18:41

Trying to select data from all columns that start with string from a pandas dataframe

2 Answers2