1

How to select all columns that have header names starting with "durations" or "shape"? (instead of defining a long list of column names). I need to select these columns and substitute blank fields by 0.

column_names = ['durations.blockMinutes_x',
                'durations.scheduledBlockMinutes_y']
data[column_names] = data[column_names].fillna(0)
Klausos Klausos
  • 15,308
  • 51
  • 135
  • 217

4 Answers4

0

You could use str methods of dataframe startwith:

df = data[data.columns[data.columns.str.startwith('durations') | data.columns.str.startwith('so')]]
df.fillna(0)

Or you could use contains method:

df = data.iloc[:, data.columns.str.contains('durations.*'|'shape.*') ]
df.fillna(0)
Anton Protopopov
  • 30,354
  • 12
  • 88
  • 93
0

Use my_dataframe.columns.values.tolist() to get the column names (based on Get list from pandas DataFrame column headers):

column_names = [x for x in data.columns.values.tolist() if x.startswith("durations") or x.startswith("shape")]
Community
  • 1
  • 1
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
0

I would use the select method:

df.select(lambda c: c.startwith('durations') or c.startswith('shape'), axis=1)

Paul H
  • 65,268
  • 20
  • 159
  • 136
0

A simple and easy way

data[data.filter(regex='durations|shape').columns].fillna(0)

Sample Screenshot

enter image description here

Noordeen
  • 1,547
  • 20
  • 26