1

My data table looks like this. enter image description here With data.columns of following 60 columns

['WeatherHR0', 'WeatherHR1', 'WeatherHR2', 'WeatherHR3', 'WeatherHR4',
       'WeatherHR5', 'WeatherHR6', 'WeatherHR7', 'WeatherHR8', 'WeatherHR9',
       'WeatherHR10', 'WeatherHR11', 'WeatherHR12', 'WeatherHR13',
       'WeatherHR14', 'WeatherHR15', 'WeatherHR16', 'WeatherHR17',
       'WeatherHR18', 'WeatherHR19', 'WeatherHR20', 'WeatherHR21',
       'WeatherHR22', 'WeatherHR23', 'AvgDB', 'HDD0', 'HDD5', 'HDD10', 'HDD13',
       'HDD18', 'CDD15', 'CDD18', 'Peak Average', 'Day of Week', 'Holiday',
       'HR1', 'HR2', 'HR3', 'HR4', 'HR5', 'HR6', 'HR7', 'HR8', 'HR9', 'HR10',
       'HR11', 'HR12', 'HR13', 'HR14', 'HR15', 'HR16', 'HR17', 'HR18', 'HR19',
       'HR20', 'HR21', 'HR22', 'HR23', 'HR24', 'Max']

At times, I wanted to select data from multiple columns that are separated by unwanted columns (unwanted as for that moment, might needed for later).

I wanted to do something like df.loc[['WeatherHR0':WeatherHR23'+ 'Peak Average'+ 'HR0':'HR24']].

For example, I may want to select columns from WeatherHR0~WeatherHR23 + Peak Average + HR0~24 while keep other columns undropped.

I know I can drop the unselected columns/create new dataframe, but is there a pythonic way to selected discreted columns in pandas?

Community
  • 1
  • 1
Meruemu
  • 611
  • 1
  • 8
  • 28
  • Does this answer your question? [Selecting multiple columns in a pandas dataframe](https://stackoverflow.com/questions/11285613/selecting-multiple-columns-in-a-pandas-dataframe) – AMC Feb 28 '20 at 20:45
  • @AMC No it doesn't. I need to select multiple discrete columns which I can't use slice selection in loc[] operation to combine columns do to like df.loc[HR0:24 + Peak + Weather0:24]. So no, it's different question and answer. – Meruemu Feb 29 '20 at 15:40

1 Answers1

2

you can create a list of strings (pythonically) and then use that to select the columns:

columns_of_interest = ([f'WeatherHR{i}' for i in range(24)] + 
                       ['Peak Average'] + 
                       [f'HR{i}' for i in range(25)])
data[columns_of_interest]

Of course, this strategy works best if there are common patterns in the column names.

jfaccioni
  • 7,099
  • 1
  • 9
  • 25
  • Thank for the quick answer. I was expecting some .iloc[] or .loc[] tricks or lambda function that I can deal with data on the fly. But it works too. – Meruemu Feb 28 '20 at 17:34
  • If you want to use `loc` (for example to select all columns containing the string HR), you could do `df.loc[:, df.columns.str.contains('HR')]`, although you'd still need to manually add the Peak Average column. – jfaccioni Feb 28 '20 at 17:43