4

I know we can select only few columns using pandas dataframe filter, but can we also exclude only some columns?

Here is MWE:

import numpy as np
import pandas as pd


df = pd.DataFrame({'id': [1,2,3], 'num_1': [10,20,30], 'num_2': [20,30,40]})

df.filter(regex='num')

Can we select all columns not having 'num' in the columns:

Something like:

df.filter(regex='^(num)')

Required Output

   id
0   1
1   2
2   3

Note

# these already works, i am only looking regex way
df[['id']] # gives the required output

Reference:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.filter.html

BhishanPoudel
  • 15,974
  • 21
  • 108
  • 169
  • 1
    `regex='^((?!num).)*$'` (or perhaps `'^(?!num).*$'`)?, See https://stackoverflow.com/questions/406230/regular-expression-to-match-a-line-that-doesnt-contain-a-word,=. IMO, that's not very readable and really hard to know how it will break, so I'd stick with `str.contains` – ALollz May 06 '19 at 17:10

2 Answers2

5

Using contains

df.loc[:,~df.columns.str.contains('num')]
df.loc[:,~df.columns.str.startswith('num')]

Using difference

df[df.columns.difference(['num_1','num_2'])]
df[df.columns.difference([i for i in df.columns if i.startswith('num'])]
df[df.columns.difference([i for i in df.columns if 'num' in i]

Using drop

df[df.columns.drop(['num_1','num_2'])] # do not use axis=1 here.
df[df.columns.drop([i for i in df.columns if i.startswith('num'])]
df[df.columns.drop([i for i in df.columns if 'num' in i]

Credits to @ALollz
If you insist on using filter:

df.filter(regex='^((?!num).)*$')

But this is not only ugly and difficult to read/understand but also vulnerable to some pitfalls of regex search as discussed in Regular expression to match a line that doesn't contain a word .

I would suggest to stick with contains.

BhishanPoudel
  • 15,974
  • 21
  • 108
  • 169
4

You can get a list of columns not having num using list comprehension

cols_without_num = [x for x in list(df) if 'num' not in x]

Then subset the data

df[cols_without_num]
Adarsh Chavakula
  • 1,509
  • 19
  • 28