8

I want to get the numerical indexes of a selection of pandas dataframe columns.

With one column it's very simple:

nonzero(df.columns.values == 'conditionA')

but with multiple elements? I have something that works but is verbose and hugly:

df = pd.DataFrame(columns=['conditionF', 'conditionB', 'conditionA', 'conditionD', 'conditionC'])

cols_to_find = ['conditionA', 'conditionB', 'conditionC']
[i for i in range(len(df.columns.values)) if df.columns.tolist()[i] in cols_to_find ]

Better ideas?

smci
  • 32,567
  • 20
  • 113
  • 146
Gioelelm
  • 2,645
  • 5
  • 30
  • 49

2 Answers2

11

This works, and also preserves order:

[df.columns.get_loc(col) for col in cols_to_find]
[2, 1, 4]

List comprehensions are your friend.

smci
  • 32,567
  • 20
  • 113
  • 146
  • @MTKnife: ok but AFAIK it works fine for the question which doesn't mention multiindex, can you post an example (with unique column names) where `get_loc` returns multiple values? Then we can see about handling that case. – smci Dec 20 '19 at 08:15
  • Actually, my apologies: I just noticed you were looking at columns, and with columns, you're unlikely to have duplicates. It would be a problem with identically named rows (including, say, multiple rows with the same 0-level value in a `MultiIndex`). – MTKnife Dec 20 '19 at 16:31
0
df=pd.DataFrame(columns=['conditionF', 'conditionB', 'conditionA', 'conditionD', 'conditionC'])
def search():
    search=['conditionA', 'conditionB', 'conditionC']
    c=len(search)
    for i in df.columns.values:
        print(i)
        if i in search:
            c-=1
    return c==0
search()
J. Doe
  • 23
  • 4