Python Pandas : Select data and ignoring KeyErrors

Question

Note: my question isn't this one, but something a little more subtle.

Say I have a dataframe that looks like this

df = 
    A     B    C
0   3     3    1
1   2     1    9

df[["A", "B", "D"]] will raise a KeyError.

Is there a python pandas way to let df[["A", "B", "D"]] == df[["A", "B"]]? (Ie: just select the columns that exist.)

One solution might be

good_columns = list(set(df.columns).intersection(["A", "B", "D"]))
mydf = df[good_columns]

But this has two problems:

score 2 · Accepted Answer · edited Nov 30 '15 at 06:02

2

You can use filter, this will just ignore any extra keys:

df.filter(["A","B","D"])
    A     B  
0   3     3   
1   2     1

edited Nov 30 '15 at 06:02

hlin117

answered Nov 30 '15 at 05:58

maxymoo

Thank you. I wish the pandas documentation had usage examples for each function, just like scikit-learn. – hlin117 Nov 30 '15 at 06:23
1

why don't you consider submitting some yourself, documentation is a great way to get started with contributing to open-source projects – maxymoo Nov 30 '15 at 22:29

score 1 · Answer 2 · answered Nov 30 '15 at 06:01

1

You can use a conditional list comprehension:

target_cols = ['A', 'B', 'D']
>>> df[[c for c in target_cols if c in df]]
   A  B
0  3  3
1  2  1

answered Nov 30 '15 at 06:01

Alexander

Looks like that's an `O(n)` check to see if `c in df`. I'll stick with @maxymoo's answer. Thanks! – hlin117 Nov 30 '15 at 06:02

2 Answers2