why is select_dtypes so slow

Asked Nov 04 '16 at 16:46

Active Nov 04 '16 at 16:46

Viewed 223 times

Why is select_dtypes so slow?

%timeit [col for col in df.columns if np.issubdtype(df[col].dtype, np.number)]

453 microsecs per loop

%timeit df.select_dtypes(include=[np.number])

4.58 secs per loop

asked Nov 04 '16 at 16:46

simon

1

I'd post an issue on [github](https://github.com/pandas-dev/pandas/issues) as this seems to be really inefficient, I get 1.59ms vs 45.7 us when comparing select_dtypes vs list comprehension – EdChum Nov 04 '16 at 16:49
1

AFAIK, the idea behind `select_dtypes()` is to select a subset of DF (not subset of columns), so it returns data (all rows), which of course takes time... – MaxU - stand with Ukraine Nov 04 '16 at 17:25
2

Why does that have to take time? It does not have to actually move any data just return pointers to columns. – simon Nov 04 '16 at 20:01

0 Answers0