0

I'm trying to do the same thing as in this question, but I have have a string-type column that I need to keep in the dataframe so I can identify which rows are which. (I guess I could do this by index, but I'd like to be able to save a step.) Is there a way to not count a column when using .any(), but keep it in the resulting dataframe? Thanks!

Here's the code that words on all columns:

df[(df > threshold).any(axis=1)]

Here's the hard coded version I'm working with right now:

df[(df[list_of__selected_columns] > 3).any(axis=1)]

This seems a little clumsy to me, so I'm wondering if there's a better way.

semblable
  • 773
  • 1
  • 8
  • 26

1 Answers1

1

You can use .select_dtype to choose all, say numerical columns:

df[df.select_dtype(include='number').gt(threshold).any(axis=1)]

Or a chunk of continuous columns with iloc:

df[df.iloc[:,3:6].gt(threshold).any(axis=1)]

If you want to select some random list of columns, you'd be best to resolve by hard coded list.

Quang Hoang
  • 146,074
  • 10
  • 56
  • 74