0

I want to get only those rows in a dataframe where a particular coulmn value is less than some upper bound.

I used to do this:

final_data[final_data['Time']<30.000000]

It gives me the error:

IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match

I again tried with:

final_data.loc[:,final_data['Time']<30.000000]

and I get:

IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match

How can I implement the filter in pandas based on column value?

david nadal
  • 279
  • 4
  • 16
  • Can you give an example of the input data? – nimrodz Jun 08 '18 at 06:09
  • Not sure about the error code tbh since, as jezrael pointed out, it shoul work. One thing I thought of is the use of "," an "." with numbers as separator, maybe that could make a difference (although it should't in this case - but just to throw it out there), – Ivo Jun 08 '18 at 06:11
  • also, you may want to check this one too https://stackoverflow.com/questions/17071871/select-rows-from-a-dataframe-based-on-values-in-a-column-in-pandas?rq=1 – user96564 Jun 08 '18 at 06:18
  • I don't know who upvoted this question but this is a typical one I would downvote for the lack of a data sample. Please try to provide one in the future using `print(df.head(5).to_dict())` for instance. – Anton vBR Jun 08 '18 at 06:32
  • @AntonvBR its a normal dataframe... should have put a datframe sample though..but this question just blocked me for two days..:( don't downvote..several people facing the same issue will benefit..whats wrong..alternative approaches solved the problem? – david nadal Jun 08 '18 at 06:41

1 Answers1

2

In my opinion first solution should working perfectly in pandas 0.23.0.

Another solutions are convert column to numpy array for avoid alignment of indexes between mask and DataFrame:

final_data[final_data['Time'].values < 30.000000]

final_data.loc[:,final_data['ResponseSLATime'].values < 30.000000]

Or use query:

final_data.query('Time < 30.000000')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252