0

I am trying to do an interval comparison similar to what is described in this question as 10000 <= number <= 30000 but I'm trying to do it in a data frame. For example, below is my sample data and I want to get all rows where latitude is within 1 of my predefined coordinates.

import pandas as pd
import numpy as np

df = pd.DataFrame([[5,7, 'wolf'],
              [5,6,'cow'],
              [8, 2, 'rabbit'],
              [5, 3, 'rabbit'],
              [3, 2, 'cow'],
              [7, 5, 'rabbit']],
              columns = ['lat', 'long', 'type'])

coords = [4,7]

viewShort = df[(coords[0] - 1) <= df['lat'] <= (coords[0] + 1)]

unfortunately, I get a ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). when I write it that way.

I realize that I could write it like this instead

viewLong = df[((coords[0] - 1) <= df['lat']) & (df['lat'] <= (coords[0] + 1))]

but I have to write a lot of these things, so I was trying to make it a bit more compact. What am I doing wrong in the viewShort example? Or is this just not possible with pandas and I have to write it the long way?

Thank you!

Sidenote: the correct viewShort data frame should have four rows:

[5,7,'wolf'],
[5,6,'cow'],
[5,3,'rabbit'],
[3,2,'cow']
Community
  • 1
  • 1
seth127
  • 2,594
  • 5
  • 30
  • 43

1 Answers1

2

Chained comparisons are not supported. You need to do:

df[df['lat'].between(coords[0] - 1, coords[0] + 1)]  # inclusive=True by default
Out: 
   lat  long    type
0    5     7    wolf
1    5     6     cow
3    5     3  rabbit
4    3     2     cow
ayhan
  • 70,170
  • 20
  • 182
  • 203