2

I am using GeopositionField in Django to store the coordinates of my user. Now I want to find a list of 20 users who are closest to current user. Can that functionality be acheaved my GeopositionField? I know that GeoDjango makes it easy to search distances, but since I am using Heroku and postgresql, I want to keep the costs down and with postgressql, installing PostGIS seems to be the only alternative.

Any suggestions?

Jonathan
  • 2,728
  • 10
  • 43
  • 73

3 Answers3

8

For the distance between two points you can use Geopy.

From the documetation: Here's an example usage of distance.distance:

>>> from geopy import distance  
>>> _, ne = g.geocode('Newport, RI')  
>>> _, cl = g.geocode('Cleveland, OH')  
>>> distance.distance(ne, cl).miles  
538.37173614757057 

To implement this in a Django project. Create a normal model in models.py:

class User(models.Model):
    name = models.Charfield()
    lat = models.FloatField()
    lng = models.FloatField()

To optimize a bit you can filter user objects to get a rough estimate of nearby users first. This way you don't have to loop over all the users in the db. This rough estimate is optional. To meet all your project requirements you maybe have to write some extra logic:

#The location of your user.
lat, lng = 41.512107999999998, -81.607044999999999 

min_lat = lat - 1 # You have to calculate this offsets based on the user location.
max_lat = lat + 1 # Because the distance of one degree varies over the planet.
min_lng = lng - 1
max_lng = lng + 1    

users = User.objects.filter(lat__gt=min_lat, lat__lt=max__lat, lat__gt=min_lat, lat__lt=max__lat)

# If not 20 fall back to all users.
if users.count() <= 20:
     users = User.objects.all()

Calculate the distance between your user and each user in users, sort them by distance and get the first 20.

results = []
for user in users:
     d = distance.distance((lat, lng), (user.lat, user.lng))
     results.append( {'distance':d, 'user':user })
        results = sorted(results, key=lambda k: k['distance'])
results = results[:20]
allcaps
  • 10,945
  • 1
  • 33
  • 54
  • but the requirement is to find 20 CLOSEST users without using GeoDjango. – Jonathan Aug 05 '13 at 05:34
  • Yes: results = sorted(results, key=lambda k: k['distance']) gives you all user by distance. [:20] gives the first 20. Maybe you have to many users to loop threw each time. The solution is to get a rough estimate of nearby users first. I'll update the answer and put the rough estimate code before the 20 closest. – allcaps Aug 05 '13 at 08:32
  • This is quite smart! But wont this be slower as it scales? compared to Postgis? – Jonathan Aug 05 '13 at 16:43
  • It's not as efficient as a spacial/geo db. I guess you should find out what queries and calculations are costing you (memory/speed/developing time/money) and base your decision on that. A good 'rough' estimate will speed thing up. Maybe .count() and dipolar search? And https://docs.djangoproject.com/en/1.2/topics/db/optimization/ – allcaps Aug 13 '13 at 20:09
  • I had similar requirements and ended up using geopy also. I found this gist to be useful: https://gist.github.com/renyi/3385043. It includes a rough distance calculation to make the query more efficient. – Jordan Jan 03 '17 at 04:49
1

I think you have 2 options here:

  1. There is no efficient way to do it without an spatial index (used by Postgis and Geodjango with PointField) and using GeopositionField. The only way I found to deal with this issue is:

    • You have to find all distances from the source user to all users (this is really heavy).
    • Then sort all the distances and top the 20 you are looking for.

    GeopositionField stores the coordinates as text but can be retrieved using .latitude and longitude on the field.

  2. There seems to be support for the K-Nearest-Neighbors problem in Postgresql 9.1+ (http://wiki.postgresql.org/images/4/46/Knn.pdf). But, I think you will have to either add another column to you table to store Points (http://www.postgresql.org/docs/9.2/static/datatype-geometric.html) or implement a distance function for GeopositionField.

If you are using the basic setup of Heroku just for development and plan to change to a higher plan, I would suggest to use the first approach since other heroku plans support Postgis and you can easily implement this approach and later change it to a simple Postgis function call.

Although, if this is the only case in which you will deal with spatial data, I would recommend to use a Point field and KNN support. So you won't need postgis support in the future.

manu
  • 66
  • 1
  • 2
  • Yes I do have a column that saves the latitude and longitude. And I have a bounding box function derived from http://janmatuschek.de/LatitudeLongitudeBoundingCoordinates .I would get a box pretty easily which tells me the range of my maximum and minimum latitude and longitude, but getting the CLOSEST 20 users efficiently is the problem. Maybe K-Nearest-Neighbors would solve that for me. I will read into it more. But this approach is substantially slower than Postgis isnt it? – Jonathan Aug 05 '13 at 05:39
  • If you use the K-Nearest-Neighbors solution implemented in Postgres 9.1+ it would be as efficiently as Postgis because it uses a spatial index (based on voronoi diagrams). – manu Aug 19 '13 at 02:55
0

A quick peek in the source code shows that the GeopositionField just stores the coördinates as plain text (<latitude>,<longitude>), so there's not gonna be an efficient way to extract the right data from the database. If you want efficient database queries, you'll have to use either GeoDjango or PostGIS (or find another alternative that provides spatial data search).

knbk
  • 52,111
  • 9
  • 124
  • 122