25

I want to implement a service that, given users' geo coordinates, can detect whether two users are at the very same location in real time.

In order to do this in real time and to scale, it seems I should go with a distributed in-memory datastore like Redis. I have researched using geohashing, but the problem is that points close to each other may not always share the same hash prefix. And geohashing may be overkill since I'm interested in finding whether two users are close enough where they are standing next to each other.

The simple solution of course is just to test whether pairs of geo coordinates fall within a small distance of each other. But AFAIK, Redis and other in-memory datastorse don't have geospatial indexing to support that kind of look-up.

What is the best way to go about implementing this?

Simian
  • 1,622
  • 3
  • 17
  • 32
  • 1
    As much as I love redis, I think the best option is to use something else for this lookup. There are several good tools out there which do have support for geospatial indexing, including elasticsearch, mongodb, and PostgreSQL (with PostGIS). Even MySQL has support for GIS extensions. All of these would be better than redis in this particular use case. – Carl Zulauf Oct 01 '13 at 08:26
  • But are those tools fast enough to do the look up in real time? I want something that can complete the lookup almost instantly. – Simian Oct 01 '13 at 14:29
  • 1
    When properly configured and indexed all of the mentioned solutions provide near-instant lookup. Elasticsearch is very very fast and has built-in support for clustering, allowing it to scale to incredible loads without much effort. – Carl Zulauf Oct 01 '13 at 17:44
  • Yes but are those suitable for a real time system? How fast is the write if it needs to go to disk and index the entry? That's why I was looking at in-memory solutions for fast writes. – Simian Oct 01 '13 at 18:50
  • Actually ElasticSearch does not support proximity searches for distances less than 1km. See original post. – Simian Oct 01 '13 at 19:43
  • Yes they are suitable for this task – Tommaso Barbugli Oct 02 '13 at 12:06
  • I need a proximity search that can find points less than < 1km away. I want to detect whether two users are at the same location. ElasticSearch does not support proximity searches for distances less than 1km. See the original post. – Simian Oct 02 '13 at 14:18
  • 1
    It does support proximity searches less than 1km. The syntax is just a bit difficult. You have to use 0.2km to represent 200m. I have not attempted anything below 100m though. – Carl Zulauf Oct 02 '13 at 16:50

7 Answers7

18

This functionality is baked into Redis 3.2+.

But for older versions the problem still exists. I've taken Yin Qiwen's answer and created a module for Node, and you can see how it uses Redis by examining the code. His instructions are perfect and I was able to follow them for great results. https://github.com/arjunmehta/node-georedis

The same algorithm is essentially what is used for the native commands.

It is very fast, and avoids any kind of intersections/haversine type operations. The coolest thing (I think) about Yin Qiwen's method is that the most computationally intense parts of the algorithm can be distributed to clients (instead of all happening in the DB or on the server).

It's not 100% precise and uses preconfigured distance steps, but for most applications you won't need exact precision I'd imagine.

I've also paraphrased Yin Qiwen's article at the GIS stack exchange.

Sorry for all the linkage. :P

Community
  • 1
  • 1
Arjun Mehta
  • 2,500
  • 1
  • 24
  • 40
  • 2
    I am sorry I meant to up this answer but pressed down by accident now StackOverflow does not allow me to up it. – user2924127 May 01 '15 at 16:39
15

Generally, this could be done by GeoHash and Redis's sorted set. There is a design I wrote before talking about how to implement a spatial index service on redis.

https://github.com/yinqiwen/ardb/wiki/Spatial-Index

yinqiwen
  • 604
  • 6
  • 6
7

Maybe you can try this one:

Redis Geography Edition

You really want to try it, it works awesome. :)

mzalazar
  • 6,206
  • 3
  • 34
  • 31
  • As mentioned in a newer answer, Redis v3.2+ includes Geo functionality: http://stackoverflow.com/a/31359397/3160475 – Itamar Haber Aug 22 '15 at 10:32
5

I realize this doesn't answer your question... but I don't think that it's the correct tool.

PostgreSQL + PostGIS can perform really, really well. You can configure PostgreSQL to pretty much run as much of the database as it can fit in memory.

PostGIS uses (I think) rtree indexes, so it's incredibly fast to perform the kind of lookup you are interested in.

Using a backend that fires off websocket requests would allow you to perform pretty much real-time. Anytime your backend receives a persons GPS coordinates; perform the spatial lookup; and notify applicable clients through websockets.

jreid42
  • 1,240
  • 1
  • 12
  • 19
5

The Redis geography edition mentioned by other answers in this thread has been integrated into Redis since version 3.2 (also see this earlier comment).

You can find the new commands here (in beta for now) :

Itamar Haber
  • 47,336
  • 7
  • 91
  • 117
nha
  • 17,623
  • 13
  • 87
  • 133
1

Tarantool database keeps data in memory, pushes them to disk as transaction logs, has RTree-type spatial index (not only 2-dimensional) and a number of nice operations on such index (containment, overlapping, distance).

I use it in a commercial project for storing and querying records which describe objects in 3D space.

http://tarantool.org/doc/book/box/box_index.html

https://github.com/tarantool/tarantool/wiki/R-tree-index-quick-start-and-usage

Standard client and examples are in Lua, but there are couple of other clients developed by the database authors. I use Java client in an Scala application with success.

The database is also very fast - here's scientific comparison with other databases (putting aside an aspect of being a spatial db): http://airccse.org/journal/ijdms/papers/6314ijdms01.pdf

Wojciech Kaczmarek
  • 2,173
  • 4
  • 23
  • 40
  • Tarantool is limited by the available memory though, not a real distributed DB like Cassandra. So I think it is unfair to compare it with the latter as was done in the paper you linked. – Roland Aug 20 '16 at 12:52
  • 1
    Yes Tarantool has memory limitation, still it's posters decision whether it is relevant to him or not. Also I'm not catching the argument "not a real distributed DB" - it is. Speaking of linked paper - scientific comparisons have the desired property that their authors have done lot of work most of us just doesn't do. I strongly believe that poster of the question can decide for himself whether the set of features tested by the scientists is helpful for him or not - no need to call anything "unfair". – Wojciech Kaczmarek Aug 22 '16 at 08:41
  • Sorry, I didn't mean to be offensive, I apologise if that was the case. By distributed I meant that you can have things like sharding, really big datasets, etc... AFAIK Tarantool is limited by the amount of memory available on *one* machine. So you can't have huge databases. Only the same data replicated on several machines. Please correct me if I'm wrong. – Roland Aug 22 '16 at 08:59
  • 1
    Ok I will attempt to address your questions: (a) distributed - it does replication in a master-slave mode so it counts as "HA-purposed" replication I'd say, no sharding (b) mem limit, you're right, at the same time this may be non-issue for the original poster. At our company we use it to hold the data about millions of objects moving in a 3D space and practically using not much than 6G of RAM so it's not a big deal. Please also note our case seems to be more complicated than using it for geospatial (2D) data - of course it depends on the number of objects needed (still we have millions) – Wojciech Kaczmarek Aug 23 '16 at 12:16
  • Upvote for the reply! How many millions, tens or hundreds of millions? Could it handle 800 million objects? – Roland Aug 23 '16 at 12:54
  • 1
    (Excuse late reply, didn't see your comment) Tens of millions of objects which are 3D boxes with some data attached. They have spatial 3D index and two additional indices (one on ints, second on strings). The reported memory usage is for both the data and indices. – Wojciech Kaczmarek Sep 07 '16 at 08:57
  • Thanks, I asked a question regarding nearest neighbor queries: http://stackoverflow.com/questions/39644902/spatial-search-for-neighbor-with-distance-limit – Roland Sep 22 '16 at 17:04
0

I would like to share a sample Java code for Redis Geography edition.

public void geoadd(String objectId, BigDecimal latitude, BigDecimal longitude) {
    log.info("geoadd(): {} {} {}", objectId, latitude, longitude);
    try (Jedis jedis = jedisPool.getResource()) {
        if (geoaddSha == null) {
            String script = "return redis.call('geoadd','" + GEOSET + "', ARGV[1], ARGV[2], KEYS[1])";
            geoaddSha = jedis.scriptLoad(script);
        }
        log.info("geoaddSha: {}", geoaddSha);
        log.info(jedis.evalsha(geoaddSha, 1, objectId, latitude.toString(), longitude.toString()).toString());
    }
}

@SuppressWarnings("unchecked")
public List<String> georadius(BigDecimal latitude, BigDecimal longitude, int radius, Unit unit) {
    log.info("georadius(): {} {} {} {}", latitude, longitude, radius, unit);
    try (Jedis jedis = jedisPool.getResource()) {
        if (georadiusSha == null) {
            String script = "return redis.call('georadius','" + GEOSET + "', ARGV[1], ARGV[2], ARGV[3], ARGV[4])";
            georadiusSha = jedis.scriptLoad(script);
        }
        log.info("georadiusSha: {}", georadiusSha);
        List<String> objectIdList = (List<String>) jedis.evalsha(georadiusSha, 0, latitude.toString(), longitude.toString(), String.valueOf(radius), unit.toString());
        log.info("objectIdList: {}", objectIdList);
        return objectIdList;
    }
}

public void remove(String objectId) {
    log.info("remove(): {}", objectId);
    try (Jedis jedis = jedisPool.getResource()) {
        jedis.zrem(GEOSET, objectId);
    }
}
  • lettuce has native GEO commands support without the need of tunneling it using Lua: http://redis.paluch.biz/docs/api/snapshots/3.3.Beta1/com/lambdaworks/redis/RedisGeoConnection.html – mp911de Aug 20 '15 at 06:20
  • @mp911de Thank you for pointing out. I couldn't find this when I first needed it. – Trinsit Wasinudomrod Aug 20 '15 at 10:17