4

Here's a sample of my PostgreSQL in CSV format.

row,latitude,longitude
1,42.082513,-72.621498
2,42.058588,-72.633386
3,42.061118,-72.631541
4,42.06035,-72.634145

I have thousands more rows like these spanning coordinates across the world.

I want to query the table only for coordinates within a certain radius. How do I do this with PostGIS and PostgreSQL?

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Username
  • 3,463
  • 11
  • 68
  • 111
  • You want a K-nearest-neighbour (KNN) search using a GiST or GIN index. – Craig Ringer Sep 15 '15 at 00:43
  • @CraigRinger I've never done anything involving either of those. Where can I learn more? – Username Sep 15 '15 at 00:50
  • You have your data in a CSV formatted file or in a table with data like this from a file in CSV format? You are looking for "coordinates within a certain radius". From a specific location or do you want a solution for any location (e.g. a function with parameters)? – Patrick Sep 15 '15 at 02:06

3 Answers3

11

You want "all rows within a 5-mile radius of a coordinate", so this is not exactly a K-nearest-neighbour (KNN) problem. Related, but your case is simpler. "Find the 10 rows closest to my coordinates" would be a KNN problem.

Convert your coordinates to geography values:

ST_SetSRID(ST_MakePoint(longitude, latitude),4326)::geography

Alternatively you could use the simpler geometry type. Consider:
4.2.2. When to use Geography Data type over Geometry data type

Then we have a table like:

CREATE TABLE tbl (
  tbl_id serial PRIMARY KEY
, geog geography NOT NULL
);

All you need is ST_DWithin() - and a spatial index to make it fast:

CREATE INDEX tbl_geog_gist ON tbl USING gist(geog);

Query:

SELECT *, ST_Distance(c.x, geog) AS distance  -- distance is optional
FROM   tbl t, (SELECT ST_GeographyFromText('SRID=4326;POINT(-72.63 42.06)')) AS c(x)
WHERE  ST_DWithin(c.x, geog, 8045)  -- distance in meter
ORDER  BY distance; -- order is optional, you did not ask for that

Or you can use your original columns and create a functional index ... This and other details in this closely related answer on dba.SE:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
4

You should first create a table from your CSV formatted file, using the COPY command (if the file is accessible to the PostgreSQL server) or the \copy command in psql if the file is not local to the server. See other Q+A on SO for examples if you have any trouble.

Once you have your data in a table, you should convert your longitude and latitude columns to the PostGIS geography type by adding a column to your table of type geography(POINT, 4326) and then populating that column (here called gps) with the appropriate values:

UPDATE my_table SET gps = ST_SetSRID(ST_MakePoint(longitude, latitude), 4326);

Add an index on that column to allow for efficient searches:

CREATE INDEX my_table_gps ON my_table USING gist(gps);

You can now find the rows within 5 miles from a given location, e.g. (-72.657, 42.0657), as follows:

SELECT *
FROM my_table
WHERE ST_DWithin(gps, ST_SetSRID(ST_MakePoint(-72.657, 42.0657), 4326), 5 * 1609);

Note that ST_DWithin() on a geography column will work in meters, so you have to multiply your radius in miles with the 1,609 meters in a mile.

Patrick
  • 29,357
  • 6
  • 62
  • 90
2

I did a combo of Erwin's and Patrick's answers.

-- Add geography column
ALTER TABLE googleplaces ADD COLUMN gps geography;
UPDATE googleplaces SET gps = ST_SetSRID(ST_MakePoint(longitude, latitude), 4326);
CREATE INDEX googleplaces_gps ON googleplaces USING gist(gps);

SELECT *
FROM my_table
WHERE ST_DWithin(gps, ST_SetSRID(ST_MakePoint(-72.657, 42.0657), 4326), 5 * 1609);
Username
  • 3,463
  • 11
  • 68
  • 111