-2

I have three SQL tables:

  1. A list of 100k weather stations with a latitude and longitude coordinate
  2. A list of 15 cities of interest with a latitude and longitude coordinate
  3. A list of weather data for each weather station

My interest right now is only with the first two tables. How do I filter the list of weather stations to those within e.g. 100km of each city of interest?

I have a Microsoft SQL Server and I'd prefer to do it within SQL if possible.

Villahousut
  • 115
  • 1
  • 11
  • Does this answer your question? [Calculating distance between two points (Latitude, Longitude)](https://stackoverflow.com/questions/13026675/calculating-distance-between-two-points-latitude-longitude) – SMor Jan 31 '21 at 17:16
  • It helps - thanks - but it's only a small part of the solution. The bigger question is how to even approach the problem. Can I do it in T-SQL or do I need Python or Go code for this? If I do it in T-SQL, is there a nice set-based solution or do I need to use a cursor and iterate over each row, which is super slow? – Villahousut Jan 31 '21 at 17:55

2 Answers2

2

Basically, if you try to do this yourself, you end up with a Cartesian product:

select c.*, ws.*
from cities c cross apply
     (select ws.*
      from (select ws.*, 
                    <complicated expression to calculate distance> as distance
            from weather_station ws 
           ) ws
      where distance < 100
     ) ws;

In order to get the list of weather stations, all cities and weather stations have to be compared. The distance calculation is often rather expensive, so you can cut down on this by "prefiltering". For instance, in most inhabited places, 100 km is within 1 degree latitude and 2 degrees longitude:

select c.*, ws.*
from cities c cross apply
     (select ws.*
      from (select ws.*, 
                    <complicated expression to calculate distance> as distance
            from weather_station ws 
            where ws.latitutde between c.latitude - 1 and c.latitude + 1 and
                  ws.longitude between c.longitude - 2 and c.longitude + 2
           ) ws
      where distance < 100
     ) ws;

Although that helps, this is still essentially a filtered Cartesian product.

So, what should you really do? If you care about coordinates as spatial data, you should look into SQL Server's spatial extensions (the documentation is here, particularly the geography type because that is most relevant to your needs).

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

As Gordon mentioned, you can define spatial geography datatype for your needs. You can follow below steps to achieve the goal.

  1. Store the latitude, longitude data in the Point
  2. Now, use the STDistance to calculate the distance between two points
  3. You can leverage common scenario of finding nearest neighbor
Venkataraman R
  • 12,181
  • 2
  • 31
  • 58