0

I have table like this:

lat lon time
51.04425273 6.26225570 2022-04-01 11:29:15.103
51.04424076 6.26200474 2022-04-01 11:29:09.106
51.04424345 6.26200506 2022-04-01 11:29:05.304
51.04427988 6.26203456 2022-04-01 11:29:03.231

I need to calculate speed (in km/h) between each two consecutive points (lat, lon). It would be ideal to also have differences in time and distance in separate columns. The first value should be empty, expected result:

lat lon time diff_time distance speed
51.04425273 6.26225570 2022-04-01 11:29:15.103
51.04424076 6.26200474 2022-04-01 11:29:09.106 ... ... ...
51.04424345 6.26200506 2022-04-01 11:29:05.304 ... ... ...
51.04427988 6.26203456 2022-04-01 11:29:03.231 ... ... ...

I have already looked at some hints (e.g. SQL Server calculate distances between 2 sets of columns of latitude / longitude), but none of them fit my table form.

I was thinking about something like a loop, in Jupyter Python I did it like this:

df_sample['speed']=0.0

# determine how many rows are in the dataframe

nrrows=df_sample['lat'].count()
i=1
while i < nrrows:

    # determine the time passed between measured positions
    timedist=df_sample['time'][i-1]-df_sample['time'][i]

    # put timedifferenc into seconds
    timed=timedist.total_seconds()


    # determine the distance passed between measured positions with distance function
    dist=distance((df_sample['lat'][i-1],df_sample['lon'][i-1]),(df_sample['lat'][i],df_sample['longitude'][i])).km

    # if there two datapoints with the same time, put speed from the last calculation
    if timed==0.0:
        df_sample['speed'][i]=df_sample['speed'][i-1]
    else:
        # calculated speed in km/h
        df_sample['speed'][i]=(dist/timed)*3600.0

    # first data gets the speed from first to second datapoint
    if i==1:
        df_sample['speed'][i-1]=(dist/timed)*3600.0
    i=i+1

Now, I need to move this logic to MS SQL and I can't cope with such data manipulation in this environment. Is someone able to help?

sdom
  • 319
  • 1
  • 9
  • 1
    In T-SQL, you could use a set-based query ordered by time, with a `LAG` windowing function to get values from the previous row. – Dan Guzman Apr 21 '22 at 18:38

1 Answers1

0

I had the same idea as @dan-guzman in the comment. Here's the implementation:

with cte as (
    select * from (values 
        (51.04425273, 6.26225570, '2022-04-01 11:29:15.103'),
        (51.04424076, 6.26200474, '2022-04-01 11:29:09.106'),
        (51.04424345, 6.26200506, '2022-04-01 11:29:05.304'),
        (51.04427988, 6.26203456, '2022-04-01 11:29:03.231')
    ) as x(lat, lon, ts)
), i1 as (
    select *,
        point = geography::Point(lat, lon, 4326)
    from cte
), i2 as (
    select *,
        ts_prev = lag(ts, 1) over (order by ts),
        point_prev = lag(point, 1) over (order by ts)
    from i1
), i3 as (
    select delta_m = point_prev.STDistance(point) , 
        delta_t = datediff(second, ts_prev, ts), 
        *
    from i2
)
select speed = delta_m / delta_t * 3.6,
    *
from i3;

I think the only "funny business" going on here is the multiplication by 3.6. Because I chose 4326 as the SRID, distances are calculated in meters. And since I'm doing the time calculation in seconds, delta_m / delta_t will have units of "meters per second"; 3.6 is the conversion factor to get it into "kilometers per hour".

Ben Thul
  • 31,080
  • 4
  • 45
  • 68