2

I am trying to find the intersection point of two flight trajectories and once I have the point, I need the indices of both the trajectories at the point of intersection. I wrote the following code for this (for one pair of flights, fa, and fb)-

import time

t0 = time.time()

for i in range(len(fa)):
    for j in range(len(fb)):
        if fa.iloc[i]['longitude'] == fb.iloc[j]['longitude'] and fa.iloc[i]['latitude'] == fb.iloc[j]['latitude']:
            print(i,j,fa.iloc[i]['longitude'], fa.iloc[i]['latitude'])           
t1 = time.time()

This is computationally very expensive (almost 20 secs to compute this) and will take a lot of time when I have hundreds of flight pairs. I'm sure there would be faster ways to do this and I would really appreciate inputs on this.

The two DataFrames have the same following structure-

fa = pd.DataFrame([[14:07:05, 106.535, 2.524], [14:07:10, 106.525,2.526]], columns=['time', 'longitude', 'latitude'])
fb = pd.DataFrame([[14:00:05, 107.306, 3.722], [14:00:10, 107.296,3.718]], columns=['time', 'longitude', 'latitude'])

I have just included two rows of the data to give a better picture of the problem.

yash
  • 122
  • 8
  • Please include a _small_ subset of your data as a __copyable__ piece of code that can be used for testing as well as your expected output for the __provided__ data. See [MRE - Minimal, Reproducible, Example](https://stackoverflow.com/help/minimal-reproducible-example), and [How to make good reproducible pandas examples](https://stackoverflow.com/q/20109391/15497888). – Henry Ecker May 25 '21 at 03:25
  • You could [merge](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html) your DataFrames on the longitude and latitude fields. That will get rid of the for loops but it's still a quadratic growth comparison. – Jan Wilamowski May 25 '21 at 03:32
  • @JanWilamowski, yes i tried it but how can I get the indices of the two data frames from this? mergedStuff = pd.merge(fa, fb, on=['longitude', 'latitude'], how='inner') . It is no doubt much faster than the loops/ – yash May 25 '21 at 03:37
  • 1
    I elaborated in an answer below – Jan Wilamowski May 25 '21 at 04:54

3 Answers3

2

I'm assuming that both fa and fb are two df's and you need the index of equality of latitude and longitude then you can try out np.where

fa[match] = np.where((fa['latitude']==fb['latitude'] and fa['longitude']==fb['longitude']), 'True', 'False')

This will insert True if both are matched otherwise false then directly print those values where the match is true

venkatesh
  • 162
  • 2
  • 6
2

I can see why your algorithm runs slow...essentially it's a brute force search to see if any points on one flight path match the other flight path. However, to fully answer your question if a more efficient method exists, you may need to clarify some assumptions.

For example, are you assuming each aircraft flies the shortest path (i.e., great circle route between two points) with no deviations? If so, here's a good summary on how to calculate the intersection - http://www.boeing-727.com/Data/fly%20odds/distance.html

2

You can merge the DataFrames on their longitude and latitude columns:

fa = pd.DataFrame([['14:07:05', 106.535, 2.524], ['14:07:10', 106.525,2.526]], columns=['time', 'longitude', 'latitude'])
fb = pd.DataFrame([['14:00:05', 107.306, 3.722], ['14:00:10', 107.296,3.718], ['14:07:05', 106.535, 2.524]], columns=['time', 'longitude', 'latitude'])
fa.merge(fb, on=['longitude', 'latitude'])

gives (note that I added another field to fb to get a match)

     time_x  longitude  latitude    time_y
0  14:07:05    106.535     2.524  14:07:05

If you want to keep the indices around, reset them (to turn them into regular columns) and use an outer merge, then drop unmatched rows:

fa.reset_index().merge(fb.reset_index(), on=['longitude', 'latitude'], how='outer').dropna()

gives

   index_x    time_x  longitude  latitude  index_y    time_y
0      0.0  14:07:05    106.535     2.524      2.0  14:07:05
Jan Wilamowski
  • 3,308
  • 2
  • 10
  • 23