Python, efficient way to operate on pair of coordinates

Question

I have a data file which has latitude and longitude information which I have stored as a list of tuples of the form

[(lat1, lon1), (lat1, lon1), (lat2, lon2), (lat3, lon3), (lat3, lon3)  ...]

As shown above the consecutive locations (lat, lon) may be the same if the location in the data file has not changed. Hence, the order is very important here. What I am interested in is a fairly efficient way to check when the coordinates change, lat1, lon1 -> lat2, lon2 etc. and then get the distance between these two coordinates.

I already have a function to get the distance of the form getDistance(lat1, lon1, lat2, lon2) which returns the calculated distance between these locations. I want to store these distances in a list from which I can do some plots later on.

A. Please include any relevant code. B. If I understand correctly, you want to create a list from the results of the differences between different tuples in another list based on the results of a function of yours? C. Did you mean that there would be duplicates of certain tuples (1,1,2,3,3) ? or should it be (1,2,3,4,5) ? — Inbar Rose, Apr 04 '13 at 08:03
I have only gone as far a reading the data file and storing it in a list of tuples as shown. And yes you are right, the function is already there. What I want is a way to go through that list and when the location changes, pass the previous and the new location into this function. — sfactor, Apr 04 '13 at 08:07
Personally I like dealing with coordinates as imaginary numbers. Makes the math simpler. — Lennart Regebro, Apr 04 '13 at 08:10
You want to put the result of `getDistance()` for each consecutive pair or coordinates into a list, where the order is the same? — Inbar Rose, Apr 04 '13 at 08:14

score 5 · Accepted Answer · edited May 23 '17 at 12:13

You could combine a function that filters out duplicates with one that iterates over pairs:

First lets take care of eliminating duplicate subsequent entries in the list. Since we wish to preserve order, as well as allow duplicates that are not next to each other, we cannot use a simple set. So if we a list of coordinates such as [(0, 0), (4, 4), (4, 4), (1, 1), (0, 0)] the correct output would be [(0, 0), (4, 4), (1, 1), (0, 0)]. A simple function that accomplishes this is:

def filter_duplicates(items):
  """A generator that ignores subsequent entires that are duplicates

  >>> items = [0, 1, 1, 2, 3, 3, 3, 4, 1]
  >>> list(filter_duplicates(items))
  [0, 1, 2, 3, 4, 1]

  """
  prev = None
  for item in items:
    if item != prev:
        yield item 
        prev = item

The yield statement is like a return that doesn't actually return. Each time it is called it passes the value back to the calling function. See What does the "yield" keyword do in Python? for a better explanation.

This simply iterates through each item and compares it to the previous item. If the item is different it yields it back to the calling function and stores it as the current previous item. Another way to write this function would have been:

def filter_duplicates_2(items): result = [] prev = None for item in items: if item != prev: result.append(item) prev = item return result

Though the accomplish the same thing, this way would end up require more memory and would be less efficient because it has to create a new list to store everything.

Now that we have have a way to ensure that every item is different than its neighbors, we need to calculate the distance between subsequent pairs. A simple way to do this is:

def pairs(iterable):
    """A generate over pairs of items in iterable

    >>> list(pairs([0, 8, 2, 1, 3]))
    [(0, 8), (8, 2), (2, 1), (1, 3)]

    """
    iterator = iter(iterable)
    prev = next(iterator)
    for j in iterator:
        yield prev, j
        prev = j

This function is similar to the filter_duplicates function. It simply keeps track of the previous item that it observed, and for each item that it processes it yields that item and the previous item. The only trick it uses is that it assignes prev to the very first item in the list using the next() function call.

If we combine the two functions we end up with:

for (x1, y1), (x2, y2) in pairs(filter_duplicates(coords)):
   distance = getDistance(x1, y1, x2, y2)

Thanks Nathan, this is what I wanted :). A couple of question as I am quite the beginner in python. 1. Is the order of the coordinates preserved here?, 2. Can you explain briefly what the pairs() functions does here, esp the iter(iterable) part ? — sfactor, Apr 04 '13 at 08:41
The order is preserved. I will update the answer with an explanation. : ) — Nathan Villaescusa, Apr 04 '13 at 17:43

score 0 · Answer 2 · answered Apr 04 '13 at 08:16

Here's a way to do it using just functions from itertools:

from itertools import *

l = [...]
ks = (k for k,g in groupby(l))
t1, t2 = tee(ks)
t2.next() # advance so we get adjacent pairs
for k1, k2 in izip(t1, t2):
    # call getDistance on k1, k2

This groups adjacent equal elements, then uses a pair of tee'd iterators to pull out adjacent pairs from the group list.

Using just groupby:

l = [...]
gs = itertools.groupby(l)
last, _ = gs.next()
for k, g in gs:
    # call getDistance on (last, k)
    last = k

Python, efficient way to operate on pair of coordinates

2 Answers2