5

My program generates the following list (excerpt):

my_list = [{'x': 1764, 'y': 18320, 'class': 'note', 'id': 'd1e2443'},
           {'x': 1764, 'y': 20030, 'class': 'note', 'id': 'd1e2591'},
           {'x': 1807, 'y': 12650, 'class': 'note', 'id': 'd1e1362'},
           {'x': 2243, 'y': 20120, 'class': 'note', 'id': 'd1e2609'},
           {'x': 2243, 'y': 22685, 'class': 'note', 'id': 'd1e2769'},
           {'x': 2257, 'y': 12560, 'class': 'note', 'id': 'd1e1380'},
           {'x': 2688, 'y': 20210, 'class': 'note', 'id': 'd1e2625'},
           {'x': 2707, 'y': 10040, 'class': 'note', 'id': 'd1e1194'},
           {'x': 2707, 'y': 12650, 'class': 'note', 'id': 'd1e1398'},
           {'x': 2707, 'y': 14720, 'class': 'note', 'id': 'd1e1571'},
           {'x': 2901, 'y': 18140, 'class': 'note', 'id': 'd1e2475'}]

It is already sorted by the value of the 'x'-key. I am trying to write a method, that returns a tuple of two elements of this list for a given coordinate (xPos, yPos):

  • The nearest element to the left (x <= xPos)
  • The nearest element to the right (x > xPos)

The distance is simply the euclidean distance ("Pythagoras"). A second parameter for the function is the maximum distance allowed:

def getNearest(noteList, posX, posY, maxDistance):
    [...]
    return leftElement, rightElement

I have tried to use the bisect function to get the insertion point of the closest element to xPos as well as for xPos - maxDistance (case 'left') and xPos + maxDistance (case 'right) respectively in order to narrow down the search area. Then I calculated the distance for every remaining element in this sliced list

Somehow this feels very unelegant. Is there any better way of doing this?

EDIT: Maybe I was not very clear with my intention: I need two elements of the list. The nearest element in the '2D pane' to the left and one to the right. Thus I need to consider the y-coordinate as well.

It might happen (acutally almost every time) that the closest element in regard to its x-coordinate is way more far away than an element with a close-by y-coordinate.

sonovice
  • 813
  • 2
  • 13
  • 27

3 Answers3

1

That seems like a good solution, but from the way you described it, I don't see a need to do more than a single bisect.

bisect_left already returns the index of the element such that your first condition is satisfied (x <= xPos - maxDistance). From there, I'd simply iterate the elements to the right one by one until you reach x > xPos + maxDistance.

This will probably yield better performance considering the CPU cache, because you're iterating adjacent positions instead of jumping around

If you start processing a large number of points or maxDistance, this algorithm will probably not suffice. In that case, you should look into using a kd-tree, as wenzul suggested.

loopbackbee
  • 21,962
  • 10
  • 62
  • 97
  • Yes, but `bisect_left` only considers the `x`coordinate. I need the closest elements (one to the left, one to the right) in regard to `x` and `y`. – sonovice Jul 24 '15 at 09:42
  • @sonovice Gotcha. Same principle applies - I've amended the answer – loopbackbee Jul 24 '15 at 09:56
1

you could use min() to find the nearest elements on both sides of posX, in regards to their Euclidean distance.

>>>import math
>>>def getNearest(noteList, posX, posY):
...    leftElement = min([i for i in my_list if i['x'] <= posX], key=lambda x:abs(math.sqrt((x['x']- posX)**2+(x['y']- posY)**2)))
...    rightElement = min([i for i in my_list if i['x'] > posX], key=lambda x:abs(math.sqrt((x['x']- posX)**2+(x['y']- posY)**2)))
...    return (leftElement, rightElement)


>>> getNearest(my_list, 2000, 2000)
({'y': 12650, 'x': 1807, 'class': 'note', 'id': 'd1e1362'}, {'y': 10040, 'x': 2707, 'class': 'note', 'id': 'd1e1194'})

>>> getNearest(my_list, 2000, 20000)
({'y': 20030, 'x': 1764, 'class': 'note', 'id': 'd1e2591'}, {'y': 20120, 'x': 2243, 'class': 'note', 'id': 'd1e2609'})

Where the Key is the Euclidean distance between each element on the 2D pane, and the element passed to the function, ie (posX, PosY).

user3636636
  • 2,409
  • 2
  • 16
  • 31
0

I have tried to merge my initial idea with a few suggestions from the answers. This is what I came up with:

class translatedDictList(object):
    def __init__(self, dictList, key):
        self.dictList = dictList
        self.key = key

    def __getitem__(self, item):
        return self.dictList[item][self.key]

    def __len__(self):
        return self.dictList.__len__()

def getNearest(self, symbolList, pos, maxDistance):
    translatedList = translatedDictList(symbolList, 'x')

    splitIndex = bisect.bisect(translatedList, pos[0])
    minIndex = bisect.bisect(translatedList, pos[0] - maxDistance)
    maxIndex = bisect.bisect(translatedList, pos[0] + maxDistance)

    # Euclidean distance acutally not needed anymore!
    leftElement = min(symbolList[minIndex:splitIndex],
                      key=lambda n: abs(n['x'] - pos[0]) +
                                    abs(n['y'] - pos[1]))
    rightElement = min(symbolList[splitIndex:maxIndex],
                       key=lambda n: abs(n['x'] - pos[0]) +
                                     abs(n['y'] - pos[1]))

    return leftElement, rightElement

print(getNearest(self.symbolsSorted, (1200, 1500), 1000))

Maybe there is a smarter way of translating the symbolList in order to use bisect()?

It should be o(log*n) and as far as I can tell I don't even need to calculate the euclidean distance anymore because I am only comparing and not interested in the actual distance.

sonovice
  • 813
  • 2
  • 13
  • 27