Get closest element from list of dictionaries

Question

My program generates the following list (excerpt):

my_list = [{'x': 1764, 'y': 18320, 'class': 'note', 'id': 'd1e2443'},
           {'x': 1764, 'y': 20030, 'class': 'note', 'id': 'd1e2591'},
           {'x': 1807, 'y': 12650, 'class': 'note', 'id': 'd1e1362'},
           {'x': 2243, 'y': 20120, 'class': 'note', 'id': 'd1e2609'},
           {'x': 2243, 'y': 22685, 'class': 'note', 'id': 'd1e2769'},
           {'x': 2257, 'y': 12560, 'class': 'note', 'id': 'd1e1380'},
           {'x': 2688, 'y': 20210, 'class': 'note', 'id': 'd1e2625'},
           {'x': 2707, 'y': 10040, 'class': 'note', 'id': 'd1e1194'},
           {'x': 2707, 'y': 12650, 'class': 'note', 'id': 'd1e1398'},
           {'x': 2707, 'y': 14720, 'class': 'note', 'id': 'd1e1571'},
           {'x': 2901, 'y': 18140, 'class': 'note', 'id': 'd1e2475'}]

It is already sorted by the value of the 'x'-key. I am trying to write a method, that returns a tuple of two elements of this list for a given coordinate (xPos, yPos):

The nearest element to the left (x <= xPos)
The nearest element to the right (x > xPos)

The distance is simply the euclidean distance ("Pythagoras"). A second parameter for the function is the maximum distance allowed:

def getNearest(noteList, posX, posY, maxDistance):
    [...]
    return leftElement, rightElement

I have tried to use the bisect function to get the insertion point of the closest element to xPos as well as for xPos - maxDistance (case 'left') and xPos + maxDistance (case 'right) respectively in order to narrow down the search area. Then I calculated the distance for every remaining element in this sliced list

Somehow this feels very unelegant. Is there any better way of doing this?

EDIT: Maybe I was not very clear with my intention: I need two elements of the list. The nearest element in the '2D pane' to the left and one to the right. Thus I need to consider the y-coordinate as well.

It might happen (acutally almost every time) that the closest element in regard to its x-coordinate is way more far away than an element with a close-by y-coordinate.

Do you want go get the `id` as result or just the coordinates? — wenzul, Jul 24 '15 at 10:11
I would need the entire element, means the whole dictionary. — sonovice, Jul 24 '15 at 10:17

loopbackbee · Answer 1 · 2015-07-24T10:14:56.410

1

That seems like a good solution, but from the way you described it, I don't see a need to do more than a single bisect.

bisect_left already returns the index of the element such that your first condition is satisfied (x <= xPos - maxDistance). From there, I'd simply iterate the elements to the right one by one until you reach x > xPos + maxDistance.

This will probably yield better performance considering the CPU cache, because you're iterating adjacent positions instead of jumping around

If you start processing a large number of points or maxDistance, this algorithm will probably not suffice. In that case, you should look into using a kd-tree, as wenzul suggested.

edited Jul 24 '15 at 10:14

answered Jul 24 '15 at 09:38

loopbackbee

21,962
10
62
97

Yes, but `bisect_left` only considers the `x`coordinate. I need the closest elements (one to the left, one to the right) in regard to `x` and `y`. – sonovice Jul 24 '15 at 09:42
@sonovice Gotcha. Same principle applies - I've amended the answer – loopbackbee Jul 24 '15 at 09:56

user3636636 · Answer 2 · 2015-07-24T12:19:25.777

you could use min() to find the nearest elements on both sides of posX, in regards to their Euclidean distance.

>>>import math
>>>def getNearest(noteList, posX, posY):
...    leftElement = min([i for i in my_list if i['x'] <= posX], key=lambda x:abs(math.sqrt((x['x']- posX)**2+(x['y']- posY)**2)))
...    rightElement = min([i for i in my_list if i['x'] > posX], key=lambda x:abs(math.sqrt((x['x']- posX)**2+(x['y']- posY)**2)))
...    return (leftElement, rightElement)


>>> getNearest(my_list, 2000, 2000)
({'y': 12650, 'x': 1807, 'class': 'note', 'id': 'd1e1362'}, {'y': 10040, 'x': 2707, 'class': 'note', 'id': 'd1e1194'})

>>> getNearest(my_list, 2000, 20000)
({'y': 20030, 'x': 1764, 'class': 'note', 'id': 'd1e2591'}, {'y': 20120, 'x': 2243, 'class': 'note', 'id': 'd1e2609'})

Where the Key is the Euclidean distance between each element on the 2D pane, and the element passed to the function, ie (posX, PosY).

Yes that is true, but no other solution had actually accounted for the Euclidean distance. They only took distance from `x` into account — user3636636, Jul 24 '15 at 23:47

sonovice · Accepted Answer · 2015-07-24T11:49:48.287

I have tried to merge my initial idea with a few suggestions from the answers. This is what I came up with:

class translatedDictList(object):
    def __init__(self, dictList, key):
        self.dictList = dictList
        self.key = key

    def __getitem__(self, item):
        return self.dictList[item][self.key]

    def __len__(self):
        return self.dictList.__len__()

def getNearest(self, symbolList, pos, maxDistance):
    translatedList = translatedDictList(symbolList, 'x')

    splitIndex = bisect.bisect(translatedList, pos[0])
    minIndex = bisect.bisect(translatedList, pos[0] - maxDistance)
    maxIndex = bisect.bisect(translatedList, pos[0] + maxDistance)

    # Euclidean distance acutally not needed anymore!
    leftElement = min(symbolList[minIndex:splitIndex],
                      key=lambda n: abs(n['x'] - pos[0]) +
                                    abs(n['y'] - pos[1]))
    rightElement = min(symbolList[splitIndex:maxIndex],
                       key=lambda n: abs(n['x'] - pos[0]) +
                                     abs(n['y'] - pos[1]))

    return leftElement, rightElement

print(getNearest(self.symbolsSorted, (1200, 1500), 1000))

Maybe there is a smarter way of translating the symbolList in order to use bisect()?

It should be o(log*n) and as far as I can tell I don't even need to calculate the euclidean distance anymore because I am only comparing and not interested in the actual distance.

Get closest element from list of dictionaries

3 Answers3