Find closest value algorithm

Question

def find_closest(data, target, key = lambda x:f(x))

This is my function definition where data is set of values, and I want to find the value that evaluates the closest to target in as few evaluations as possible, i.e. abs(target-f(x)) is minimum. f(x) is monotonic.

I've heard that binary search can do this in O(log(n)) time, is there a library implementation in python? Are there more efficient search algorithms?

EDIT: I'm looking to minimize complexity in terms of evaluating f(x) because that's the expensive part. I want to find the x in data that when evaluated with f(x), comes closest to the target. data is in the domain of f, target is in the range of f. Yes, data can be sorted quickly.

You'll need `data` to be sorted to have O(log(n)) complexity. Describe `data` in your question — Basilevs, Aug 26 '14 at 16:08

score 0 · Answer 1 · answered Aug 26 '14 at 16:13

You can use the utilities in the bisect module. You will have to evaluate x on data though, i.e. list(f(x) for x in data) to get a monotonic / sorted list to bisect.

I am not aware of a binary search in the standard library that works directly on f and data.

Abhijit · Answer 2 · 2014-08-26T16:23:27.513

0

If the data presented is already sorted and the function is strctly monotonic, apply the function f on the data and then perform a binary search using bisect.bisect

import bisect
def find_closest(data, target, key = f):

    data = map(f, data)
    if f(0) > f(1):
        data = [-e for e in data]
    try:
        return data[bisect.bisect_left(data, target)]
    except IndexError:
        return data[-1]

edited Aug 26 '14 at 16:23

answered Aug 26 '14 at 16:17

Abhijit

62,056
18
131
204

Finds first not greater than target, while closest is requested. – Basilevs Aug 26 '14 at 16:26
undefined f variable is used – Basilevs Aug 26 '14 at 16:28

score 0 · Answer 3 · edited May 23 '17 at 11:50

Use bisect_left() method to find lower bound. Bisect_left accepts a random-access list of elements, to avoid calculating all of them you can use lazy collection of calculated function values with __len__ and __getitem__ methods defined. Carefully check return value for border conditions. Your heavy calculation will be called O(log(N) + 1) = O(log(N)) times.

from bisect import bisect_left
from collections import defaultdict

class Cache(defaultdict):
    def __init__(self, method):
        self.method = method
    def __missing__(self, key):
        return self.method(key)

class MappedList(object):
    def __init__(self, method, input):
        self.method = method
        self.input = input
        self.cache = Cache(method)
    def __len__(self):
        return len(self.input)
    def __getitem__(self, i):
        return self.cache[input[i]]

def find_closest(data, target, key = lambda x:x):
    s = sorted(data)
    evaluated = MappedList(key, s)
    index = bisect_left(evaluated, target)
    if index == 0:
        return data[0]
    if index == len(data):
        return data[index-1]
    if target - evaluated[index-1] <= evaluated[index] - target:
        return data[index-1]
    else:
        return data[index]

This looks good, didn't know that map was lazy. Here you're returning `f(x)`, right, instead of `x`. Just change it to `return data[index]`? — siamii, Aug 26 '14 at 16:44
[Check out mine](http://ideone.com/krSas4). Now with caching! :) Only two or three calculations for our 6-element input data. — Basilevs, Aug 26 '14 at 17:39

Find closest value algorithm

3 Answers3