262

Given a list of integers, I want to find which number is the closest to a number I give in input:

>>> myList = [4, 1, 88, 44, 3]
>>> myNumber = 5
>>> takeClosest(myList, myNumber)
...
4

Is there any quick way to do this?

buhtz
  • 10,774
  • 18
  • 76
  • 149
Ricky Robinson
  • 21,798
  • 42
  • 129
  • 185
  • 5
    what about also returning the index that this happened in the list. – Charlie Parker Jul 10 '16 at 20:27
  • 2
    Possible duplicate of [finding index of an item closest to the value in a list that's not entirely sorted](https://stackoverflow.com/questions/9706041/finding-index-of-an-item-closest-to-the-value-in-a-list-thats-not-entirely-sort) – sancho.s ReinstateMonicaCellio Nov 14 '17 at 08:57
  • 1
    @sancho.s Nicely spotted. Though the answers to this question are way better than the ones on that other question. So I'm going to vote to close the other one as duplicate of this one. – Jean-François Corbett Nov 14 '17 at 09:21

10 Answers10

498

If we are not sure that the list is sorted, we could use the built-in min() function, to find the element which has the minimum distance from the specified number.

>>> min(myList, key=lambda x:abs(x-myNumber))
4

Note that it also works with dicts with int keys, like {1: "a", 2: "b"}. This method takes O(n) time.


If the list is already sorted, or you could pay the price of sorting the array once only, use the bisection method illustrated in @Lauritz's answer which only takes O(log n) time (note however checking if a list is already sorted is O(n) and sorting is O(n log n).)

Community
  • 1
  • 1
kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
  • 17
    Speaking in complexity, this is `O(n)`, where a little hacking with `bisect` will give you a massive improvement to `O(log n)` (if your input array is sorted). – mic_e Jun 15 '14 at 04:34
  • 5
    @mic_e: That's just [Lauritz's answer](http://stackoverflow.com/a/12141511/224671). – kennytm Jun 15 '14 at 08:11
  • 3
    what about also returning the index that this happened in the list? – Charlie Parker Jul 10 '16 at 20:26
  • @CharlieParker Create your own implementation of `min`, run it over a dictionary (`items()`) instead of a list, and return the key instead of the value in the end. – Dustin Oprea Nov 13 '16 at 18:24
  • 4
    Or use `numpy.argmin` instead of `min` to get the index instead of the value. –  Mar 09 '18 at 10:02
201

I'll rename the function take_closest to conform with PEP8 naming conventions.

If you mean quick-to-execute as opposed to quick-to-write, min should not be your weapon of choice, except in one very narrow use case. The min solution needs to examine every number in the list and do a calculation for each number. Using bisect.bisect_left instead is almost always faster.

The "almost" comes from the fact that bisect_left requires the list to be sorted to work. Hopefully, your use case is such that you can sort the list once and then leave it alone. Even if not, as long as you don't need to sort before every time you call take_closest, the bisect module will likely come out on top. If you're in doubt, try both and look at the real-world difference.

from bisect import bisect_left

def take_closest(myList, myNumber):
    """
    Assumes myList is sorted. Returns closest value to myNumber.

    If two numbers are equally close, return the smallest number.
    """
    pos = bisect_left(myList, myNumber)
    if pos == 0:
        return myList[0]
    if pos == len(myList):
        return myList[-1]
    before = myList[pos - 1]
    after = myList[pos]
    if after - myNumber < myNumber - before:
        return after
    else:
        return before

Bisect works by repeatedly halving a list and finding out which half myNumber has to be in by looking at the middle value. This means it has a running time of O(log n) as opposed to the O(n) running time of the highest voted answer. If we compare the two methods and supply both with a sorted myList, these are the results:

$ python -m timeit -s "
from closest import take_closest
from random import randint
a = range(-1000, 1000, 10)" "take_closest(a, randint(-1100, 1100))"

100000 loops, best of 3: 2.22 usec per loop

$ python -m timeit -s "
from closest import with_min
from random import randint
a = range(-1000, 1000, 10)" "with_min(a, randint(-1100, 1100))"

10000 loops, best of 3: 43.9 usec per loop

So in this particular test, bisect is almost 20 times faster. For longer lists, the difference will be greater.

What if we level the playing field by removing the precondition that myList must be sorted? Let's say we sort a copy of the list every time take_closest is called, while leaving the min solution unaltered. Using the 200-item list in the above test, the bisect solution is still the fastest, though only by about 30%.

This is a strange result, considering that the sorting step is O(n log(n))! The only reason min is still losing is that the sorting is done in highly optimalized c code, while min has to plod along calling a lambda function for every item. As myList grows in size, the min solution will eventually be faster. Note that we had to stack everything in its favour for the min solution to win.

KetZoomer
  • 2,701
  • 3
  • 15
  • 43
Lauritz V. Thaulow
  • 49,139
  • 12
  • 73
  • 92
  • 2
    Sorting itself needs O(N log N), so it will be slower when N is becoming large. For instance, if you use `a=range(-1000,1000,2);random.shuffle(a)` you'll find that `takeClosest(sorted(a), b)` would become slower. – kennytm Aug 27 '12 at 12:43
  • 3
    @KennyTM I'll grant you that, and I'll point it out in my answer. But as long `getClosest` may be called more than once for every sort, this will be faster, and for the sort-once use case, it's a no-brainer. – Lauritz V. Thaulow Aug 27 '12 at 12:52
  • what about also returning the index that this happened in the list? – Charlie Parker Jul 10 '16 at 20:27
  • 3
    If `myList` is already an `np.array` then using `np.searchsorted` in place of `bisect` is faster. – Michael Hall Jan 21 '19 at 11:19
  • What if I would like to return not closes value, but it's ID? – AAAA Sep 09 '20 at 09:14
  • @LauritzV.Thaulow I think you made a typo in the end of your answer, as you say "As mylist grows in size, the *min* solution will eventually be faster". However from the rest of your answer I am deducing that *min* should become slower and slower instead – Leander Jul 27 '22 at 12:31
12
>>> takeClosest = lambda num,collection:min(collection,key=lambda x:abs(x-num))
>>> takeClosest(5,[4,1,88,44,3])
4

A lambda is a special way of writing an "anonymous" function (a function that doesn't have a name). You can assign it any name you want because a lambda is an expression.

The "long" way of writing the the above would be:

def takeClosest(num,collection):
   return min(collection,key=lambda x:abs(x-num))
Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284
  • 3
    Note however, that assigning lambda's to names is discouraged according to [PEP 8](https://www.python.org/dev/peps/pep-0008/#programming-recommendations). – Evert Heylen Jun 08 '17 at 22:48
7
def closest(list, Number):
    aux = []
    for valor in list:
        aux.append(abs(Number-valor))

    return aux.index(min(aux))

This code will give you the index of the closest number of Number in the list.

The solution given by KennyTM is the best overall, but in the cases you cannot use it (like brython), this function will do the work

Gustavo Lima
  • 83
  • 2
  • 6
6

Iterate over the list and compare the current closest number with abs(currentNumber - myNumber):

def takeClosest(myList, myNumber):
    closest = myList[0]
    for i in range(1, len(myList)):
        if abs(i - myNumber) < closest:
            closest = i
    return closest
nbro
  • 15,395
  • 32
  • 113
  • 196
João Silva
  • 89,303
  • 29
  • 152
  • 158
  • 1
    you could also return the index. – Charlie Parker Jul 10 '16 at 20:32
  • 1
    ! Incorrect ! Should be `if abs(myList[i] - myNumber) < abs(closest - myNumber): closest = myList[i];`. Better store that value beforehand though. – lk_vc Feb 10 '18 at 03:42
  • Surely the function as it stands already returns the index of the closest. For it to satisfy the requirements of the OP shouldn't the second last line read closest = myList[i] – Paula Livingstone Mar 07 '19 at 08:31
3
def find_nearest(array, value):
    array = np.asarray(array)
    idx = (np.abs(array - value)).argmin()
    return array[idx]

run it by using

price_near_to=find_nearest(df['Close'], df['Close'][-2])
2

It's important to note that Lauritz's suggestion idea of using bisect does not actually find the closest value in MyList to MyNumber. Instead, bisect finds the next value in order after MyNumber in MyList. So in OP's case you'd actually get the position of 44 returned instead of the position of 4.

>>> myList = [1, 3, 4, 44, 88] 
>>> myNumber = 5
>>> pos = (bisect_left(myList, myNumber))
>>> myList[pos]
...
44

To get the value that's closest to 5 you could try converting the list to an array and using argmin from numpy like so.

>>> import numpy as np
>>> myNumber = 5   
>>> myList = [1, 3, 4, 44, 88] 
>>> myArray = np.array(myList)
>>> pos = (np.abs(myArray-myNumber)).argmin()
>>> myArray[pos]
...
4

I don't know how fast this would be though, my guess would be "not very".

jmdeamer
  • 337
  • 2
  • 16
  • 2
    Lauritz's function works correctly. You just using bisect_left only but Lauritz suggested a function takeClosest(...) that makes additional check. – Kanat Apr 27 '16 at 09:25
  • If you're going to use NumPy, you could use `np.searchsorted` instead of `bisect_left`. And @Kanat is right - Lauritz's solution *does* include code which picks which of the two candidates is closer. – John Y Oct 04 '17 at 22:03
1

Expanding upon Gustavo Lima's answer. The same thing can be done without creating an entirely new list. The values in the list can be replaced with the differentials as the FOR loop progresses.

def f_ClosestVal(v_List, v_Number):
"""Takes an unsorted LIST of INTs and RETURNS INDEX of value closest to an INT"""
for _index, i in enumerate(v_List):
    v_List[_index] = abs(v_Number - i)
return v_List.index(min(v_List))

myList = [1, 88, 44, 4, 4, -2, 3]
v_Num = 5
print(f_ClosestVal(myList, v_Num)) ## Gives "3," the index of the first "4" in the list.
JayJay123
  • 315
  • 1
  • 5
  • 12
1

If I may add to @Lauritz's answer

In order not to have a run error don't forget to add a condition before the bisect_left line:

if (myNumber > myList[-1] or myNumber < myList[0]):
    return False

so the full code will look like:

from bisect import bisect_left

def takeClosest(myList, myNumber):
    """
    Assumes myList is sorted. Returns closest value to myNumber.
    If two numbers are equally close, return the smallest number.
    If number is outside of min or max return False
    """
    if (myNumber > myList[-1] or myNumber < myList[0]):
        return False
    pos = bisect_left(myList, myNumber)
    if pos == 0:
            return myList[0]
    if pos == len(myList):
            return myList[-1]
    before = myList[pos - 1]
    after = myList[pos]
    if after - myNumber < myNumber - before:
       return after
    else:
       return before
umn
  • 431
  • 6
  • 17
0
def takeClosest(myList, myNumber):
    newlst = []
    for i in myList:
        newlst.append(i - myNumber)
    lstt = [abs(ele) for ele in newlst]
    print(myList[lstt.index(min(lstt))])

myList = [4, 1, 88, 44, 3]
myNumber = 5
takeClosest(myList,myNumber)
Sawndes
  • 1
  • 1