问题:从整数列表中,获取最接近给定值的数字

给定一个整数列表,我想找到哪个数字与我在输入中提供的数字最接近:

>>> myList = [4, 1, 88, 44, 3]
>>> myNumber = 5
>>> takeClosest(myList, myNumber)
...
4

有什么快速的方法可以做到这一点吗?

Given a list of integers, I want to find which number is the closest to a number I give in input:

>>> myList = [4, 1, 88, 44, 3]
>>> myNumber = 5
>>> takeClosest(myList, myNumber)
...
4

Is there any quick way to do this?


回答 0

如果不确定列表是否已排序,则可以使用内置min()函数,查找与指定数字之间的最小距离的元素。

>>> min(myList, key=lambda x:abs(x-myNumber))
4

请注意,它也可用于带有int键的字典,例如{1: "a", 2: "b"}。此方法花费O(n)时间。


如果列表已经排序,或者您可以只对数组进行一次排序,则使用@Lauritz答案中所示的二等分方法,该方法只需要O(log n)时间(但请注意,检查列表是否已排序为O) (n),排序为O(n log n)。)

If we are not sure that the list is sorted, we could use the built-in min() function, to find the element which has the minimum distance from the specified number.

>>> min(myList, key=lambda x:abs(x-myNumber))
4

Note that it also works with dicts with int keys, like {1: "a", 2: "b"}. This method takes O(n) time.


If the list is already sorted, or you could pay the price of sorting the array once only, use the bisection method illustrated in @Lauritz’s answer which only takes O(log n) time (note however checking if a list is already sorted is O(n) and sorting is O(n log n).)


回答 1

我将重命名该函数take_closest以符合PEP8命名约定。

如果您的意思是快速执行而不是快速编写,min那么除非是在一个非常狭窄的用例中,否则不应其作为选择的武器。该min解决方案需要检查列表中的每一个数字,并做到每个号码的计算。使用bisect.bisect_left替代几乎总是更快。

“几乎”来自bisect_left要求对列表进行排序才能工作的事实。希望您的用例能够对列表进行一次排序,然后再将其保留。即使不是,只要您不需要在每次调用之前进行排序take_closest,该bisect模块就可能排在最前面。如果您有疑问,请尝试两者并查看实际差异。

from bisect import bisect_left

def take_closest(myList, myNumber):
    """
    Assumes myList is sorted. Returns closest value to myNumber.

    If two numbers are equally close, return the smallest number.
    """
    pos = bisect_left(myList, myNumber)
    if pos == 0:
        return myList[0]
    if pos == len(myList):
        return myList[-1]
    before = myList[pos - 1]
    after = myList[pos]
    if after - myNumber < myNumber - before:
       return after
    else:
       return before

Bisect的工作方式是反复将列表减半,并myNumber通过查看中间值找出必须放入的一半。这意味着它的运行时间为O(log n),而不是最高投票答案O(n)运行时间。如果我们比较这两种方法并同时提供两种myList,则结果如下:

$ python -m timeit -s“
从最近的导入take_closest
来自随机进口randint
a = range(-1000,1000,10)“” take_closest(a,randint(-1100,1100))“

100000次循环,每循环3:2.22最佳

$ python -m timeit -s“
最接近的导入with_min
来自随机进口randint
a = range(-1000,1000,10)“” with_min(a,randint(-1100,1100))“

10000次循环,最好为3次:每个循环43.9微秒

因此,在此特定测试中,bisect速度快了将近20倍。对于更长的列表,差异会更大。

如果我们通过消除myList必须排序的前提条件来公平地竞争该怎么办?假设我们在每次 take_closest调用时对列表的副本进行排序,而min解决方案保持不变。使用上述测试中的200个项目列表,该bisect解决方案仍然是最快的,尽管只有30%。

考虑到排序步骤为O(n log(n)),这是一个奇怪的结果!唯一min仍然丢失的原因是,排序是在高度优化的C代码中完成的,而min必须为每个项目调用lambda函数。随着myList规模的增长,min解决方案最终将更快。请注意,min为了赢得解决方案,我们必须堆叠所有有利条件。

I’ll rename the function take_closest to conform with PEP8 naming conventions.

If you mean quick-to-execute as opposed to quick-to-write, min should not be your weapon of choice, except in one very narrow use case. The min solution needs to examine every number in the list and do a calculation for each number. Using bisect.bisect_left instead is almost always faster.

The “almost” comes from the fact that bisect_left requires the list to be sorted to work. Hopefully, your use case is such that you can sort the list once and then leave it alone. Even if not, as long as you don’t need to sort before every time you call take_closest, the bisect module will likely come out on top. If you’re in doubt, try both and look at the real-world difference.

from bisect import bisect_left

def take_closest(myList, myNumber):
    """
    Assumes myList is sorted. Returns closest value to myNumber.

    If two numbers are equally close, return the smallest number.
    """
    pos = bisect_left(myList, myNumber)
    if pos == 0:
        return myList[0]
    if pos == len(myList):
        return myList[-1]
    before = myList[pos - 1]
    after = myList[pos]
    if after - myNumber < myNumber - before:
       return after
    else:
       return before

Bisect works by repeatedly halving a list and finding out which half myNumber has to be in by looking at the middle value. This means it has a running time of O(log n) as opposed to the O(n) running time of the highest voted answer. If we compare the two methods and supply both with a sorted myList, these are the results:

$ python -m timeit -s "
from closest import take_closest
from random import randint
a = range(-1000, 1000, 10)" "take_closest(a, randint(-1100, 1100))"

100000 loops, best of 3: 2.22 usec per loop

$ python -m timeit -s "
from closest import with_min
from random import randint
a = range(-1000, 1000, 10)" "with_min(a, randint(-1100, 1100))"

10000 loops, best of 3: 43.9 usec per loop

So in this particular test, bisect is almost 20 times faster. For longer lists, the difference will be greater.

What if we level the playing field by removing the precondition that myList must be sorted? Let’s say we sort a copy of the list every time take_closest is called, while leaving the min solution unaltered. Using the 200-item list in the above test, the bisect solution is still the fastest, though only by about 30%.

This is a strange result, considering that the sorting step is O(n log(n))! The only reason min is still losing is that the sorting is done in highly optimalized c code, while min has to plod along calling a lambda function for every item. As myList grows in size, the min solution will eventually be faster. Note that we had to stack everything in its favour for the min solution to win.


回答 2

>>> takeClosest = lambda num,collection:min(collection,key=lambda x:abs(x-num))
>>> takeClosest(5,[4,1,88,44,3])
4

一个拉姆达是写一个“匿名”功能(即没有名称的功能)的一种特殊方式。您可以为它分配任何名称,因为lambda是一个表达式。

上面的“长篇”写法是:

def takeClosest(num,collection):
   return min(collection,key=lambda x:abs(x-num))
>>> takeClosest = lambda num,collection:min(collection,key=lambda x:abs(x-num))
>>> takeClosest(5,[4,1,88,44,3])
4

A lambda is a special way of writing an “anonymous” function (a function that doesn’t have a name). You can assign it any name you want because a lambda is an expression.

The “long” way of writing the the above would be:

def takeClosest(num,collection):
   return min(collection,key=lambda x:abs(x-num))

回答 3

def closest(list, Number):
    aux = []
    for valor in list:
        aux.append(abs(Number-valor))

    return aux.index(min(aux))

此代码将为您提供列表中最接近的Number的索引。

KennyTM提供的解决方案是最好的整体解决方案,但是在您无法使用它的情况下(例如brython),此功能可以完成工作

def closest(list, Number):
    aux = []
    for valor in list:
        aux.append(abs(Number-valor))

    return aux.index(min(aux))

This code will give you the index of the closest number of Number in the list.

The solution given by KennyTM is the best overall, but in the cases you cannot use it (like brython), this function will do the work


回答 4

遍历列表,然后将当前最接近的数字与进行比较abs(currentNumber - myNumber)

def takeClosest(myList, myNumber):
    closest = myList[0]
    for i in range(1, len(myList)):
        if abs(i - myNumber) < closest:
            closest = i
    return closest

Iterate over the list and compare the current closest number with abs(currentNumber - myNumber):

def takeClosest(myList, myNumber):
    closest = myList[0]
    for i in range(1, len(myList)):
        if abs(i - myNumber) < closest:
            closest = i
    return closest

回答 5

重要的是要注意,Lauritz的使用bisect的建议思想实际上并未在MyList中找到与MyNumber最接近的值。相反,bisect会在MyList中的MyNumber之后按顺序查找下一个值。因此,在OP的情况下,您实际上得到的是返回的位置44而不是位置4。

>>> myList = [1, 3, 4, 44, 88] 
>>> myNumber = 5
>>> pos = (bisect_left(myList, myNumber))
>>> myList[pos]
...
44

要获得最接近5的值,您可以尝试将列表转换为数组,并使用numpy的argmin这样。

>>> import numpy as np
>>> myNumber = 5   
>>> myList = [1, 3, 4, 44, 88] 
>>> myArray = np.array(myList)
>>> pos = (np.abs(myArray-myNumber)).argmin()
>>> myArray[pos]
...
4

我不知道这会有多快,我的猜测是“不太”。

It’s important to note that Lauritz’s suggestion idea of using bisect does not actually find the closest value in MyList to MyNumber. Instead, bisect finds the next value in order after MyNumber in MyList. So in OP’s case you’d actually get the position of 44 returned instead of the position of 4.

>>> myList = [1, 3, 4, 44, 88] 
>>> myNumber = 5
>>> pos = (bisect_left(myList, myNumber))
>>> myList[pos]
...
44

To get the value that’s closest to 5 you could try converting the list to an array and using argmin from numpy like so.

>>> import numpy as np
>>> myNumber = 5   
>>> myList = [1, 3, 4, 44, 88] 
>>> myArray = np.array(myList)
>>> pos = (np.abs(myArray-myNumber)).argmin()
>>> myArray[pos]
...
4

I don’t know how fast this would be though, my guess would be “not very”.


回答 6

扩展了古斯塔沃·利马(Gustavo Lima)的答案。无需创建全新的列表即可完成相同的操作。随着FOR循环的进行,列表中的值可以用差分代替。

def f_ClosestVal(v_List, v_Number):
"""Takes an unsorted LIST of INTs and RETURNS INDEX of value closest to an INT"""
for _index, i in enumerate(v_List):
    v_List[_index] = abs(v_Number - i)
return v_List.index(min(v_List))

myList = [1, 88, 44, 4, 4, -2, 3]
v_Num = 5
print(f_ClosestVal(myList, v_Num)) ## Gives "3," the index of the first "4" in the list.

Expanding upon Gustavo Lima’s answer. The same thing can be done without creating an entirely new list. The values in the list can be replaced with the differentials as the FOR loop progresses.

def f_ClosestVal(v_List, v_Number):
"""Takes an unsorted LIST of INTs and RETURNS INDEX of value closest to an INT"""
for _index, i in enumerate(v_List):
    v_List[_index] = abs(v_Number - i)
return v_List.index(min(v_List))

myList = [1, 88, 44, 4, 4, -2, 3]
v_Num = 5
print(f_ClosestVal(myList, v_Num)) ## Gives "3," the index of the first "4" in the list.

回答 7

如果我可以补充@Lauritz的答案

为了避免出现运行错误,请不要忘记在该bisect_left行之前添加一个条件:

if (myNumber > myList[-1] or myNumber < myList[0]):
    return False

因此完整的代码如下所示:

from bisect import bisect_left

def takeClosest(myList, myNumber):
    """
    Assumes myList is sorted. Returns closest value to myNumber.
    If two numbers are equally close, return the smallest number.
    If number is outside of min or max return False
    """
    if (myNumber > myList[-1] or myNumber < myList[0]):
        return False
    pos = bisect_left(myList, myNumber)
    if pos == 0:
            return myList[0]
    if pos == len(myList):
            return myList[-1]
    before = myList[pos - 1]
    after = myList[pos]
    if after - myNumber < myNumber - before:
       return after
    else:
       return before

If I may add to @Lauritz’s answer

In order not to have a run error don’t forget to add a condition before the bisect_left line:

if (myNumber > myList[-1] or myNumber < myList[0]):
    return False

so the full code will look like:

from bisect import bisect_left

def takeClosest(myList, myNumber):
    """
    Assumes myList is sorted. Returns closest value to myNumber.
    If two numbers are equally close, return the smallest number.
    If number is outside of min or max return False
    """
    if (myNumber > myList[-1] or myNumber < myList[0]):
        return False
    pos = bisect_left(myList, myNumber)
    if pos == 0:
            return myList[0]
    if pos == len(myList):
            return myList[-1]
    before = myList[pos - 1]
    after = myList[pos]
    if after - myNumber < myNumber - before:
       return after
    else:
       return before

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。