Python:在列表中查找

问题:Python:在列表中查找

我遇到了这个:

item = someSortOfSelection()
if item in myList:
    doMySpecialFunction(item)

但有时它不适用于我的所有物品,好像它们在列表中没有被识别(当它是字符串列表时)。

这是在列表中查找项目的最“ pythonic”方式if x in l:吗?

I have come across this:

item = someSortOfSelection()
if item in myList:
    doMySpecialFunction(item)

but sometimes it does not work with all my items, as if they weren’t recognized in the list (when it’s a list of string).

Is this the most ‘pythonic’ way of finding an item in a list: if x in l:?


回答 0

关于您的第一个问题:该代码非常好,并且如果与item其中的一个元素相等就应该可以工作myList。也许您尝试找到与其中一项不完全匹配的字符串,或者您使用的浮点值会导致不正确。

至于第二个问题:如果“查找”列表中的内容,实际上有几种可能的方法。

检查里面是否有东西

这是您描述的用例:检查列表中是否包含某些内容。如您所知,您可以使用in运算符:

3 in [1, 2, 3] # => True

过滤集合

即,找到满足特定条件的序列中的所有元素。您可以为此使用列表推导或生成器表达式:

matches = [x for x in lst if fulfills_some_condition(x)]
matches = (x for x in lst if x > 6)

后者将返回一个生成器,您可以将其想象为一种懒惰列表,该列表只有在您对其进行迭代时才会被构建。顺便说一句,第一个完全等于

matches = filter(fulfills_some_condition, lst)

在Python 2中。在这里您可以看到工作中的高阶函数。在Python 3中,filter不返回列表,而是返回类似生成器的对象。

寻找第一次出现

如果您只想匹配条件的第一件事(但是您还不知道它是什么),那么可以使用for循环(可能也使用该else子句,这并不是很知名)。您也可以使用

next(x for x in lst if ...)

StopIteration如果没有找到任何匹配项,则将返回第一个匹配项或引发a 。或者,您可以使用

next((x for x in lst if ...), [default value])

查找物品的位置

对于列表,index如果您想知道某个元素在列表中的何处,还有一种方法有时会很有用:

[1,2,3].index(2) # => 1
[1,2,3].index(4) # => ValueError

但是,请注意,如果有重复项,则.index始终返回最低索引:……

[1,2,3,2].index(2) # => 1

如果有重复项,并且想要所有索引,则可以enumerate()改用:

[i for i,x in enumerate([1,2,3,2]) if x==2] # => [1, 3]

As for your first question: that code is perfectly fine and should work if item equals one of the elements inside myList. Maybe you try to find a string that does not exactly match one of the items or maybe you are using a float value which suffers from inaccuracy.

As for your second question: There’s actually several possible ways if “finding” things in lists.

Checking if something is inside

This is the use case you describe: Checking whether something is inside a list or not. As you know, you can use the in operator for that:

3 in [1, 2, 3] # => True

Filtering a collection

That is, finding all elements in a sequence that meet a certain condition. You can use list comprehension or generator expressions for that:

matches = [x for x in lst if fulfills_some_condition(x)]
matches = (x for x in lst if x > 6)

The latter will return a generator which you can imagine as a sort of lazy list that will only be built as soon as you iterate through it. By the way, the first one is exactly equivalent to

matches = filter(fulfills_some_condition, lst)

in Python 2. Here you can see higher-order functions at work. In Python 3, filter doesn’t return a list, but a generator-like object.

Finding the first occurrence

If you only want the first thing that matches a condition (but you don’t know what it is yet), it’s fine to use a for loop (possibly using the else clause as well, which is not really well-known). You can also use

next(x for x in lst if ...)

which will return the first match or raise a StopIteration if none is found. Alternatively, you can use

next((x for x in lst if ...), [default value])

Finding the location of an item

For lists, there’s also the index method that can sometimes be useful if you want to know where a certain element is in the list:

[1,2,3].index(2) # => 1
[1,2,3].index(4) # => ValueError

However, note that if you have duplicates, .index always returns the lowest index:……

[1,2,3,2].index(2) # => 1

If there are duplicates and you want all the indexes then you can use enumerate() instead:

[i for i,x in enumerate([1,2,3,2]) if x==2] # => [1, 3]

回答 1

如果要查找一个元素或在中None使用default next,则StopIteration在列表中未找到该元素时不会提高:

first_or_default = next((x for x in lst if ...), None)

If you want to find one element or None use default in next, it won’t raise StopIteration if the item was not found in the list:

first_or_default = next((x for x in lst if ...), None)

回答 2

虽然Niklas B.的答案非常全面,但是当我们想在列表中查找某项时,有时获得其索引很有用:

next((i for i, x in enumerate(lst) if [condition on x]), [default value])

While the answer from Niklas B. is pretty comprehensive, when we want to find an item in a list it is sometimes useful to get its index:

next((i for i, x in enumerate(lst) if [condition on x]), [default value])

回答 3

寻找第一次出现

在其中有一个配方itertools

def first_true(iterable, default=False, pred=None):
    """Returns the first true value in the iterable.

    If no true value is found, returns *default*

    If *pred* is not None, returns the first item
    for which pred(item) is true.

    """
    # first_true([a,b,c], x) --> a or b or c or x
    # first_true([a,b], x, f) --> a if f(a) else b if f(b) else x
    return next(filter(pred, iterable), default)

例如,以下代码查找列表中的第一个奇数:

>>> first_true([2,3,4,5], None, lambda x: x%2==1)
3  

Finding the first occurrence

There’s a recipe for that in itertools:

def first_true(iterable, default=False, pred=None):
    """Returns the first true value in the iterable.

    If no true value is found, returns *default*

    If *pred* is not None, returns the first item
    for which pred(item) is true.

    """
    # first_true([a,b,c], x) --> a or b or c or x
    # first_true([a,b], x, f) --> a if f(a) else b if f(b) else x
    return next(filter(pred, iterable), default)

For example, the following code finds the first odd number in a list:

>>> first_true([2,3,4,5], None, lambda x: x%2==1)
3  

回答 4

另一种选择:您可以使用来检查项目是否在列表中if item in list:,但这是订单O(n)。如果您要处理大量项目,而您只需要知道某项是否是列表的成员,则可以先将列表转换为集合,然后利用恒定时间集查找

my_set = set(my_list)
if item in my_set:  # much faster on average than using a list
    # do something

并非在每种情况下都是正确的解决方案,但是在某些情况下,这可能会为您带来更好的性能。

请注意,使用来创建集合set(my_list)也是O(n),因此,如果您只需要执行一次此操作,则以这种方式进行操作不会更快。但是,如果您需要反复检查成员资格,则在初始集创建之后,每次查找将为O(1)。

Another alternative: you can check if an item is in a list with if item in list:, but this is order O(n). If you are dealing with big lists of items and all you need to know is whether something is a member of your list, you can convert the list to a set first and take advantage of constant time set lookup:

my_set = set(my_list)
if item in my_set:  # much faster on average than using a list
    # do something

Not going to be the correct solution in every case, but for some cases this might give you better performance.

Note that creating the set with set(my_list) is also O(n), so if you only need to do this once then it isn’t any faster to do it this way. If you need to repeatedly check membership though, then this will be O(1) for every lookup after that initial set creation.


回答 5

在处理字符串列表时,您可能想使用两种可能的搜索之一:

  1. 如果list元素等于一个项目(’example’在[‘one’,’example’,’two’]中):

    if item in your_list: some_function_on_true()

    [‘one’,’ex’,’two’]中的’ex’=>真

    [‘one’,’ex’,’two’]中的’ex_1’=>否

  2. 如果list元素就像一个项目(“ ex”在[‘one,’example’,’two’]中,或者’example_1’在[‘one’,’example’,’two’]中):

    matches = [el for el in your_list if item in el]

    要么

    matches = [el for el in your_list if el in item]

    然后只需要检查len(matches)或阅读即可。

You may want to use one of two possible searches while working with list of strings:

  1. if list element is equal to an item (‘example’ is in [‘one’,’example’,’two’]):

    if item in your_list: some_function_on_true()

    ‘ex’ in [‘one’,’ex’,’two’] => True

    ‘ex_1’ in [‘one’,’ex’,’two’] => False

  2. if list element is like an item (‘ex’ is in [‘one,’example’,’two’] or ‘example_1’ is in [‘one’,’example’,’two’]):

    matches = [el for el in your_list if item in el]

    or

    matches = [el for el in your_list if el in item]

    then just check len(matches) or read them if needed.


回答 6

定义和用法

count()方法返回具有指定值的元素数。

句法

list.count(value)

例:

fruits = ['apple', 'banana', 'cherry']

x = fruits.count("cherry")

问题的例子:

item = someSortOfSelection()

if myList.count(item) >= 1 :

    doMySpecialFunction(item)

Definition and Usage

the count() method returns the number of elements with the specified value.

Syntax

list.count(value)

example:

fruits = ['apple', 'banana', 'cherry']

x = fruits.count("cherry")

Question’s example:

item = someSortOfSelection()

if myList.count(item) >= 1 :

    doMySpecialFunction(item)

回答 7

而不是使用的list.index(x)如果在列表中找到返回x的指数或返回#ValueError,如果没有找到X,你可以使用邮件list.count(x)返回列表x的发生次数(验证x是确实在列表中),或者它否则返回0(在没有x的情况下)。有趣的count()是,它不会破坏您的代码,也不会要求您在找不到x时抛出异常

Instead of using list.index(x) which returns the index of x if it is found in list or returns a #ValueError message if x is not found, you could use list.count(x) which returns the number of occurrences of x in list (validation that x is indeed in the list) or it returns 0 otherwise (in the absence of x). The cool thing about count() is that it doesn’t break your code or require you to throw an exception for when x is not found


回答 8

如果您要检查一次收藏品中是否存在值,则可以使用“ in”运算符。但是,如果要检查一次以上,则建议使用bisect模块。请记住,使用bisect模块的数据必须进行排序。因此,您可以对数据进行一次排序,然后可以使用二等分。在我的机器上使用bisect模块比使用“ in”运算符快12倍。

这是使用Python 3.8及更高版本语法的代码示例:

import bisect
from timeit import timeit

def bisect_search(container, value):
    return (
      (index := bisect.bisect_left(container, value)) < len(container) 
      and container[index] == value
    )

data = list(range(1000))
# value to search
true_value = 666
false_value = 66666

# times to test
ttt = 1000

print(f"{bisect_search(data, true_value)=} {bisect_search(data, false_value)=}")

t1 = timeit(lambda: true_value in data, number=ttt)
t2 = timeit(lambda: bisect_search(data, true_value), number=ttt)

print("Performance:", f"{t1=:.4f}, {t2=:.4f}, diffs {t1/t2=:.2f}")

输出:

bisect_search(data, true_value)=True bisect_search(data, false_value)=False
Performance: t1=0.0220, t2=0.0019, diffs t1/t2=11.71

If you are going to check if value exist in the collectible once then using ‘in’ operator is fine. However, if you are going to check for more than once then I recommend using bisect module. Keep in mind that using bisect module data must be sorted. So you sort data once and then you can use bisect. Using bisect module on my machine is about 12 times faster than using ‘in’ operator.

Here is an example of code using Python 3.8 and above syntax:

import bisect
from timeit import timeit

def bisect_search(container, value):
    return (
      (index := bisect.bisect_left(container, value)) < len(container) 
      and container[index] == value
    )

data = list(range(1000))
# value to search
true_value = 666
false_value = 66666

# times to test
ttt = 1000

print(f"{bisect_search(data, true_value)=} {bisect_search(data, false_value)=}")

t1 = timeit(lambda: true_value in data, number=ttt)
t2 = timeit(lambda: bisect_search(data, true_value), number=ttt)

print("Performance:", f"{t1=:.4f}, {t2=:.4f}, diffs {t1/t2=:.2f}")

Output:

bisect_search(data, true_value)=True bisect_search(data, false_value)=False
Performance: t1=0.0220, t2=0.0019, diffs t1/t2=11.71

回答 9

检查字符串列表中的项目是否没有其他多余的空格。这就是可能无法解释项目的原因。

Check there are no additional/unwanted whites space in the items of the list of strings. That’s a reason that can be interfering explaining the items cannot be found.