从符合条件的可迭代项中获取第一项

问题:从符合条件的可迭代项中获取第一项

我想从符合条件的列表中获得第一项。重要的是,生成的方法不能处理整个列表,这可能会很大。例如,以下功能是足够的:

def first(the_iterable, condition = lambda x: True):
    for i in the_iterable:
        if condition(i):
            return i

可以使用以下功能:

>>> first(range(10))
0
>>> first(range(10), lambda i: i > 3)
4

但是,我想不出一个好的内置式/单层式来让我这样做。如果不需要,我特别不想复制此功能。是否有内置的方法来获取与条件匹配的第一项?

I would like to get the first item from a list matching a condition. It’s important that the resulting method not process the entire list, which could be quite large. For example, the following function is adequate:

def first(the_iterable, condition = lambda x: True):
    for i in the_iterable:
        if condition(i):
            return i

This function could be used something like this:

>>> first(range(10))
0
>>> first(range(10), lambda i: i > 3)
4

However, I can’t think of a good built-in / one-liner to let me do this. I don’t particularly want to copy this function around if I don’t have to. Is there a built-in way to get the first item matching a condition?


回答 0

在Python 2.6或更高版本中:

如果StopIteration在找不到匹配元素的情况下希望被引发:

next(x for x in the_iterable if x > 3)

如果您希望返回default_value(例如None),请执行以下操作:

next((x for x in the_iterable if x > 3), default_value)

请注意,在这种情况下,您需要在生成器表达式周围加一对括号-只要生成器表达式不是唯一的参数,就需要使用括号。

我看到大多数答案都坚决地忽略了next内置函数,因此我认为出于某种神秘的原因,它们100%专注于2.5版及更早的版本-并未提及Python版本问题(但后来我没有看到该提及答案确实提到了next内置答案,这就是为什么我认为有必要自己提供答案的原因-至少以这种方式记录“正确版本”问题;-)。

在2.5中,如果迭代器立即完成.next(),则迭代器的方法立即提高StopIteration-即,对于您的用例,如果可迭代项中没有项满足条件。如果您不在乎(即,您知道必须至少有一个令人满意的项目),则只需使用.next()(在genexp上最好next,Python 2.6内置版本中的行及更高版本)。

如果您确实愿意的话,按照您在Q中首先指出的方法将内容包装在函数中似乎是最好的,尽管您建议的函数实现很好,但是您也可以使用itertoolsfor...: break循环或genexp,或者将a try/except StopIteration作为函数的主体,如各种答案所示。这些替代方案都没有太多附加值,因此我会选择您最初提出的简单的版本。

In Python 2.6 or newer:

If you want StopIteration to be raised if no matching element is found:

next(x for x in the_iterable if x > 3)

If you want default_value (e.g. None) to be returned instead:

next((x for x in the_iterable if x > 3), default_value)

Note that you need an extra pair of parentheses around the generator expression in this case − they are needed whenever the generator expression isn’t the only argument.

I see most answers resolutely ignore the next built-in and so I assume that for some mysterious reason they’re 100% focused on versions 2.5 and older — without mentioning the Python-version issue (but then I don’t see that mention in the answers that do mention the next built-in, which is why I thought it necessary to provide an answer myself — at least the “correct version” issue gets on record this way;-).

In 2.5, the .next() method of iterators immediately raises StopIteration if the iterator immediately finishes — i.e., for your use case, if no item in the iterable satisfies the condition. If you don’t care (i.e., you know there must be at least one satisfactory item) then just use .next() (best on a genexp, line for the next built-in in Python 2.6 and better).

If you do care, wrapping things in a function as you had first indicated in your Q seems best, and while the function implementation you proposed is just fine, you could alternatively use itertools, a for...: break loop, or a genexp, or a try/except StopIteration as the function’s body, as various answers suggested. There’s not much added value in any of these alternatives so I’d go for the starkly-simple version you first proposed.


回答 1

作为可重用,记录和测试的功能

def first(iterable, condition = lambda x: True):
    """
    Returns the first item in the `iterable` that
    satisfies the `condition`.

    If the condition is not given, returns the first item of
    the iterable.

    Raises `StopIteration` if no item satysfing the condition is found.

    >>> first( (1,2,3), condition=lambda x: x % 2 == 0)
    2
    >>> first(range(3, 100))
    3
    >>> first( () )
    Traceback (most recent call last):
    ...
    StopIteration
    """

    return next(x for x in iterable if condition(x))

具有默认参数的版本

@zorf建议使用此函数的版本,如果iterable为空或没有符合条件的项目,则可以具有预定义的返回值:

def first(iterable, default = None, condition = lambda x: True):
    """
    Returns the first item in the `iterable` that
    satisfies the `condition`.

    If the condition is not given, returns the first item of
    the iterable.

    If the `default` argument is given and the iterable is empty,
    or if it has no items matching the condition, the `default` argument
    is returned if it matches the condition.

    The `default` argument being None is the same as it not being given.

    Raises `StopIteration` if no item satisfying the condition is found
    and default is not given or doesn't satisfy the condition.

    >>> first( (1,2,3), condition=lambda x: x % 2 == 0)
    2
    >>> first(range(3, 100))
    3
    >>> first( () )
    Traceback (most recent call last):
    ...
    StopIteration
    >>> first([], default=1)
    1
    >>> first([], default=1, condition=lambda x: x % 2 == 0)
    Traceback (most recent call last):
    ...
    StopIteration
    >>> first([1,3,5], default=1, condition=lambda x: x % 2 == 0)
    Traceback (most recent call last):
    ...
    StopIteration
    """

    try:
        return next(x for x in iterable if condition(x))
    except StopIteration:
        if default is not None and condition(default):
            return default
        else:
            raise

As a reusable, documented and tested function

def first(iterable, condition = lambda x: True):
    """
    Returns the first item in the `iterable` that
    satisfies the `condition`.

    If the condition is not given, returns the first item of
    the iterable.

    Raises `StopIteration` if no item satysfing the condition is found.

    >>> first( (1,2,3), condition=lambda x: x % 2 == 0)
    2
    >>> first(range(3, 100))
    3
    >>> first( () )
    Traceback (most recent call last):
    ...
    StopIteration
    """

    return next(x for x in iterable if condition(x))

Version with default argument

@zorf suggested a version of this function where you can have a predefined return value if the iterable is empty or has no items matching the condition:

def first(iterable, default = None, condition = lambda x: True):
    """
    Returns the first item in the `iterable` that
    satisfies the `condition`.

    If the condition is not given, returns the first item of
    the iterable.

    If the `default` argument is given and the iterable is empty,
    or if it has no items matching the condition, the `default` argument
    is returned if it matches the condition.

    The `default` argument being None is the same as it not being given.

    Raises `StopIteration` if no item satisfying the condition is found
    and default is not given or doesn't satisfy the condition.

    >>> first( (1,2,3), condition=lambda x: x % 2 == 0)
    2
    >>> first(range(3, 100))
    3
    >>> first( () )
    Traceback (most recent call last):
    ...
    StopIteration
    >>> first([], default=1)
    1
    >>> first([], default=1, condition=lambda x: x % 2 == 0)
    Traceback (most recent call last):
    ...
    StopIteration
    >>> first([1,3,5], default=1, condition=lambda x: x % 2 == 0)
    Traceback (most recent call last):
    ...
    StopIteration
    """

    try:
        return next(x for x in iterable if condition(x))
    except StopIteration:
        if default is not None and condition(default):
            return default
        else:
            raise

回答 2

该死的exceptions!

我喜欢这个答案。但是,由于在没有项目时next()引发StopIteration异常,因此我将使用以下代码段来避免异常:

a = []
item = next((x for x in a), None)

例如,

a = []
item = next(x for x in a)

将引发StopIteration异常;

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Damn Exceptions!

I love this answer. However, since next() raise a StopIteration exception when there are no items, i would use the following snippet to avoid an exception:

a = []
item = next((x for x in a), None)

For example,

a = []
item = next(x for x in a)

Will raise a StopIteration exception;

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

回答 3

与using相似ifilter,您可以使用生成器表达式:

>>> (x for x in xrange(10) if x > 5).next()
6

无论哪种情况,StopIteration如果没有元素满足您的条件,您可能都想抓住。

从技术上讲,我想您可以执行以下操作:

>>> foo = None
>>> for foo in (x for x in xrange(10) if x > 5): break
... 
>>> foo
6

这样可以避免产生try/except障碍。但这似乎对语法有些晦涩难懂。

Similar to using ifilter, you could use a generator expression:

>>> (x for x in xrange(10) if x > 5).next()
6

In either case, you probably want to catch StopIteration though, in case no elements satisfy your condition.

Technically speaking, I suppose you could do something like this:

>>> foo = None
>>> for foo in (x for x in xrange(10) if x > 5): break
... 
>>> foo
6

It would avoid having to make a try/except block. But that seems kind of obscure and abusive to the syntax.


回答 4

Python 3中最有效的方法是以下方法之一(使用类似的示例):

具有“领悟”风格:

next(i for i in range(100000000) if i == 1000)

警告:该表达式也适用于Python 2,但是在该示例中使用的range是在Python 3中返回一个可迭代对象,而不是像Python 2这样的列表(如果要在Python 2中构造一个可迭代对象,请使用xrange)。

请注意,该表达式避免在comprehension表达式中构造一个列表next([i for ...]),这会导致在过滤元素之前创建一个包含所有元素的列表,并且会导致处理整个选项,而不是停止迭代一次i == 1000

具有“实用”风格:

next(filter(lambda i: i == 1000, range(100000000)))

警告:这不会工作在Python 2,甚至取代rangexrange由于是filter创建一个列表,而不是一个迭代器(低效率),以及next功能只与迭代器的工作原理。

默认值

如其他响应中所述,next如果要避免在不满足条件时引发异常,则必须在函数中添加一个额外参数。

“实用”风格:

next(filter(lambda i: i == 1000, range(100000000)), False)

“领悟”风格:

使用这种样式时,您需要将comprehension表达式包含()在其中,以避免出现SyntaxError: Generator expression must be parenthesized if not sole argument

next((i for i in range(100000000) if i == 1000), False)

The most efficient way in Python 3 are one of the following (using a similar example):

With “comprehension” style:

next(i for i in range(100000000) if i == 1000)

WARNING: The expression works also with Python 2, but in the example is used range that returns an iterable object in Python 3 instead of a list like Python 2 (if you want to construct an iterable in Python 2 use xrange instead).

Note that the expression avoid to construct a list in the comprehension expression next([i for ...]), that would cause to create a list with all the elements before filter the elements, and would cause to process the entire options, instead of stop the iteration once i == 1000.

With “functional” style:

next(filter(lambda i: i == 1000, range(100000000)))

WARNING: This doesn’t work in Python 2, even replacing range with xrange due that filter create a list instead of a iterator (inefficient), and the next function only works with iterators.

Default value

As mentioned in other responses, you must add a extra-parameter to the function next if you want to avoid an exception raised when the condition is not fulfilled.

“functional” style:

next(filter(lambda i: i == 1000, range(100000000)), False)

“comprehension” style:

With this style you need to surround the comprehension expression with () to avoid a SyntaxError: Generator expression must be parenthesized if not sole argument:

next((i for i in range(100000000) if i == 1000), False)

回答 5

我会写这个

next(x for x in xrange(10) if x > 3)

I would write this

next(x for x in xrange(10) if x > 3)

回答 6

itertools模块包含用于迭代器的过滤器功能。可以通过调用next()它来获取过滤后的迭代器的第一个元素:

from itertools import ifilter

print ifilter((lambda i: i > 3), range(10)).next()

The itertools module contains a filter function for iterators. The first element of the filtered iterator can be obtained by calling next() on it:

from itertools import ifilter

print ifilter((lambda i: i > 3), range(10)).next()

回答 7

对于较旧版本的Python,其中不存在下一个内置组件:

(x for x in range(10) if x > 3).next()

For older versions of Python where the next built-in doesn’t exist:

(x for x in range(10) if x > 3).next()

回答 8

通过使用

(index for index, value in enumerate(the_iterable) if condition(value))

可以检查the_iterable中第一项的条件,并获得其索引,而无需评估the_iterable中的所有项

使用的完整表达式是

first_index = next(index for index, value in enumerate(the_iterable) if condition(value))

在这里,first_index假定在上述表达式中标识的第一个值的值。

By using

(index for index, value in enumerate(the_iterable) if condition(value))

one can check the condition of the value of the first item in the_iterable, and obtain its index without the need to evaluate all of the items in the_iterable.

The complete expression to use is

first_index = next(index for index, value in enumerate(the_iterable) if condition(value))

Here first_index assumes the value of the first value identified in the expression discussed above.


回答 9

这个问题已经有了很好的答案。我只加两分钱,因为我登陆这里试图找到解决自己问题的方法,这与OP非常相似。

如果要使用生成器查找与条件匹配的第一项的INDEX,只需执行以下操作:

next(index for index, value in enumerate(iterable) if condition)

This question already has great answers. I’m only adding my two cents because I landed here trying to find a solution to my own problem, which is very similar to the OP.

If you want to find the INDEX of the first item matching a criteria using generators, you can simply do:

next(index for index, value in enumerate(iterable) if condition)

回答 10

您也可以argwhere在Numpy中使用该功能。例如:

i)在“ helloworld”中找到第一个“ l”:

import numpy as np
l = list("helloworld") # Create list
i = np.argwhere(np.array(l)=="l") # i = array([[2],[3],[8]])
index_of_first = i.min()

ii)查找第一个随机数> 0.1

import numpy as np
r = np.random.rand(50) # Create random numbers
i = np.argwhere(r>0.1)
index_of_first = i.min()

iii)找到最后一个随机数> 0.1

import numpy as np
r = np.random.rand(50) # Create random numbers
i = np.argwhere(r>0.1)
index_of_last = i.max()

You could also use the argwhere function in Numpy. For example:

i) Find the first “l” in “helloworld”:

import numpy as np
l = list("helloworld") # Create list
i = np.argwhere(np.array(l)=="l") # i = array([[2],[3],[8]])
index_of_first = i.min()

ii) Find first random number > 0.1

import numpy as np
r = np.random.rand(50) # Create random numbers
i = np.argwhere(r>0.1)
index_of_first = i.min()

iii) Find the last random number > 0.1

import numpy as np
r = np.random.rand(50) # Create random numbers
i = np.argwhere(r>0.1)
index_of_last = i.max()

回答 11

在Python 3中:

a = (None, False, 0, 1)
assert next(filter(None, a)) == 1

在Python 2.6中:

a = (None, False, 0, 1)
assert next(iter(filter(None, a))) == 1

编辑:我认为这很明显,但显然不是:而是None可以通过lambda检查条件来传递函数(或):

a = [2,3,4,5,6,7,8]
assert next(filter(lambda x: x%2, a)) == 3

In Python 3:

a = (None, False, 0, 1)
assert next(filter(None, a)) == 1

In Python 2.6:

a = (None, False, 0, 1)
assert next(iter(filter(None, a))) == 1

EDIT: I thought it was obvious, but apparently not: instead of None you can pass a function (or a lambda) with a check for the condition:

a = [2,3,4,5,6,7,8]
assert next(filter(lambda x: x%2, a)) == 3

回答 12

Oneliner:

thefirst = [i for i in range(10) if i > 3][0]

如果您不确定任何元素根据条件有效,则应将其括起来,try/except因为这[0]会引发IndexError

Oneliner:

thefirst = [i for i in range(10) if i > 3][0]

If youre not sure that any element will be valid according to the criteria, you should enclose this with try/except since that [0] can raise an IndexError.