标签归档:iterator

Python列表迭代器行为和next(iterator)

问题:Python列表迭代器行为和next(iterator)

考虑:

>>> lst = iter([1,2,3])
>>> next(lst)
1
>>> next(lst)
2

因此,按预期方式,通过更改同一对象来处理迭代器。

在这种情况下,我希望:

a = iter(list(range(10)))
for i in a:
   print(i)
   next(a)

跳过每第二个元素:对的调用next应使迭代器前进一次,然后循环进行的隐式调用应使它第二次前进-并将第二次调用的结果分配给i

没有。该循环将打印列表中的所有项目,而不会跳过任何项目。

我的第一个想法是,可能会发生这种情况,因为循环会调用iter它所传递的内容,并且这可能会提供一个独立的迭代器-事实并非如此iter(a) is a

那么,为什么next在这种情况下似乎不推进迭代器呢?

Consider:

>>> lst = iter([1,2,3])
>>> next(lst)
1
>>> next(lst)
2

So, advancing the iterator is, as expected, handled by mutating that same object.

This being the case, I would expect:

a = iter(list(range(10)))
for i in a:
   print(i)
   next(a)

to skip every second element: the call to next should advance the iterator once, then the implicit call made by the loop should advance it a second time – and the result of this second call would be assigned to i.

It doesn’t. The loop prints all of the items in the list, without skipping any.

My first thought was that this might happen because the loop calls iter on what it is passed, and this might give an independent iterator – this isn’t the case, as we have iter(a) is a.

So, why does next not appear to advance the iterator in this case?


回答 0

您看到的是,解释器next()除了回显i每次迭代外,还回显了返回值:

>>> a = iter(list(range(10)))
>>> for i in a:
...    print(i)
...    next(a)
... 
0
1
2
3
4
5
6
7
8
9

所以,0是的输出print(i)1从返回值next(),通过交互式解释回荡,等,有仅5次迭代,产生2行每次迭代被写入到所述终端。

如果您分配next()事物的输出按预期工作:

>>> a = iter(list(range(10)))
>>> for i in a:
...    print(i)
...    _ = next(a)
... 
0
2
4
6
8

或打印额外的信息来区分print()从交互式解释回声输出:

>>> a = iter(list(range(10)))
>>> for i in a:
...    print('Printing: {}'.format(i))
...    next(a)
... 
Printing: 0
1
Printing: 2
3
Printing: 4
5
Printing: 6
7
Printing: 8
9

换句话说,next()它按预期方式工作,但是由于它从迭代器返回下一个值,并由交互式解释器回显,因此您被认为是循环以某种方式拥有自己的迭代器副本。

What you see is the interpreter echoing back the return value of next() in addition to i being printed each iteration:

>>> a = iter(list(range(10)))
>>> for i in a:
...    print(i)
...    next(a)
... 
0
1
2
3
4
5
6
7
8
9

So 0 is the output of print(i), 1 the return value from next(), echoed by the interactive interpreter, etc. There are just 5 iterations, each iteration resulting in 2 lines being written to the terminal.

If you assign the output of next() things work as expected:

>>> a = iter(list(range(10)))
>>> for i in a:
...    print(i)
...    _ = next(a)
... 
0
2
4
6
8

or print extra information to differentiate the print() output from the interactive interpreter echo:

>>> a = iter(list(range(10)))
>>> for i in a:
...    print('Printing: {}'.format(i))
...    next(a)
... 
Printing: 0
1
Printing: 2
3
Printing: 4
5
Printing: 6
7
Printing: 8
9

In other words, next() is working as expected, but because it returns the next value from the iterator, echoed by the interactive interpreter, you are led to believe that the loop has its own iterator copy somehow.


回答 1

发生的是 next(a)返回a的下一个值,该值将打印到控制台,因为它不受影响。

您可以做的就是使用此值影响变量:

>>> a = iter(list(range(10)))
>>> for i in a:
...    print(i)
...    b=next(a)
...
0
2
4
6
8

What is happening is that next(a) returns the next value of a, which is printed to the console because it is not affected.

What you can do is affect a variable with this value:

>>> a = iter(list(range(10)))
>>> for i in a:
...    print(i)
...    b=next(a)
...
0
2
4
6
8

回答 2

我觉得现有的答案有点令人困惑,因为他们只间接地表明了代码示例基本神秘的事: *“打印我”和“未来(一)”是导致他们的结果进行打印。

由于他们正在打印原始序列的交替元素,并且“ next(a)”语句正在打印是意外的,因此看起来“ print i”语句正在打印所有值。

鉴于此,变得更加清楚的是,将“ next(a)”的结果分配给变量会禁止打印其结果,从而仅打印“ i”循环变量的替代值。同样,使“打印”语句发出更独特的内容也可以消除歧义。

(现有答案之一反驳其他答案,因为该答案会将示例代码评估为一个块,因此解释器不会报告“ next(a)”的中间值。)

通常,在回答问题时令人着迷的事情是,一旦您知道答案,就应明确表述哪些是显而易见的。可能难以捉摸。一旦理解了答案,就同样会提出批评。这真有趣…

I find the existing answers a little confusing, because they only indirectly indicate the essential mystifying thing in the code example: both* the “print i” and the “next(a)” are causing their results to be printed.

Since they’re printing alternating elements of the original sequence, and it’s unexpected that the “next(a)” statement is printing, it appears as if the “print i” statement is printing all the values.

In that light, it becomes more clear that assigning the result of “next(a)” to a variable inhibits the printing of its’ result, so that just the alternate values that the “i” loop variable are printed. Similarly, making the “print” statement emit something more distinctive disambiguates it, as well.

(One of the existing answers refutes the others because that answer is having the example code evaluated as a block, so that the interpreter is not reporting the intermediate values for “next(a)”.)

The beguiling thing in answering questions, in general, is being explicit about what is obvious once you know the answer. It can be elusive. Likewise critiquing answers once you understand them. It’s interesting…


回答 3

您的Python /计算机出了点问题。

a = iter(list(range(10)))
for i in a:
   print(i)
   next(a)

>>> 
0
2
4
6
8

像预期的那样工作。

在Python 2.7和Python 3+中进行了测试。两者均可正常工作

Something is wrong with your Python/Computer.

a = iter(list(range(10)))
for i in a:
   print(i)
   next(a)

>>> 
0
2
4
6
8

Works like expected.

Tested in Python 2.7 and in Python 3+ . Works properly in both


回答 4

对于那些仍然不了解的人。

>>> a = iter(list(range(10)))
>>> for i in a:
...    print(i)
...    next(a)
... 
0 # print(i) printed this
1 # next(a) printed this
2 # print(i) printed this
3 # next(a) printed this
4 # print(i) printed this
5 # next(a) printed this
6 # print(i) printed this
7 # next(a) printed this
8 # print(i) printed this
9 # next(a) printed this

正如其他人已经说过的那样,next将迭代器按预期方式增加1。将其返回值分配给变量不会神奇地改变其行为。

For those who still do not understand.

>>> a = iter(list(range(10)))
>>> for i in a:
...    print(i)
...    next(a)
... 
0 # print(i) printed this
1 # next(a) printed this
2 # print(i) printed this
3 # next(a) printed this
4 # print(i) printed this
5 # next(a) printed this
6 # print(i) printed this
7 # next(a) printed this
8 # print(i) printed this
9 # next(a) printed this

As others have already said, next increases the iterator by 1 as expected. Assigning its returned value to a variable doesn’t magically changes its behaviour.


回答 5

如果作为函数调用,它将表现出您想要的方式:

>>> def test():
...     a = iter(list(range(10)))
...     for i in a:
...         print(i)
...         next(a)
... 
>>> test()
0
2
4
6
8

It behaves the way you want if called as a function:

>>> def test():
...     a = iter(list(range(10)))
...     for i in a:
...         print(i)
...         next(a)
... 
>>> test()
0
2
4
6
8

如何从生成器中仅选择一项(在python中)?

问题:如何从生成器中仅选择一项(在python中)?

我有一个类似下面的生成器函数:

def myfunct():
  ...
  yield result

调用此函数的常用方法是:

for r in myfunct():
  dostuff(r)

我的问题是,有什么方法可以随时从生成器中获取一个元素吗?例如,我想做类似的事情:

while True:
  ...
  if something:
      my_element = pick_just_one_element(myfunct())
      dostuff(my_element)
  ...

I have a generator function like the following:

def myfunct():
  ...
  yield result

The usual way to call this function would be:

for r in myfunct():
  dostuff(r)

My question, is there a way to get just one element from the generator whenever I like? For example, I’d like to do something like:

while True:
  ...
  if something:
      my_element = pick_just_one_element(myfunct())
      dostuff(my_element)
  ...

回答 0

使用创建一个生成器

g = myfunct()

每当您想要一个项目时,请使用

next(g)

(或g.next()在Python 2.5或更低版本中)。

如果生成器退出,它将升高StopIteration。您可以根据需要捕获此异常,也可以将default参数用于next()

next(g, default_value)

Create a generator using

g = myfunct()

Everytime you would like an item, use

next(g)

(or g.next() in Python 2.5 or below).

If the generator exits, it will raise StopIteration. You can either catch this exception if necessary, or use the default argument to next():

next(g, default_value)

回答 1

要仅选择生成器的一个元素,请breakfor语句中使用,或list(itertools.islice(gen, 1))

根据您的示例(从字面上看),您可以执行以下操作:

while True:
  ...
  if something:
      for my_element in myfunct():
          dostuff(my_element)
          break
      else:
          do_generator_empty()

如果您想“ 每当我喜欢的时候就从 [生成的] 生成器中仅获取一个元素 ”(我想是最初意图的50%,也是最常见的意图),那么:

gen = myfunct()
while True:
  ...
  if something:
      for my_element in gen:
          dostuff(my_element)
          break
      else:
          do_generator_empty()

这样generator.next()可以避免显式使用,并且输入结束处理不需要(神秘的)StopIteration异常处理或额外的默认值比较。

else:for,如果你想要做一些特别的结束产生的case语句段时,才需要。

注意上next()/ .next()

在Python3中,该.next()方法被重命名.__next__()为有充分的理由:它被认为是低级的(PEP 3114)。在Python 2.6之前,内置函数next()不存在。甚至讨论过迁移next()到该operator模块(这本来是明智的做法),因为它很少需要,并且内置名称的可疑膨胀。

next()没有默认值的情况下使用仍然是非常低级的实践- StopIteration在普通的应用程序代码中公开地将神秘的东西扔掉。而且使用next()默认的哨兵-最好是next()直接输入的唯一选择builtins-受限制,并且通常会给出奇怪的非Python逻辑/可读性的原因。

底线:很少使用next()-就像使用operator模块的功能一样。使用for x in iteratorislicelist(iterator)等功能接受一个迭代器无缝地使用是在应用层上的迭代器的自然方式-而且相当总是可能的。next()是低级的,一个额外的概念,很明显-正如该线程的问题所示。虽然例如,使用breakfor是常规的。

For picking just one element of a generator use break in a for statement, or list(itertools.islice(gen, 1))

According to your example (literally) you can do something like:

while True:
  ...
  if something:
      for my_element in myfunct():
          dostuff(my_element)
          break
      else:
          do_generator_empty()

If you want “get just one element from the [once generated] generator whenever I like” (I suppose 50% thats the original intention, and the most common intention) then:

gen = myfunct()
while True:
  ...
  if something:
      for my_element in gen:
          dostuff(my_element)
          break
      else:
          do_generator_empty()

This way explicit use of generator.next() can be avoided, and end-of-input handling doesn’t require (cryptic) StopIteration exception handling or extra default value comparisons.

The else: of for statement section is only needed if you want do something special in case of end-of-generator.

Note on next() / .next():

In Python3 the .next() method was renamed to .__next__() for good reason: its considered low-level (PEP 3114). Before Python 2.6 the builtin function next() did not exist. And it was even discussed to move next() to the operator module (which would have been wise), because of its rare need and questionable inflation of builtin names.

Using next() without default is still very low-level practice – throwing the cryptic StopIteration like a bolt out of the blue in normal application code openly. And using next() with default sentinel – which best should be the only option for a next() directly in builtins – is limited and often gives reason to odd non-pythonic logic/readablity.

Bottom line: Using next() should be very rare – like using functions of operator module. Using for x in iterator , islice, list(iterator) and other functions accepting an iterator seamlessly is the natural way of using iterators on application level – and quite always possible. next() is low-level, an extra concept, unobvious – as the question of this thread shows. While e.g. using break in for is conventional.


回答 2

我不认为有一种便捷的方法可以从生成器中检索任意值。生成器将提供next()方法来遍历自身,但是不会立即生成完整序列以节省内存。那就是生成器和列表之间的功能差异。

I don’t believe there’s a convenient way to retrieve an arbitrary value from a generator. The generator will provide a next() method to traverse itself, but the full sequence is not produced immediately to save memory. That’s the functional difference between a generator and a list.


回答 3

对于那些浏览这些答案的人来说,它们是Python3的完整工作示例…在这里,您可以继续:

def numgen():
    x = 1000
    while True:
        x += 1
        yield x

nums = numgen() # because it must be the _same_ generator

for n in range(3):
    numnext = next(nums)
    print(numnext)

输出:

1001
1002
1003

For those of you scanning through these answers for a complete working example for Python3… well here ya go:

def numgen():
    x = 1000
    while True:
        x += 1
        yield x

nums = numgen() # because it must be the _same_ generator

for n in range(3):
    numnext = next(nums)
    print(numnext)

This outputs:

1001
1002
1003

回答 4

Generator是产生迭代器的函数。因此,一旦有了迭代器实例,就可以使用next()从迭代器中获取下一项。例如,使用next()函数来获取第一个项目,然后for in用于处理剩余的项目:

# create new instance of iterator by calling a generator function
items = generator_function()

# fetch and print first item
first = next(items)
print('first item:', first)

# process remaining items:
for item in items:
    print('next item:', item)

Generator is a function that produces an iterator. Therefore, once you have iterator instance, use next() to fetch the next item from the iterator. As an example, use next() function to fetch the first item, and later use for in to process remaining items:

# create new instance of iterator by calling a generator function
items = generator_function()

# fetch and print first item
first = next(items)
print('first item:', first)

# process remaining items:
for item in items:
    print('next item:', item)

回答 5

generator = myfunct()
while True:
   my_element = generator.next()

确保捕获采用最后一个元素后引发的异常

generator = myfunct()
while True:
   my_element = generator.next()

make sure to catch the exception thrown after the last element is taken


回答 6

我相信唯一的方法是从迭代器中获取一个列表,然后从该列表中获取所需的元素。

l = list(myfunct())
l[4]

I believe the only way is to get a list from the iterator then get the element you want from that list.

l = list(myfunct())
l[4]

将迭代器转换为列表的最快方法

问题:将迭代器转换为列表的最快方法

有一个iterator对象,是否有比列表理解更快,更好或更正确的方法来获取迭代器返回的对象的列表?

user_list = [user for user in user_iterator]

Having an iterator object, is there something faster, better or more correct than a list comprehension to get a list of the objects returned by the iterator?

user_list = [user for user in user_iterator]

回答 0

list(your_iterator)
list(your_iterator)

回答 1

python 3.5开始, 您可以使用*可迭代的拆包运算符:

user_list = [*your_iterator]

pythonic的方法是:

user_list  = list(your_iterator)

since python 3.5 you can use * iterable unpacking operator:

user_list = [*your_iterator]

but the pythonic way to do it is:

user_list  = list(your_iterator)

从符合条件的可迭代项中获取第一项

问题:从符合条件的可迭代项中获取第一项

我想从符合条件的列表中获得第一项。重要的是,生成的方法不能处理整个列表,这可能会很大。例如,以下功能是足够的:

def first(the_iterable, condition = lambda x: True):
    for i in the_iterable:
        if condition(i):
            return i

可以使用以下功能:

>>> first(range(10))
0
>>> first(range(10), lambda i: i > 3)
4

但是,我想不出一个好的内置式/单层式来让我这样做。如果不需要,我特别不想复制此功能。是否有内置的方法来获取与条件匹配的第一项?

I would like to get the first item from a list matching a condition. It’s important that the resulting method not process the entire list, which could be quite large. For example, the following function is adequate:

def first(the_iterable, condition = lambda x: True):
    for i in the_iterable:
        if condition(i):
            return i

This function could be used something like this:

>>> first(range(10))
0
>>> first(range(10), lambda i: i > 3)
4

However, I can’t think of a good built-in / one-liner to let me do this. I don’t particularly want to copy this function around if I don’t have to. Is there a built-in way to get the first item matching a condition?


回答 0

在Python 2.6或更高版本中:

如果StopIteration在找不到匹配元素的情况下希望被引发:

next(x for x in the_iterable if x > 3)

如果您希望返回default_value(例如None),请执行以下操作:

next((x for x in the_iterable if x > 3), default_value)

请注意,在这种情况下,您需要在生成器表达式周围加一对括号-只要生成器表达式不是唯一的参数,就需要使用括号。

我看到大多数答案都坚决地忽略了next内置函数,因此我认为出于某种神秘的原因,它们100%专注于2.5版及更早的版本-并未提及Python版本问题(但后来我没有看到该提及答案确实提到了next内置答案,这就是为什么我认为有必要自己提供答案的原因-至少以这种方式记录“正确版本”问题;-)。

在2.5中,如果迭代器立即完成.next(),则迭代器的方法立即提高StopIteration-即,对于您的用例,如果可迭代项中没有项满足条件。如果您不在乎(即,您知道必须至少有一个令人满意的项目),则只需使用.next()(在genexp上最好next,Python 2.6内置版本中的行及更高版本)。

如果您确实愿意的话,按照您在Q中首先指出的方法将内容包装在函数中似乎是最好的,尽管您建议的函数实现很好,但是您也可以使用itertoolsfor...: break循环或genexp,或者将a try/except StopIteration作为函数的主体,如各种答案所示。这些替代方案都没有太多附加值,因此我会选择您最初提出的简单的版本。

In Python 2.6 or newer:

If you want StopIteration to be raised if no matching element is found:

next(x for x in the_iterable if x > 3)

If you want default_value (e.g. None) to be returned instead:

next((x for x in the_iterable if x > 3), default_value)

Note that you need an extra pair of parentheses around the generator expression in this case − they are needed whenever the generator expression isn’t the only argument.

I see most answers resolutely ignore the next built-in and so I assume that for some mysterious reason they’re 100% focused on versions 2.5 and older — without mentioning the Python-version issue (but then I don’t see that mention in the answers that do mention the next built-in, which is why I thought it necessary to provide an answer myself — at least the “correct version” issue gets on record this way;-).

In 2.5, the .next() method of iterators immediately raises StopIteration if the iterator immediately finishes — i.e., for your use case, if no item in the iterable satisfies the condition. If you don’t care (i.e., you know there must be at least one satisfactory item) then just use .next() (best on a genexp, line for the next built-in in Python 2.6 and better).

If you do care, wrapping things in a function as you had first indicated in your Q seems best, and while the function implementation you proposed is just fine, you could alternatively use itertools, a for...: break loop, or a genexp, or a try/except StopIteration as the function’s body, as various answers suggested. There’s not much added value in any of these alternatives so I’d go for the starkly-simple version you first proposed.


回答 1

作为可重用,记录和测试的功能

def first(iterable, condition = lambda x: True):
    """
    Returns the first item in the `iterable` that
    satisfies the `condition`.

    If the condition is not given, returns the first item of
    the iterable.

    Raises `StopIteration` if no item satysfing the condition is found.

    >>> first( (1,2,3), condition=lambda x: x % 2 == 0)
    2
    >>> first(range(3, 100))
    3
    >>> first( () )
    Traceback (most recent call last):
    ...
    StopIteration
    """

    return next(x for x in iterable if condition(x))

具有默认参数的版本

@zorf建议使用此函数的版本,如果iterable为空或没有符合条件的项目,则可以具有预定义的返回值:

def first(iterable, default = None, condition = lambda x: True):
    """
    Returns the first item in the `iterable` that
    satisfies the `condition`.

    If the condition is not given, returns the first item of
    the iterable.

    If the `default` argument is given and the iterable is empty,
    or if it has no items matching the condition, the `default` argument
    is returned if it matches the condition.

    The `default` argument being None is the same as it not being given.

    Raises `StopIteration` if no item satisfying the condition is found
    and default is not given or doesn't satisfy the condition.

    >>> first( (1,2,3), condition=lambda x: x % 2 == 0)
    2
    >>> first(range(3, 100))
    3
    >>> first( () )
    Traceback (most recent call last):
    ...
    StopIteration
    >>> first([], default=1)
    1
    >>> first([], default=1, condition=lambda x: x % 2 == 0)
    Traceback (most recent call last):
    ...
    StopIteration
    >>> first([1,3,5], default=1, condition=lambda x: x % 2 == 0)
    Traceback (most recent call last):
    ...
    StopIteration
    """

    try:
        return next(x for x in iterable if condition(x))
    except StopIteration:
        if default is not None and condition(default):
            return default
        else:
            raise

As a reusable, documented and tested function

def first(iterable, condition = lambda x: True):
    """
    Returns the first item in the `iterable` that
    satisfies the `condition`.

    If the condition is not given, returns the first item of
    the iterable.

    Raises `StopIteration` if no item satysfing the condition is found.

    >>> first( (1,2,3), condition=lambda x: x % 2 == 0)
    2
    >>> first(range(3, 100))
    3
    >>> first( () )
    Traceback (most recent call last):
    ...
    StopIteration
    """

    return next(x for x in iterable if condition(x))

Version with default argument

@zorf suggested a version of this function where you can have a predefined return value if the iterable is empty or has no items matching the condition:

def first(iterable, default = None, condition = lambda x: True):
    """
    Returns the first item in the `iterable` that
    satisfies the `condition`.

    If the condition is not given, returns the first item of
    the iterable.

    If the `default` argument is given and the iterable is empty,
    or if it has no items matching the condition, the `default` argument
    is returned if it matches the condition.

    The `default` argument being None is the same as it not being given.

    Raises `StopIteration` if no item satisfying the condition is found
    and default is not given or doesn't satisfy the condition.

    >>> first( (1,2,3), condition=lambda x: x % 2 == 0)
    2
    >>> first(range(3, 100))
    3
    >>> first( () )
    Traceback (most recent call last):
    ...
    StopIteration
    >>> first([], default=1)
    1
    >>> first([], default=1, condition=lambda x: x % 2 == 0)
    Traceback (most recent call last):
    ...
    StopIteration
    >>> first([1,3,5], default=1, condition=lambda x: x % 2 == 0)
    Traceback (most recent call last):
    ...
    StopIteration
    """

    try:
        return next(x for x in iterable if condition(x))
    except StopIteration:
        if default is not None and condition(default):
            return default
        else:
            raise

回答 2

该死的exceptions!

我喜欢这个答案。但是,由于在没有项目时next()引发StopIteration异常,因此我将使用以下代码段来避免异常:

a = []
item = next((x for x in a), None)

例如,

a = []
item = next(x for x in a)

将引发StopIteration异常;

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Damn Exceptions!

I love this answer. However, since next() raise a StopIteration exception when there are no items, i would use the following snippet to avoid an exception:

a = []
item = next((x for x in a), None)

For example,

a = []
item = next(x for x in a)

Will raise a StopIteration exception;

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

回答 3

与using相似ifilter,您可以使用生成器表达式:

>>> (x for x in xrange(10) if x > 5).next()
6

无论哪种情况,StopIteration如果没有元素满足您的条件,您可能都想抓住。

从技术上讲,我想您可以执行以下操作:

>>> foo = None
>>> for foo in (x for x in xrange(10) if x > 5): break
... 
>>> foo
6

这样可以避免产生try/except障碍。但这似乎对语法有些晦涩难懂。

Similar to using ifilter, you could use a generator expression:

>>> (x for x in xrange(10) if x > 5).next()
6

In either case, you probably want to catch StopIteration though, in case no elements satisfy your condition.

Technically speaking, I suppose you could do something like this:

>>> foo = None
>>> for foo in (x for x in xrange(10) if x > 5): break
... 
>>> foo
6

It would avoid having to make a try/except block. But that seems kind of obscure and abusive to the syntax.


回答 4

Python 3中最有效的方法是以下方法之一(使用类似的示例):

具有“领悟”风格:

next(i for i in range(100000000) if i == 1000)

警告:该表达式也适用于Python 2,但是在该示例中使用的range是在Python 3中返回一个可迭代对象,而不是像Python 2这样的列表(如果要在Python 2中构造一个可迭代对象,请使用xrange)。

请注意,该表达式避免在comprehension表达式中构造一个列表next([i for ...]),这会导致在过滤元素之前创建一个包含所有元素的列表,并且会导致处理整个选项,而不是停止迭代一次i == 1000

具有“实用”风格:

next(filter(lambda i: i == 1000, range(100000000)))

警告:这不会工作在Python 2,甚至取代rangexrange由于是filter创建一个列表,而不是一个迭代器(低效率),以及next功能只与迭代器的工作原理。

默认值

如其他响应中所述,next如果要避免在不满足条件时引发异常,则必须在函数中添加一个额外参数。

“实用”风格:

next(filter(lambda i: i == 1000, range(100000000)), False)

“领悟”风格:

使用这种样式时,您需要将comprehension表达式包含()在其中,以避免出现SyntaxError: Generator expression must be parenthesized if not sole argument

next((i for i in range(100000000) if i == 1000), False)

The most efficient way in Python 3 are one of the following (using a similar example):

With “comprehension” style:

next(i for i in range(100000000) if i == 1000)

WARNING: The expression works also with Python 2, but in the example is used range that returns an iterable object in Python 3 instead of a list like Python 2 (if you want to construct an iterable in Python 2 use xrange instead).

Note that the expression avoid to construct a list in the comprehension expression next([i for ...]), that would cause to create a list with all the elements before filter the elements, and would cause to process the entire options, instead of stop the iteration once i == 1000.

With “functional” style:

next(filter(lambda i: i == 1000, range(100000000)))

WARNING: This doesn’t work in Python 2, even replacing range with xrange due that filter create a list instead of a iterator (inefficient), and the next function only works with iterators.

Default value

As mentioned in other responses, you must add a extra-parameter to the function next if you want to avoid an exception raised when the condition is not fulfilled.

“functional” style:

next(filter(lambda i: i == 1000, range(100000000)), False)

“comprehension” style:

With this style you need to surround the comprehension expression with () to avoid a SyntaxError: Generator expression must be parenthesized if not sole argument:

next((i for i in range(100000000) if i == 1000), False)

回答 5

我会写这个

next(x for x in xrange(10) if x > 3)

I would write this

next(x for x in xrange(10) if x > 3)

回答 6

itertools模块包含用于迭代器的过滤器功能。可以通过调用next()它来获取过滤后的迭代器的第一个元素:

from itertools import ifilter

print ifilter((lambda i: i > 3), range(10)).next()

The itertools module contains a filter function for iterators. The first element of the filtered iterator can be obtained by calling next() on it:

from itertools import ifilter

print ifilter((lambda i: i > 3), range(10)).next()

回答 7

对于较旧版本的Python,其中不存在下一个内置组件:

(x for x in range(10) if x > 3).next()

For older versions of Python where the next built-in doesn’t exist:

(x for x in range(10) if x > 3).next()

回答 8

通过使用

(index for index, value in enumerate(the_iterable) if condition(value))

可以检查the_iterable中第一项的条件,并获得其索引,而无需评估the_iterable中的所有项

使用的完整表达式是

first_index = next(index for index, value in enumerate(the_iterable) if condition(value))

在这里,first_index假定在上述表达式中标识的第一个值的值。

By using

(index for index, value in enumerate(the_iterable) if condition(value))

one can check the condition of the value of the first item in the_iterable, and obtain its index without the need to evaluate all of the items in the_iterable.

The complete expression to use is

first_index = next(index for index, value in enumerate(the_iterable) if condition(value))

Here first_index assumes the value of the first value identified in the expression discussed above.


回答 9

这个问题已经有了很好的答案。我只加两分钱,因为我登陆这里试图找到解决自己问题的方法,这与OP非常相似。

如果要使用生成器查找与条件匹配的第一项的INDEX,只需执行以下操作:

next(index for index, value in enumerate(iterable) if condition)

This question already has great answers. I’m only adding my two cents because I landed here trying to find a solution to my own problem, which is very similar to the OP.

If you want to find the INDEX of the first item matching a criteria using generators, you can simply do:

next(index for index, value in enumerate(iterable) if condition)

回答 10

您也可以argwhere在Numpy中使用该功能。例如:

i)在“ helloworld”中找到第一个“ l”:

import numpy as np
l = list("helloworld") # Create list
i = np.argwhere(np.array(l)=="l") # i = array([[2],[3],[8]])
index_of_first = i.min()

ii)查找第一个随机数> 0.1

import numpy as np
r = np.random.rand(50) # Create random numbers
i = np.argwhere(r>0.1)
index_of_first = i.min()

iii)找到最后一个随机数> 0.1

import numpy as np
r = np.random.rand(50) # Create random numbers
i = np.argwhere(r>0.1)
index_of_last = i.max()

You could also use the argwhere function in Numpy. For example:

i) Find the first “l” in “helloworld”:

import numpy as np
l = list("helloworld") # Create list
i = np.argwhere(np.array(l)=="l") # i = array([[2],[3],[8]])
index_of_first = i.min()

ii) Find first random number > 0.1

import numpy as np
r = np.random.rand(50) # Create random numbers
i = np.argwhere(r>0.1)
index_of_first = i.min()

iii) Find the last random number > 0.1

import numpy as np
r = np.random.rand(50) # Create random numbers
i = np.argwhere(r>0.1)
index_of_last = i.max()

回答 11

在Python 3中:

a = (None, False, 0, 1)
assert next(filter(None, a)) == 1

在Python 2.6中:

a = (None, False, 0, 1)
assert next(iter(filter(None, a))) == 1

编辑:我认为这很明显,但显然不是:而是None可以通过lambda检查条件来传递函数(或):

a = [2,3,4,5,6,7,8]
assert next(filter(lambda x: x%2, a)) == 3

In Python 3:

a = (None, False, 0, 1)
assert next(filter(None, a)) == 1

In Python 2.6:

a = (None, False, 0, 1)
assert next(iter(filter(None, a))) == 1

EDIT: I thought it was obvious, but apparently not: instead of None you can pass a function (or a lambda) with a check for the condition:

a = [2,3,4,5,6,7,8]
assert next(filter(lambda x: x%2, a)) == 3

回答 12

Oneliner:

thefirst = [i for i in range(10) if i > 3][0]

如果您不确定任何元素根据条件有效,则应将其括起来,try/except因为这[0]会引发IndexError

Oneliner:

thefirst = [i for i in range(10) if i > 3][0]

If youre not sure that any element will be valid according to the criteria, you should enclose this with try/except since that [0] can raise an IndexError.


构建一个基本的Python迭代器

问题:构建一个基本的Python迭代器

如何在python中创建一个迭代函数(或迭代器对象)?

How would one create an iterative function (or iterator object) in python?


回答 0

python中的迭代器对象符合迭代器协议,这基本上意味着它们提供了两种方法:__iter__()__next__()

  • __iter__返回迭代器对象,并在循环开始时隐式调用。

  • __next__()方法返回下一个值,并在每次循环增量时隐式调用。当没有更多值要返回时,此方法将引发StopIteration异常,该异常由循环构造以停止迭代的方式隐式捕获。

这是一个简单的计数器示例:

class Counter:
    def __init__(self, low, high):
        self.current = low - 1
        self.high = high

    def __iter__(self):
        return self

    def __next__(self): # Python 2: def next(self)
        self.current += 1
        if self.current < self.high:
            return self.current
        raise StopIteration


for c in Counter(3, 9):
    print(c)

这将打印:

3
4
5
6
7
8

如上一个答案所述,使用生成器编写起来更容易:

def counter(low, high):
    current = low
    while current < high:
        yield current
        current += 1

for c in counter(3, 9):
    print(c)

打印的输出将相同。在内部,生成器对象支持迭代器协议,并且执行与类Counter大致相似的操作。

David Mertz的文章Iterators和Simple Generators是很好的介绍。

Iterator objects in python conform to the iterator protocol, which basically means they provide two methods: __iter__() and __next__().

  • The __iter__ returns the iterator object and is implicitly called at the start of loops.

  • The __next__() method returns the next value and is implicitly called at each loop increment. This method raises a StopIteration exception when there are no more value to return, which is implicitly captured by looping constructs to stop iterating.

Here’s a simple example of a counter:

class Counter:
    def __init__(self, low, high):
        self.current = low - 1
        self.high = high

    def __iter__(self):
        return self

    def __next__(self): # Python 2: def next(self)
        self.current += 1
        if self.current < self.high:
            return self.current
        raise StopIteration


for c in Counter(3, 9):
    print(c)

This will print:

3
4
5
6
7
8

This is easier to write using a generator, as covered in a previous answer:

def counter(low, high):
    current = low
    while current < high:
        yield current
        current += 1

for c in counter(3, 9):
    print(c)

The printed output will be the same. Under the hood, the generator object supports the iterator protocol and does something roughly similar to the class Counter.

David Mertz’s article, Iterators and Simple Generators, is a pretty good introduction.


回答 1

有四种方法可以构建迭代函数:

例子:

# generator
def uc_gen(text):
    for char in text.upper():
        yield char

# generator expression
def uc_genexp(text):
    return (char for char in text.upper())

# iterator protocol
class uc_iter():
    def __init__(self, text):
        self.text = text.upper()
        self.index = 0
    def __iter__(self):
        return self
    def __next__(self):
        try:
            result = self.text[self.index]
        except IndexError:
            raise StopIteration
        self.index += 1
        return result

# getitem method
class uc_getitem():
    def __init__(self, text):
        self.text = text.upper()
    def __getitem__(self, index):
        return self.text[index]

要查看所有四种方法:

for iterator in uc_gen, uc_genexp, uc_iter, uc_getitem:
    for ch in iterator('abcde'):
        print(ch, end=' ')
    print()

结果是:

A B C D E
A B C D E
A B C D E
A B C D E

注意事项

两种生成器类型(uc_genuc_genexp)不能为reversed(); 普通的iterator(uc_iter)将需要__reversed__magic方法(根据docs,它必须返回一个新的iterator,但返回self工作结果(至少在CPython中));并且getitem iteratable(uc_getitem)必须具有__len__魔术方法:

    # for uc_iter we add __reversed__ and update __next__
    def __reversed__(self):
        self.index = -1
        return self
    def __next__(self):
        try:
            result = self.text[self.index]
        except IndexError:
            raise StopIteration
        self.index += -1 if self.index < 0 else +1
        return result

    # for uc_getitem
    def __len__(self)
        return len(self.text)

为了回答上校Panic关于无限懒惰求值的迭代器的第二个问题,以下是使用上述四种方法中的每一个的示例:

# generator
def even_gen():
    result = 0
    while True:
        yield result
        result += 2


# generator expression
def even_genexp():
    return (num for num in even_gen())  # or even_iter or even_getitem
                                        # not much value under these circumstances

# iterator protocol
class even_iter():
    def __init__(self):
        self.value = 0
    def __iter__(self):
        return self
    def __next__(self):
        next_value = self.value
        self.value += 2
        return next_value

# getitem method
class even_getitem():
    def __getitem__(self, index):
        return index * 2

import random
for iterator in even_gen, even_genexp, even_iter, even_getitem:
    limit = random.randint(15, 30)
    count = 0
    for even in iterator():
        print even,
        count += 1
        if count >= limit:
            break
    print

结果(至少在我的示例运行中):

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32

如何选择使用哪一个?这主要是一个品味问题。我最常看到的两种方法是生成器和迭代器协议,以及混合方法(__iter__返回生成器)。

生成器表达式可用于替换列表推导(它们很懒,因此可以节省资源)。

如果需要与早期的Python 2.x版本兼容,请使用__getitem__

There are four ways to build an iterative function:

Examples:

# generator
def uc_gen(text):
    for char in text.upper():
        yield char

# generator expression
def uc_genexp(text):
    return (char for char in text.upper())

# iterator protocol
class uc_iter():
    def __init__(self, text):
        self.text = text.upper()
        self.index = 0
    def __iter__(self):
        return self
    def __next__(self):
        try:
            result = self.text[self.index]
        except IndexError:
            raise StopIteration
        self.index += 1
        return result

# getitem method
class uc_getitem():
    def __init__(self, text):
        self.text = text.upper()
    def __getitem__(self, index):
        return self.text[index]

To see all four methods in action:

for iterator in uc_gen, uc_genexp, uc_iter, uc_getitem:
    for ch in iterator('abcde'):
        print(ch, end=' ')
    print()

Which results in:

A B C D E
A B C D E
A B C D E
A B C D E

Note:

The two generator types (uc_gen and uc_genexp) cannot be reversed(); the plain iterator (uc_iter) would need the __reversed__ magic method (which, according to the docs, must return a new iterator, but returning self works (at least in CPython)); and the getitem iteratable (uc_getitem) must have the __len__ magic method:

    # for uc_iter we add __reversed__ and update __next__
    def __reversed__(self):
        self.index = -1
        return self
    def __next__(self):
        try:
            result = self.text[self.index]
        except IndexError:
            raise StopIteration
        self.index += -1 if self.index < 0 else +1
        return result

    # for uc_getitem
    def __len__(self)
        return len(self.text)

To answer Colonel Panic’s secondary question about an infinite lazily evaluated iterator, here are those examples, using each of the four methods above:

# generator
def even_gen():
    result = 0
    while True:
        yield result
        result += 2


# generator expression
def even_genexp():
    return (num for num in even_gen())  # or even_iter or even_getitem
                                        # not much value under these circumstances

# iterator protocol
class even_iter():
    def __init__(self):
        self.value = 0
    def __iter__(self):
        return self
    def __next__(self):
        next_value = self.value
        self.value += 2
        return next_value

# getitem method
class even_getitem():
    def __getitem__(self, index):
        return index * 2

import random
for iterator in even_gen, even_genexp, even_iter, even_getitem:
    limit = random.randint(15, 30)
    count = 0
    for even in iterator():
        print even,
        count += 1
        if count >= limit:
            break
    print

Which results in (at least for my sample run):

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32

How to choose which one to use? This is mostly a matter of taste. The two methods I see most often are generators and the iterator protocol, as well as a hybrid (__iter__ returning a generator).

Generator expressions are useful for replacing list comprehensions (they are lazy and so can save on resources).

If one needs compatibility with earlier Python 2.x versions use __getitem__.


回答 2

首先,itertools模块在各种情况下都非常有用,在这种情况下,迭代器将是有用的,但是这里是在python中创建迭代器所需的全部:

那不是很酷吗?Yield可以用来代替函数中的正常收益。它返回的对象是相同的,但是它不会破坏状态并退出,而是为您要执行下一次迭代保存状态。这是直接从itertools函数列表中提取的示例:

def count(n=0):
    while True:
        yield n
        n += 1

如功能说明中所述(它是itertools模块中的count()函数…),它生成一个迭代器,该迭代器返回以n开头的连续整数。

生成器表达式是蠕虫(真棒蠕虫!)的另一种形式。可以使用它们代替列表推导来节省内存(列表推导会在内存中创建一个列表,如果未分配给变量,该列表在使用后会被销毁,但是生成器表达式可以创建一个Generator对象…说迭​​代器)。这是生成器表达式定义的示例:

gen = (n for n in xrange(0,11))

这与上面的迭代器定义非常相似,不同之处在于整个范围的预定范围是0到10。

我刚刚找到了xrange()(应该是我之前从未见过……)并将其添加到上述示例中。 xrange()range()的可迭代版本,其优点是不预先构建列表。如果您要遍历庞大的数据集并且只有那么多的内存可以进行访问,这将非常有用。

First of all the itertools module is incredibly useful for all sorts of cases in which an iterator would be useful, but here is all you need to create an iterator in python:

yield

Isn’t that cool? Yield can be used to replace a normal return in a function. It returns the object just the same, but instead of destroying state and exiting, it saves state for when you want to execute the next iteration. Here is an example of it in action pulled directly from the itertools function list:

def count(n=0):
    while True:
        yield n
        n += 1

As stated in the functions description (it’s the count() function from the itertools module…) , it produces an iterator that returns consecutive integers starting with n.

Generator expressions are a whole other can of worms (awesome worms!). They may be used in place of a List Comprehension to save memory (list comprehensions create a list in memory that is destroyed after use if not assigned to a variable, but generator expressions can create a Generator Object… which is a fancy way of saying Iterator). Here is an example of a generator expression definition:

gen = (n for n in xrange(0,11))

This is very similar to our iterator definition above except the full range is predetermined to be between 0 and 10.

I just found xrange() (suprised I hadn’t seen it before…) and added it to the above example. xrange() is an iterable version of range() which has the advantage of not prebuilding the list. It would be very useful if you had a giant corpus of data to iterate over and only had so much memory to do it in.


回答 3

我看到你们return self中有些人在做__iter__。我只想指出,__iter__它本身可以成为生成器(因此消除了对异常的需求__next__并引发了StopIteration异常)

class range:
  def __init__(self,a,b):
    self.a = a
    self.b = b
  def __iter__(self):
    i = self.a
    while i < self.b:
      yield i
      i+=1

当然,这里也可以直接生成一个生成器,但是对于更复杂的类,它可能很有用。

I see some of you doing return self in __iter__. I just wanted to note that __iter__ itself can be a generator (thus removing the need for __next__ and raising StopIteration exceptions)

class range:
  def __init__(self,a,b):
    self.a = a
    self.b = b
  def __iter__(self):
    i = self.a
    while i < self.b:
      yield i
      i+=1

Of course here one might as well directly make a generator, but for more complex classes it can be useful.


回答 4

这个问题是关于可迭代的对象,而不是关于迭代器。在Python中,序列也是可迭代的,因此制作可迭代类的一种方法是使其表现得像序列,即给它__getitem____len__方法。我已经在Python 2和3上对此进行了测试。

class CustomRange:

    def __init__(self, low, high):
        self.low = low
        self.high = high

    def __getitem__(self, item):
        if item >= len(self):
            raise IndexError("CustomRange index out of range")
        return self.low + item

    def __len__(self):
        return self.high - self.low


cr = CustomRange(0, 10)
for i in cr:
    print(i)

This question is about iterable objects, not about iterators. In Python, sequences are iterable too so one way to make an iterable class is to make it behave like a sequence, i.e. give it __getitem__ and __len__ methods. I have tested this on Python 2 and 3.

class CustomRange:

    def __init__(self, low, high):
        self.low = low
        self.high = high

    def __getitem__(self, item):
        if item >= len(self):
            raise IndexError("CustomRange index out of range")
        return self.low + item

    def __len__(self):
        return self.high - self.low


cr = CustomRange(0, 10)
for i in cr:
    print(i)

回答 5

对于复杂的对象,此页面上的所有答案都非常有用。但对于含有内置的迭代类型,属性那些像strlistsetdict,或任何实现collections.Iterable,你可以在你的类省略某些事情。

class Test(object):
    def __init__(self, string):
        self.string = string

    def __iter__(self):
        # since your string is already iterable
        return (ch for ch in self.string)
        # or simply
        return self.string.__iter__()
        # also
        return iter(self.string)

可以像这样使用:

for x in Test("abcde"):
    print(x)

# prints
# a
# b
# c
# d
# e

All answers on this page are really great for a complex object. But for those containing builtin iterable types as attributes, like str, list, set or dict, or any implementation of collections.Iterable, you can omit certain things in your class.

class Test(object):
    def __init__(self, string):
        self.string = string

    def __iter__(self):
        # since your string is already iterable
        return (ch for ch in self.string)
        # or simply
        return self.string.__iter__()
        # also
        return iter(self.string)

It can be used like:

for x in Test("abcde"):
    print(x)

# prints
# a
# b
# c
# d
# e

回答 6

如果没有,这是一个迭代函数yield。它利用iter函数和闭包将其状态保存在listpython 2的封闭范围内的可变()中。

def count(low, high):
    counter = [0]
    def tmp():
        val = low + counter[0]
        if val < high:
            counter[0] += 1
            return val
        return None
    return iter(tmp, None)

对于Python 3,封闭状态在封闭范围内保持不变,并nonlocal在局部范围内用于更新状态变量。

def count(low, high):
    counter = 0
    def tmp():
        nonlocal counter
        val = low + counter
        if val < high:
            counter += 1
            return val
        return None
    return iter(tmp, None)  

测试;

for i in count(1,10):
    print(i)
1
2
3
4
5
6
7
8
9

This is an iterable function without yield. It make use of the iter function and a closure which keeps it’s state in a mutable (list) in the enclosing scope for python 2.

def count(low, high):
    counter = [0]
    def tmp():
        val = low + counter[0]
        if val < high:
            counter[0] += 1
            return val
        return None
    return iter(tmp, None)

For Python 3, closure state is kept in an immutable in the enclosing scope and nonlocal is used in local scope to update the state variable.

def count(low, high):
    counter = 0
    def tmp():
        nonlocal counter
        val = low + counter
        if val < high:
            counter += 1
            return val
        return None
    return iter(tmp, None)  

Test;

for i in count(1,10):
    print(i)
1
2
3
4
5
6
7
8
9

回答 7

如果您想找简单明了的东西,也许对您来说已经足够了:

class A(object):
    def __init__(self, l):
        self.data = l

    def __iter__(self):
        return iter(self.data)

使用示例:

In [3]: a = A([2,3,4])

In [4]: [i for i in a]
Out[4]: [2, 3, 4]

If you looking for something short and simple, maybe it will be enough for you:

class A(object):
    def __init__(self, l):
        self.data = l

    def __iter__(self):
        return iter(self.data)

example of usage:

In [3]: a = A([2,3,4])

In [4]: [i for i in a]
Out[4]: [2, 3, 4]

回答 8

受Matt Gregory的回答启发,这里有一个更复杂的迭代器,它将返回a,b,…,z,aa,ab,…,zz,aaa,aab,…,zzy,zzz

    class AlphaCounter:
    def __init__(self, low, high):
        self.current = low
        self.high = high

    def __iter__(self):
        return self

    def __next__(self): # Python 3: def __next__(self)
        alpha = ' abcdefghijklmnopqrstuvwxyz'
        n_current = sum([(alpha.find(self.current[x])* 26**(len(self.current)-x-1)) for x in range(len(self.current))])
        n_high = sum([(alpha.find(self.high[x])* 26**(len(self.high)-x-1)) for x in range(len(self.high))])
        if n_current > n_high:
            raise StopIteration
        else:
            increment = True
            ret = ''
            for x in self.current[::-1]:
                if 'z' == x:
                    if increment:
                        ret += 'a'
                    else:
                        ret += 'z'
                else:
                    if increment:
                        ret += alpha[alpha.find(x)+1]
                        increment = False
                    else:
                        ret += x
            if increment:
                ret += 'a'
            tmp = self.current
            self.current = ret[::-1]
            return tmp

for c in AlphaCounter('a', 'zzz'):
    print(c)

Inspired by Matt Gregory’s answer here is a bit more complicated iterator that will return a,b,…,z,aa,ab,…,zz,aaa,aab,…,zzy,zzz

    class AlphaCounter:
    def __init__(self, low, high):
        self.current = low
        self.high = high

    def __iter__(self):
        return self

    def __next__(self): # Python 3: def __next__(self)
        alpha = ' abcdefghijklmnopqrstuvwxyz'
        n_current = sum([(alpha.find(self.current[x])* 26**(len(self.current)-x-1)) for x in range(len(self.current))])
        n_high = sum([(alpha.find(self.high[x])* 26**(len(self.high)-x-1)) for x in range(len(self.high))])
        if n_current > n_high:
            raise StopIteration
        else:
            increment = True
            ret = ''
            for x in self.current[::-1]:
                if 'z' == x:
                    if increment:
                        ret += 'a'
                    else:
                        ret += 'z'
                else:
                    if increment:
                        ret += alpha[alpha.find(x)+1]
                        increment = False
                    else:
                        ret += x
            if increment:
                ret += 'a'
            tmp = self.current
            self.current = ret[::-1]
            return tmp

for c in AlphaCounter('a', 'zzz'):
    print(c)

如何遍历给定目录中的文件?

问题:如何遍历给定目录中的文件?

我需要遍历.asm给定目录内的所有文件并对它们执行一些操作。

如何有效地做到这一点?

I need to iterate through all .asm files inside a given directory and do some actions on them.

How can this be done in a efficient way?


回答 0

原始答案:

import os

for filename in os.listdir(directory):
    if filename.endswith(".asm") or filename.endswith(".py"): 
         # print(os.path.join(directory, filename))
        continue
    else:
        continue

上面答案的Python 3.6版本,使用os-假设您将目录路径作为str对象包含在名为的变量中directory_in_str

import os

directory = os.fsencode(directory_in_str)

for file in os.listdir(directory):
     filename = os.fsdecode(file)
     if filename.endswith(".asm") or filename.endswith(".py"): 
         # print(os.path.join(directory, filename))
         continue
     else:
         continue

或递归使用pathlib

from pathlib import Path

pathlist = Path(directory_in_str).glob('**/*.asm')
for path in pathlist:
     # because path is object not string
     path_in_str = str(path)
     # print(path_in_str)

Original answer:

import os

for filename in os.listdir(directory):
    if filename.endswith(".asm") or filename.endswith(".py"): 
         # print(os.path.join(directory, filename))
        continue
    else:
        continue

Python 3.6 version of the above answer, using os – assuming that you have the directory path as a str object in a variable called directory_in_str:

import os

directory = os.fsencode(directory_in_str)

for file in os.listdir(directory):
     filename = os.fsdecode(file)
     if filename.endswith(".asm") or filename.endswith(".py"): 
         # print(os.path.join(directory, filename))
         continue
     else:
         continue

Or recursively, using pathlib:

from pathlib import Path

pathlist = Path(directory_in_str).glob('**/*.asm')
for path in pathlist:
     # because path is object not string
     path_in_str = str(path)
     # print(path_in_str)

回答 1

这将遍历所有后代文件,而不仅仅是目录的直接子级:

import os

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        #print os.path.join(subdir, file)
        filepath = subdir + os.sep + file

        if filepath.endswith(".asm"):
            print (filepath)

This will iterate over all descendant files, not just the immediate children of the directory:

import os

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        #print os.path.join(subdir, file)
        filepath = subdir + os.sep + file

        if filepath.endswith(".asm"):
            print (filepath)

回答 2

您可以尝试使用glob模块:

import glob

for filepath in glob.iglob('my_dir/*.asm'):
    print(filepath)

从Python 3.5开始,您还可以搜索子目录:

glob.glob('**/*.txt', recursive=True) # => ['2.txt', 'sub/3.txt']

从文档:

glob模块根据Unix shell使用的规则查找与指定模式匹配的所有路径名,尽管结果以任意顺序返回。没有波浪符号扩展,但是* 、?和用[]表示的字符范围将正确匹配。

You can try using glob module:

import glob

for filepath in glob.iglob('my_dir/*.asm'):
    print(filepath)

and since Python 3.5 you can search subdirectories as well:

glob.glob('**/*.txt', recursive=True) # => ['2.txt', 'sub/3.txt']

From the docs:

The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell, although results are returned in arbitrary order. No tilde expansion is done, but *, ?, and character ranges expressed with [] will be correctly matched.


回答 3

从Python 3.5开始,使用os.scandir()可以轻松得多

with os.scandir(path) as it:
    for entry in it:
        if entry.name.endswith(".asm") and entry.is_file():
            print(entry.name, entry.path)

使用scandir()而不是listdir()可以显着提高还需要文件类型或文件属性信息的代码的性能,因为如果操作系统在扫描目录时提供了os.DirEntry对象,则os.DirEntry对象将公开此信息。所有的os.DirEntry方法都可以执行系统调用,但是is_dir()和is_file()通常只需要系统调用即可进行符号链接。os.DirEntry.stat()在Unix上始终需要系统调用,而在Windows上只需要一个系统调用即可。

Since Python 3.5, things are much easier with os.scandir()

with os.scandir(path) as it:
    for entry in it:
        if entry.name.endswith(".asm") and entry.is_file():
            print(entry.name, entry.path)

Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory. All os.DirEntry methods may perform a system call, but is_dir() and is_file() usually only require a system call for symbolic links; os.DirEntry.stat() always requires a system call on Unix but only requires one for symbolic links on Windows.


回答 4

Python 3.4和更高版本在标准库中提供pathlib。您可以这样做:

from pathlib import Path

asm_pths = [pth for pth in Path.cwd().iterdir()
            if pth.suffix == '.asm']

或者,如果您不喜欢列表推导:

asm_paths = []
for pth in Path.cwd().iterdir():
    if pth.suffix == '.asm':
        asm_pths.append(pth)

Path 对象可以轻松转换为字符串。

Python 3.4 and later offer pathlib in the standard library. You could do:

from pathlib import Path

asm_pths = [pth for pth in Path.cwd().iterdir()
            if pth.suffix == '.asm']

Or if you don’t like list comprehensions:

asm_paths = []
for pth in Path.cwd().iterdir():
    if pth.suffix == '.asm':
        asm_pths.append(pth)

Path objects can easily be converted to strings.


回答 5

这是我遍历Python中文件的方式:

import os

path = 'the/name/of/your/path'

folder = os.fsencode(path)

filenames = []

for file in os.listdir(folder):
    filename = os.fsdecode(file)
    if filename.endswith( ('.jpeg', '.png', '.gif') ): # whatever file types you're using...
        filenames.append(filename)

filenames.sort() # now you have the filenames and can do something with them

这些技术均无法保证任何迭代顺序

是的,超级变幻莫测。请注意,我对文件名进行了排序,这在文件顺序很重要的情况下很重要,例如,对于视频帧或与时间有关的数据收集。不过,请务必在文件名中添加索引!

Here’s how I iterate through files in Python:

import os

path = 'the/name/of/your/path'

folder = os.fsencode(path)

filenames = []

for file in os.listdir(folder):
    filename = os.fsdecode(file)
    if filename.endswith( ('.jpeg', '.png', '.gif') ): # whatever file types you're using...
        filenames.append(filename)

filenames.sort() # now you have the filenames and can do something with them

NONE OF THESE TECHNIQUES GUARANTEE ANY ITERATION ORDERING

Yup, super unpredictable. Notice that I sort the filenames, which is important if the order of the files matters, i.e. for video frames or time dependent data collection. Be sure to put indices in your filenames though!


回答 6

您可以使用glob来引用目录和列表:

import glob
import os

#to get the current working directory name
cwd = os.getcwd()
#Load the images from images folder.
for f in glob.glob('images\*.jpg'):   
    dir_name = get_dir_name(f)
    image_file_name = dir_name + '.jpg'
    #To print the file name with path (path will be in string)
    print (image_file_name)

要获取数组中所有目录的列表,可以使用os

os.listdir(directory)

You can use glob for referring the directory and the list :

import glob
import os

#to get the current working directory name
cwd = os.getcwd()
#Load the images from images folder.
for f in glob.glob('images\*.jpg'):   
    dir_name = get_dir_name(f)
    image_file_name = dir_name + '.jpg'
    #To print the file name with path (path will be in string)
    print (image_file_name)

To get the list of all directory in array you can use os :

os.listdir(directory)

回答 7

我对该实现还不太满意,我想拥有一个自定义构造函数,DirectoryIndex._make(next(os.walk(input_path)))该构造函数可以使您只传递要为其列出文件的路径。欢迎编辑!

import collections
import os

DirectoryIndex = collections.namedtuple('DirectoryIndex', ['root', 'dirs', 'files'])

for file_name in DirectoryIndex(*next(os.walk('.'))).files:
    file_path = os.path.join(path, file_name)

I’m not quite happy with this implementation yet, I wanted to have a custom constructor that does DirectoryIndex._make(next(os.walk(input_path))) such that you can just pass the path you want a file listing for. Edits welcome!

import collections
import os

DirectoryIndex = collections.namedtuple('DirectoryIndex', ['root', 'dirs', 'files'])

for file_name in DirectoryIndex(*next(os.walk('.'))).files:
    file_path = os.path.join(path, file_name)

回答 8

我真的很喜欢使用库中scandir内置的指令os。这是一个工作示例:

import os

i = 0
with os.scandir('/usr/local/bin') as root_dir:
    for path in root_dir:
        if path.is_file():
            i += 1
            print(f"Full path is: {path} and just the name is: {path.name}")
print(f"{i} files scanned successfully.")

I really like using the scandir directive that is built into the os library. Here is a working example:

import os

i = 0
with os.scandir('/usr/local/bin') as root_dir:
    for path in root_dir:
        if path.is_file():
            i += 1
            print(f"Full path is: {path} and just the name is: {path.name}")
print(f"{i} files scanned successfully.")

Python的生成器和迭代器之间的区别

问题:Python的生成器和迭代器之间的区别

迭代器和生成器有什么区别?有关何时使用每种情况的一些示例会有所帮助。

What is the difference between iterators and generators? Some examples for when you would use each case would be helpful.


回答 0

iterator是一个更笼统的概念:其类具有next方法(__next__在Python 3中)和具有__iter__方法的任何对象return self

每个生成器都是一个迭代器,但反之亦然。生成器是通过调用具有一个或多个yield表达式(yield在Python 2.5及更早版本中为语句)的函数而构建的,并且该函数是满足上一段对的定义的对象iterator

当您需要一个具有某种复杂状态维护行为的类,或者想要公开除next(和__iter____init__)之外的其他方法时,您可能想使用自定义迭代器,而不是生成器。通常,一个生成器(有时,对于足够简单的需求,一个生成器表达式)就足够了,并且它更容易编写代码,因为状态维护(在合理范围内)基本上是由挂起和恢复帧“为您完成的”。

例如,一个生成器,例如:

def squares(start, stop):
    for i in range(start, stop):
        yield i * i

generator = squares(a, b)

或等效的生成器表达式(genexp)

generator = (i*i for i in range(a, b))

将需要更多代码来构建为自定义迭代器:

class Squares(object):
    def __init__(self, start, stop):
       self.start = start
       self.stop = stop
    def __iter__(self): return self
    def next(self): # __next__ in Python 3
       if self.start >= self.stop:
           raise StopIteration
       current = self.start * self.start
       self.start += 1
       return current

iterator = Squares(a, b)

但是,当然,有了类,Squares您可以轻松地提供其他方法,即

    def current(self):
       return self.start

如果您在应用程序中实际需要这种额外功能。

iterator is a more general concept: any object whose class has a next method (__next__ in Python 3) and an __iter__ method that does return self.

Every generator is an iterator, but not vice versa. A generator is built by calling a function that has one or more yield expressions (yield statements, in Python 2.5 and earlier), and is an object that meets the previous paragraph’s definition of an iterator.

You may want to use a custom iterator, rather than a generator, when you need a class with somewhat complex state-maintaining behavior, or want to expose other methods besides next (and __iter__ and __init__). Most often, a generator (sometimes, for sufficiently simple needs, a generator expression) is sufficient, and it’s simpler to code because state maintenance (within reasonable limits) is basically “done for you” by the frame getting suspended and resumed.

For example, a generator such as:

def squares(start, stop):
    for i in range(start, stop):
        yield i * i

generator = squares(a, b)

or the equivalent generator expression (genexp)

generator = (i*i for i in range(a, b))

would take more code to build as a custom iterator:

class Squares(object):
    def __init__(self, start, stop):
       self.start = start
       self.stop = stop
    def __iter__(self): return self
    def next(self): # __next__ in Python 3
       if self.start >= self.stop:
           raise StopIteration
       current = self.start * self.start
       self.start += 1
       return current

iterator = Squares(a, b)

But, of course, with class Squares you could easily offer extra methods, i.e.

    def current(self):
       return self.start

if you have any actual need for such extra functionality in your application.


回答 1

迭代器和生成器有什么区别?有关何时使用每种情况的一些示例会有所帮助。

总结:迭代器是具有__iter__和方法的对象__next__next在Python 2中)。生成器提供了一种简单的内置方法来创建Iterator的实例。

包含yield的函数仍然是一个函数,在调用该函数时,它会返回生成器对象的实例:

def a_function():
    "when called, returns generator object"
    yield

生成器表达式还返回生成器:

a_generator = (i for i in range(0))

有关更深入的说明和示例,请继续阅读。

生成器迭代器

具体来说,生成器是迭代器的子类型。

>>> import collections, types
>>> issubclass(types.GeneratorType, collections.Iterator)
True

我们可以通过几种方式创建生成器。一种非常普遍且简单的方法是使用函数。

具体来说,其中包含yield的函数是一个函数,在调用该函数时会返回生成器:

>>> def a_function():
        "just a function definition with yield in it"
        yield
>>> type(a_function)
<class 'function'>
>>> a_generator = a_function()  # when called
>>> type(a_generator)           # returns a generator
<class 'generator'>

同样,生成器是迭代器:

>>> isinstance(a_generator, collections.Iterator)
True

迭代器可迭代的

迭代器是可迭代的

>>> issubclass(collections.Iterator, collections.Iterable)
True

这需要一个__iter__返回Iterator 的方法:

>>> collections.Iterable()
Traceback (most recent call last):
  File "<pyshell#79>", line 1, in <module>
    collections.Iterable()
TypeError: Can't instantiate abstract class Iterable with abstract methods __iter__

内置元组,列表,字典,集合,冻结集合,字符串,字节字符串,字节数组,范围和内存视图是可迭代对象的一些示例:

>>> all(isinstance(element, collections.Iterable) for element in (
        (), [], {}, set(), frozenset(), '', b'', bytearray(), range(0), memoryview(b'')))
True

迭代器需要一个next__next__方法

在Python 2中:

>>> collections.Iterator()
Traceback (most recent call last):
  File "<pyshell#80>", line 1, in <module>
    collections.Iterator()
TypeError: Can't instantiate abstract class Iterator with abstract methods next

在Python 3中:

>>> collections.Iterator()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class Iterator with abstract methods __next__

我们可以使用以下函数从内置对象(或自定义对象)中获取迭代器iter

>>> all(isinstance(iter(element), collections.Iterator) for element in (
        (), [], {}, set(), frozenset(), '', b'', bytearray(), range(0), memoryview(b'')))
True

__iter__当您尝试将对象与for循环一起使用时,将调用该方法。然后__next__,在迭代器对象上调用该方法以获取循环中的每个项目。StopIteration耗尽后,迭代器会上升,并且此时无法重用。

从文档中

在“内置类型” 文档的“迭代器类型”部分的“生成器类型”部分中:

Python的生成器提供了一种实现迭代器协议的便捷方法。如果容器对象的__iter__()方法作为生成器实现的,它会自动返回一个迭代器对象(在技术上,一个生成器对象)供给__iter__()next()[ __next__()在Python 3]的方法。有关生成器的更多信息,可以在yield表达式的文档中找到。

(已添加重点。)

因此,我们从中了解到生成器是(便捷的)迭代器类型。

示例迭代器对象

您可以通过创建或扩展自己的对象来创建实现Iterator协议的对象。

class Yes(collections.Iterator):

    def __init__(self, stop):
        self.x = 0
        self.stop = stop

    def __iter__(self):
        return self

    def next(self):
        if self.x < self.stop:
            self.x += 1
            return 'yes'
        else:
            # Iterators must raise when done, else considered broken
            raise StopIteration

    __next__ = next # Python 3 compatibility

但是,使用Generator来执行此操作会更容易:

def yes(stop):
    for _ in range(stop):
        yield 'yes'

也许更简单一些,生成器表达式(类似于列表推导):

yes_expr = ('yes' for _ in range(stop))

它们都可以以相同的方式使用:

>>> stop = 4             
>>> for i, y1, y2, y3 in zip(range(stop), Yes(stop), yes(stop), 
                             ('yes' for _ in range(stop))):
...     print('{0}: {1} == {2} == {3}'.format(i, y1, y2, y3))
...     
0: yes == yes == yes
1: yes == yes == yes
2: yes == yes == yes
3: yes == yes == yes

结论

当需要将Python对象扩展为可以迭代的对象时,可以直接使用Iterator协议。

但是,在大多数情况下,最适合yield用于定义返回生成器迭代器或考虑生成器表达式的函数。

最后,请注意,生成器提供了更多的协同程序功能。我将yield在有关“ yield”关键字的作用?”的回答中深入解释Generators和该语句。

What is the difference between iterators and generators? Some examples for when you would use each case would be helpful.

In summary: Iterators are objects that have an __iter__ and a __next__ (next in Python 2) method. Generators provide an easy, built-in way to create instances of Iterators.

A function with yield in it is still a function, that, when called, returns an instance of a generator object:

def a_function():
    "when called, returns generator object"
    yield

A generator expression also returns a generator:

a_generator = (i for i in range(0))

For a more in-depth exposition and examples, keep reading.

A Generator is an Iterator

Specifically, generator is a subtype of iterator.

>>> import collections, types
>>> issubclass(types.GeneratorType, collections.Iterator)
True

We can create a generator several ways. A very common and simple way to do so is with a function.

Specifically, a function with yield in it is a function, that, when called, returns a generator:

>>> def a_function():
        "just a function definition with yield in it"
        yield
>>> type(a_function)
<class 'function'>
>>> a_generator = a_function()  # when called
>>> type(a_generator)           # returns a generator
<class 'generator'>

And a generator, again, is an Iterator:

>>> isinstance(a_generator, collections.Iterator)
True

An Iterator is an Iterable

An Iterator is an Iterable,

>>> issubclass(collections.Iterator, collections.Iterable)
True

which requires an __iter__ method that returns an Iterator:

>>> collections.Iterable()
Traceback (most recent call last):
  File "<pyshell#79>", line 1, in <module>
    collections.Iterable()
TypeError: Can't instantiate abstract class Iterable with abstract methods __iter__

Some examples of iterables are the built-in tuples, lists, dictionaries, sets, frozen sets, strings, byte strings, byte arrays, ranges and memoryviews:

>>> all(isinstance(element, collections.Iterable) for element in (
        (), [], {}, set(), frozenset(), '', b'', bytearray(), range(0), memoryview(b'')))
True

Iterators require a next or __next__ method

In Python 2:

>>> collections.Iterator()
Traceback (most recent call last):
  File "<pyshell#80>", line 1, in <module>
    collections.Iterator()
TypeError: Can't instantiate abstract class Iterator with abstract methods next

And in Python 3:

>>> collections.Iterator()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class Iterator with abstract methods __next__

We can get the iterators from the built-in objects (or custom objects) with the iter function:

>>> all(isinstance(iter(element), collections.Iterator) for element in (
        (), [], {}, set(), frozenset(), '', b'', bytearray(), range(0), memoryview(b'')))
True

The __iter__ method is called when you attempt to use an object with a for-loop. Then the __next__ method is called on the iterator object to get each item out for the loop. The iterator raises StopIteration when you have exhausted it, and it cannot be reused at that point.

From the documentation

From the Generator Types section of the Iterator Types section of the Built-in Types documentation:

Python’s generators provide a convenient way to implement the iterator protocol. If a container object’s __iter__() method is implemented as a generator, it will automatically return an iterator object (technically, a generator object) supplying the __iter__() and next() [__next__() in Python 3] methods. More information about generators can be found in the documentation for the yield expression.

(Emphasis added.)

So from this we learn that Generators are a (convenient) type of Iterator.

Example Iterator Objects

You might create object that implements the Iterator protocol by creating or extending your own object.

class Yes(collections.Iterator):

    def __init__(self, stop):
        self.x = 0
        self.stop = stop

    def __iter__(self):
        return self

    def next(self):
        if self.x < self.stop:
            self.x += 1
            return 'yes'
        else:
            # Iterators must raise when done, else considered broken
            raise StopIteration

    __next__ = next # Python 3 compatibility

But it’s easier to simply use a Generator to do this:

def yes(stop):
    for _ in range(stop):
        yield 'yes'

Or perhaps simpler, a Generator Expression (works similarly to list comprehensions):

yes_expr = ('yes' for _ in range(stop))

They can all be used in the same way:

>>> stop = 4             
>>> for i, y1, y2, y3 in zip(range(stop), Yes(stop), yes(stop), 
                             ('yes' for _ in range(stop))):
...     print('{0}: {1} == {2} == {3}'.format(i, y1, y2, y3))
...     
0: yes == yes == yes
1: yes == yes == yes
2: yes == yes == yes
3: yes == yes == yes

Conclusion

You can use the Iterator protocol directly when you need to extend a Python object as an object that can be iterated over.

However, in the vast majority of cases, you are best suited to use yield to define a function that returns a Generator Iterator or consider Generator Expressions.

Finally, note that generators provide even more functionality as coroutines. I explain Generators, along with the yield statement, in depth on my answer to “What does the “yield” keyword do?”.


回答 2

迭代器:

迭代器是使用next()方法获取序列的下一个值的对象。

生成器:

生成器是一种使用yield方法生成或产生值序列的函数。

生成器函数(如以下示例中的ex:函数)返回的生成next()器对象(如ex f中的示例)的每个方法调用都将按foo()顺序生成下一个值。

调用生成器函数时,它甚至不开始执行函数就返回生成器对象。当next()方法被称为首次,函数开始执行,直到它到达它返回产生值yield语句。收益跟踪(即记住上一次执行)。第二个next()调用从先前的值继续。

下面的示例演示了yield和生成器对象上的next方法的调用之间的相互作用。

>>> def foo():
...     print "begin"
...     for i in range(3):
...         print "before yield", i
...         yield i
...         print "after yield", i
...     print "end"
...
>>> f = foo()
>>> f.next()
begin
before yield 0            # Control is in for loop
0
>>> f.next()
after yield 0             
before yield 1            # Continue for loop
1
>>> f.next()
after yield 1
before yield 2
2
>>> f.next()
after yield 2
end
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>>

Iterators:

Iterator are objects which uses next() method to get next value of sequence.

Generators:

A generator is a function that produces or yields a sequence of values using yield method.

Every next() method call on generator object(for ex: f as in below example) returned by generator function(for ex: foo() function in below example), generates next value in sequence.

When a generator function is called, it returns an generator object without even beginning execution of the function. When next() method is called for the first time, the function starts executing until it reaches yield statement which returns the yielded value. The yield keeps track of i.e. remembers last execution. And second next() call continues from previous value.

The following example demonstrates the interplay between yield and call to next method on generator object.

>>> def foo():
...     print "begin"
...     for i in range(3):
...         print "before yield", i
...         yield i
...         print "after yield", i
...     print "end"
...
>>> f = foo()
>>> f.next()
begin
before yield 0            # Control is in for loop
0
>>> f.next()
after yield 0             
before yield 1            # Continue for loop
1
>>> f.next()
after yield 1
before yield 2
2
>>> f.next()
after yield 2
end
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>>

回答 3

添加答案是因为现有答案中没有一个专门解决官方文献中的混乱。

生成器函数是使用yield代替定义的普通函数return。调用时,生成器函数将返回一个生成器对象,它是一种迭代器-它具有一个next()方法。当您调用时next(),将返回生成器函数产生的下一个值。

函数或对象都可以称为“生成器”,具体取决于您阅读的是哪个Python源文档。在Python的词汇说发生器功能,而Python的维基意味着生成器对象。在Python的教程非常设法暗示用三句话的空间用法:

生成器是用于创建迭代器的简单而强大的工具。它们的编写方式与常规函数类似,但是只要要返回数据就使用yield语句。每次在其上调用next()时,生成器都会从上次中断的地方继续(它会记住所有数据值以及最后执行的语句)。

前两个句子用生成器函数标识生成器,而第三句话用生成器对象标识它们。

尽管存在所有这些困惑,但您仍然可以找到Python语言参考来获得清晰明确的词:

yield表达式仅在定义生成器函数时使用,并且只能在函数定义的主体中使用。在函数定义中使用yield表达式足以使该定义创建一个生成器函数,而不是普通函数。

调用生成器函数时,它将返回称为生成器的迭代器。然后,该生成器控制生成器功能的执行。

因此,在正式和精确的用法中,“生成器”不合格表示生成器对象,而不是生成器功能。

上面的参考是针对Python 2的,但是Python 3语言参考却说了同样的话。但是,Python 3词汇表指出

generator …通常是指生成器函数,但在某些情况下可能是指生成器迭代器。在预期含义不明确的情况下,使用完整术语可以避免歧义。

Adding an answer because none of the existing answers specifically address the confusion in the official literature.

Generator functions are ordinary functions defined using yield instead of return. When called, a generator function returns a generator object, which is a kind of iterator – it has a next() method. When you call next(), the next value yielded by the generator function is returned.

Either the function or the object may be called the “generator” depending on which Python source document you read. The Python glossary says generator functions, while the Python wiki implies generator objects. The Python tutorial remarkably manages to imply both usages in the space of three sentences:

Generators are a simple and powerful tool for creating iterators. They are written like regular functions but use the yield statement whenever they want to return data. Each time next() is called on it, the generator resumes where it left off (it remembers all the data values and which statement was last executed).

The first two sentences identify generators with generator functions, while the third sentence identifies them with generator objects.

Despite all this confusion, one can seek out the Python language reference for the clear and final word:

The yield expression is only used when defining a generator function, and can only be used in the body of a function definition. Using a yield expression in a function definition is sufficient to cause that definition to create a generator function instead of a normal function.

When a generator function is called, it returns an iterator known as a generator. That generator then controls the execution of a generator function.

So, in formal and precise usage, “generator” unqualified means generator object, not generator function.

The above references are for Python 2 but Python 3 language reference says the same thing. However, the Python 3 glossary states that

generator … Usually refers to a generator function, but may refer to a generator iterator in some contexts. In cases where the intended meaning isn’t clear, using the full terms avoids ambiguity.


回答 4

每个人都有一个非常好的示例冗长的答案,我对此表示感谢。我只是想为在概念上仍不太清楚的人提供简短的答案:

如果创建自己的迭代器,则涉及到一点点-您必须创建一个类并至少实现iter和next方法。但是,如果您不想经历这种麻烦并想快速创建迭代器,该怎么办。幸运的是,Python提供了一种定义迭代器的捷径。您需要做的就是定义一个至少调用一次yield的函数,现在当您调用该函数时,它将返回“ something ”,其作用类似于迭代器(您可以调用next方法并在for循环中使用它)。这个东西在Python中有一个名字叫做Generator

希望可以澄清一下。

Everybody has a really nice and verbose answer with examples and I really appreciate it. I just wanted to give a short few lines answer for people who are still not quite clear conceptually:

If you create your own iterator, it is a little bit involved – you have to create a class and at least implement the iter and the next methods. But what if you don’t want to go through this hassle and want to quickly create an iterator. Fortunately, Python provides a short-cut way to defining an iterator. All you need to do is define a function with at least 1 call to yield and now when you call that function it will return “something” which will act like an iterator (you can call next method and use it in a for loop). This something has a name in Python called Generator

Hope that clarifies a bit.


回答 5

先前的答案未添加此功能:生成器具有close方法,而典型的迭代器则没有。该close方法StopIteration在生成器中触发一个异常,该异常可能会finally在该迭代器的子句中捕获,从而有机会运行一些清理操作。这种抽象使它比简单的迭代器在大型迭代器中最有用。一个人可以关闭一个生成器,就像一个人可以关闭一个文件一样,而不必担心底层内容。

也就是说,我对第一个问题的个人回答是:iteratable __iter__仅具有一个方法,典型的迭代器__next__仅具有一个方法,生成器具有an __iter__和a __next__以及一个extra close

对于第二个问题,我个人的回答是:在公共界面中,我倾向于偏爱生成器,因为它更具弹性:该close方法具有更大的可组合性yield from。在本地,我可以使用迭代器,但前提是它是一个平面且简单的结构(迭代器不容易编写),并且有理由相信该序列很短,尤其是在序列结束之前可以将其停止的情况。我倾向于将迭代器视为低级原语,而不是文字。

对于控制流而言,生成器是一个与承诺一样重要的概念:两者都是抽象的且可组合的。

Previous answers missed this addition: a generator has a close method, while typical iterators don’t. The close method triggers a StopIteration exception in the generator, which may be caught in a finally clause in that iterator, to get a chance to run some clean‑up. This abstraction makes it most usable in the large than simple iterators. One can close a generator as one could close a file, without having to bother about what’s underneath.

That said, my personal answer to the first question would be: iteratable has an __iter__ method only, typical iterators have a __next__ method only, generators has both an __iter__ and a __next__ and an additional close.

For the second question, my personal answer would be: in a public interface, I tend to favor generators a lot, since it’s more resilient: the close method an a greater composability with yield from. Locally, I may use iterators, but only if it’s a flat and simple structure (iterators does not compose easily) and if there are reasons to believe the sequence is rather short especially if it may be stopped before it reach the end. I tend to look at iterators as a low level primitive, except as literals.

For control flow matters, generators are an as much important concept as promises: both are abstract and composable.


回答 6

生成器功能,生成器对象,生成器:

一个生成器的功能就像Python中的常规功能,但它包含一个或多个yield语句。生成器函数是一个很好的工具,它可以尽可能轻松地创建 Iterator对象。通过generator函数返回的Iterator对象也称为Generator对象Generator

在此示例中,我创建了一个Generator函数,该函数返回Generator对象<generator object fib at 0x01342480>。就像其他迭代器一样,Generator对象可以在for循环中使用,也可以与内置函数一起使用,该 函数next()从generator返回下一个值。

def fib(max):
    a, b = 0, 1
    for i in range(max):
        yield a
        a, b = b, a + b
print(fib(10))             #<generator object fib at 0x01342480>

for i in fib(10):
    print(i)               # 0 1 1 2 3 5 8 13 21 34


print(next(myfib))         #0
print(next(myfib))         #1
print(next(myfib))         #1
print(next(myfib))         #2

因此,生成器函数是创建Iterator对象的最简单方法。

迭代器

每个生成器对象都是一个迭代器,但反之则不是。如果自定义迭代器对象的类实现__iter____next__方法(也称为迭代器协议),则可以创建该对象 。

但是,使用生成器函数来创建迭代器要容易得多,因为它们可以简化迭代器的创建,但是自定义迭代器为您提供了更大的自由度,并且您还可以根据需要实现其他方法,如下例所示。

class Fib:
    def __init__(self,max):
        self.current=0
        self.next=1
        self.max=max
        self.count=0

    def __iter__(self):
        return self

    def __next__(self):
        if self.count>self.max:
            raise StopIteration
        else:
            self.current,self.next=self.next,(self.current+self.next)
            self.count+=1
            return self.next-self.current

    def __str__(self):
        return "Generator object"

itobj=Fib(4)
print(itobj)               #Generator object

for i in Fib(4):  
    print(i)               #0 1 1 2

print(next(itobj))         #0
print(next(itobj))         #1
print(next(itobj))         #1

Generator Function, Generator Object, Generator:

A Generator function is just like a regular function in Python but it contains one or more yield statements. Generator functions is a great tool to create Iterator objects as easy as possible. The Iterator object returend by generator function is also called Generator object or Generator.

In this example I have created a Generator function which returns a Generator object <generator object fib at 0x01342480>. Just like other iterators, Generator objects can be used in a for loop or with the built-in function next() which returns the next value from generator.

def fib(max):
    a, b = 0, 1
    for i in range(max):
        yield a
        a, b = b, a + b
print(fib(10))             #<generator object fib at 0x01342480>

for i in fib(10):
    print(i)               # 0 1 1 2 3 5 8 13 21 34


print(next(myfib))         #0
print(next(myfib))         #1
print(next(myfib))         #1
print(next(myfib))         #2

So a generator function is the easiest way to create an Iterator object.

Iterator:

Every generator object is an iterator but not vice versa. A custom iterator object can be created if its class implements __iter__ and __next__ method (also called iterator protocol).

However, it is much easier to use generators function to create iterators because they simplify their creation, but a custom Iterator gives you more freedom and you can also implement other methods according to your requirements as shown in the below example.

class Fib:
    def __init__(self,max):
        self.current=0
        self.next=1
        self.max=max
        self.count=0

    def __iter__(self):
        return self

    def __next__(self):
        if self.count>self.max:
            raise StopIteration
        else:
            self.current,self.next=self.next,(self.current+self.next)
            self.count+=1
            return self.next-self.current

    def __str__(self):
        return "Generator object"

itobj=Fib(4)
print(itobj)               #Generator object

for i in Fib(4):  
    print(i)               #0 1 1 2

print(next(itobj))         #0
print(next(itobj))         #1
print(next(itobj))         #1

回答 7

强烈推荐Ned Batchelder的示例 用于迭代器和生成器

没有生成器的方法会做一些偶数运算

def evens(stream):
   them = []
   for n in stream:
      if n % 2 == 0:
         them.append(n)
   return them

而使用生成器

def evens(stream):
    for n in stream:
        if n % 2 == 0:
            yield n
  • 我们不需要任何清单return声明
  • 对于大/无限长的流有效…它只是走并产生价值

evens照常调用方法(生成器)

num = [...]
for n in evens(num):
   do_smth(n)
  • 生成器也用于打破双回路

迭代器

一整页的书是可迭代的,书签是 迭代器

这个书签除了移动外别无其他 next

litr = iter([1,2,3])
next(litr) ## 1
next(litr) ## 2
next(litr) ## 3
next(litr) ## StopIteration  (Exception) as we got end of the iterator

要使用Generator …我们需要一个函数

要使用迭代器……我们需要nextiter

如前所述:

Generator函数返回迭代器对象

迭代器的全部优点:

一次将一个元素存储在内存中

Examples from Ned Batchelder highly recommended for iterators and generators

A method without generators that do something to even numbers

def evens(stream):
   them = []
   for n in stream:
      if n % 2 == 0:
         them.append(n)
   return them

while by using a generator

def evens(stream):
    for n in stream:
        if n % 2 == 0:
            yield n
  • We don’t need any list nor a return statement
  • Efficient for large/ infinite length stream … it just walks and yield the value

Calling the evens method (generator) is as usual

num = [...]
for n in evens(num):
   do_smth(n)
  • Generator also used to Break double loop

Iterator

A book full of pages is an iterable, A bookmark is an iterator

and this bookmark has nothing to do except to move next

litr = iter([1,2,3])
next(litr) ## 1
next(litr) ## 2
next(litr) ## 3
next(litr) ## StopIteration  (Exception) as we got end of the iterator

To use Generator … we need a function

To use Iterator … we need next and iter

As been said:

A Generator function returns an iterator object

The Whole benefit of Iterator:

Store one element a time in memory


回答 8

您可以将两种方法比较相同的数据:

def myGeneratorList(n):
    for i in range(n):
        yield i

def myIterableList(n):
    ll = n*[None]
    for i in range(n):
        ll[i] = i
    return ll

# Same values
ll1 = myGeneratorList(10)
ll2 = myIterableList(10)
for i1, i2 in zip(ll1, ll2):
    print("{} {}".format(i1, i2))

# Generator can only be read once
ll1 = myGeneratorList(10)
ll2 = myIterableList(10)

print("{} {}".format(len(list(ll1)), len(ll2)))
print("{} {}".format(len(list(ll1)), len(ll2)))

# Generator can be read several times if converted into iterable
ll1 = list(myGeneratorList(10))
ll2 = myIterableList(10)

print("{} {}".format(len(list(ll1)), len(ll2)))
print("{} {}".format(len(list(ll1)), len(ll2)))

此外,如果检查内存占用量,生成器将占用更少的内存,因为它不需要同时将所有值存储在内存中。

You can compare both approaches for the same data:

def myGeneratorList(n):
    for i in range(n):
        yield i

def myIterableList(n):
    ll = n*[None]
    for i in range(n):
        ll[i] = i
    return ll

# Same values
ll1 = myGeneratorList(10)
ll2 = myIterableList(10)
for i1, i2 in zip(ll1, ll2):
    print("{} {}".format(i1, i2))

# Generator can only be read once
ll1 = myGeneratorList(10)
ll2 = myIterableList(10)

print("{} {}".format(len(list(ll1)), len(ll2)))
print("{} {}".format(len(list(ll1)), len(ll2)))

# Generator can be read several times if converted into iterable
ll1 = list(myGeneratorList(10))
ll2 = myIterableList(10)

print("{} {}".format(len(list(ll1)), len(ll2)))
print("{} {}".format(len(list(ll1)), len(ll2)))

Besides, if you check the memory footprint, the generator takes much less memory as it doesn’t need to store all the values in memory at the same time.


回答 9

我用一种非常简单的方式专门为Python新手编写了代码,尽管Python深入人心地做了很多事情。

让我们从最基本的开始:

考虑一个清单,

l = [1,2,3]

让我们编写一个等效的函数:

def f():
    return [1,2,3]

o / p为print(l): [1,2,3]&o / p为print(f()) : [1,2,3]

让列表l变得可迭代:在python中,列表始终是可迭代的,这意味着您可以随时使用迭代器。

让我们在列表上应用迭代器:

iter_l = iter(l) # iterator applied explicitly

让我们迭代一个函数,即编写一个等效的生成器函数。 在python中,一旦您引入了关键字yield;它成为生成器函数,并且迭代器将被隐式应用。

注意:每个生成器始终可以应用隐式迭代器进行迭代,此处隐式迭代器是关键, 因此生成器函数将是:

def f():
  yield 1 
  yield 2
  yield 3

iter_f = f() # which is iter(f) as iterator is already applied implicitly

因此,如果您观察到,一旦创建函数fa generator,它已经是iter(f)

现在,

l是列表,应用迭代器方法“ iter”后,它变为iter(l)

f已经是iter(f),在应用迭代器方法“ iter”之后,它变为iter(iter(f)),再次是iter(f)

您将int强制转换为已经为int的int(x)并保留为int(x)。

例如:

print(type(iter(iter(l))))

<class 'list_iterator'>

永远不要忘记这是Python,而不是C或C ++

因此,以上解释得出的结论是:

列出l〜= iter(l)

生成器函数f == iter(f)

I am writing specifically for Python newbies in a very simple way, though deep down Python does so many things.

Let’s start with the very basic:

Consider a list,

l = [1,2,3]

Let’s write an equivalent function:

def f():
    return [1,2,3]

o/p of print(l): [1,2,3] & o/p of print(f()) : [1,2,3]

Let’s make list l iterable: In python list is always iterable that means you can apply iterator whenever you want.

Let’s apply iterator on list:

iter_l = iter(l) # iterator applied explicitly

Let’s make a function iterable, i.e. write an equivalent generator function. In python as soon as you introduce the keyword yield; it becomes a generator function and iterator will be applied implicitly.

Note: Every generator is always iterable with implicit iterator applied and here implicit iterator is the crux So the generator function will be:

def f():
  yield 1 
  yield 2
  yield 3

iter_f = f() # which is iter(f) as iterator is already applied implicitly

So if you have observed, as soon as you made function f a generator, it is already iter(f)

Now,

l is the list, after applying iterator method “iter” it becomes, iter(l)

f is already iter(f), after applying iterator method “iter” it becomes, iter(iter(f)), which is again iter(f)

It’s kinda you are casting int to int(x) which is already int and it will remain int(x).

For example o/p of :

print(type(iter(iter(l))))

is

<class 'list_iterator'>

Never forget this is Python and not C or C++

Hence the conclusion from above explanation is:

list l ~= iter(l)

generator function f == iter(f)


迭代器,可迭代和迭代到底是什么?

问题:迭代器,可迭代和迭代到底是什么?

Python中“可迭代”,“迭代器”和“迭代”的最基本定义是什么?

我已经阅读了多个定义,但是我无法确定确切的含义,因为它仍然不会陷入。

有人可以在外行方面为我提供3个定义的帮助吗?

What is the most basic definition of “iterable”, “iterator” and “iteration” in Python?

I have read multiple definitions but I am unable to identify the exact meaning as it still won’t sink in.

Can someone please help me with the 3 definitions in layman terms?


回答 0

迭代是一个总称,表示一件一件一件一件一件接一件的物品。每当您使用循环(显式或隐式)遍历一组项目时,即迭代。

在Python中,iterableiterator具有特定的含义。

一个迭代是具有对象__iter__返回一个方法迭代,或者其限定__getitem__,可以采取顺序索引从零启动方法(并发出IndexError时,索引不再有效)。因此,可迭代对象是可以从中获取迭代器的对象。

一个迭代器是具有一个对象next(Python的2)或__next__(Python 3的)方法。

每当在Python中使用for循环或map或列表理解等时,next都会自动调用该方法以从迭代器获取每个项,从而进行迭代过程。

一个开始学习的好地方是本教程迭代器部分和标准类型页面迭代器类型部分。了解基础知识之后,请尝试“功能编程HOWTO”的“ 迭代器”部分

Iteration is a general term for taking each item of something, one after another. Any time you use a loop, explicit or implicit, to go over a group of items, that is iteration.

In Python, iterable and iterator have specific meanings.

An iterable is an object that has an __iter__ method which returns an iterator, or which defines a __getitem__ method that can take sequential indexes starting from zero (and raises an IndexError when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.

An iterator is an object with a next (Python 2) or __next__ (Python 3) method.

Whenever you use a for loop, or map, or a list comprehension, etc. in Python, the next method is called automatically to get each item from the iterator, thus going through the process of iteration.

A good place to start learning would be the iterators section of the tutorial and the iterator types section of the standard types page. After you understand the basics, try the iterators section of the Functional Programming HOWTO.


回答 1

这是我在教授Python类时使用的解释:

一个ITERABLE是:

  • 任何可以循环播放的内容(例如,您可以循环播放字符串或文件)或
  • 任何可能出现在for循环右侧的内容: for x in iterable: ...
  • 您可以呼叫的任何内容iter()都会传回ITERATOR: iter(obj)
  • 一个定义的对象,该对象__iter__返回一个新鲜的ITERATOR,或者它可能具有__getitem__适合于索引查找的方法。

ITERATOR是一个对象:

  • 状态会记住迭代过程中的位置,
  • 使用以下__next__方法:
    • 返回迭代中的下一个值
    • 更新状态以指向下一个值
    • 通过提高发出信号 StopIteration
  • 并且这是可自我迭代的(意味着它具有__iter__返回的方法self)。

笔记:

  • __next__Python 3中的方法是Python 2中的拼写next,并且
  • 内置函数next()在传递给它的对象上调用该方法。

例如:

>>> s = 'cat'      # s is an ITERABLE
                   # s is a str object that is immutable
                   # s has no state
                   # s has a __getitem__() method 

>>> t = iter(s)    # t is an ITERATOR
                   # t has state (it starts by pointing at the "c"
                   # t has a next() method and an __iter__() method

>>> next(t)        # the next() function returns the next value and advances the state
'c'
>>> next(t)        # the next() function returns the next value and advances
'a'
>>> next(t)        # the next() function returns the next value and advances
't'
>>> next(t)        # next() raises StopIteration to signal that iteration is complete
Traceback (most recent call last):
...
StopIteration

>>> iter(t) is t   # the iterator is self-iterable

Here’s the explanation I use in teaching Python classes:

An ITERABLE is:

  • anything that can be looped over (i.e. you can loop over a string or file) or
  • anything that can appear on the right-side of a for-loop: for x in iterable: ... or
  • anything you can call with iter() that will return an ITERATOR: iter(obj) or
  • an object that defines __iter__ that returns a fresh ITERATOR, or it may have a __getitem__ method suitable for indexed lookup.

An ITERATOR is an object:

  • with state that remembers where it is during iteration,
  • with a __next__ method that:
    • returns the next value in the iteration
    • updates the state to point at the next value
    • signals when it is done by raising StopIteration
  • and that is self-iterable (meaning that it has an __iter__ method that returns self).

Notes:

  • The __next__ method in Python 3 is spelt next in Python 2, and
  • The builtin function next() calls that method on the object passed to it.

For example:

>>> s = 'cat'      # s is an ITERABLE
                   # s is a str object that is immutable
                   # s has no state
                   # s has a __getitem__() method 

>>> t = iter(s)    # t is an ITERATOR
                   # t has state (it starts by pointing at the "c"
                   # t has a next() method and an __iter__() method

>>> next(t)        # the next() function returns the next value and advances the state
'c'
>>> next(t)        # the next() function returns the next value and advances
'a'
>>> next(t)        # the next() function returns the next value and advances
't'
>>> next(t)        # next() raises StopIteration to signal that iteration is complete
Traceback (most recent call last):
...
StopIteration

>>> iter(t) is t   # the iterator is self-iterable

回答 2

上面的答案很棒,但是正如我所见到的大多数一样,对于像我这样的人来说,不要强调这种区别

同样,人们倾向于通过在__foo__()前面放置诸如“ X是具有方法的对象”之类的定义来获得“ Python风格” 。这样的定义是正确的-它们基于鸭子式的哲学,但是当试图以简单的方式理解概念时,对方法的关注往往会介于两者之间。

因此,我添加了我的版本。


用自然语言

  • 迭代是在一行元素中一次获取一个元素的过程。

在Python中,

  • Iterable是一个很好的可迭代对象,简单地说,意味着可以在迭代中使用它,例如使用for循环。怎么样?通过使用迭代器。我会在下面解释。

  • …,而迭代器是一个对象,它定义了如何实际执行迭代-特别是下一个元素是什么。这就是为什么它必须有next()方法的原因 。

迭代器本身也是可迭代的,区别在于它们的__iter__()方法返回相同的object(self),而不管其先前的调用是否已消耗其项目next()


那么,Python解释器看到for x in obj:语句时会怎么想?

看,for循环。看起来像是一个迭代器的工作…让我们得到一个。…有obj一个人,让我们问他。

“先生obj,您有迭代器吗?” (…调用iter(obj),这些调用 obj.__iter__()愉快地发出了一个闪亮的新迭代器_i。)

好的,那很简单…让我们开始迭代。(x = _i.next()x = _i.next()…)

由于Mr. Mr obj成功地通过了某种测试(通过某种方法返回有效的迭代器),因此我们用形容词来奖励他:您现在可以称他为“ Iterable Mr. obj”。

但是,在简单的情况下,通常不会从分别拥有Iterator和Iterable中受益。因此,您定义一个对象,这也是它自己的迭代器。(Python并不真正在乎_i发出的obj不是那么闪亮,而仅仅是obj它本身。)

这就是为什么在我见过的大多数示例中(以及一遍又一遍使我困惑的原因)中,您可以看到:

class IterableExample(object):

    def __iter__(self):
        return self

    def next(self):
        pass

代替

class Iterator(object):
    def next(self):
        pass

class Iterable(object):
    def __iter__(self):
        return Iterator()

但是,在某些情况下,可以从使迭代器与可迭代的对象分离中受益,例如,当您希望有一行项目,但需要更多的“游标”时。例如,当您要使用“当前”和“即将到来”的元素时,可以为这两个元素使用单独的迭代器。或从庞大列表中提取多个线程:每个线程都可以具有自己的迭代器以遍历所有项目。见@雷蒙德@ glglgl的上述回答。

想象一下您可以做什么:

class SmartIterableExample(object):

    def create_iterator(self):
        # An amazingly powerful yet simple way to create arbitrary
        # iterator, utilizing object state (or not, if you are fan
        # of functional), magic and nuclear waste--no kittens hurt.
        pass    # don't forget to add the next() method

    def __iter__(self):
        return self.create_iterator()

笔记:

  • 我将再次重复:迭代器不可迭代。迭代器不能用作for循环中的“源” 。什么for环路主要需要的是__iter__() (即返回与事next())。

  • 当然,for这不是唯一的迭代循环,因此上述内容同样适用于其他一些构造(while…)。

  • 迭代器next()可以抛出StopIteration来停止迭代。但是,它不必永久地迭代或使用其他方式。

  • 在上面的“思想过程”中,_i并不真正存在。我叫这个名字。

  • Python 3.x有一个小的变化:next()现在必须调用方法(不是内置方法)__next__()。是的,一直以来都是这样。

  • 您也可以这样想:可迭代拥有数据,迭代器提取下一项

免责声明:我不是任何Python解释器的开发人员,所以我真的不知道解释器的想法。上面的想法只是从其他解释,实验和Python新手的实际经验中展示了我如何理解该主题。

The above answers are great, but as most of what I’ve seen, don’t stress the distinction enough for people like me.

Also, people tend to get “too Pythonic” by putting definitions like “X is an object that has __foo__() method” before. Such definitions are correct–they are based on duck-typing philosophy, but the focus on methods tends to get between when trying to understand the concept in its simplicity.

So I add my version.


In natural language,

  • iteration is the process of taking one element at a time in a row of elements.

In Python,

  • iterable is an object that is, well, iterable, which simply put, means that it can be used in iteration, e.g. with a for loop. How? By using iterator. I’ll explain below.

  • … while iterator is an object that defines how to actually do the iteration–specifically what is the next element. That’s why it must have next() method.

Iterators are themselves also iterable, with the distinction that their __iter__() method returns the same object (self), regardless of whether or not its items have been consumed by previous calls to next().


So what does Python interpreter think when it sees for x in obj: statement?

Look, a for loop. Looks like a job for an iterator… Let’s get one. … There’s this obj guy, so let’s ask him.

“Mr. obj, do you have your iterator?” (… calls iter(obj), which calls obj.__iter__(), which happily hands out a shiny new iterator _i.)

OK, that was easy… Let’s start iterating then. (x = _i.next()x = _i.next()…)

Since Mr. obj succeeded in this test (by having certain method returning a valid iterator), we reward him with adjective: you can now call him “iterable Mr. obj“.

However, in simple cases, you don’t normally benefit from having iterator and iterable separately. So you define only one object, which is also its own iterator. (Python does not really care that _i handed out by obj wasn’t all that shiny, but just the obj itself.)

This is why in most examples I’ve seen (and what had been confusing me over and over), you can see:

class IterableExample(object):

    def __iter__(self):
        return self

    def next(self):
        pass

instead of

class Iterator(object):
    def next(self):
        pass

class Iterable(object):
    def __iter__(self):
        return Iterator()

There are cases, though, when you can benefit from having iterator separated from the iterable, such as when you want to have one row of items, but more “cursors”. For example when you want to work with “current” and “forthcoming” elements, you can have separate iterators for both. Or multiple threads pulling from a huge list: each can have its own iterator to traverse over all items. See @Raymond’s and @glglgl’s answers above.

Imagine what you could do:

class SmartIterableExample(object):

    def create_iterator(self):
        # An amazingly powerful yet simple way to create arbitrary
        # iterator, utilizing object state (or not, if you are fan
        # of functional), magic and nuclear waste--no kittens hurt.
        pass    # don't forget to add the next() method

    def __iter__(self):
        return self.create_iterator()

Notes:

  • I’ll repeat again: iterator is not iterable. Iterator cannot be used as a “source” in for loop. What for loop primarily needs is __iter__() (that returns something with next()).

  • Of course, for is not the only iteration loop, so above applies to some other constructs as well (while…).

  • Iterator’s next() can throw StopIteration to stop iteration. Does not have to, though, it can iterate forever or use other means.

  • In the above “thought process”, _i does not really exist. I’ve made up that name.

  • There’s a small change in Python 3.x: next() method (not the built-in) now must be called __next__(). Yes, it should have been like that all along.

  • You can also think of it like this: iterable has the data, iterator pulls the next item

Disclaimer: I’m not a developer of any Python interpreter, so I don’t really know what the interpreter “thinks”. The musings above are solely demonstration of how I understand the topic from other explanations, experiments and real-life experience of a Python newbie.


回答 3

可迭代对象是具有__iter__()方法的对象。它可能会迭代多次,例如list()s和tuple()s。

迭代器是要迭代的对象。它由__iter__()方法返回,通过自己的__iter__()方法返回自身,并具有next()方法(__next__()在3.x中)。

迭代是调用此next()响应的过程。__next__()直到它上升StopIteration

例:

>>> a = [1, 2, 3] # iterable
>>> b1 = iter(a) # iterator 1
>>> b2 = iter(a) # iterator 2, independent of b1
>>> next(b1)
1
>>> next(b1)
2
>>> next(b2) # start over, as it is the first call to b2
1
>>> next(b1)
3
>>> next(b1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>> b1 = iter(a) # new one, start over
>>> next(b1)
1

An iterable is a object which has a __iter__() method. It can possibly iterated over several times, such as list()s and tuple()s.

An iterator is the object which iterates. It is returned by an __iter__() method, returns itself via its own __iter__() method and has a next() method (__next__() in 3.x).

Iteration is the process of calling this next() resp. __next__() until it raises StopIteration.

Example:

>>> a = [1, 2, 3] # iterable
>>> b1 = iter(a) # iterator 1
>>> b2 = iter(a) # iterator 2, independent of b1
>>> next(b1)
1
>>> next(b1)
2
>>> next(b2) # start over, as it is the first call to b2
1
>>> next(b1)
3
>>> next(b1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>> b1 = iter(a) # new one, start over
>>> next(b1)
1

回答 4

这是我的备忘单:

 sequence
  +
  |
  v
   def __getitem__(self, index: int):
  +    ...
  |    raise IndexError
  |
  |
  |              def __iter__(self):
  |             +     ...
  |             |     return <iterator>
  |             |
  |             |
  +--> or <-----+        def __next__(self):
       +        |       +    ...
       |        |       |    raise StopIteration
       v        |       |
    iterable    |       |
           +    |       |
           |    |       v
           |    +----> and +-------> iterator
           |                               ^
           v                               |
   iter(<iterable>) +----------------------+
                                           |
   def generator():                        |
  +    yield 1                             |
  |                 generator_expression +-+
  |                                        |
  +-> generator() +-> generator_iterator +-+

测验:您知道如何…

  1. 每个迭代器都是可迭代的?
  2. 容器对象的__iter__()方法可以实现为生成器吗?
  3. 具有__next__方法的可迭代对象不一定是迭代器吗?

答案:

  1. 每个迭代器都必须有一个__iter__方法。具有__iter__足够的可迭代性。因此,每个迭代器都是可迭代的。
  2. __iter__被调用时,它应该返回一个迭代器(return <iterator>在上图中)。调用生成器将返回生成器迭代器,它是迭代器的一种。

    class Iterable1:
        def __iter__(self):
            # a method (which is a function defined inside a class body)
            # calling iter() converts iterable (tuple) to iterator
            return iter((1,2,3))
    
    class Iterable2:
        def __iter__(self):
            # a generator
            for i in (1, 2, 3):
                yield i
    
    class Iterable3:
        def __iter__(self):
            # with PEP 380 syntax
            yield from (1, 2, 3)
    
    # passes
    assert list(Iterable1()) == list(Iterable2()) == list(Iterable3()) == [1, 2, 3]
  3. 这是一个例子:

    class MyIterable:
    
        def __init__(self):
            self.n = 0
    
        def __getitem__(self, index: int):
            return (1, 2, 3)[index]
    
        def __next__(self):
            n = self.n = self.n + 1
            if n > 3:
                raise StopIteration
            return n
    
    # if you can iter it without raising a TypeError, then it's an iterable.
    iter(MyIterable())
    
    # but obviously `MyIterable()` is not an iterator since it does not have
    # an `__iter__` method.
    from collections.abc import Iterator
    assert isinstance(MyIterable(), Iterator)  # AssertionError

Here’s my cheat sheet:

 sequence
  +
  |
  v
   def __getitem__(self, index: int):
  +    ...
  |    raise IndexError
  |
  |
  |              def __iter__(self):
  |             +     ...
  |             |     return <iterator>
  |             |
  |             |
  +--> or <-----+        def __next__(self):
       +        |       +    ...
       |        |       |    raise StopIteration
       v        |       |
    iterable    |       |
           +    |       |
           |    |       v
           |    +----> and +-------> iterator
           |                               ^
           v                               |
   iter(<iterable>) +----------------------+
                                           |
   def generator():                        |
  +    yield 1                             |
  |                 generator_expression +-+
  |                                        |
  +-> generator() +-> generator_iterator +-+

Quiz: Do you see how…

  1. every iterator is an iterable?
  2. a container object’s __iter__() method can be implemented as a generator?
  3. an iterable that has a __next__ method is not necessarily an iterator?

Answers:

  1. Every iterator must have an __iter__ method. Having __iter__ is enough to be an iterable. Therefore every iterator is an iterable.
  2. When __iter__ is called it should return an iterator (return <iterator> in the diagram above). Calling a generator returns a generator iterator which is a type of iterator.

    class Iterable1:
        def __iter__(self):
            # a method (which is a function defined inside a class body)
            # calling iter() converts iterable (tuple) to iterator
            return iter((1,2,3))
    
    class Iterable2:
        def __iter__(self):
            # a generator
            for i in (1, 2, 3):
                yield i
    
    class Iterable3:
        def __iter__(self):
            # with PEP 380 syntax
            yield from (1, 2, 3)
    
    # passes
    assert list(Iterable1()) == list(Iterable2()) == list(Iterable3()) == [1, 2, 3]
    
  3. Here is an example:

    class MyIterable:
    
        def __init__(self):
            self.n = 0
    
        def __getitem__(self, index: int):
            return (1, 2, 3)[index]
    
        def __next__(self):
            n = self.n = self.n + 1
            if n > 3:
                raise StopIteration
            return n
    
    # if you can iter it without raising a TypeError, then it's an iterable.
    iter(MyIterable())
    
    # but obviously `MyIterable()` is not an iterator since it does not have
    # an `__iter__` method.
    from collections.abc import Iterator
    assert isinstance(MyIterable(), Iterator)  # AssertionError
    

回答 5

我不知道它是否对任何人都有帮助,但我一直喜欢在脑海中形象化概念以更好地理解它们。因此,当我有一个小儿子时,我用砖块和白皮书形象化了迭代/迭代器的概念。

假设我们在黑暗的房间里,在地板上,我的儿子有砖头。现在,大小,颜色不同的砖都不再重要了。假设我们有5块这样的砖。可以将这5块砖描述为一个对象 -假设是砖块套件。使用此积木工具包,我们可以做很多事情–可以取一个,然后取第二,然后取第三,可以更改积木的位置,将第一个积木放在第二个之上。我们可以用这些做很多事情。因此,这个积木工具包是一个可迭代的对象序列,因为我们可以遍历每个积木并对其进行处理。我们只能做到像我的小儿子-我们可以玩一个在同一时间。所以我再次想像自己这套积木是一个可迭代的

现在请记住,我们在黑暗的房间里。或几乎是黑暗的。问题是我们没有清楚地看到那些砖块,它们是什么颜色,什么形状等等。因此,即使我们想对它们做些事情(也就是遍历它们),我们也不知道到底是什么以及如何做,因为它是太暗了。

我们所能做的就是接近第一个砖块(作为砖块工具包的组成部分),我们可以放一张白色荧光纸,以便我们了解第一个砖块元素的位置。每次我们从工具包中取出一块砖块时,都会将白纸替换为下一块砖块,以便能够在黑暗的房间中看到它。这张白纸只不过是一个迭代器。它也是一个对象。但是,具有可工作和可迭代对象的元素的对象–砖块工具包。

顺便说一下,这解释了我在IDLE中尝试以下操作并遇到TypeError时的早期错误:

 >>> X = [1,2,3,4,5]
 >>> next(X)
 Traceback (most recent call last):
    File "<pyshell#19>", line 1, in <module>
      next(X)
 TypeError: 'list' object is not an iterator

清单X是我们的积木工具包,但不是白纸。我需要先找到一个迭代器:

>>> X = [1,2,3,4,5]
>>> bricks_kit = [1,2,3,4,5]
>>> white_piece_of_paper = iter(bricks_kit)
>>> next(white_piece_of_paper)
1
>>> next(white_piece_of_paper)
2
>>>

不知道是否有帮助,但是对我有帮助。如果有人可以确认/纠正该概念的可视化,我将不胜感激。这将帮助我了解更多信息。

I don’t know if it helps anybody but I always like to visualize concepts in my head to better understand them. So as I have a little son I visualize iterable/iterator concept with bricks and white paper.

Suppose we are in the dark room and on the floor we have bricks for my son. Bricks of different size, color, does not matter now. Suppose we have 5 bricks like those. Those 5 bricks can be described as an object – let’s say bricks kit. We can do many things with this bricks kit – can take one and then take second and then third, can change places of bricks, put first brick above the second. We can do many sorts of things with those. Therefore this bricks kit is an iterable object or sequence as we can go through each brick and do something with it. We can only do it like my little son – we can play with one brick at a time. So again I imagine myself this bricks kit to be an iterable.

Now remember that we are in the dark room. Or almost dark. The thing is that we don’t clearly see those bricks, what color they are, what shape etc. So even if we want to do something with them – aka iterate through them – we don’t really know what and how because it is too dark.

What we can do is near to first brick – as element of a bricks kit – we can put a piece of white fluorescent paper in order for us to see where the first brick-element is. And each time we take a brick from a kit, we replace the white piece of paper to a next brick in order to be able to see that in the dark room. This white piece of paper is nothing more than an iterator. It is an object as well. But an object with what we can work and play with elements of our iterable object – bricks kit.

That by the way explains my early mistake when I tried the following in an IDLE and got a TypeError:

 >>> X = [1,2,3,4,5]
 >>> next(X)
 Traceback (most recent call last):
    File "<pyshell#19>", line 1, in <module>
      next(X)
 TypeError: 'list' object is not an iterator

List X here was our bricks kit but NOT a white piece of paper. I needed to find an iterator first:

>>> X = [1,2,3,4,5]
>>> bricks_kit = [1,2,3,4,5]
>>> white_piece_of_paper = iter(bricks_kit)
>>> next(white_piece_of_paper)
1
>>> next(white_piece_of_paper)
2
>>>

Don’t know if it helps, but it helped me. If someone could confirm/correct visualization of the concept, I would be grateful. It would help me to learn more.


回答 6

可迭代: -这是迭代的迭代; 例如列表,字符串等序列。它也具有__getitem__方法或__iter__方法。现在,如果我们iter()对该对象使用功能,我们将获得一个迭代器。

迭代器:-当我们从iter()函数中获取迭代器对象时;我们调用__next__()方法(在python3中)或简单地next()(在python2中)一一获取元素。此类或此类的实例称为迭代器。

从文档:-

迭代器的使用遍布并统一了Python。在后台,for语句调用  iter() 容器对象。该函数返回一个迭代器对象,该对象定义了__next__() 一次访问一个容器中元素的方法  。当没有更多元素时,  __next__() 引发StopIteration异常,该异常通知for循环终止。您可以__next__() 使用next() 内置函数来调用该  方法  。这个例子展示了它是如何工作的:

>>> s = 'abc'
>>> it = iter(s)
>>> it
<iterator object at 0x00A1DB50>
>>> next(it)
'a'
>>> next(it)
'b'
>>> next(it)
'c'
>>> next(it)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    next(it)
StopIteration

例如:

class Reverse:
    """Iterator for looping over a sequence backwards."""
    def __init__(self, data):
        self.data = data
        self.index = len(data)
    def __iter__(self):
        return self
    def __next__(self):
        if self.index == 0:
            raise StopIteration
        self.index = self.index - 1
        return self.data[self.index]


>>> rev = Reverse('spam')
>>> iter(rev)
<__main__.Reverse object at 0x00A1DB50>
>>> for char in rev:
...     print(char)
...
m
a
p
s

Iterable:- something that is iterable is iterable; like sequences like lists ,strings etc. Also it has either the __getitem__ method or an __iter__ method. Now if we use iter() function on that object, we’ll get an iterator.

Iterator:- When we get the iterator object from the iter() function; we call __next__() method (in python3) or simply next() (in python2) to get elements one by one. This class or instance of this class is called an iterator.

From docs:-

The use of iterators pervades and unifies Python. Behind the scenes, the for statement calls iter() on the container object. The function returns an iterator object that defines the method __next__() which accesses elements in the container one at a time. When there are no more elements, __next__() raises a StopIteration exception which tells the for loop to terminate. You can call the __next__() method using the next() built-in function; this example shows how it all works:

>>> s = 'abc'
>>> it = iter(s)
>>> it
<iterator object at 0x00A1DB50>
>>> next(it)
'a'
>>> next(it)
'b'
>>> next(it)
'c'
>>> next(it)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    next(it)
StopIteration

Ex of a class:-

class Reverse:
    """Iterator for looping over a sequence backwards."""
    def __init__(self, data):
        self.data = data
        self.index = len(data)
    def __iter__(self):
        return self
    def __next__(self):
        if self.index == 0:
            raise StopIteration
        self.index = self.index - 1
        return self.data[self.index]


>>> rev = Reverse('spam')
>>> iter(rev)
<__main__.Reverse object at 0x00A1DB50>
>>> for char in rev:
...     print(char)
...
m
a
p
s

回答 7

我认为您不会比文档简单得多,但是我会尝试:

  • 可迭代的东西,可以被重复过。在实践中,它通常表示一个序列,例如具有开始和结束的某种事物,以及某种贯穿其中所有项目的方式。
  • 您可以将Iterator视为辅助伪方法(或伪属性),该伪方法可提供(或保留)iterable中的下一个(或第一个)项。(实际上,它只是一个定义方法的对象next()

  • Merriam-Webster 对该词的定义可能最好地解释了迭代

b:将计算机指令序列重复指定的次数或直到满足条件为止-比较递归

I don’t think that you can get it much simpler than the documentation, however I’ll try:

  • Iterable is something that can be iterated over. In practice it usually means a sequence e.g. something that has a beginning and an end and some way to go through all the items in it.
  • You can think Iterator as a helper pseudo-method (or pseudo-attribute) that gives (or holds) the next (or first) item in the iterable. (In practice it is just an object that defines the method next())

  • Iteration is probably best explained by the Merriam-Webster definition of the word :

b : the repetition of a sequence of computer instructions a specified number of times or until a condition is met — compare recursion


回答 8

iterable = [1, 2] 

iterator = iter(iterable)

print(iterator.__next__())   

print(iterator.__next__())   

所以,

  1. iterable是可以循环对象。例如list,string,tuple等。

  2. iteriterable对象上使用该函数将返回迭代器对象。

  3. 现在,此迭代器对象具有名为__next__(在Python 3中,或仅next在Python 2中)的方法,您可以通过该方法访问iterable的每个元素。

因此,以上代码的输出将是:

1个

2

iterable = [1, 2] 

iterator = iter(iterable)

print(iterator.__next__())   

print(iterator.__next__())   

so,

  1. iterable is an object that can be looped over. e.g. list , string , tuple etc.

  2. using the iter function on our iterable object will return an iterator object.

  3. now this iterator object has method named __next__ (in Python 3, or just next in Python 2) by which you can access each element of iterable.

so, OUTPUT OF ABOVE CODE WILL BE:

1

2


回答 9

__iter__迭代对象具有每次都实例化新迭代器的方法。

迭代器实现一个__next__返回单个项目的__iter__方法和一个返回的方法self

因此,迭代器也是可迭代的,但是可迭代器不是迭代器。

Luciano Ramalho,流利的Python。

Iterables have a __iter__ method that instantiates a new iterator every time.

Iterators implement a __next__ method that returns individual items, and a __iter__ method that returns self .

Therefore, iterators are also iterable, but iterables are not iterators.

Luciano Ramalho, Fluent Python.


回答 10

在处理迭代器和迭代器之前,决定迭代器和迭代器的主要因素是顺序

序列:序列是数据的集合

可迭代:可迭代是支持__iter__方法的序列类型对象。

Iter方法:Iter方法将序列作为输入并创建一个称为迭代器的对象

迭代器:迭代器是调用next方法并遍历整个序列的对象。在调用下一个方法时,它返回当前遍历的对象。

例:

x=[1,2,3,4]

x是一个由数据收集组成的序列

y=iter(x)

调用iter(x)时,仅当x对象具有iter方法时才返回迭代器,否则会引发异常。如果返回迭代器,则按如下方式分配y:

y=[1,2,3,4]

由于y是迭代器,因此它支持next()方法

调用next方法时,它会一步一步返回列表的各个元素。

返回序列的最后一个元素后,如果再次调用下一个方法,则会引发StopIteration错误

例:

>>> y.next()
1
>>> y.next()
2
>>> y.next()
3
>>> y.next()
4
>>> y.next()
StopIteration

Before dealing with the iterables and iterator the major factor that decide the iterable and iterator is sequence

Sequence: Sequence is the collection of data

Iterable: Iterable are the sequence type object that support __iter__ method.

Iter method: Iter method take sequence as an input and create an object which is known as iterator

Iterator: Iterator are the object which call next method and transverse through the sequence. On calling the next method it returns the object that it traversed currently.

example:

x=[1,2,3,4]

x is a sequence which consists of collection of data

y=iter(x)

On calling iter(x) it returns a iterator only when the x object has iter method otherwise it raise an exception.If it returns iterator then y is assign like this:

y=[1,2,3,4]

As y is a iterator hence it support next() method

On calling next method it returns the individual elements of the list one by one.

After returning the last element of the sequence if we again call the next method it raise an StopIteration error

example:

>>> y.next()
1
>>> y.next()
2
>>> y.next()
3
>>> y.next()
4
>>> y.next()
StopIteration

回答 11

在Python中,一切都是对象。如果说一个对象是可迭代的,则意味着您可以将对象作为一个集合逐步进行(即迭代)。

例如,数组是可迭代的。您可以使用for循环遍历它们,并从索引0到索引n,n是数组对象的长度减去1。

字典(键/值对,也称为关联数组)也是可迭代的。您可以逐步浏览他们的键。

显然,不是集合的对象是不可迭代的。例如,布尔对象只有一个值为True或False。它不是可迭代的(它是一个可迭代的对象是没有意义的)。

阅读更多。http://www.lepus.org.uk/ref/companion/Iterator.xml

In Python everything is an object. When an object is said to be iterable, it means that you can step through (i.e. iterate) the object as a collection.

Arrays for example are iterable. You can step through them with a for loop, and go from index 0 to index n, n being the length of the array object minus 1.

Dictionaries (pairs of key/value, also called associative arrays) are also iterable. You can step through their keys.

Obviously the objects which are not collections are not iterable. A bool object for example only have one value, True or False. It is not iterable (it wouldn’t make sense that it’s an iterable object).

Read more. http://www.lepus.org.uk/ref/companion/Iterator.xml


如何并行地遍历两个列表?

问题:如何并行地遍历两个列表?

我在Python中有两个可迭代的对象,我想成对地遍历它们:

foo = (1, 2, 3)
bar = (4, 5, 6)

for (f, b) in some_iterator(foo, bar):
    print "f: ", f, "; b: ", b

它应导致:

f: 1; b: 4
f: 2; b: 5
f: 3; b: 6

一种方法是遍历索引:

for i in xrange(len(foo)):
    print "f: ", foo[i], "; b: ", b[i]

但这对我来说似乎有些不可思议。有更好的方法吗?

I have two iterables in Python, and I want to go over them in pairs:

foo = (1, 2, 3)
bar = (4, 5, 6)

for (f, b) in some_iterator(foo, bar):
    print "f: ", f, "; b: ", b

It should result in:

f: 1; b: 4
f: 2; b: 5
f: 3; b: 6

One way to do it is to iterate over the indices:

for i in xrange(len(foo)):
    print "f: ", foo[i], "; b: ", b[i]

But that seems somewhat unpythonic to me. Is there a better way to do it?


回答 0

Python 3

for f, b in zip(foo, bar):
    print(f, b)

zipfoo或中的较短者bar停止。

Python 3zip 中,像itertools.izip在Python2中一样,返回元组的迭代器。要获取元组列表,请使用list(zip(foo, bar))。要压缩直到两个迭代器都用尽,可以使用 itertools.zip_longest

Python 2

Python 2中zip 返回一个元组列表。当foobar不是很大时,这很好。如果它们都是大量的,则形成zip(foo,bar)是不必要的大量临时变量,应将其替换为itertools.izipitertools.izip_longest,它返回迭代器而不是列表。

import itertools
for f,b in itertools.izip(foo,bar):
    print(f,b)
for f,b in itertools.izip_longest(foo,bar):
    print(f,b)

izipfoobar耗尽时停止。 izip_longest当两个停止foobar耗尽。当较短的迭代器用尽时,将izip_longest生成一个元组,None其位置与该迭代器相对应。您还可以设置不同fillvalue之外None,如果你想。看到这里的完整故事


还要注意zip,其zip类似brethen可以接受任意数量的Iterables作为参数。例如,

for num, cheese, color in zip([1,2,3], ['manchego', 'stilton', 'brie'], 
                              ['red', 'blue', 'green']):
    print('{} {} {}'.format(num, color, cheese))

版画

1 red manchego
2 blue stilton
3 green brie

Python 3

for f, b in zip(foo, bar):
    print(f, b)

zip stops when the shorter of foo or bar stops.

In Python 3, zip returns an iterator of tuples, like itertools.izip in Python2. To get a list of tuples, use list(zip(foo, bar)). And to zip until both iterators are exhausted, you would use itertools.zip_longest.

Python 2

In Python 2, zip returns a list of tuples. This is fine when foo and bar are not massive. If they are both massive then forming zip(foo,bar) is an unnecessarily massive temporary variable, and should be replaced by itertools.izip or itertools.izip_longest, which returns an iterator instead of a list.

import itertools
for f,b in itertools.izip(foo,bar):
    print(f,b)
for f,b in itertools.izip_longest(foo,bar):
    print(f,b)

izip stops when either foo or bar is exhausted. izip_longest stops when both foo and bar are exhausted. When the shorter iterator(s) are exhausted, izip_longest yields a tuple with None in the position corresponding to that iterator. You can also set a different fillvalue besides None if you wish. See here for the full story.


Note also that zip and its zip-like brethen can accept an arbitrary number of iterables as arguments. For example,

for num, cheese, color in zip([1,2,3], ['manchego', 'stilton', 'brie'], 
                              ['red', 'blue', 'green']):
    print('{} {} {}'.format(num, color, cheese))

prints

1 red manchego
2 blue stilton
3 green brie

回答 1

您需要该zip功能。

for (f,b) in zip(foo, bar):
    print "f: ", f ,"; b: ", b

You want the zip function.

for (f,b) in zip(foo, bar):
    print "f: ", f ,"; b: ", b

回答 2

您应该使用“ zip ”功能。这是您自己的zip函数的外观示例

def custom_zip(seq1, seq2):
    it1 = iter(seq1)
    it2 = iter(seq2)
    while True:
        yield next(it1), next(it2)

You should use ‘zip‘ function. Here is an example how your own zip function can look like

def custom_zip(seq1, seq2):
    it1 = iter(seq1)
    it2 = iter(seq2)
    while True:
        yield next(it1), next(it2)

回答 3

您可以使用理解将第n个元素捆绑到一个元组或列表中,然后使用生成器函数将其传递出去。

def iterate_multi(*lists):
    for i in range(min(map(len,lists))):
        yield tuple(l[i] for l in lists)

for l1, l2, l3 in iterate_multi([1,2,3],[4,5,6],[7,8,9]):
    print(str(l1)+","+str(l2)+","+str(l3))

You can bundle the nth elements into a tuple or list using comprehension, then pass them out with a generator function.

def iterate_multi(*lists):
    for i in range(min(map(len,lists))):
        yield tuple(l[i] for l in lists)

for l1, l2, l3 in iterate_multi([1,2,3],[4,5,6],[7,8,9]):
    print(str(l1)+","+str(l2)+","+str(l3))

回答 4

万一有人在寻找这样的东西,我发现它非常简单:

list_1 = ["Hello", "World"]
list_2 = [1, 2, 3]

for a,b in [(list_1, list_2)]:
    for element_a in a:
        ...
    for element_b in b:
        ...

>> Hello
World
1
2
3

列表将以其全部内容进行迭代,而zip()只会迭代最大内容长度。

In case someone is looking for something like this, I found it very simple and easy:

list_1 = ["Hello", "World"]
list_2 = [1, 2, 3]

for a,b in [(list_1, list_2)]:
    for element_a in a:
        ...
    for element_b in b:
        ...

>> Hello
World
1
2
3

The lists will be iterated with their full content, unlike zip() which only iterates up to the minimum content length.


回答 5

以下是使用列表理解的方法:

a = (1, 2, 3)
b = (4, 5, 6)
[print('f:', i, '; b', j) for i, j in zip(a, b)]

印刷品:

f: 1 ; b 4
f: 2 ; b 5
f: 3 ; b 6

Here’s how to do it with list comprehension:

a = (1, 2, 3)
b = (4, 5, 6)
[print('f:', i, '; b', j) for i, j in zip(a, b)]

prints:

f: 1 ; b 4
f: 2 ; b 5
f: 3 ; b 6

“ yield”关键字有什么作用?

问题:“ yield”关键字有什么作用?

yield关键字在Python中的用途是什么?

例如,我试图理解这段代码1

def _get_child_candidates(self, distance, min_dist, max_dist):
    if self._leftchild and distance - max_dist < self._median:
        yield self._leftchild
    if self._rightchild and distance + max_dist >= self._median:
        yield self._rightchild  

这是调用方法:

result, candidates = [], [self]
while candidates:
    node = candidates.pop()
    distance = node._get_dist(obj)
    if distance <= max_dist and distance >= min_dist:
        result.extend(node._values)
    candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))
return result

_get_child_candidates调用该方法会怎样?是否返回列表?一个元素?再叫一次吗?后续通话何时停止?


1.这段代码是由Jochen Schulz(jrschulz)编写的,Jochen Schulz是一个很好的用于度量空间的Python库。这是完整源代码的链接:Module mspace

What is the use of the yield keyword in Python, and what does it do?

For example, I’m trying to understand this code1:

def _get_child_candidates(self, distance, min_dist, max_dist):
    if self._leftchild and distance - max_dist < self._median:
        yield self._leftchild
    if self._rightchild and distance + max_dist >= self._median:
        yield self._rightchild  

And this is the caller:

result, candidates = [], [self]
while candidates:
    node = candidates.pop()
    distance = node._get_dist(obj)
    if distance <= max_dist and distance >= min_dist:
        result.extend(node._values)
    candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))
return result

What happens when the method _get_child_candidates is called? Is a list returned? A single element? Is it called again? When will subsequent calls stop?


1. This piece of code was written by Jochen Schulz (jrschulz), who made a great Python library for metric spaces. This is the link to the complete source: Module mspace.


回答 0

要了解其yield作用,您必须了解什么是生成器。而且,在您了解生成器之前,您必须了解iterables

可迭代

创建列表时,可以一一阅读它的项目。逐一读取其项称为迭代:

>>> mylist = [1, 2, 3]
>>> for i in mylist:
...    print(i)
1
2
3

mylist是一个可迭代的。当您使用列表推导时,您将创建一个列表,因此是可迭代的:

>>> mylist = [x*x for x in range(3)]
>>> for i in mylist:
...    print(i)
0
1
4

您可以使用的所有“ for... in...”都是可迭代的;listsstrings,文件…

这些可迭代的方法很方便,因为您可以随意读取它们,但是您将所有值都存储在内存中,当拥有很多值时,这并不总是想要的。

生成器

生成器是迭代器,一种迭代,您只能迭代一次。生成器不会将所有值存储在内存中,它们会即时生成值

>>> mygenerator = (x*x for x in range(3))
>>> for i in mygenerator:
...    print(i)
0
1
4

只是您使用()代替一样[]。但是,由于生成器只能使用一次,因此您无法执行for i in mygenerator第二次:生成器计算0,然后忽略它,然后计算1,最后一次计算4,最后一次。

Yield

yield是与一样使用的关键字return,不同之处在于该函数将返回生成器。

>>> def createGenerator():
...    mylist = range(3)
...    for i in mylist:
...        yield i*i
...
>>> mygenerator = createGenerator() # create a generator
>>> print(mygenerator) # mygenerator is an object!
<generator object createGenerator at 0xb7555c34>
>>> for i in mygenerator:
...     print(i)
0
1
4

这是一个无用的示例,但是当您知道函数将返回大量的值(只需要读取一次)时,它就很方便。

要掌握yield,您必须了解在调用函数时,在函数主体中编写的代码不会运行。该函数仅返回生成器对象,这有点棘手:-)

然后,您的代码将在每次for使用生成器时从中断处继续。

现在最困难的部分是:

第一次for调用从您的函数创建的生成器对象时,它将从头开始运行函数中的代码,直到命中为止yield,然后它将返回循环的第一个值。然后,每个后续调用将运行您在函数中编写的循环的另一个迭代,并返回下一个值。这将一直持续到生成器被认为是空的为止,这在函数运行时没有命中时就会发生yield。那可能是因为循环已经结束,或者是因为您不再满足"if/else"


您的代码说明

生成器:

# Here you create the method of the node object that will return the generator
def _get_child_candidates(self, distance, min_dist, max_dist):

    # Here is the code that will be called each time you use the generator object:

    # If there is still a child of the node object on its left
    # AND if the distance is ok, return the next child
    if self._leftchild and distance - max_dist < self._median:
        yield self._leftchild

    # If there is still a child of the node object on its right
    # AND if the distance is ok, return the next child
    if self._rightchild and distance + max_dist >= self._median:
        yield self._rightchild

    # If the function arrives here, the generator will be considered empty
    # there is no more than two values: the left and the right children

调用方法:

# Create an empty list and a list with the current object reference
result, candidates = list(), [self]

# Loop on candidates (they contain only one element at the beginning)
while candidates:

    # Get the last candidate and remove it from the list
    node = candidates.pop()

    # Get the distance between obj and the candidate
    distance = node._get_dist(obj)

    # If distance is ok, then you can fill the result
    if distance <= max_dist and distance >= min_dist:
        result.extend(node._values)

    # Add the children of the candidate in the candidate's list
    # so the loop will keep running until it will have looked
    # at all the children of the children of the children, etc. of the candidate
    candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))

return result

该代码包含几个智能部分:

  • 循环在一个列表上迭代,但是循环在迭代时列表会扩展:-)这是浏览所有这些嵌套数据的一种简洁方法,即使这样做有点危险,因为您可能会遇到无限循环。在这种情况下,请candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))耗尽所有生成器的值,但是while继续创建新的生成器对象,因为它们未应用于同一节点,因此将产生与先前值不同的值。

  • extend()方法是期望可迭代并将其值添加到列表的列表对象方法。

通常我们将一个列表传递给它:

>>> a = [1, 2]
>>> b = [3, 4]
>>> a.extend(b)
>>> print(a)
[1, 2, 3, 4]

但是在您的代码中,它得到了一个生成器,这很好,因为:

  1. 您无需两次读取值。
  2. 您可能有很多孩子,并且您不希望所有孩子都存储在内存中。

它之所以有效,是因为Python不在乎方法的参数是否为列表。Python期望可迭代,因此它将与字符串,列表,元组和生成器一起使用!这就是所谓的鸭子输入,这是Python如此酷的原因之一。但这是另一个故事,还有另一个问题…

您可以在这里停止,或者阅读一点以了解生成器的高级用法:

控制生成器耗尽

>>> class Bank(): # Let's create a bank, building ATMs
...    crisis = False
...    def create_atm(self):
...        while not self.crisis:
...            yield "$100"
>>> hsbc = Bank() # When everything's ok the ATM gives you as much as you want
>>> corner_street_atm = hsbc.create_atm()
>>> print(corner_street_atm.next())
$100
>>> print(corner_street_atm.next())
$100
>>> print([corner_street_atm.next() for cash in range(5)])
['$100', '$100', '$100', '$100', '$100']
>>> hsbc.crisis = True # Crisis is coming, no more money!
>>> print(corner_street_atm.next())
<type 'exceptions.StopIteration'>
>>> wall_street_atm = hsbc.create_atm() # It's even true for new ATMs
>>> print(wall_street_atm.next())
<type 'exceptions.StopIteration'>
>>> hsbc.crisis = False # The trouble is, even post-crisis the ATM remains empty
>>> print(corner_street_atm.next())
<type 'exceptions.StopIteration'>
>>> brand_new_atm = hsbc.create_atm() # Build a new one to get back in business
>>> for cash in brand_new_atm:
...    print cash
$100
$100
$100
$100
$100
$100
$100
$100
$100
...

注意:对于Python 3,请使用print(corner_street_atm.__next__())print(next(corner_street_atm))

对于诸如控制对资源的访问之类的各种事情,它可能很有用。

Itertools,您最好的朋友

itertools模块包含用于操纵可迭代对象的特殊功能。曾经希望复制一个生成器吗?连锁两个生成器?用一行代码对嵌套列表中的值进行分组?Map / Zip没有创建另一个列表?

然后就import itertools

一个例子?让我们看一下四马比赛的可能到达顺序:

>>> horses = [1, 2, 3, 4]
>>> races = itertools.permutations(horses)
>>> print(races)
<itertools.permutations object at 0xb754f1dc>
>>> print(list(itertools.permutations(horses)))
[(1, 2, 3, 4),
 (1, 2, 4, 3),
 (1, 3, 2, 4),
 (1, 3, 4, 2),
 (1, 4, 2, 3),
 (1, 4, 3, 2),
 (2, 1, 3, 4),
 (2, 1, 4, 3),
 (2, 3, 1, 4),
 (2, 3, 4, 1),
 (2, 4, 1, 3),
 (2, 4, 3, 1),
 (3, 1, 2, 4),
 (3, 1, 4, 2),
 (3, 2, 1, 4),
 (3, 2, 4, 1),
 (3, 4, 1, 2),
 (3, 4, 2, 1),
 (4, 1, 2, 3),
 (4, 1, 3, 2),
 (4, 2, 1, 3),
 (4, 2, 3, 1),
 (4, 3, 1, 2),
 (4, 3, 2, 1)]

了解迭代的内部机制

迭代是一个隐含可迭代(实现__iter__()方法)和迭代器(实现__next__()方法)的过程。可迭代对象是可以从中获取迭代器的任何对象。迭代器是使您可以迭代的对象。

本文还提供了有关循环如何for工作的更多信息

To understand what yield does, you must understand what generators are. And before you can understand generators, you must understand iterables.

Iterables

When you create a list, you can read its items one by one. Reading its items one by one is called iteration:

>>> mylist = [1, 2, 3]
>>> for i in mylist:
...    print(i)
1
2
3

mylist is an iterable. When you use a list comprehension, you create a list, and so an iterable:

>>> mylist = [x*x for x in range(3)]
>>> for i in mylist:
...    print(i)
0
1
4

Everything you can use “for... in...” on is an iterable; lists, strings, files…

These iterables are handy because you can read them as much as you wish, but you store all the values in memory and this is not always what you want when you have a lot of values.

Generators

Generators are iterators, a kind of iterable you can only iterate over once. Generators do not store all the values in memory, they generate the values on the fly:

>>> mygenerator = (x*x for x in range(3))
>>> for i in mygenerator:
...    print(i)
0
1
4

It is just the same except you used () instead of []. BUT, you cannot perform for i in mygenerator a second time since generators can only be used once: they calculate 0, then forget about it and calculate 1, and end calculating 4, one by one.

Yield

yield is a keyword that is used like return, except the function will return a generator.

>>> def createGenerator():
...    mylist = range(3)
...    for i in mylist:
...        yield i*i
...
>>> mygenerator = createGenerator() # create a generator
>>> print(mygenerator) # mygenerator is an object!
<generator object createGenerator at 0xb7555c34>
>>> for i in mygenerator:
...     print(i)
0
1
4

Here it’s a useless example, but it’s handy when you know your function will return a huge set of values that you will only need to read once.

To master yield, you must understand that when you call the function, the code you have written in the function body does not run. The function only returns the generator object, this is a bit tricky :-)

Then, your code will continue from where it left off each time for uses the generator.

Now the hard part:

The first time the for calls the generator object created from your function, it will run the code in your function from the beginning until it hits yield, then it’ll return the first value of the loop. Then, each subsequent call will run another iteration of the loop you have written in the function and return the next value. This will continue until the generator is considered empty, which happens when the function runs without hitting yield. That can be because the loop has come to an end, or because you no longer satisfy an "if/else".


Your code explained

Generator:

# Here you create the method of the node object that will return the generator
def _get_child_candidates(self, distance, min_dist, max_dist):

    # Here is the code that will be called each time you use the generator object:

    # If there is still a child of the node object on its left
    # AND if the distance is ok, return the next child
    if self._leftchild and distance - max_dist < self._median:
        yield self._leftchild

    # If there is still a child of the node object on its right
    # AND if the distance is ok, return the next child
    if self._rightchild and distance + max_dist >= self._median:
        yield self._rightchild

    # If the function arrives here, the generator will be considered empty
    # there is no more than two values: the left and the right children

Caller:

# Create an empty list and a list with the current object reference
result, candidates = list(), [self]

# Loop on candidates (they contain only one element at the beginning)
while candidates:

    # Get the last candidate and remove it from the list
    node = candidates.pop()

    # Get the distance between obj and the candidate
    distance = node._get_dist(obj)

    # If distance is ok, then you can fill the result
    if distance <= max_dist and distance >= min_dist:
        result.extend(node._values)

    # Add the children of the candidate in the candidate's list
    # so the loop will keep running until it will have looked
    # at all the children of the children of the children, etc. of the candidate
    candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))

return result

This code contains several smart parts:

  • The loop iterates on a list, but the list expands while the loop is being iterated :-) It’s a concise way to go through all these nested data even if it’s a bit dangerous since you can end up with an infinite loop. In this case, candidates.extend(node._get_child_candidates(distance, min_dist, max_dist)) exhaust all the values of the generator, but while keeps creating new generator objects which will produce different values from the previous ones since it’s not applied on the same node.

  • The extend() method is a list object method that expects an iterable and adds its values to the list.

Usually we pass a list to it:

>>> a = [1, 2]
>>> b = [3, 4]
>>> a.extend(b)
>>> print(a)
[1, 2, 3, 4]

But in your code, it gets a generator, which is good because:

  1. You don’t need to read the values twice.
  2. You may have a lot of children and you don’t want them all stored in memory.

And it works because Python does not care if the argument of a method is a list or not. Python expects iterables so it will work with strings, lists, tuples, and generators! This is called duck typing and is one of the reasons why Python is so cool. But this is another story, for another question…

You can stop here, or read a little bit to see an advanced use of a generator:

Controlling a generator exhaustion

>>> class Bank(): # Let's create a bank, building ATMs
...    crisis = False
...    def create_atm(self):
...        while not self.crisis:
...            yield "$100"
>>> hsbc = Bank() # When everything's ok the ATM gives you as much as you want
>>> corner_street_atm = hsbc.create_atm()
>>> print(corner_street_atm.next())
$100
>>> print(corner_street_atm.next())
$100
>>> print([corner_street_atm.next() for cash in range(5)])
['$100', '$100', '$100', '$100', '$100']
>>> hsbc.crisis = True # Crisis is coming, no more money!
>>> print(corner_street_atm.next())
<type 'exceptions.StopIteration'>
>>> wall_street_atm = hsbc.create_atm() # It's even true for new ATMs
>>> print(wall_street_atm.next())
<type 'exceptions.StopIteration'>
>>> hsbc.crisis = False # The trouble is, even post-crisis the ATM remains empty
>>> print(corner_street_atm.next())
<type 'exceptions.StopIteration'>
>>> brand_new_atm = hsbc.create_atm() # Build a new one to get back in business
>>> for cash in brand_new_atm:
...    print cash
$100
$100
$100
$100
$100
$100
$100
$100
$100
...

Note: For Python 3, useprint(corner_street_atm.__next__()) or print(next(corner_street_atm))

It can be useful for various things like controlling access to a resource.

Itertools, your best friend

The itertools module contains special functions to manipulate iterables. Ever wish to duplicate a generator? Chain two generators? Group values in a nested list with a one-liner? Map / Zip without creating another list?

Then just import itertools.

An example? Let’s see the possible orders of arrival for a four-horse race:

>>> horses = [1, 2, 3, 4]
>>> races = itertools.permutations(horses)
>>> print(races)
<itertools.permutations object at 0xb754f1dc>
>>> print(list(itertools.permutations(horses)))
[(1, 2, 3, 4),
 (1, 2, 4, 3),
 (1, 3, 2, 4),
 (1, 3, 4, 2),
 (1, 4, 2, 3),
 (1, 4, 3, 2),
 (2, 1, 3, 4),
 (2, 1, 4, 3),
 (2, 3, 1, 4),
 (2, 3, 4, 1),
 (2, 4, 1, 3),
 (2, 4, 3, 1),
 (3, 1, 2, 4),
 (3, 1, 4, 2),
 (3, 2, 1, 4),
 (3, 2, 4, 1),
 (3, 4, 1, 2),
 (3, 4, 2, 1),
 (4, 1, 2, 3),
 (4, 1, 3, 2),
 (4, 2, 1, 3),
 (4, 2, 3, 1),
 (4, 3, 1, 2),
 (4, 3, 2, 1)]

Understanding the inner mechanisms of iteration

Iteration is a process implying iterables (implementing the __iter__() method) and iterators (implementing the __next__() method). Iterables are any objects you can get an iterator from. Iterators are objects that let you iterate on iterables.

There is more about it in this article about how for loops work.


回答 1

理解的捷径 yield

当您看到带有yield语句的函数时,请应用以下简单技巧,以了解将发生的情况:

  1. result = []在函数的开头插入一行。
  2. 替换每个yield exprresult.append(expr)
  3. return result在函数底部插入一行。
  4. 是的-不再yield声明!阅读并找出代码。
  5. 将功能与原始定义进行比较。

这个技巧可能会让您对函数背后的逻辑yield有所了解,但是实际发生的事情与基于列表的方法发生的事情明显不同。在许多情况下,yield方法也将具有更高的内存效率和更快的速度。在其他情况下,即使原始函数运行正常,此技巧也会使您陷入无限循环。请继续阅读以了解更多信息…

不要混淆您的Iterable,Iterators和Generators

首先,迭代器协议 -当您编写时

for x in mylist:
    ...loop body...

Python执行以下两个步骤:

  1. 获取一个迭代器 mylist

    调用iter(mylist)->这将返回一个带有next()方法(或__next__()在Python 3中)。

    [这是大多数人忘记告诉您的步骤]

  2. 使用迭代器遍历项目:

    继续next()在从步骤1返回的迭代器上调用该方法。从的返回值next()被分配给x并执行循环体。如果StopIteration从内部引发异常next(),则意味着迭代器中没有更多值,并且退出了循环。

事实是,Python在想要遍历对象内容的任何时候都执行上述两个步骤-因此它可以是for循环,但也可以是类似的代码otherlist.extend(mylist)(其中otherlist是Python列表)。

mylist是一个可迭代的,因为它实现了迭代器协议。在用户定义的类中,可以实现该__iter__()方法以使您的类的实例可迭代。此方法应返回迭代器。迭代器是带有next()方法的对象。它可以同时实现__iter__(),并next()在同一类,并有__iter__()回报self。这适用于简单的情况,但是当您希望两个迭代器同时在同一个对象上循环时,则不能使用。

这就是迭代器协议,许多对象都实现了该协议:

  1. 内置列表,字典,元组,集合,文件。
  2. 实现的用户定义的类__iter__()
  3. 生成器。

请注意,for循环不知道它要处理的是哪种对象-它仅遵循迭代器协议,并且很高兴在调用时逐项获取next()。内置列表一一返回它们的项,词典一一返回,文件一一返回,依此类推。生成器返回…就是这样yield

def f123():
    yield 1
    yield 2
    yield 3

for item in f123():
    print item

yield如果没有三个return语句,f123()则只执行第一个语句,而不是语句,然后函数将退出。但是f123()没有普通的功能。当f123()被调用时,它不会返回yield语句中的任何值!它返回一个生成器对象。另外,该函数并没有真正退出-进入了挂起状态。当for循环尝试遍历生成器对象时,该函数从yield先前返回的下一行从其挂起状态恢复,执行下一行代码(在这种情况下为yield语句),并将其作为下一行返回项目。这会一直发生,直到函数退出,此时生成器将引发StopIteration,然后循环退出。

因此,生成器对象有点像适配器-在一端,它通过公开__iter__()next()保持for循环满意的方法来展示迭代器协议。但是,在另一端,它仅运行该函数以从中获取下一个值,然后将其放回暂停模式。

为什么使用生成器?

通常,您可以编写不使用生成器但实现相同逻辑的代码。一种选择是使用我之前提到的临时列表“技巧”。这并非在所有情况下都可行,例如,如果您有无限循环,或者当您的列表很长时,这可能会导致内存使用效率低下。另一种方法是实现一个新的可迭代类SomethingIter,该类将状态保留在实例成员中,并在其next()(或__next__()Python 3)方法中执行下一个逻辑步骤。根据逻辑,next()方法中的代码可能最终看起来非常复杂并且容易出现错误。在这里,生成器提供了一种干净而简单的解决方案。

Shortcut to understanding yield

When you see a function with yield statements, apply this easy trick to understand what will happen:

  1. Insert a line result = [] at the start of the function.
  2. Replace each yield expr with result.append(expr).
  3. Insert a line return result at the bottom of the function.
  4. Yay – no more yield statements! Read and figure out code.
  5. Compare function to the original definition.

This trick may give you an idea of the logic behind the function, but what actually happens with yield is significantly different than what happens in the list based approach. In many cases, the yield approach will be a lot more memory efficient and faster too. In other cases, this trick will get you stuck in an infinite loop, even though the original function works just fine. Read on to learn more…

Don’t confuse your Iterables, Iterators, and Generators

First, the iterator protocol – when you write

for x in mylist:
    ...loop body...

Python performs the following two steps:

  1. Gets an iterator for mylist:

    Call iter(mylist) -> this returns an object with a next() method (or __next__() in Python 3).

    [This is the step most people forget to tell you about]

  2. Uses the iterator to loop over items:

    Keep calling the next() method on the iterator returned from step 1. The return value from next() is assigned to x and the loop body is executed. If an exception StopIteration is raised from within next(), it means there are no more values in the iterator and the loop is exited.

The truth is Python performs the above two steps anytime it wants to loop over the contents of an object – so it could be a for loop, but it could also be code like otherlist.extend(mylist) (where otherlist is a Python list).

Here mylist is an iterable because it implements the iterator protocol. In a user-defined class, you can implement the __iter__() method to make instances of your class iterable. This method should return an iterator. An iterator is an object with a next() method. It is possible to implement both __iter__() and next() on the same class, and have __iter__() return self. This will work for simple cases, but not when you want two iterators looping over the same object at the same time.

So that’s the iterator protocol, many objects implement this protocol:

  1. Built-in lists, dictionaries, tuples, sets, files.
  2. User-defined classes that implement __iter__().
  3. Generators.

Note that a for loop doesn’t know what kind of object it’s dealing with – it just follows the iterator protocol, and is happy to get item after item as it calls next(). Built-in lists return their items one by one, dictionaries return the keys one by one, files return the lines one by one, etc. And generators return… well that’s where yield comes in:

def f123():
    yield 1
    yield 2
    yield 3

for item in f123():
    print item

Instead of yield statements, if you had three return statements in f123() only the first would get executed, and the function would exit. But f123() is no ordinary function. When f123() is called, it does not return any of the values in the yield statements! It returns a generator object. Also, the function does not really exit – it goes into a suspended state. When the for loop tries to loop over the generator object, the function resumes from its suspended state at the very next line after the yield it previously returned from, executes the next line of code, in this case, a yield statement, and returns that as the next item. This happens until the function exits, at which point the generator raises StopIteration, and the loop exits.

So the generator object is sort of like an adapter – at one end it exhibits the iterator protocol, by exposing __iter__() and next() methods to keep the for loop happy. At the other end, however, it runs the function just enough to get the next value out of it, and puts it back in suspended mode.

Why Use Generators?

Usually, you can write code that doesn’t use generators but implements the same logic. One option is to use the temporary list ‘trick’ I mentioned before. That will not work in all cases, for e.g. if you have infinite loops, or it may make inefficient use of memory when you have a really long list. The other approach is to implement a new iterable class SomethingIter that keeps the state in instance members and performs the next logical step in it’s next() (or __next__() in Python 3) method. Depending on the logic, the code inside the next() method may end up looking very complex and be prone to bugs. Here generators provide a clean and easy solution.


回答 2

这样想:

迭代器只是一个带有next()方法的对象的美化名词。因此,产生收益的函数最终是这样的:

原始版本:

def some_function():
    for i in xrange(4):
        yield i

for i in some_function():
    print i

这基本上是Python解释器使用上面的代码执行的操作:

class it:
    def __init__(self):
        # Start at -1 so that we get 0 when we add 1 below.
        self.count = -1

    # The __iter__ method will be called once by the 'for' loop.
    # The rest of the magic happens on the object returned by this method.
    # In this case it is the object itself.
    def __iter__(self):
        return self

    # The next method will be called repeatedly by the 'for' loop
    # until it raises StopIteration.
    def next(self):
        self.count += 1
        if self.count < 4:
            return self.count
        else:
            # A StopIteration exception is raised
            # to signal that the iterator is done.
            # This is caught implicitly by the 'for' loop.
            raise StopIteration

def some_func():
    return it()

for i in some_func():
    print i

为了更深入地了解幕后发生的事情,for可以将循环重写为:

iterator = some_func()
try:
    while 1:
        print iterator.next()
except StopIteration:
    pass

这是否更有意义,还是会让您更加困惑?:)

我要指出,这为了说明的目的过于简单化。:)

Think of it this way:

An iterator is just a fancy sounding term for an object that has a next() method. So a yield-ed function ends up being something like this:

Original version:

def some_function():
    for i in xrange(4):
        yield i

for i in some_function():
    print i

This is basically what the Python interpreter does with the above code:

class it:
    def __init__(self):
        # Start at -1 so that we get 0 when we add 1 below.
        self.count = -1

    # The __iter__ method will be called once by the 'for' loop.
    # The rest of the magic happens on the object returned by this method.
    # In this case it is the object itself.
    def __iter__(self):
        return self

    # The next method will be called repeatedly by the 'for' loop
    # until it raises StopIteration.
    def next(self):
        self.count += 1
        if self.count < 4:
            return self.count
        else:
            # A StopIteration exception is raised
            # to signal that the iterator is done.
            # This is caught implicitly by the 'for' loop.
            raise StopIteration

def some_func():
    return it()

for i in some_func():
    print i

For more insight as to what’s happening behind the scenes, the for loop can be rewritten to this:

iterator = some_func()
try:
    while 1:
        print iterator.next()
except StopIteration:
    pass

Does that make more sense or just confuse you more? :)

I should note that this is an oversimplification for illustrative purposes. :)


回答 3

yield关键字被减少到两个简单的事实:

  1. 如果编译器在函数内部的任何位置检测到yield关键字,则该函数不再通过该语句返回。相反,它立即返回一个懒惰的“待处理列表”对象,称为生成器return
  2. 生成器是可迭代的。什么是可迭代的?就像是listor或setor range或dict-view一样,它带有用于以特定顺序访问每个元素内置协议

简而言之:生成器是一个懒惰的,增量待定的list,并且yield语句允许您使用函数符号来编程生成器应逐渐吐出的列表值

generator = myYieldingFunction(...)
x = list(generator)

   generator
       v
[x[0], ..., ???]

         generator
             v
[x[0], x[1], ..., ???]

               generator
                   v
[x[0], x[1], x[2], ..., ???]

                       StopIteration exception
[x[0], x[1], x[2]]     done

list==[x[0], x[1], x[2]]

让我们定义一个makeRange类似于Python的函数range。调用makeRange(n)“返回生成器”:

def makeRange(n):
    # return 0,1,2,...,n-1
    i = 0
    while i < n:
        yield i
        i += 1

>>> makeRange(5)
<generator object makeRange at 0x19e4aa0>

要强制生成器立即返回其待处理的值,可以将其传递给list()(就像您可以进行任何迭代一样):

>>> list(makeRange(5))
[0, 1, 2, 3, 4]

将示例与“仅返回列表”进行比较

可以将上面的示例视为仅创建一个列表,并将其附加并返回:

# list-version                   #  # generator-version
def makeRange(n):                #  def makeRange(n):
    """return [0,1,2,...,n-1]""" #~     """return 0,1,2,...,n-1"""
    TO_RETURN = []               #>
    i = 0                        #      i = 0
    while i < n:                 #      while i < n:
        TO_RETURN += [i]         #~         yield i
        i += 1                   #          i += 1  ## indented
    return TO_RETURN             #>

>>> makeRange(5)
[0, 1, 2, 3, 4]

但是,有一个主要区别。请参阅最后一节。


您如何使用生成器

可迭代是列表理解的最后一部分,并且所有生成器都是可迭代的,因此经常像这样使用它们:

#                   _ITERABLE_
>>> [x+10 for x in makeRange(5)]
[10, 11, 12, 13, 14]

为了使生成器更好地使用,您可以使用该itertools模块(一定要使用chain.from_iterable而不是chain在保修期内)。例如,您甚至可以使用生成器来实现无限长的惰性列表,例如itertools.count()。您可以实现自己的def enumerate(iterable): zip(count(), iterable),也可以yield在while循环中使用关键字来实现。

请注意:生成器实际上可以用于更多事情,例如实现协程或不确定性编程或其他优雅的事情。但是,我在这里提出的“惰性列表”观点是您会发现的最常见用法。


幕后花絮

这就是“ Python迭代协议”的工作方式。就是说,当你做什么的时候list(makeRange(5))。这就是我之前所说的“懒惰的增量列表”。

>>> x=iter(range(5))
>>> next(x)
0
>>> next(x)
1
>>> next(x)
2
>>> next(x)
3
>>> next(x)
4
>>> next(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

内置函数next()仅调用对象.next()函数,它是“迭代协议”的一部分,可以在所有迭代器上找到。您可以手动使用next()函数(以及迭代协议的其他部分)来实现奇特的事情,通常是以牺牲可读性为代价的,因此请避免这样做。


细节

通常,大多数人不会关心以下区别,并且可能想在这里停止阅读。

用Python来说,可迭代对象是“了解for循环的概念”的任何对象,例如列表[1,2,3],而迭代器是所请求的for循环的特定实例,例如[1,2,3].__iter__()。一个生成器是完全一样的任何迭代器,除了它是写(带有功能语法)的方式。

当您从列表中请求迭代器时,它将创建一个新的迭代器。但是,当您从迭代器请求迭代器时(很少这样做),它只会为您提供自身的副本。

因此,在极少数情况下,您可能无法执行此类操作…

> x = myRange(5)
> list(x)
[0, 1, 2, 3, 4]
> list(x)
[]

…然后记住生成器是迭代器 ; 即是一次性使用。如果要重用它,则应myRange(...)再次调用。如果需要两次使用结果,请将结果转换为列表并将其存储在变量中x = list(myRange(5))。那些绝对需要克隆生成器的人(例如,正在可怕地修改程序的人)可以itertools.tee在绝对必要的情况下使用,因为可复制的迭代器Python PEP标准建议已被推迟。

The yield keyword is reduced to two simple facts:

  1. If the compiler detects the yield keyword anywhere inside a function, that function no longer returns via the return statement. Instead, it immediately returns a lazy “pending list” object called a generator
  2. A generator is iterable. What is an iterable? It’s anything like a list or set or range or dict-view, with a built-in protocol for visiting each element in a certain order.

In a nutshell: a generator is a lazy, incrementally-pending list, and yield statements allow you to use function notation to program the list values the generator should incrementally spit out.

generator = myYieldingFunction(...)
x = list(generator)

   generator
       v
[x[0], ..., ???]

         generator
             v
[x[0], x[1], ..., ???]

               generator
                   v
[x[0], x[1], x[2], ..., ???]

                       StopIteration exception
[x[0], x[1], x[2]]     done

list==[x[0], x[1], x[2]]

Example

Let’s define a function makeRange that’s just like Python’s range. Calling makeRange(n) RETURNS A GENERATOR:

def makeRange(n):
    # return 0,1,2,...,n-1
    i = 0
    while i < n:
        yield i
        i += 1

>>> makeRange(5)
<generator object makeRange at 0x19e4aa0>

To force the generator to immediately return its pending values, you can pass it into list() (just like you could any iterable):

>>> list(makeRange(5))
[0, 1, 2, 3, 4]

Comparing example to “just returning a list”

The above example can be thought of as merely creating a list which you append to and return:

# list-version                   #  # generator-version
def makeRange(n):                #  def makeRange(n):
    """return [0,1,2,...,n-1]""" #~     """return 0,1,2,...,n-1"""
    TO_RETURN = []               #>
    i = 0                        #      i = 0
    while i < n:                 #      while i < n:
        TO_RETURN += [i]         #~         yield i
        i += 1                   #          i += 1  ## indented
    return TO_RETURN             #>

>>> makeRange(5)
[0, 1, 2, 3, 4]

There is one major difference, though; see the last section.


How you might use generators

An iterable is the last part of a list comprehension, and all generators are iterable, so they’re often used like so:

#                   _ITERABLE_
>>> [x+10 for x in makeRange(5)]
[10, 11, 12, 13, 14]

To get a better feel for generators, you can play around with the itertools module (be sure to use chain.from_iterable rather than chain when warranted). For example, you might even use generators to implement infinitely-long lazy lists like itertools.count(). You could implement your own def enumerate(iterable): zip(count(), iterable), or alternatively do so with the yield keyword in a while-loop.

Please note: generators can actually be used for many more things, such as implementing coroutines or non-deterministic programming or other elegant things. However, the “lazy lists” viewpoint I present here is the most common use you will find.


Behind the scenes

This is how the “Python iteration protocol” works. That is, what is going on when you do list(makeRange(5)). This is what I describe earlier as a “lazy, incremental list”.

>>> x=iter(range(5))
>>> next(x)
0
>>> next(x)
1
>>> next(x)
2
>>> next(x)
3
>>> next(x)
4
>>> next(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

The built-in function next() just calls the objects .next() function, which is a part of the “iteration protocol” and is found on all iterators. You can manually use the next() function (and other parts of the iteration protocol) to implement fancy things, usually at the expense of readability, so try to avoid doing that…


Minutiae

Normally, most people would not care about the following distinctions and probably want to stop reading here.

In Python-speak, an iterable is any object which “understands the concept of a for-loop” like a list [1,2,3], and an iterator is a specific instance of the requested for-loop like [1,2,3].__iter__(). A generator is exactly the same as any iterator, except for the way it was written (with function syntax).

When you request an iterator from a list, it creates a new iterator. However, when you request an iterator from an iterator (which you would rarely do), it just gives you a copy of itself.

Thus, in the unlikely event that you are failing to do something like this…

> x = myRange(5)
> list(x)
[0, 1, 2, 3, 4]
> list(x)
[]

… then remember that a generator is an iterator; that is, it is one-time-use. If you want to reuse it, you should call myRange(...) again. If you need to use the result twice, convert the result to a list and store it in a variable x = list(myRange(5)). Those who absolutely need to clone a generator (for example, who are doing terrifyingly hackish metaprogramming) can use itertools.tee if absolutely necessary, since the copyable iterator Python PEP standards proposal has been deferred.


回答 4

什么是yield关键词在Python呢?

答案大纲/摘要

  • 具有的函数yield在被调用时将返回Generator
  • 生成器是迭代器,因为它们实现了迭代器协议,因此您可以对其进行迭代。
  • 也可以生成器发送信息,使其在概念上成为协程
  • 在Python 3中,您可以使用双向一个生成器委托给另一个生成器yield from
  • (附录对几个答案进行了评论,包括最上面的一个,并讨论了return在生成器中的用法。)

生成器:

yield仅在函数定义内部合法,并且函数定义中包含yield使其返回生成器。

生成器的想法来自具有不同实现方式的其他语言(请参见脚注1)。在Python的Generators中,代码的执行会在收益率点冻结。调用生成器时(下面将讨论方法),恢复执行,然后冻结下一个Yield。

yield提供了一种实现迭代器协议的简便方法,该协议由以下两种方法定义: __iter__next(Python 2)或__next__(Python 3)。这两种方法都使对象成为迭代器,您可以使用模块中的IteratorAbstract Base Class对其进行类型检查collections

>>> def func():
...     yield 'I am'
...     yield 'a generator!'
... 
>>> type(func)                 # A function with yield is still a function
<type 'function'>
>>> gen = func()
>>> type(gen)                  # but it returns a generator
<type 'generator'>
>>> hasattr(gen, '__iter__')   # that's an iterable
True
>>> hasattr(gen, 'next')       # and with .next (.__next__ in Python 3)
True                           # implements the iterator protocol.

生成器类型是迭代器的子类型:

>>> import collections, types
>>> issubclass(types.GeneratorType, collections.Iterator)
True

并且如有必要,我们可以像这样进行类型检查:

>>> isinstance(gen, types.GeneratorType)
True
>>> isinstance(gen, collections.Iterator)
True

的一个功能Iterator 是,一旦用尽,您将无法重复使用或重置它:

>>> list(gen)
['I am', 'a generator!']
>>> list(gen)
[]

如果要再次使用其功能,则必须另做一个(请参见脚注2):

>>> list(func())
['I am', 'a generator!']

一个人可以通过编程方式产生数据,例如:

def func(an_iterable):
    for item in an_iterable:
        yield item

上面的简单生成器也等效于下面的生成器-从Python 3.3开始(在Python 2中不可用),您可以使用yield from

def func(an_iterable):
    yield from an_iterable

但是,yield from还允许委派给子生成器,这将在以下有关使用子协程进行合作委派的部分中进行解释。

协程:

yield 形成一个表达式,该表达式允许将数据发送到生成器中(请参见脚注3)

这是一个示例,请注意该received变量,该变量将指向发送到生成器的数据:

def bank_account(deposited, interest_rate):
    while True:
        calculated_interest = interest_rate * deposited 
        received = yield calculated_interest
        if received:
            deposited += received


>>> my_account = bank_account(1000, .05)

首先,我们必须使内置函数生成器排队next。它将调用适当的next__next__方法,具体取决于您所使用的Python版本:

>>> first_year_interest = next(my_account)
>>> first_year_interest
50.0

现在我们可以将数据发送到生成器中。(发送None与呼叫相同next。):

>>> next_year_interest = my_account.send(first_year_interest + 1000)
>>> next_year_interest
102.5

合作协办小组 yield from

现在,回想一下yield fromPython 3中可用的功能。这使我们可以将协程委托给子协程:

def money_manager(expected_rate):
    under_management = yield     # must receive deposited value
    while True:
        try:
            additional_investment = yield expected_rate * under_management 
            if additional_investment:
                under_management += additional_investment
        except GeneratorExit:
            '''TODO: write function to send unclaimed funds to state'''
        finally:
            '''TODO: write function to mail tax info to client'''


def investment_account(deposited, manager):
    '''very simple model of an investment account that delegates to a manager'''
    next(manager) # must queue up manager
    manager.send(deposited)
    while True:
        try:
            yield from manager
        except GeneratorExit:
            return manager.close()

现在我们可以将功能委派给子生成器,并且生成器可以像上面一样使用它:

>>> my_manager = money_manager(.06)
>>> my_account = investment_account(1000, my_manager)
>>> first_year_return = next(my_account)
>>> first_year_return
60.0
>>> next_year_return = my_account.send(first_year_return + 1000)
>>> next_year_return
123.6

你可以阅读更多的精确语义yield fromPEP 380。

其他方法:关闭并抛出

close方法GeneratorExit在函数执行被冻结的时候引发。这也将由调用,__del__因此您可以将任何清理代码放在处理位置GeneratorExit

>>> my_account.close()

您还可以引发异常,该异常可以在生成器中处理或传播回用户:

>>> import sys
>>> try:
...     raise ValueError
... except:
...     my_manager.throw(*sys.exc_info())
... 
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "<stdin>", line 2, in <module>
ValueError

结论

我相信我已经涵盖了以下问题的各个方面:

什么是yield关键词在Python呢?

事实证明,这样yield做确实很有帮助。我相信我可以为此添加更详尽的示例。如果您想要更多或有建设性的批评,请在下面评论中告诉我。


附录:

对最佳/可接受答案的评论**

  • 仅以列表为例,它对使可迭代的内容感到困惑。请参阅上面的参考资料,但总而言之:iterable具有__iter__返回iterator的方法。一个迭代器提供了一个.next(Python 2里或.__next__(Python 3的)方法,它是隐式由称为for循环,直到它提出StopIteration,并且一旦这样做,将继续这样做。
  • 然后,它使用生成器表达式来描述什么是生成器。由于生成器只是创建迭代器的一种简便方法,因此它只会使事情变得混乱,而我们仍然没有涉及到这一yield部分。
  • 控制生成器的排气中,他调用了.next方法,而应该使用内置函数next。这将是一个适当的间接层,因为他的代码在Python 3中不起作用。
  • Itertools?这根本与做什么无关yield
  • 没有讨论yieldyield fromPython 3中的新功能一起提供的方法。最高/可接受的答案是非常不完整的答案。

yield生成器表达或理解中提出的答案的评论。

该语法当前允许列表理解中的任何表达式。

expr_stmt: testlist_star_expr (annassign | augassign (yield_expr|testlist) |
                     ('=' (yield_expr|testlist_star_expr))*)
...
yield_expr: 'yield' [yield_arg]
yield_arg: 'from' test | testlist

由于yield是一种表达,因此尽管没有特别好的用例,但有人认为它可以用于理解或生成器表达中。

CPython核心开发人员正在讨论弃用其津贴。这是邮件列表中的相关帖子:

2017年1月30日19:05,布雷特·坎农写道:

2017年1月29日星期日,克雷格·罗德里格斯(Craig Rodrigues)在星期日写道:

两种方法我都可以。恕我直言,把事情留在Python 3中是不好的。

我的投票是SyntaxError,因为您没有从语法中得到期望。

我同意这对我们来说是一个明智的选择,因为依赖当前行为的任何代码确实太聪明了,无法维护。

在到达目的地方面,我们可能需要:

  • 3.7中的语法警告或弃用警告
  • 2.7.x中的Py3k警告
  • 3.8中的SyntaxError

干杯,尼克。

-Nick Coghlan | gmail.com上的ncoghlan | 澳大利亚布里斯班

此外,还有一个悬而未决的问题(10544),似乎正说明这绝不是一个好主意(PyPy,用Python编写的Python实现,已经在发出语法警告。)

最重要的是,直到CPython的开发人员另行告诉我们为止:不要放入yield生成器表达式或理解。

return生成器中的语句

Python 2中

在生成器函数中,该return语句不允许包含expression_list。在这种情况下,裸露return表示生成器已完成并且将引起StopIteration提升。

An expression_list基本上是由逗号分隔的任意数量的表达式-本质上,在Python 2中,您可以使用停止生成器return,但不能返回值。

Python 3中

在生成器函数中,该return语句指示生成器完成并且将引起StopIteration提升。返回的值(如果有)用作构造的参数,StopIteration并成为StopIteration.value属性。

脚注

  1. 提案中引用了CLU,Sather和Icon语言,以将生成器的概念引入Python。总体思路是,一个函数可以维护内部状态并根据用户的需要产生中间数据点。这有望在性能上优于其他方法,包括Python线程,该方法甚至在某些系统上不可用。

  2. 例如,这意味着xrange对象(range在Python 3中)不是Iterator,即使它们是可迭代的,因为它们可以被重用。像列表一样,它们的__iter__方法返回迭代器对象。

  3. yield最初是作为语句引入的,这意味着它只能出现在代码块的一行的开头。现在yield创建一个yield表达式。 https://docs.python.org/2/reference/simple_stmts.html#grammar-token-yield_stmt 提出 此更改是为了允许用户将数据发送到生成器中,就像接收数据一样。要发送数据,必须能够将其分配给某物,为此,一条语句就行不通了。

What does the yield keyword do in Python?

Answer Outline/Summary

  • A function with yield, when called, returns a Generator.
  • Generators are iterators because they implement the iterator protocol, so you can iterate over them.
  • A generator can also be sent information, making it conceptually a coroutine.
  • In Python 3, you can delegate from one generator to another in both directions with yield from.
  • (Appendix critiques a couple of answers, including the top one, and discusses the use of return in a generator.)

Generators:

yield is only legal inside of a function definition, and the inclusion of yield in a function definition makes it return a generator.

The idea for generators comes from other languages (see footnote 1) with varying implementations. In Python’s Generators, the execution of the code is frozen at the point of the yield. When the generator is called (methods are discussed below) execution resumes and then freezes at the next yield.

yield provides an easy way of implementing the iterator protocol, defined by the following two methods: __iter__ and next (Python 2) or __next__ (Python 3). Both of those methods make an object an iterator that you could type-check with the Iterator Abstract Base Class from the collections module.

>>> def func():
...     yield 'I am'
...     yield 'a generator!'
... 
>>> type(func)                 # A function with yield is still a function
<type 'function'>
>>> gen = func()
>>> type(gen)                  # but it returns a generator
<type 'generator'>
>>> hasattr(gen, '__iter__')   # that's an iterable
True
>>> hasattr(gen, 'next')       # and with .next (.__next__ in Python 3)
True                           # implements the iterator protocol.

The generator type is a sub-type of iterator:

>>> import collections, types
>>> issubclass(types.GeneratorType, collections.Iterator)
True

And if necessary, we can type-check like this:

>>> isinstance(gen, types.GeneratorType)
True
>>> isinstance(gen, collections.Iterator)
True

A feature of an Iterator is that once exhausted, you can’t reuse or reset it:

>>> list(gen)
['I am', 'a generator!']
>>> list(gen)
[]

You’ll have to make another if you want to use its functionality again (see footnote 2):

>>> list(func())
['I am', 'a generator!']

One can yield data programmatically, for example:

def func(an_iterable):
    for item in an_iterable:
        yield item

The above simple generator is also equivalent to the below – as of Python 3.3 (and not available in Python 2), you can use yield from:

def func(an_iterable):
    yield from an_iterable

However, yield from also allows for delegation to subgenerators, which will be explained in the following section on cooperative delegation with sub-coroutines.

Coroutines:

yield forms an expression that allows data to be sent into the generator (see footnote 3)

Here is an example, take note of the received variable, which will point to the data that is sent to the generator:

def bank_account(deposited, interest_rate):
    while True:
        calculated_interest = interest_rate * deposited 
        received = yield calculated_interest
        if received:
            deposited += received


>>> my_account = bank_account(1000, .05)

First, we must queue up the generator with the builtin function, next. It will call the appropriate next or __next__ method, depending on the version of Python you are using:

>>> first_year_interest = next(my_account)
>>> first_year_interest
50.0

And now we can send data into the generator. (Sending None is the same as calling next.) :

>>> next_year_interest = my_account.send(first_year_interest + 1000)
>>> next_year_interest
102.5

Cooperative Delegation to Sub-Coroutine with yield from

Now, recall that yield from is available in Python 3. This allows us to delegate coroutines to a subcoroutine:

def money_manager(expected_rate):
    under_management = yield     # must receive deposited value
    while True:
        try:
            additional_investment = yield expected_rate * under_management 
            if additional_investment:
                under_management += additional_investment
        except GeneratorExit:
            '''TODO: write function to send unclaimed funds to state'''
        finally:
            '''TODO: write function to mail tax info to client'''


def investment_account(deposited, manager):
    '''very simple model of an investment account that delegates to a manager'''
    next(manager) # must queue up manager
    manager.send(deposited)
    while True:
        try:
            yield from manager
        except GeneratorExit:
            return manager.close()

And now we can delegate functionality to a sub-generator and it can be used by a generator just as above:

>>> my_manager = money_manager(.06)
>>> my_account = investment_account(1000, my_manager)
>>> first_year_return = next(my_account)
>>> first_year_return
60.0
>>> next_year_return = my_account.send(first_year_return + 1000)
>>> next_year_return
123.6

You can read more about the precise semantics of yield from in PEP 380.

Other Methods: close and throw

The close method raises GeneratorExit at the point the function execution was frozen. This will also be called by __del__ so you can put any cleanup code where you handle the GeneratorExit:

>>> my_account.close()

You can also throw an exception which can be handled in the generator or propagated back to the user:

>>> import sys
>>> try:
...     raise ValueError
... except:
...     my_manager.throw(*sys.exc_info())
... 
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "<stdin>", line 2, in <module>
ValueError

Conclusion

I believe I have covered all aspects of the following question:

What does the yield keyword do in Python?

It turns out that yield does a lot. I’m sure I could add even more thorough examples to this. If you want more or have some constructive criticism, let me know by commenting below.


Appendix:

Critique of the Top/Accepted Answer**

  • It is confused on what makes an iterable, just using a list as an example. See my references above, but in summary: an iterable has an __iter__ method returning an iterator. An iterator provides a .next (Python 2 or .__next__ (Python 3) method, which is implicitly called by for loops until it raises StopIteration, and once it does, it will continue to do so.
  • It then uses a generator expression to describe what a generator is. Since a generator is simply a convenient way to create an iterator, it only confuses the matter, and we still have not yet gotten to the yield part.
  • In Controlling a generator exhaustion he calls the .next method, when instead he should use the builtin function, next. It would be an appropriate layer of indirection, because his code does not work in Python 3.
  • Itertools? This was not relevant to what yield does at all.
  • No discussion of the methods that yield provides along with the new functionality yield from in Python 3. The top/accepted answer is a very incomplete answer.

Critique of answer suggesting yield in a generator expression or comprehension.

The grammar currently allows any expression in a list comprehension.

expr_stmt: testlist_star_expr (annassign | augassign (yield_expr|testlist) |
                     ('=' (yield_expr|testlist_star_expr))*)
...
yield_expr: 'yield' [yield_arg]
yield_arg: 'from' test | testlist

Since yield is an expression, it has been touted by some as interesting to use it in comprehensions or generator expression – in spite of citing no particularly good use-case.

The CPython core developers are discussing deprecating its allowance. Here’s a relevant post from the mailing list:

On 30 January 2017 at 19:05, Brett Cannon wrote:

On Sun, 29 Jan 2017 at 16:39 Craig Rodrigues wrote:

I’m OK with either approach. Leaving things the way they are in Python 3 is no good, IMHO.

My vote is it be a SyntaxError since you’re not getting what you expect from the syntax.

I’d agree that’s a sensible place for us to end up, as any code relying on the current behaviour is really too clever to be maintainable.

In terms of getting there, we’ll likely want:

  • SyntaxWarning or DeprecationWarning in 3.7
  • Py3k warning in 2.7.x
  • SyntaxError in 3.8

Cheers, Nick.

— Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia

Further, there is an outstanding issue (10544) which seems to be pointing in the direction of this never being a good idea (PyPy, a Python implementation written in Python, is already raising syntax warnings.)

Bottom line, until the developers of CPython tell us otherwise: Don’t put yield in a generator expression or comprehension.

The return statement in a generator

In Python 2:

In a generator function, the return statement is not allowed to include an expression_list. In that context, a bare return indicates that the generator is done and will cause StopIteration to be raised.

An expression_list is basically any number of expressions separated by commas – essentially, in Python 2, you can stop the generator with return, but you can’t return a value.

In Python 3:

In a generator function, the return statement indicates that the generator is done and will cause StopIteration to be raised. The returned value (if any) is used as an argument to construct StopIteration and becomes the StopIteration.value attribute.

Footnotes

  1. The languages CLU, Sather, and Icon were referenced in the proposal to introduce the concept of generators to Python. The general idea is that a function can maintain internal state and yield intermediate data points on demand by the user. This promised to be superior in performance to other approaches, including Python threading, which isn’t even available on some systems.

  2. This means, for example, that xrange objects (range in Python 3) aren’t Iterators, even though they are iterable, because they can be reused. Like lists, their __iter__ methods return iterator objects.

  3. yield was originally introduced as a statement, meaning that it could only appear at the beginning of a line in a code block. Now yield creates a yield expression. https://docs.python.org/2/reference/simple_stmts.html#grammar-token-yield_stmt This change was proposed to allow a user to send data into the generator just as one might receive it. To send data, one must be able to assign it to something, and for that, a statement just won’t work.


回答 5

yield就像return-它返回您告诉的内容(作为生成器)。不同之处在于,下一次您调用生成器时,执行将从上一次对yield语句的调用开始。与return不同的是,在产生良率时不会清除堆栈帧,但是会将控制权转移回调用方,因此下次调用该函数时,其状态将恢复。

就您的代码而言,该函数get_child_candidates的作用就像一个迭代器,以便在扩展列表时,它一次将一个元素添加到新列表中。

list.extend调用迭代器,直到耗尽为止。在您发布的代码示例的情况下,只返回一个元组并将其添加到列表中会更加清楚。

yield is just like return – it returns whatever you tell it to (as a generator). The difference is that the next time you call the generator, execution starts from the last call to the yield statement. Unlike return, the stack frame is not cleaned up when a yield occurs, however control is transferred back to the caller, so its state will resume the next time the function is called.

In the case of your code, the function get_child_candidates is acting like an iterator so that when you extend your list, it adds one element at a time to the new list.

list.extend calls an iterator until it’s exhausted. In the case of the code sample you posted, it would be much clearer to just return a tuple and append that to the list.


回答 6

还有另外一件事要提及:yield的函数实际上不必终止。我写了这样的代码:

def fib():
    last, cur = 0, 1
    while True: 
        yield cur
        last, cur = cur, last + cur

然后我可以在其他代码中使用它:

for f in fib():
    if some_condition: break
    coolfuncs(f);

它确实有助于简化某些问题,并使某些事情更易于使用。

There’s one extra thing to mention: a function that yields doesn’t actually have to terminate. I’ve written code like this:

def fib():
    last, cur = 0, 1
    while True: 
        yield cur
        last, cur = cur, last + cur

Then I can use it in other code like this:

for f in fib():
    if some_condition: break
    coolfuncs(f);

It really helps simplify some problems, and makes some things easier to work with.


回答 7

对于那些偏爱简单示例的人,请在此交互式Python会话中进行冥想:

>>> def f():
...   yield 1
...   yield 2
...   yield 3
... 
>>> g = f()
>>> for i in g:
...   print(i)
... 
1
2
3
>>> for i in g:
...   print(i)
... 
>>> # Note that this time nothing was printed

For those who prefer a minimal working example, meditate on this interactive Python session:

>>> def f():
...   yield 1
...   yield 2
...   yield 3
... 
>>> g = f()
>>> for i in g:
...   print(i)
... 
1
2
3
>>> for i in g:
...   print(i)
... 
>>> # Note that this time nothing was printed

回答 8

TL; DR

代替这个:

def square_list(n):
    the_list = []                         # Replace
    for x in range(n):
        y = x * x
        the_list.append(y)                # these
    return the_list                       # lines

做这个:

def square_yield(n):
    for x in range(n):
        y = x * x
        yield y                           # with this one.

每当您发现自己从头开始建立清单时,就yield逐一列出。

这是我第一次屈服。


yield是一种含蓄的说法

建立一系列的东西

相同的行为:

>>> for square in square_list(4):
...     print(square)
...
0
1
4
9
>>> for square in square_yield(4):
...     print(square)
...
0
1
4
9

不同的行为:

收益是单次通过:您只能迭代一次。当一个函数包含一个yield时,我们称其为Generator函数。还有一个迭代器就是它返回的内容。这些术语在揭示。我们失去了容器的便利性,但获得了按需计算且任意长的序列的功效。

Yield懒惰,推迟了计算。当您调用函数时,其中包含yield的函数实际上根本不会执行。它返回一个迭代器对象,该对象记住它从何处中断。每次您调用next()迭代器(这在for循环中发生)时,执行都会向前推进到下一个收益。return引发StopIteration并结束序列(这是for循环的自然结束)。

Yield多才多艺。数据不必全部存储在一起,可以一次存储一次。它可以是无限的。

>>> def squares_all_of_them():
...     x = 0
...     while True:
...         yield x * x
...         x += 1
...
>>> squares = squares_all_of_them()
>>> for _ in range(4):
...     print(next(squares))
...
0
1
4
9

如果您需要多次通过,而系列又不太长,只需调用list()它:

>>> list(square_yield(4))
[0, 1, 4, 9]

单词的出色选择,yield因为两种含义都适用:

Yield —生产或提供(如在农业中)

…提供系列中的下一个数据。

屈服 —让步或放弃(如在政治权力中一样)

…放弃CPU执行,直到迭代器前进。

TL;DR

Instead of this:

def square_list(n):
    the_list = []                         # Replace
    for x in range(n):
        y = x * x
        the_list.append(y)                # these
    return the_list                       # lines

do this:

def square_yield(n):
    for x in range(n):
        y = x * x
        yield y                           # with this one.

Whenever you find yourself building a list from scratch, yield each piece instead.

This was my first “aha” moment with yield.


yield is a sugary way to say

build a series of stuff

Same behavior:

>>> for square in square_list(4):
...     print(square)
...
0
1
4
9
>>> for square in square_yield(4):
...     print(square)
...
0
1
4
9

Different behavior:

Yield is single-pass: you can only iterate through once. When a function has a yield in it we call it a generator function. And an iterator is what it returns. Those terms are revealing. We lose the convenience of a container, but gain the power of a series that’s computed as needed, and arbitrarily long.

Yield is lazy, it puts off computation. A function with a yield in it doesn’t actually execute at all when you call it. It returns an iterator object that remembers where it left off. Each time you call next() on the iterator (this happens in a for-loop) execution inches forward to the next yield. return raises StopIteration and ends the series (this is the natural end of a for-loop).

Yield is versatile. Data doesn’t have to be stored all together, it can be made available one at a time. It can be infinite.

>>> def squares_all_of_them():
...     x = 0
...     while True:
...         yield x * x
...         x += 1
...
>>> squares = squares_all_of_them()
>>> for _ in range(4):
...     print(next(squares))
...
0
1
4
9

If you need multiple passes and the series isn’t too long, just call list() on it:

>>> list(square_yield(4))
[0, 1, 4, 9]

Brilliant choice of the word yield because both meanings apply:

yield — produce or provide (as in agriculture)

…provide the next data in the series.

yield — give way or relinquish (as in political power)

…relinquish CPU execution until the iterator advances.


回答 9

Yield可以为您提供生成器。

def get_odd_numbers(i):
    return range(1, i, 2)
def yield_odd_numbers(i):
    for x in range(1, i, 2):
       yield x
foo = get_odd_numbers(10)
bar = yield_odd_numbers(10)
foo
[1, 3, 5, 7, 9]
bar
<generator object yield_odd_numbers at 0x1029c6f50>
bar.next()
1
bar.next()
3
bar.next()
5

如您所见,在第一种情况下,foo将整个列表立即保存在内存中。对于包含5个元素的列表来说,这不是什么大问题,但是如果您想要500万个列表,该怎么办?这不仅是一个巨大的内存消耗者,而且在调用该函数时还花费大量时间来构建。

在第二种情况下,bar只需为您提供一个生成器。生成器是可迭代的-这意味着您可以在for循环等中使用它,但是每个值只能被访问一次。所有的值也不会同时存储在存储器中。生成器对象“记住”您上次调用它时在循环中的位置-这样,如果您使用的是一个迭代的(例如)计数为500亿,则不必计数为500亿立即存储500亿个数字以进行计算。

再次,这是一个非常人为的示例,如果您真的想计数到500亿,则可能会使用itertools。:)

这是生成器最简单的用例。如您所说,它可以用来编写有效的排列,使用yield可以将内容推入调用堆栈,而不是使用某种堆栈变量。生成器还可以用于特殊的树遍历以及所有其他方式。

Yield gives you a generator.

def get_odd_numbers(i):
    return range(1, i, 2)
def yield_odd_numbers(i):
    for x in range(1, i, 2):
       yield x
foo = get_odd_numbers(10)
bar = yield_odd_numbers(10)
foo
[1, 3, 5, 7, 9]
bar
<generator object yield_odd_numbers at 0x1029c6f50>
bar.next()
1
bar.next()
3
bar.next()
5

As you can see, in the first case foo holds the entire list in memory at once. It’s not a big deal for a list with 5 elements, but what if you want a list of 5 million? Not only is this a huge memory eater, it also costs a lot of time to build at the time that the function is called.

In the second case, bar just gives you a generator. A generator is an iterable–which means you can use it in a for loop, etc, but each value can only be accessed once. All the values are also not stored in memory at the same time; the generator object “remembers” where it was in the looping the last time you called it–this way, if you’re using an iterable to (say) count to 50 billion, you don’t have to count to 50 billion all at once and store the 50 billion numbers to count through.

Again, this is a pretty contrived example, you probably would use itertools if you really wanted to count to 50 billion. :)

This is the most simple use case of generators. As you said, it can be used to write efficient permutations, using yield to push things up through the call stack instead of using some sort of stack variable. Generators can also be used for specialized tree traversal, and all manner of other things.


回答 10

它正在返回生成器。我对Python并不是特别熟悉,但是如果您熟悉C#的迭代器块,我相信它与C#的迭代器块一样

关键思想是,编译器/解释器/无论做什么都做一些技巧,以便就调用者而言,他们可以继续调用next(),并且将继续返回值- 就像Generator方法已暂停一样。现在显然您不能真正地“暂停”方法,因此编译器构建了一个状态机,供您记住您当前所​​在的位置以及局部变量等的外观。这比自己编写迭代器要容易得多。

It’s returning a generator. I’m not particularly familiar with Python, but I believe it’s the same kind of thing as C#’s iterator blocks if you’re familiar with those.

The key idea is that the compiler/interpreter/whatever does some trickery so that as far as the caller is concerned, they can keep calling next() and it will keep returning values – as if the generator method was paused. Now obviously you can’t really “pause” a method, so the compiler builds a state machine for you to remember where you currently are and what the local variables etc look like. This is much easier than writing an iterator yourself.


回答 11

在描述如何使用生成器的许多很棒的答案中,我还没有给出一种答案。这是编程语言理论的答案:

yieldPython中的语句返回一个生成器。Python中的生成器是一个返回延续的函数(特别是协程类型,但是延续代表了一种更通用的机制来了解正在发生的事情)。

编程语言理论中的连续性是一种更为基础的计算,但是由于它们很难推理而且也很难实现,因此并不经常使用。但是,关于延续是什么的想法很简单:只是尚未完成的计算状态。在此状态下,将保存变量的当前值,尚未执行的操作等。然后,在稍后的某个时刻,可以在程序中调用继续,以便将程序的变量重置为该状态,并执行保存的操作。

以这种更一般的形式进行的延续可以两种方式实现。在call/cc方式,程序的堆栈字面上保存,然后调用延续时,堆栈恢复。

在延续传递样式(CPS)中,延续只是普通的函数(仅在函数是第一类的语言中),程序员明确地对其进行管理并传递给子例程。以这种方式,程序状态由闭包(以及恰好在其中编码的变量)表示,而不是驻留在堆栈中某个位置的变量。管理控制流的函数接受连续作为参数(在CPS的某些变体中,函数可以接受多个连续),并通过简单地调用它们并随后返回来调用它们来操纵控制流。延续传递样式的一个非常简单的示例如下:

def save_file(filename):
  def write_file_continuation():
    write_stuff_to_file(filename)

  check_if_file_exists_and_user_wants_to_overwrite(write_file_continuation)

在这个(非常简单的)示例中,程序员保存了将文件实际写入连续的操作(该操作可能是非常复杂的操作,需要写出许多细节),然后传递该连续(例如,首先类闭包)给另一个进行更多处理的运算符,然后在必要时调用它。(我在实际的GUI编程中经常使用这种设计模式,这是因为它节省了我的代码行,或更重要的是,在GUI事件触发后管理了控制流。)

在不失一般性的前提下,本文的其余部分将连续性概念化为CPS,因为它很容易理解和阅读。


现在让我们谈谈Python中的生成器。生成器是延续的特定子类型。而延续能够在一般的保存状态计算(即程序调用堆栈),生成器只能保存迭代的状态经过一个迭代器。虽然,对于生成器的某些用例,此定义有些误导。例如:

def f():
  while True:
    yield 4

显然,这是一个合理的迭代器,其行为已得到很好的定义-每次生成器对其进行迭代时,它都会返回4(并永远这样做)。但是,在考虑迭代器(即for x in collection: do_something(x))时,可能并没有想到可迭代的原型类型。此示例说明了生成器的功能:如果有什么是迭代器,生成器可以保存其迭代状态。

重申一下:连续可以保存程序堆栈的状态,而生成器可以保存迭代的状态。这意味着延续比生成器强大得多,但是生成器也非常简单。它们对于语言设计者来说更容易实现,对程序员来说也更容易使用(如果您有时间要燃烧,请尝试阅读并理解有关延续和call / cc的本页)。

但是您可以轻松地将生成器实现(并概念化)为连续传递样式的一种简单的特定情况:

每当yield调用时,它告诉函数返回一个延续。再次调用该函数时,将从中断处开始。因此,在伪伪代码(即不是伪代码,而不是代码)中,生成器的next方法基本上如下:

class Generator():
  def __init__(self,iterable,generatorfun):
    self.next_continuation = lambda:generatorfun(iterable)

  def next(self):
    value, next_continuation = self.next_continuation()
    self.next_continuation = next_continuation
    return value

其中,yield关键字实际上是真正的生成器功能语法糖,基本上是这样的:

def generatorfun(iterable):
  if len(iterable) == 0:
    raise StopIteration
  else:
    return (iterable[0], lambda:generatorfun(iterable[1:]))

请记住,这只是伪代码,Python中生成器的实际实现更为复杂。但是,作为练习以了解发生了什么,请尝试使用连续传递样式来实现生成器对象,而不使用yield关键字。

There is one type of answer that I don’t feel has been given yet, among the many great answers that describe how to use generators. Here is the programming language theory answer:

The yield statement in Python returns a generator. A generator in Python is a function that returns continuations (and specifically a type of coroutine, but continuations represent the more general mechanism to understand what is going on).

Continuations in programming languages theory are a much more fundamental kind of computation, but they are not often used, because they are extremely hard to reason about and also very difficult to implement. But the idea of what a continuation is, is straightforward: it is the state of a computation that has not yet finished. In this state, the current values of variables, the operations that have yet to be performed, and so on, are saved. Then at some point later in the program the continuation can be invoked, such that the program’s variables are reset to that state and the operations that were saved are carried out.

Continuations, in this more general form, can be implemented in two ways. In the call/cc way, the program’s stack is literally saved and then when the continuation is invoked, the stack is restored.

In continuation passing style (CPS), continuations are just normal functions (only in languages where functions are first class) which the programmer explicitly manages and passes around to subroutines. In this style, program state is represented by closures (and the variables that happen to be encoded in them) rather than variables that reside somewhere on the stack. Functions that manage control flow accept continuation as arguments (in some variations of CPS, functions may accept multiple continuations) and manipulate control flow by invoking them by simply calling them and returning afterwards. A very simple example of continuation passing style is as follows:

def save_file(filename):
  def write_file_continuation():
    write_stuff_to_file(filename)

  check_if_file_exists_and_user_wants_to_overwrite(write_file_continuation)

In this (very simplistic) example, the programmer saves the operation of actually writing the file into a continuation (which can potentially be a very complex operation with many details to write out), and then passes that continuation (i.e, as a first-class closure) to another operator which does some more processing, and then calls it if necessary. (I use this design pattern a lot in actual GUI programming, either because it saves me lines of code or, more importantly, to manage control flow after GUI events trigger.)

The rest of this post will, without loss of generality, conceptualize continuations as CPS, because it is a hell of a lot easier to understand and read.


Now let’s talk about generators in Python. Generators are a specific subtype of continuation. Whereas continuations are able in general to save the state of a computation (i.e., the program’s call stack), generators are only able to save the state of iteration over an iterator. Although, this definition is slightly misleading for certain use cases of generators. For instance:

def f():
  while True:
    yield 4

This is clearly a reasonable iterable whose behavior is well defined — each time the generator iterates over it, it returns 4 (and does so forever). But it isn’t probably the prototypical type of iterable that comes to mind when thinking of iterators (i.e., for x in collection: do_something(x)). This example illustrates the power of generators: if anything is an iterator, a generator can save the state of its iteration.

To reiterate: Continuations can save the state of a program’s stack and generators can save the state of iteration. This means that continuations are more a lot powerful than generators, but also that generators are a lot, lot easier. They are easier for the language designer to implement, and they are easier for the programmer to use (if you have some time to burn, try to read and understand this page about continuations and call/cc).

But you could easily implement (and conceptualize) generators as a simple, specific case of continuation passing style:

Whenever yield is called, it tells the function to return a continuation. When the function is called again, it starts from wherever it left off. So, in pseudo-pseudocode (i.e., not pseudocode, but not code) the generator’s next method is basically as follows:

class Generator():
  def __init__(self,iterable,generatorfun):
    self.next_continuation = lambda:generatorfun(iterable)

  def next(self):
    value, next_continuation = self.next_continuation()
    self.next_continuation = next_continuation
    return value

where the yield keyword is actually syntactic sugar for the real generator function, basically something like:

def generatorfun(iterable):
  if len(iterable) == 0:
    raise StopIteration
  else:
    return (iterable[0], lambda:generatorfun(iterable[1:]))

Remember that this is just pseudocode and the actual implementation of generators in Python is more complex. But as an exercise to understand what is going on, try to use continuation passing style to implement generator objects without use of the yield keyword.


回答 12

这是简单语言的示例。我将提供高级人类概念与低级Python概念之间的对应关系。

我想对数字序列进行运算,但是我不想为创建该序列而烦恼自己,我只想着重于自己想做的运算。因此,我执行以下操作:

  • 我打电话给你,告诉你我想要一个以特定方式产生的数字序列,让您知道算法是什么。
    此步骤对应于def生成器函数,即包含a的函数yield
  • 稍后,我告诉您,“好,准备告诉我数字的顺序”。
    此步骤对应于调用生成器函数,该函数返回生成器对象。请注意,您还没有告诉我任何数字。你只要拿起纸和铅笔。
  • 我问你,“告诉我下一个号码”,然后你告诉我第一个号码;之后,您等我问您下一个电话号码。记住您的位置,已经说过的电话号码以及下一个电话号码是您的工作。我不在乎细节。
    此步骤对应于调用.next()生成器对象。
  • …重复上一步,直到…
  • 最终,您可能会走到尽头。你不告诉我电话号码;您只是大声喊道:“抱马!我做完了!没有数字了!”
    此步骤对应于生成器对象结束其工作并引发StopIteration异常。生成器函数不需要引发异常。函数结束或发出时,它将自动引发return

这就是生成器的功能(包含的函数yield);它开始执行,在执行时暂停yield,并在要求输入.next()值时从上一个点继续执行。根据设计,它与Python的迭代器协议完美契合,该协议描述了如何顺序请求值。

迭代器协议最著名的用户是forPython中的命令。因此,无论何时执行以下操作:

for item in sequence:

不管sequence是列表,字符串,字典还是如上所述的生成器对象,都没有关系;结果是相同的:您从一个序列中逐个读取项目。

注意,def包含一个yield关键字的函数并不是创建生成器的唯一方法;这是创建一个的最简单的方法。

有关更准确的信息,请阅读Python文档中有关迭代器类型yield语句生成器的信息。

Here is an example in plain language. I will provide a correspondence between high-level human concepts to low-level Python concepts.

I want to operate on a sequence of numbers, but I don’t want to bother my self with the creation of that sequence, I want only to focus on the operation I want to do. So, I do the following:

  • I call you and tell you that I want a sequence of numbers which is produced in a specific way, and I let you know what the algorithm is.
    This step corresponds to defining the generator function, i.e. the function containing a yield.
  • Sometime later, I tell you, “OK, get ready to tell me the sequence of numbers”.
    This step corresponds to calling the generator function which returns a generator object. Note that you don’t tell me any numbers yet; you just grab your paper and pencil.
  • I ask you, “tell me the next number”, and you tell me the first number; after that, you wait for me to ask you for the next number. It’s your job to remember where you were, what numbers you have already said, and what is the next number. I don’t care about the details.
    This step corresponds to calling .next() on the generator object.
  • … repeat previous step, until…
  • eventually, you might come to an end. You don’t tell me a number; you just shout, “hold your horses! I’m done! No more numbers!”
    This step corresponds to the generator object ending its job, and raising a StopIteration exception The generator function does not need to raise the exception. It’s raised automatically when the function ends or issues a return.

This is what a generator does (a function that contains a yield); it starts executing, pauses whenever it does a yield, and when asked for a .next() value it continues from the point it was last. It fits perfectly by design with the iterator protocol of Python, which describes how to sequentially request values.

The most famous user of the iterator protocol is the for command in Python. So, whenever you do a:

for item in sequence:

it doesn’t matter if sequence is a list, a string, a dictionary or a generator object like described above; the result is the same: you read items off a sequence one by one.

Note that defining a function which contains a yield keyword is not the only way to create a generator; it’s just the easiest way to create one.

For more accurate information, read about iterator types, the yield statement and generators in the Python documentation.


回答 13

尽管有许多答案说明了为什么要使用a yield来生成生成器,但是的使用更多了yield。创建协程非常容易,这使信息可以在两个代码块之间传递。我不会重复任何有关使用yield生成器的优秀示例。

为了帮助理解yield以下代码中的功能,您可以用手指在带有的任何代码中跟踪循环yield。每次手指触摸时yield,您都必须等待输入a next或a send。当next被调用时,您通过跟踪代码,直到你打yield…上的右边的代码yield进行评估,并返回给调用者…那你就等着。当next再次被调用时,您将在代码中执行另一个循环。但是,您会注意到,在协程中,yield也可以与send… 一起使用,它将从调用方将值发送 yielding函数。如果send给出a,则yield接收到发送的值,然后将其吐到左侧…然后遍历代码,直到您yield再次单击为止(返回值,就像next被调用一样)。

例如:

>>> def coroutine():
...     i = -1
...     while True:
...         i += 1
...         val = (yield i)
...         print("Received %s" % val)
...
>>> sequence = coroutine()
>>> sequence.next()
0
>>> sequence.next()
Received None
1
>>> sequence.send('hello')
Received hello
2
>>> sequence.close()

While a lot of answers show why you’d use a yield to create a generator, there are more uses for yield. It’s quite easy to make a coroutine, which enables the passing of information between two blocks of code. I won’t repeat any of the fine examples that have already been given about using yield to create a generator.

To help understand what a yield does in the following code, you can use your finger to trace the cycle through any code that has a yield. Every time your finger hits the yield, you have to wait for a next or a send to be entered. When a next is called, you trace through the code until you hit the yield… the code on the right of the yield is evaluated and returned to the caller… then you wait. When next is called again, you perform another loop through the code. However, you’ll note that in a coroutine, yield can also be used with a send… which will send a value from the caller into the yielding function. If a send is given, then yield receives the value sent, and spits it out the left hand side… then the trace through the code progresses until you hit the yield again (returning the value at the end, as if next was called).

For example:

>>> def coroutine():
...     i = -1
...     while True:
...         i += 1
...         val = (yield i)
...         print("Received %s" % val)
...
>>> sequence = coroutine()
>>> sequence.next()
0
>>> sequence.next()
Received None
1
>>> sequence.send('hello')
Received hello
2
>>> sequence.close()

回答 14

还有另一个yield用途和含义(自Python 3.3起):

yield from <expr>

PEP 380-委托给子生成器的语法

提出了一种语法,供生成器将其部分操作委托给另一生成器。这允许包含“ yield”的一段代码被分解出来并放置在另一个生成器中。此外,允许子生成器返回一个值,并且该值可用于委派生成器。

当一个生成器重新产生由另一个生成器生成的值时,新语法还为优化提供了一些机会。

此外,将引入(自Python 3.5起):

async def new_coroutine(data):
   ...
   await blocking_action()

为了避免将协程与常规生成器混淆(今天yield在两者中都使用)。

There is another yield use and meaning (since Python 3.3):

yield from <expr>

From PEP 380 — Syntax for Delegating to a Subgenerator:

A syntax is proposed for a generator to delegate part of its operations to another generator. This allows a section of code containing ‘yield’ to be factored out and placed in another generator. Additionally, the subgenerator is allowed to return with a value, and the value is made available to the delegating generator.

The new syntax also opens up some opportunities for optimisation when one generator re-yields values produced by another.

Moreover this will introduce (since Python 3.5):

async def new_coroutine(data):
   ...
   await blocking_action()

to avoid coroutines being confused with a regular generator (today yield is used in both).


回答 15

所有好的答案,但是对于新手来说有点困难。

我认为您已经了解了该return声明。

作为一个比喻,returnyield是一对双胞胎。return表示“返回并停止”,而“收益”则表示“返回但继续”

  1. 尝试使用获取num_list return
def num_list(n):
    for i in range(n):
        return i

运行:

In [5]: num_list(3)
Out[5]: 0

看,您只会得到一个数字,而不是列表。return永远不要让你高高兴兴,只实现一次就退出。

  1. 来了 yield

替换returnyield

In [10]: def num_list(n):
    ...:     for i in range(n):
    ...:         yield i
    ...:

In [11]: num_list(3)
Out[11]: <generator object num_list at 0x10327c990>

In [12]: list(num_list(3))
Out[12]: [0, 1, 2]

现在,您将赢得所有数字。

与计划return一次运行和停止yield运行的时间进行比较。你可以理解returnreturn one of them,和yield作为return all of them。这称为iterable

  1. 我们可以yield使用以下步骤重写语句return
In [15]: def num_list(n):
    ...:     result = []
    ...:     for i in range(n):
    ...:         result.append(i)
    ...:     return result

In [16]: num_list(3)
Out[16]: [0, 1, 2]

这是关于 yield

列表return输出和对象之间的区别yield输出是:

您将始终从列表对象获取[0,1,2],但只能从“对象yield输出”中检索一次。因此,它具有一个新的名称generator对象,如Out[11]: <generator object num_list at 0x10327c990>

总之,作为一个隐喻,它可以:

  • return并且yield是双胞胎
  • list并且generator是双胞胎

All great answers, however a bit difficult for newbies.

I assume you have learned the return statement.

As an analogy, return and yield are twins. return means ‘return and stop’ whereas ‘yield` means ‘return, but continue’

  1. Try to get a num_list with return.
def num_list(n):
    for i in range(n):
        return i

Run it:

In [5]: num_list(3)
Out[5]: 0

See, you get only a single number rather than a list of them. return never allows you prevail happily, just implements once and quit.

  1. There comes yield

Replace return with yield:

In [10]: def num_list(n):
    ...:     for i in range(n):
    ...:         yield i
    ...:

In [11]: num_list(3)
Out[11]: <generator object num_list at 0x10327c990>

In [12]: list(num_list(3))
Out[12]: [0, 1, 2]

Now, you win to get all the numbers.

Comparing to return which runs once and stops, yield runs times you planed. You can interpret return as return one of them, and yield as return all of them. This is called iterable.

  1. One more step we can rewrite yield statement with return
In [15]: def num_list(n):
    ...:     result = []
    ...:     for i in range(n):
    ...:         result.append(i)
    ...:     return result

In [16]: num_list(3)
Out[16]: [0, 1, 2]

It’s the core about yield.

The difference between a list return outputs and the object yield output is:

You will always get [0, 1, 2] from a list object but only could retrieve them from ‘the object yield output’ once. So, it has a new name generator object as displayed in Out[11]: <generator object num_list at 0x10327c990>.

In conclusion, as a metaphor to grok it:

  • return and yield are twins
  • list and generator are twins

回答 16

以下是一些Python示例,这些示例说明如何实际实现生成器,就像Python没有为其提供语法糖一样:

作为Python生成器:

from itertools import islice

def fib_gen():
    a, b = 1, 1
    while True:
        yield a
        a, b = b, a + b

assert [1, 1, 2, 3, 5] == list(islice(fib_gen(), 5))

使用词法闭包而不是生成器

def ftake(fnext, last):
    return [fnext() for _ in xrange(last)]

def fib_gen2():
    #funky scope due to python2.x workaround
    #for python 3.x use nonlocal
    def _():
        _.a, _.b = _.b, _.a + _.b
        return _.a
    _.a, _.b = 0, 1
    return _

assert [1,1,2,3,5] == ftake(fib_gen2(), 5)

使用对象闭包而不是生成器(因为ClosuresAndObjectsAreEquivalent

class fib_gen3:
    def __init__(self):
        self.a, self.b = 1, 1

    def __call__(self):
        r = self.a
        self.a, self.b = self.b, self.a + self.b
        return r

assert [1,1,2,3,5] == ftake(fib_gen3(), 5)

Here are some Python examples of how to actually implement generators as if Python did not provide syntactic sugar for them:

As a Python generator:

from itertools import islice

def fib_gen():
    a, b = 1, 1
    while True:
        yield a
        a, b = b, a + b

assert [1, 1, 2, 3, 5] == list(islice(fib_gen(), 5))

Using lexical closures instead of generators

def ftake(fnext, last):
    return [fnext() for _ in xrange(last)]

def fib_gen2():
    #funky scope due to python2.x workaround
    #for python 3.x use nonlocal
    def _():
        _.a, _.b = _.b, _.a + _.b
        return _.a
    _.a, _.b = 0, 1
    return _

assert [1,1,2,3,5] == ftake(fib_gen2(), 5)

Using object closures instead of generators (because ClosuresAndObjectsAreEquivalent)

class fib_gen3:
    def __init__(self):
        self.a, self.b = 1, 1

    def __call__(self):
        r = self.a
        self.a, self.b = self.b, self.a + self.b
        return r

assert [1,1,2,3,5] == ftake(fib_gen3(), 5)

回答 17

我打算发布“阅读Beazley的“ Python:基本参考”的第19页,以快速了解生成器”,但是已经有许多其他人发布了不错的描述。

另外,请注意,它们yield可以在协程中用作生成函数的双重功能。尽管它与您的代码段用法不同,(yield)但是可以用作函数中的表达式。当调用者使用该send()方法向该方法发送值时,协程将执行直到(yield)遇到下一条语句。

生成器和协程是设置数据流类型应用程序的一种很酷的方法。我认为有必要了解该yield语句在函数中的其他用法。

I was going to post “read page 19 of Beazley’s ‘Python: Essential Reference’ for a quick description of generators”, but so many others have posted good descriptions already.

Also, note that yield can be used in coroutines as the dual of their use in generator functions. Although it isn’t the same use as your code snippet, (yield) can be used as an expression in a function. When a caller sends a value to the method using the send() method, then the coroutine will execute until the next (yield) statement is encountered.

Generators and coroutines are a cool way to set up data-flow type applications. I thought it would be worthwhile knowing about the other use of the yield statement in functions.


回答 18

从编程的角度来看,迭代器被实现为thunk

为了将迭代器,生成器和线程池实现为并发执行等,作为重击(也称为匿名函数),人们使用发送到具有分派器的闭包对象的消息,然后分派器对“消息”做出响应。

http://en.wikipedia.org/wiki/Message_passing

next ”是发送给闭包的消息,由“ iter ”创建 ”调用。

有很多方法可以实现此计算。我使用了变异,但是通过返回当前值和下一个生成器,很容易做到无变异。

这是一个使用R6RS结构的演示,但是其语义与Python完全相同。它是相同的计算模型,只需要更改语法就可以用Python重写它。

Welcome to Racket v6.5.0.3.

-> (define gen
     (lambda (l)
       (define yield
         (lambda ()
           (if (null? l)
               'END
               (let ((v (car l)))
                 (set! l (cdr l))
                 v))))
       (lambda(m)
         (case m
           ('yield (yield))
           ('init  (lambda (data)
                     (set! l data)
                     'OK))))))
-> (define stream (gen '(1 2 3)))
-> (stream 'yield)
1
-> (stream 'yield)
2
-> (stream 'yield)
3
-> (stream 'yield)
'END
-> ((stream 'init) '(a b))
'OK
-> (stream 'yield)
'a
-> (stream 'yield)
'b
-> (stream 'yield)
'END
-> (stream 'yield)
'END
->

From a programming viewpoint, the iterators are implemented as thunks.

To implement iterators, generators, and thread pools for concurrent execution, etc. as thunks (also called anonymous functions), one uses messages sent to a closure object, which has a dispatcher, and the dispatcher answers to “messages”.

http://en.wikipedia.org/wiki/Message_passing

next” is a message sent to a closure, created by the “iter” call.

There are lots of ways to implement this computation. I used mutation, but it is easy to do it without mutation, by returning the current value and the next yielder.

Here is a demonstration which uses the structure of R6RS, but the semantics is absolutely identical to Python’s. It’s the same model of computation, and only a change in syntax is required to rewrite it in Python.

Welcome to Racket v6.5.0.3.

-> (define gen
     (lambda (l)
       (define yield
         (lambda ()
           (if (null? l)
               'END
               (let ((v (car l)))
                 (set! l (cdr l))
                 v))))
       (lambda(m)
         (case m
           ('yield (yield))
           ('init  (lambda (data)
                     (set! l data)
                     'OK))))))
-> (define stream (gen '(1 2 3)))
-> (stream 'yield)
1
-> (stream 'yield)
2
-> (stream 'yield)
3
-> (stream 'yield)
'END
-> ((stream 'init) '(a b))
'OK
-> (stream 'yield)
'a
-> (stream 'yield)
'b
-> (stream 'yield)
'END
-> (stream 'yield)
'END
->

回答 19

这是一个简单的示例:

def isPrimeNumber(n):
    print "isPrimeNumber({}) call".format(n)
    if n==1:
        return False
    for x in range(2,n):
        if n % x == 0:
            return False
    return True

def primes (n=1):
    while(True):
        print "loop step ---------------- {}".format(n)
        if isPrimeNumber(n): yield n
        n += 1

for n in primes():
    if n> 10:break
    print "wiriting result {}".format(n)

输出:

loop step ---------------- 1
isPrimeNumber(1) call
loop step ---------------- 2
isPrimeNumber(2) call
loop step ---------------- 3
isPrimeNumber(3) call
wiriting result 3
loop step ---------------- 4
isPrimeNumber(4) call
loop step ---------------- 5
isPrimeNumber(5) call
wiriting result 5
loop step ---------------- 6
isPrimeNumber(6) call
loop step ---------------- 7
isPrimeNumber(7) call
wiriting result 7
loop step ---------------- 8
isPrimeNumber(8) call
loop step ---------------- 9
isPrimeNumber(9) call
loop step ---------------- 10
isPrimeNumber(10) call
loop step ---------------- 11
isPrimeNumber(11) call

我不是Python开发人员,但在我看来 yield保持着程序流程的位置,并且下一个循环从“ yield”位置开始。似乎它正在那个位置等待,就在那之前,在外面返回一个值,下一次继续工作。

这似乎是一种有趣而又不错的能力:D

Here is a simple example:

def isPrimeNumber(n):
    print "isPrimeNumber({}) call".format(n)
    if n==1:
        return False
    for x in range(2,n):
        if n % x == 0:
            return False
    return True

def primes (n=1):
    while(True):
        print "loop step ---------------- {}".format(n)
        if isPrimeNumber(n): yield n
        n += 1

for n in primes():
    if n> 10:break
    print "wiriting result {}".format(n)

Output:

loop step ---------------- 1
isPrimeNumber(1) call
loop step ---------------- 2
isPrimeNumber(2) call
loop step ---------------- 3
isPrimeNumber(3) call
wiriting result 3
loop step ---------------- 4
isPrimeNumber(4) call
loop step ---------------- 5
isPrimeNumber(5) call
wiriting result 5
loop step ---------------- 6
isPrimeNumber(6) call
loop step ---------------- 7
isPrimeNumber(7) call
wiriting result 7
loop step ---------------- 8
isPrimeNumber(8) call
loop step ---------------- 9
isPrimeNumber(9) call
loop step ---------------- 10
isPrimeNumber(10) call
loop step ---------------- 11
isPrimeNumber(11) call

I am not a Python developer, but it looks to me yield holds the position of program flow and the next loop start from “yield” position. It seems like it is waiting at that position, and just before that, returning a value outside, and next time continues to work.

It seems to be an interesting and nice ability :D


回答 20

这是做什么事情的心理yield印象。

我喜欢将线程视为具有堆栈(即使未以这种方式实现)。

调用普通函数时,它将其局部变量放在堆栈上,进行一些计算,然后清除堆栈并返回。再也看不到其局部变量的值。

对于一个yield函数,当其代码开始运行时(即,在调用该函数之后,返回生成器对象,next()然后调用该方法的生成器对象),它类似地将其局部变量放入堆栈中并进行一段时间的计算。但是,当它命中该yield语句时,在清除堆栈的一部分并返回之前,它会对其局部变量进行快照,并将其存储在生成器对象中。它还在代码中写下了当前位置(即特定的yield语句)。

因此,这是生成器挂起的一种冻结函数。

next()随后被调用时,它检索功能的物品入堆栈,重新蓬勃生机。该函数从中断处继续进行计算,而忽略了它刚刚在冷库中度过了一个永恒的事实。

比较以下示例:

def normalFunction():
    return
    if False:
        pass

def yielderFunction():
    return
    if False:
        yield 12

当我们调用第二个函数时,它的行为与第一个函数非常不同。该yield语句可能无法到达,但是如果它存在于任何地方,它将改变我们正在处理的内容的性质。

>>> yielderFunction()
<generator object yielderFunction at 0x07742D28>

调用yielderFunction()不会运行其代码,而是使代码生成器。(yielder为便于阅读,以这样的名称命名可能是个好主意。)

>>> gen = yielderFunction()
>>> dir(gen)
['__class__',
 ...
 '__iter__',    #Returns gen itself, to make it work uniformly with containers
 ...            #when given to a for loop. (Containers return an iterator instead.)
 'close',
 'gi_code',
 'gi_frame',
 'gi_running',
 'next',        #The method that runs the function's body.
 'send',
 'throw']

gi_codegi_frame字段是冻结状态的存储位置。用探索它们dir(..),我们可以确认我们上面的心理模型是可信的。

Here is a mental image of what yield does.

I like to think of a thread as having a stack (even when it’s not implemented that way).

When a normal function is called, it puts its local variables on the stack, does some computation, then clears the stack and returns. The values of its local variables are never seen again.

With a yield function, when its code begins to run (i.e. after the function is called, returning a generator object, whose next() method is then invoked), it similarly puts its local variables onto the stack and computes for a while. But then, when it hits the yield statement, before clearing its part of the stack and returning, it takes a snapshot of its local variables and stores them in the generator object. It also writes down the place where it’s currently up to in its code (i.e. the particular yield statement).

So it’s a kind of a frozen function that the generator is hanging onto.

When next() is called subsequently, it retrieves the function’s belongings onto the stack and re-animates it. The function continues to compute from where it left off, oblivious to the fact that it had just spent an eternity in cold storage.

Compare the following examples:

def normalFunction():
    return
    if False:
        pass

def yielderFunction():
    return
    if False:
        yield 12

When we call the second function, it behaves very differently to the first. The yield statement might be unreachable, but if it’s present anywhere, it changes the nature of what we’re dealing with.

>>> yielderFunction()
<generator object yielderFunction at 0x07742D28>

Calling yielderFunction() doesn’t run its code, but makes a generator out of the code. (Maybe it’s a good idea to name such things with the yielder prefix for readability.)

>>> gen = yielderFunction()
>>> dir(gen)
['__class__',
 ...
 '__iter__',    #Returns gen itself, to make it work uniformly with containers
 ...            #when given to a for loop. (Containers return an iterator instead.)
 'close',
 'gi_code',
 'gi_frame',
 'gi_running',
 'next',        #The method that runs the function's body.
 'send',
 'throw']

The gi_code and gi_frame fields are where the frozen state is stored. Exploring them with dir(..), we can confirm that our mental model above is credible.


回答 21

就像每个答案所建议的那样,yield用于创建序列生成器。它用于动态生成一些序列。例如,在网络上逐行读取文件时,可以使用以下yield功能:

def getNextLines():
   while con.isOpen():
       yield con.read()

您可以在代码中使用它,如下所示:

for line in getNextLines():
    doSomeThing(line)

执行控制转移陷阱

执行foryield时,执行控制将从getNextLines()转移到循环中。因此,每次调用getNextLines()时,都会从上次暂停的位置开始执行。

因此,简而言之,具有以下代码的函数

def simpleYield():
    yield "first time"
    yield "second time"
    yield "third time"
    yield "Now some useful value {}".format(12)

for i in simpleYield():
    print i

将打印

"first time"
"second time"
"third time"
"Now some useful value 12"

Like every answer suggests, yield is used for creating a sequence generator. It’s used for generating some sequence dynamically. For example, while reading a file line by line on a network, you can use the yield function as follows:

def getNextLines():
   while con.isOpen():
       yield con.read()

You can use it in your code as follows:

for line in getNextLines():
    doSomeThing(line)

Execution Control Transfer gotcha

The execution control will be transferred from getNextLines() to the for loop when yield is executed. Thus, every time getNextLines() is invoked, execution begins from the point where it was paused last time.

Thus in short, a function with the following code

def simpleYield():
    yield "first time"
    yield "second time"
    yield "third time"
    yield "Now some useful value {}".format(12)

for i in simpleYield():
    print i

will print

"first time"
"second time"
"third time"
"Now some useful value 12"

回答 22

一个简单的例子来了解它是什么: yield

def f123():
    for _ in range(4):
        yield 1
        yield 2


for i in f123():
    print (i)

输出为:

1 2 1 2 1 2 1 2

An easy example to understand what it is: yield

def f123():
    for _ in range(4):
        yield 1
        yield 2


for i in f123():
    print (i)

The output is:

1 2 1 2 1 2 1 2

回答 23

(我下面的回答仅从使用Python生成器的角度讲,而不是生成器机制基础实现,它涉及堆栈和堆操作的一些技巧。)

在python函数中yield使用when 代替a return时,该函数变成了一个特殊的名称generator function。该函数将返回一个generator类型的对象。yield关键字是一个标志,通知Python编译器将特殊对待这样的功能。普通函数将在返回一些值后终止。但是在编译器的帮助下,可以将 generator函数视为可恢复的。也就是说,将恢复执行上下文,并且将从上次运行继续执行。在您显式调用return之前,它将引发StopIteration异常(这也是迭代器协议的一部分),或到达函数的结尾。我发现了很多关于引用的generator,但是这一个从中functional programming perspective最容易消化。

(现在,我想根据我自己的理解来讨论其背后的原理generatoriterator基础。我希望这可以帮助您掌握迭代器和生成器的基本动机。这种概念也出现在其他语言中,例如C#。)

据我了解,当我们要处理一堆数据时,通常先将数据存储在某个地方,然后再逐一处理。但是这种幼稚的方法是有问题的。如果数据量巨大,则预先存储它们是很昂贵的。因此data,为什么不直接存储自身,为什么不metadata间接存储某种形式,即the logic how the data is computed

有两种包装此类元数据的方法。

  1. 面向对象的方法,我们包装了元数据as a class。这就是所谓的iterator实现迭代器协议的人(即__next__()__iter__()方法)。这也是常见的迭代器设计模式
  2. 在功能方法上,我们包装了元数据as a function。这就是所谓的generator function。但是在后台,返回的generator object静态IS-A迭代器仍然存在,因为它也实现了迭代器协议。

无论哪种方式,都会创建一个迭代器,即某个可以为您提供所需数据的对象。OO方法可能有点复杂。无论如何,要使用哪一个取决于您。

(My below answer only speaks from the perspective of using Python generator, not the underlying implementation of generator mechanism, which involves some tricks of stack and heap manipulation.)

When yield is used instead of a return in a python function, that function is turned into something special called generator function. That function will return an object of generator type. The yield keyword is a flag to notify the python compiler to treat such function specially. Normal functions will terminate once some value is returned from it. But with the help of the compiler, the generator function can be thought of as resumable. That is, the execution context will be restored and the execution will continue from last run. Until you explicitly call return, which will raise a StopIteration exception (which is also part of the iterator protocol), or reach the end of the function. I found a lot of references about generator but this one from the functional programming perspective is the most digestable.

(Now I want to talk about the rationale behind generator, and the iterator based on my own understanding. I hope this can help you grasp the essential motivation of iterator and generator. Such concept shows up in other languages as well such as C#.)

As I understand, when we want to process a bunch of data, we usually first store the data somewhere and then process it one by one. But this naive approach is problematic. If the data volume is huge, it’s expensive to store them as a whole beforehand. So instead of storing the data itself directly, why not store some kind of metadata indirectly, i.e. the logic how the data is computed.

There are 2 approaches to wrap such metadata.

  1. The OO approach, we wrap the metadata as a class. This is the so-called iterator who implements the iterator protocol (i.e. the __next__(), and __iter__() methods). This is also the commonly seen iterator design pattern.
  2. The functional approach, we wrap the metadata as a function. This is the so-called generator function. But under the hood, the returned generator object still IS-A iterator because it also implements the iterator protocol.

Either way, an iterator is created, i.e. some object that can give you the data you want. The OO approach may be a bit complex. Anyway, which one to use is up to you.


回答 24

总之,该yield语句将您的函数转换为一个工厂,该工厂产生一个称为a的特殊对象,该对象generator环绕原始函数的主体。当generator被重复,直到它到达下一个执行的功能yield后停止执行,计算结果为传递给值yield。它将在每次迭代中重复此过程,直到执行路径退出函数为止。例如,

def simple_generator():
    yield 'one'
    yield 'two'
    yield 'three'

for i in simple_generator():
    print i

简单地输出

one
two
three

动力来自将生成器与计算序列的循环配合使用,生成器每次执行循环都会停止,以“产生”下一个计算结果,这样就可以即时计算列表,而好处是可以存储保存用于特别大的计算

假设您想创建自己的range函数来产生可迭代的数字范围,则可以这样做,

def myRangeNaive(i):
    n = 0
    range = []
    while n < i:
        range.append(n)
        n = n + 1
    return range

像这样使用

for i in myRangeNaive(10):
    print i

但这是低效的,因为

  • 您创建只使用一次的数组(这会浪费内存)
  • 这段代码实际上在该数组上循环了两次!:(

幸运的是,Guido和他的团队足够慷慨地开发生成器,因此我们可以做到这一点。

def myRangeSmart(i):
    n = 0
    while n < i:
       yield n
       n = n + 1
    return

for i in myRangeSmart(10):
    print i

现在,每次迭代时,生成器上的一个称为next()函数的函数都会执行该函数,直到达到“ yield”语句为止,该语句在该语句中停止并“屈服”值或到达函数的末尾。在这种情况下,在第一次调用时,next()执行到yield语句并产生yield’n’,在下一次调用时,它将执行递增语句,跳回到’while’,对其求值,如果为true,它将停止并再次产生yield’n’,它将继续以这种方式,直到while条件返回false且生成器跳到函数的末尾。

In summary, the yield statement transforms your function into a factory that produces a special object called a generator which wraps around the body of your original function. When the generator is iterated, it executes your function until it reaches the next yield then suspends execution and evaluates to the value passed to yield. It repeats this process on each iteration until the path of execution exits the function. For instance,

def simple_generator():
    yield 'one'
    yield 'two'
    yield 'three'

for i in simple_generator():
    print i

simply outputs

one
two
three

The power comes from using the generator with a loop that calculates a sequence, the generator executes the loop stopping each time to ‘yield’ the next result of the calculation, in this way it calculates a list on the fly, the benefit being the memory saved for especially large calculations

Say you wanted to create a your own range function that produces an iterable range of numbers, you could do it like so,

def myRangeNaive(i):
    n = 0
    range = []
    while n < i:
        range.append(n)
        n = n + 1
    return range

and use it like this;

for i in myRangeNaive(10):
    print i

But this is inefficient because

  • You create an array that you only use once (this wastes memory)
  • This code actually loops over that array twice! :(

Luckily Guido and his team were generous enough to develop generators so we could just do this;

def myRangeSmart(i):
    n = 0
    while n < i:
       yield n
       n = n + 1
    return

for i in myRangeSmart(10):
    print i

Now upon each iteration a function on the generator called next() executes the function until it either reaches a ‘yield’ statement in which it stops and ‘yields’ the value or reaches the end of the function. In this case on the first call, next() executes up to the yield statement and yield ‘n’, on the next call it will execute the increment statement, jump back to the ‘while’, evaluate it, and if true, it will stop and yield ‘n’ again, it will continue that way until the while condition returns false and the generator jumps to the end of the function.


回答 25

Yield是一个对象

return函数中的A 将返回单个值。

如果您希望函数返回大量值,请使用yield

更重要的yield是,是一个障碍

就像CUDA语言中的barrier一样,它在完成之前不会转移控制权。

也就是说,它将从头开始运行函数中的代码,直到命中为止yield。然后,它将返回循环的第一个值。

然后,其他所有调用将再次运行您在函数中编写的循环,返回下一个值,直到没有任何值可返回为止。

Yield is an object

A return in a function will return a single value.

If you want a function to return a huge set of values, use yield.

More importantly, yield is a barrier.

like barrier in the CUDA language, it will not transfer control until it gets completed.

That is, it will run the code in your function from the beginning until it hits yield. Then, it’ll return the first value of the loop.

Then, every other call will run the loop you have written in the function one more time, returning the next value until there isn’t any value to return.


回答 26

许多人使用return而不是yield,但是在某些情况下yield可以更高效,更轻松地工作。

这是yield绝对适合的示例:

返回(函数中)

import random

def return_dates():
    dates = [] # With 'return' you need to create a list then return it
    for i in range(5):
        date = random.choice(["1st", "2nd", "3rd", "4th", "5th", "6th", "7th", "8th", "9th", "10th"])
        dates.append(date)
    return dates

Yield(以功能计)

def yield_dates():
    for i in range(5):
        date = random.choice(["1st", "2nd", "3rd", "4th", "5th", "6th", "7th", "8th", "9th", "10th"])
        yield date # 'yield' makes a generator automatically which works
                   # in a similar way. This is much more efficient.

通话功能

dates_list = return_dates()
print(dates_list)
for i in dates_list:
    print(i)

dates_generator = yield_dates()
print(dates_generator)
for i in dates_generator:
    print(i)

这两个函数执行相同的操作,但是yield使用三行而不是五行,并且少担心一个变量。

这是代码的结果:

输出量

如您所见,两个函数都做同样的事情。唯一的区别是return_dates()提供列表和yield_dates()生成器。

现实生活中的例子可能是像逐行读取文件,或者只是想生成一个生成器。

Many people use return rather than yield, but in some cases yield can be more efficient and easier to work with.

Here is an example which yield is definitely best for:

return (in function)

import random

def return_dates():
    dates = [] # With 'return' you need to create a list then return it
    for i in range(5):
        date = random.choice(["1st", "2nd", "3rd", "4th", "5th", "6th", "7th", "8th", "9th", "10th"])
        dates.append(date)
    return dates

yield (in function)

def yield_dates():
    for i in range(5):
        date = random.choice(["1st", "2nd", "3rd", "4th", "5th", "6th", "7th", "8th", "9th", "10th"])
        yield date # 'yield' makes a generator automatically which works
                   # in a similar way. This is much more efficient.

Calling functions

dates_list = return_dates()
print(dates_list)
for i in dates_list:
    print(i)

dates_generator = yield_dates()
print(dates_generator)
for i in dates_generator:
    print(i)

Both functions do the same thing, but yield uses three lines instead of five and has one less variable to worry about.

This is the result from the code:

Output

As you can see both functions do the same thing. The only difference is return_dates() gives a list and yield_dates() gives a generator.

A real life example would be something like reading a file line by line or if you just want to make a generator.


回答 27

yield就像函数的返回元素一样。不同之处在于,yield元素将功能转换为生成器。生成器的行为就像一个函数,直到“屈服”为止。生成器停止运行,直到下一次调用为止,并从与启动完全相同的点继续运行。您可以通过调用来获得所有“屈服”值的序列list(generator())

yield is like a return element for a function. The difference is, that the yield element turns a function into a generator. A generator behaves just like a function until something is ‘yielded’. The generator stops until it is next called, and continues from exactly the same point as it started. You can get a sequence of all the ‘yielded’ values in one, by calling list(generator()).


回答 28

yield关键字简单地收集返回结果。想想yieldreturn +=

The yield keyword simply collects returning results. Think of yield like return +=


回答 29

这是一种yield基于简单的方法来计算斐波那契数列,解释如下:

def fib(limit=50):
    a, b = 0, 1
    for i in range(limit):
       yield b
       a, b = b, a+b

当您将其输入到REPL中并尝试调用它时,您将得到一个神秘的结果:

>>> fib()
<generator object fib at 0x7fa38394e3b8>

这是因为存在yield向您发送信号的Python,您想要创建一个生成器,即一个按需生成值的对象。

那么,如何生成这些值?这可以通过使用内置函数直接完成,也可以next通过将其提供给使用值的构造间接完成。

使用内置next()函数,您可以直接调用.next/ __next__,强制生成器生成一个值:

>>> g = fib()
>>> next(g)
1
>>> next(g)
1
>>> next(g)
2
>>> next(g)
3
>>> next(g)
5

间接地,如果您提供fibfor循环,list初始化程序,tuple初始化程序或其他任何期望对象生成/产生值的对象,则将“消耗”生成器,直到无法再生成任何值(并且返回) :

results = []
for i in fib(30):       # consumes fib
    results.append(i) 
# can also be accomplished with
results = list(fib(30)) # consumes fib

同样,使用tuple初始化程序:

>>> tuple(fib(5))       # consumes fib
(1, 1, 2, 3, 5)

生成器在延迟方面与功能有所不同。它通过保持其本地状态并允许您在需要时恢复来实现此目的。

首次调用fib时:

f = fib()

Python编译函数,遇到yield关键字,然后简单地将生成器对象返回给您。看起来不是很有帮助。

然后,当您请求它直接或间接生成第一个值时,它将执行找到的所有语句,直到遇到a为止yield,然后返回您提供给它的值yield并暂停。为了更好地说明这一点,让我们使用一些print调用(print "text"在Python 2上用if 代替):

def yielder(value):
    """ This is an infinite generator. Only use next on it """ 
    while 1:
        print("I'm going to generate the value for you")
        print("Then I'll pause for a while")
        yield value
        print("Let's go through it again.")

现在,输入REPL:

>>> gen = yielder("Hello, yield!")

您现在有了一个生成器对象,等待一个命令来生成一个值。使用next并查看打印出的内容:

>>> next(gen) # runs until it finds a yield
I'm going to generate the value for you
Then I'll pause for a while
'Hello, yield!'

未报价的结果是所打印的内容。引用的结果是从返回的结果yieldnext现在再次调用:

>>> next(gen) # continues from yield and runs again
Let's go through it again.
I'm going to generate the value for you
Then I'll pause for a while
'Hello, yield!'

生成器会记住它在此处暂停yield value并从那里继续。打印下一条消息yield,并再次执行搜索以使其暂停的语句(由于while循环)。

Here’s a simple yield based approach, to compute the fibonacci series, explained:

def fib(limit=50):
    a, b = 0, 1
    for i in range(limit):
       yield b
       a, b = b, a+b

When you enter this into your REPL and then try and call it, you’ll get a mystifying result:

>>> fib()
<generator object fib at 0x7fa38394e3b8>

This is because the presence of yield signaled to Python that you want to create a generator, that is, an object that generates values on demand.

So, how do you generate these values? This can either be done directly by using the built-in function next, or, indirectly by feeding it to a construct that consumes values.

Using the built-in next() function, you directly invoke .next/__next__, forcing the generator to produce a value:

>>> g = fib()
>>> next(g)
1
>>> next(g)
1
>>> next(g)
2
>>> next(g)
3
>>> next(g)
5

Indirectly, if you provide fib to a for loop, a list initializer, a tuple initializer, or anything else that expects an object that generates/produces values, you’ll “consume” the generator until no more values can be produced by it (and it returns):

results = []
for i in fib(30):       # consumes fib
    results.append(i) 
# can also be accomplished with
results = list(fib(30)) # consumes fib

Similarly, with a tuple initializer:

>>> tuple(fib(5))       # consumes fib
(1, 1, 2, 3, 5)

A generator differs from a function in the sense that it is lazy. It accomplishes this by maintaining it’s local state and allowing you to resume whenever you need to.

When you first invoke fib by calling it:

f = fib()

Python compiles the function, encounters the yield keyword and simply returns a generator object back at you. Not very helpful it seems.

When you then request it generates the first value, directly or indirectly, it executes all statements that it finds, until it encounters a yield, it then yields back the value you supplied to yield and pauses. For an example that better demonstrates this, let’s use some print calls (replace with print "text" if on Python 2):

def yielder(value):
    """ This is an infinite generator. Only use next on it """ 
    while 1:
        print("I'm going to generate the value for you")
        print("Then I'll pause for a while")
        yield value
        print("Let's go through it again.")

Now, enter in the REPL:

>>> gen = yielder("Hello, yield!")

you have a generator object now waiting for a command for it to generate a value. Use next and see what get’s printed:

>>> next(gen) # runs until it finds a yield
I'm going to generate the value for you
Then I'll pause for a while
'Hello, yield!'

The unquoted results are what’s printed. The quoted result is what is returned from yield. Call next again now:

>>> next(gen) # continues from yield and runs again
Let's go through it again.
I'm going to generate the value for you
Then I'll pause for a while
'Hello, yield!'

The generator remembers it was paused at yield value and resumes from there. The next message is printed and the search for the yield statement to pause at it performed again (due to the while loop).