了解Python中的生成器

问题:了解Python中的生成器

我目前正在阅读Python食谱,目前正在研究生成器。我发现很难回头。

我来自Java的背景,是否有Java的等效语言?这本书讲的是“生产者/消费者”,但是当我听说线程的时候。

什么是生成器,为什么要使用它?显然,无需引用任何书籍(除非您可以直接从书籍中找到一个体面,简单的答案)。也许举一些例子,如果您感到慷慨!

I am reading the Python cookbook at the moment and am currently looking at generators. I’m finding it hard to get my head round.

As I come from a Java background, is there a Java equivalent? The book was speaking about ‘Producer / Consumer’, however when I hear that I think of threading.

What is a generator and why would you use it? Without quoting any books, obviously (unless you can find a decent, simplistic answer direct from a book). Perhaps with examples, if you’re feeling generous!


回答 0

注意:本文采用Python 3.x语法。

一个生成器仅仅是它返回一个对象,你可以调用一个函数next,这样在每次调用它返回一定的价值,直到它提出了一个StopIterationexceptions,这表明所有值已经产生。这样的对象称为迭代器

普通函数使用来返回单个值return,就像Java中一样。但是,在Python中,有一个替代方法称为yieldyield在函数中的任何地方使用它都会使其生成器。遵守以下代码:

>>> def myGen(n):
...     yield n
...     yield n + 1
... 
>>> g = myGen(6)
>>> next(g)
6
>>> next(g)
7
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

如您所见,myGen(n)是一个产生n和的函数n + 1。每次调用都会next产生一个值,直到产生所有值为止。for循环next在后台调用,因此:

>>> for n in myGen(6):
...     print(n)
... 
6
7

同样,还有生成器表达式,它们提供了一种方法来简要描述某些常见的生成器类型:

>>> g = (n for n in range(3, 5))
>>> next(g)
3
>>> next(g)
4
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

请注意,生成器表达式非常类似于列表推导

>>> lc = [n for n in range(3, 5)]
>>> lc
[3, 4]

观察到生成器对象仅生成一次,但是其代码并非一次运行。仅调用以next实际执行(部分)代码。一旦yield到达语句,生成器中的代码将停止执​​行,并在该语句上返回值。next然后,对下一个调用会导致执行在生成器在最后一个生成器被保留的状态下继续执行yield。这是常规函数的根本区别:常规函数始终在“顶部”开始执行,并在返回值时丢弃其状态。

关于这个主题还有更多的事情要说。例如,可以将send数据返回到生成器(参考)中。但这是我建议您在了解生成器的基本概念之前不要研究的东西。

现在您可能会问:为什么使用生成器?有两个很好的理由:

  • 使用生成器可以更简洁地描述某些概念。
  • 无需创建返回值列表的函数,而是可以编写生成器以动态生成值。这意味着不需要构造任何列表,这意味着生成的代码具有更高的内存效率。这样,甚至可以描述太大而无法容纳在内存中的数据流。
  • 生成器提供了一种自然的方式来描述无限流。考虑例如斐波那契数

    >>> def fib():
    ...     a, b = 0, 1
    ...     while True:
    ...         yield a
    ...         a, b = b, a + b
    ... 
    >>> import itertools
    >>> list(itertools.islice(fib(), 10))
    [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

    该代码用于itertools.islice从无限流中获取有限数量的元素。建议您仔细看一下itertools模块中的功能,因为它们是轻松编写高级生成器的基本工具。


  关于Python <= 2.6:在上面的示例中next是一个函数,该函数__next__在给定对象上调用方法。在Python <= 2.6中,使用了一种稍有不同的技术,即o.next()代替next(o)。Python 2.7具有next()call,.next因此您无需在2.7中使用以下内容:

>>> g = (n for n in range(3, 5))
>>> g.next()
3

Note: this post assumes Python 3.x syntax.

A generator is simply a function which returns an object on which you can call next, such that for every call it returns some value, until it raises a StopIteration exception, signaling that all values have been generated. Such an object is called an iterator.

Normal functions return a single value using return, just like in Java. In Python, however, there is an alternative, called yield. Using yield anywhere in a function makes it a generator. Observe this code:

>>> def myGen(n):
...     yield n
...     yield n + 1
... 
>>> g = myGen(6)
>>> next(g)
6
>>> next(g)
7
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

As you can see, myGen(n) is a function which yields n and n + 1. Every call to next yields a single value, until all values have been yielded. for loops call next in the background, thus:

>>> for n in myGen(6):
...     print(n)
... 
6
7

Likewise there are generator expressions, which provide a means to succinctly describe certain common types of generators:

>>> g = (n for n in range(3, 5))
>>> next(g)
3
>>> next(g)
4
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Note that generator expressions are much like list comprehensions:

>>> lc = [n for n in range(3, 5)]
>>> lc
[3, 4]

Observe that a generator object is generated once, but its code is not run all at once. Only calls to next actually execute (part of) the code. Execution of the code in a generator stops once a yield statement has been reached, upon which it returns a value. The next call to next then causes execution to continue in the state in which the generator was left after the last yield. This is a fundamental difference with regular functions: those always start execution at the “top” and discard their state upon returning a value.

There are more things to be said about this subject. It is e.g. possible to send data back into a generator (reference). But that is something I suggest you do not look into until you understand the basic concept of a generator.

Now you may ask: why use generators? There are a couple of good reasons:

  • Certain concepts can be described much more succinctly using generators.
  • Instead of creating a function which returns a list of values, one can write a generator which generates the values on the fly. This means that no list needs to be constructed, meaning that the resulting code is more memory efficient. In this way one can even describe data streams which would simply be too large to fit in memory.
  • Generators allow for a natural way to describe infinite streams. Consider for example the Fibonacci numbers:

    >>> def fib():
    ...     a, b = 0, 1
    ...     while True:
    ...         yield a
    ...         a, b = b, a + b
    ... 
    >>> import itertools
    >>> list(itertools.islice(fib(), 10))
    [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
    

    This code uses itertools.islice to take a finite number of elements from an infinite stream. You are advised to have a good look at the functions in the itertools module, as they are essential tools for writing advanced generators with great ease.


  About Python <=2.6: in the above examples next is a function which calls the method __next__ on the given object. In Python <=2.6 one uses a slightly different technique, namely o.next() instead of next(o). Python 2.7 has next() call .next so you need not use the following in 2.7:

>>> g = (n for n in range(3, 5))
>>> g.next()
3

回答 1

生成器实际上是一个在完成之前返回(数据)的函数,但是它会在那一点暂停,您可以在那一点恢复该函数。

>>> def myGenerator():
...     yield 'These'
...     yield 'words'
...     yield 'come'
...     yield 'one'
...     yield 'at'
...     yield 'a'
...     yield 'time'

>>> myGeneratorInstance = myGenerator()
>>> next(myGeneratorInstance)
These
>>> next(myGeneratorInstance)
words

等等。生成器的(或一个)好处是,因为生成器一次处理一个数据,所以您可以处理大量数据;对于列表,过多的内存需求可能会成为问题。生成器与列表一样,都是可迭代的,因此可以以相同的方式使用它们:

>>> for word in myGeneratorInstance:
...     print word
These
words
come
one
at 
a 
time

请注意,生成器提供了另一种处理无穷大的方法,例如

>>> from time import gmtime, strftime
>>> def myGen():
...     while True:
...         yield strftime("%a, %d %b %Y %H:%M:%S +0000", gmtime())    
>>> myGeneratorInstance = myGen()
>>> next(myGeneratorInstance)
Thu, 28 Jun 2001 14:17:15 +0000
>>> next(myGeneratorInstance)
Thu, 28 Jun 2001 14:18:02 +0000   

生成器封装了一个无限循环,但这不是问题,因为每次请求它时,您只会得到每个答案。

A generator is effectively a function that returns (data) before it is finished, but it pauses at that point, and you can resume the function at that point.

>>> def myGenerator():
...     yield 'These'
...     yield 'words'
...     yield 'come'
...     yield 'one'
...     yield 'at'
...     yield 'a'
...     yield 'time'

>>> myGeneratorInstance = myGenerator()
>>> next(myGeneratorInstance)
These
>>> next(myGeneratorInstance)
words

and so on. The (or one) benefit of generators is that because they deal with data one piece at a time, you can deal with large amounts of data; with lists, excessive memory requirements could become a problem. Generators, just like lists, are iterable, so they can be used in the same ways:

>>> for word in myGeneratorInstance:
...     print word
These
words
come
one
at 
a 
time

Note that generators provide another way to deal with infinity, for example

>>> from time import gmtime, strftime
>>> def myGen():
...     while True:
...         yield strftime("%a, %d %b %Y %H:%M:%S +0000", gmtime())    
>>> myGeneratorInstance = myGen()
>>> next(myGeneratorInstance)
Thu, 28 Jun 2001 14:17:15 +0000
>>> next(myGeneratorInstance)
Thu, 28 Jun 2001 14:18:02 +0000   

The generator encapsulates an infinite loop, but this isn’t a problem because you only get each answer every time you ask for it.


回答 2

首先,术语“ 生成器”最初在Python中定义不清,从而引起很多混乱。您可能是指迭代器可迭代对象(请参阅此处)。然后在Python中还有生成器函数(返回生成器对象),生成器对象(即迭代器)和生成器表达式(它们被评估为生成器对象)。

根据生成器的词汇表条目,似乎正式的术语是生成器是“生成器功能”的缩写。过去,文档中对术语的定义不一致,但是幸运的是,此问题已得到解决。

精确一点,避免在没有进一步说明的情况下使用术语“生成器”可能仍然是一个好主意。

First of all, the term generator originally was somewhat ill-defined in Python, leading to lots of confusion. You probably mean iterators and iterables (see here). Then in Python there are also generator functions (which return a generator object), generator objects (which are iterators) and generator expressions (which are evaluated to a generator object).

According to the glossary entry for generator it seems that the official terminology is now that generator is short for “generator function”. In the past the documentation defined the terms inconsistently, but fortunately this has been fixed.

It might still be a good idea to be precise and avoid the term “generator” without further specification.


回答 3

生成器可以被认为是创建迭代器的简写。它们的行为类似于Java迭代器。例:

>>> g = (x for x in range(10))
>>> g
<generator object <genexpr> at 0x7fac1c1e6aa0>
>>> g.next()
0
>>> g.next()
1
>>> g.next()
2
>>> list(g)   # force iterating the rest
[3, 4, 5, 6, 7, 8, 9]
>>> g.next()  # iterator is at the end; calling next again will throw
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

希望这对您有所帮助。

更新:

正如许多其他答案所示,创建生成器的方法有很多。您可以像上面的示例一样使用括号语法,也可以使用yield。另一个有趣的功能是生成器可以是“无限的”-不会停止的迭代器:

>>> def infinite_gen():
...     n = 0
...     while True:
...         yield n
...         n = n + 1
... 
>>> g = infinite_gen()
>>> g.next()
0
>>> g.next()
1
>>> g.next()
2
>>> g.next()
3
...

Generators could be thought of as shorthand for creating an iterator. They behave like a Java Iterator. Example:

>>> g = (x for x in range(10))
>>> g
<generator object <genexpr> at 0x7fac1c1e6aa0>
>>> g.next()
0
>>> g.next()
1
>>> g.next()
2
>>> list(g)   # force iterating the rest
[3, 4, 5, 6, 7, 8, 9]
>>> g.next()  # iterator is at the end; calling next again will throw
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Hope this helps/is what you are looking for.

Update:

As many other answers are showing, there are different ways to create a generator. You can use the parentheses syntax as in my example above, or you can use yield. Another interesting feature is that generators can be “infinite” — iterators that don’t stop:

>>> def infinite_gen():
...     n = 0
...     while True:
...         yield n
...         n = n + 1
... 
>>> g = infinite_gen()
>>> g.next()
0
>>> g.next()
1
>>> g.next()
2
>>> g.next()
3
...

回答 4

没有等效的Java。

这是一个人为的例子:

#! /usr/bin/python
def  mygen(n):
    x = 0
    while x < n:
        x = x + 1
        if x % 3 == 0:
            yield x

for a in mygen(100):
    print a

生成器中存在一个从0到n的循环,如果循环变量是3的倍数,它将产生该变量。

for循环的每次迭代期间,都会执行生成器。如果这是生成器的第一次执行,则它从头开始,否则从生成的上一时间开始继续。

There is no Java equivalent.

Here is a bit of a contrived example:

#! /usr/bin/python
def  mygen(n):
    x = 0
    while x < n:
        x = x + 1
        if x % 3 == 0:
            yield x

for a in mygen(100):
    print a

There is a loop in the generator that runs from 0 to n, and if the loop variable is a multiple of 3, it yields the variable.

During each iteration of the for loop the generator is executed. If it is the first time the generator executes, it starts at the beginning, otherwise it continues from the previous time it yielded.


回答 5

我喜欢用堆栈框架向那些在编程语言和计算领域具有良好背景的人描述生成器。

在许多语言中,有一个堆栈,其上方是当前堆栈“框架”。堆栈框架包括为函数本地变量分配的空间,包括传递给该函数的参数。

调用函数时,当前执行点(“程序计数器”或等效程序)被压入堆栈,并创建一个新的堆栈框架。然后执行将转移到被调用函数的开头。

对于常规函数,函数有时会返回一个值,并且堆栈会“弹出”。该函数的堆栈帧将被丢弃,并在先前的位置恢复执行。

当函数是生成器时,它可以使用yield语句返回一个值,不会丢弃堆栈帧。函数中局部变量的值和程序计数器将保留。这允许生成器在以后的时间恢复,并从yield语句继续执行,并且可以执行更多代码并返回另一个值。

在Python 2.5之前,这是所有生成器所做的。Python 2.5的加入到传回值的能力到生成器为好。这样,传入的值就可以用作由yield语句生成的表达式,该语句从生成器临时返回了控制(和值)。

生成器的主要优点是保留了函数的“状态”,这与常规函数不同,在常规函数中,每次丢弃堆栈帧时,您都会丢失所有的“状态”。第二个优点是避免了某些函数调用开销(创建和删除堆栈帧),尽管这通常是次要的优点。

I like to describe generators, to those with a decent background in programming languages and computing, in terms of stack frames.

In many languages, there is a stack on top of which is the current stack “frame”. The stack frame includes space allocated for variables local to the function including the arguments passed in to that function.

When you call a function, the current point of execution (the “program counter” or equivalent) is pushed onto the stack, and a new stack frame is created. Execution then transfers to the beginning of the function being called.

With regular functions, at some point the function returns a value, and the stack is “popped”. The function’s stack frame is discarded and execution resumes at the previous location.

When a function is a generator, it can return a value without the stack frame being discarded, using the yield statement. The values of local variables and the program counter within the function are preserved. This allows the generator to be resumed at a later time, with execution continuing from the yield statement, and it can execute more code and return another value.

Before Python 2.5 this was all generators did. Python 2.5 added the ability to pass values back in to the generator as well. In doing so, the passed-in value is available as an expression resulting from the yield statement which had temporarily returned control (and a value) from the generator.

The key advantage to generators is that the “state” of the function is preserved, unlike with regular functions where each time the stack frame is discarded, you lose all that “state”. A secondary advantage is that some of the function call overhead (creating and deleting stack frames) is avoided, though this is a usually a minor advantage.


回答 6

我可以添加到Stephan202答案中的唯一一件事是建议您看一下David Beazley在PyCon ’08上的演讲“系统程序员的生成器技巧”,这是我所见过的生成器的方式和原因的最好的单一解释。任何地方。这就是让我从“ Python看起来很有趣”到“这就是我一直在寻找的东西”的原因。在http://www.dabeaz.com/generators/上

The only thing I can add to Stephan202’s answer is a recommendation that you take a look at David Beazley’s PyCon ’08 presentation “Generator Tricks for Systems Programmers,” which is the best single explanation of the how and why of generators that I’ve seen anywhere. This is the thing that took me from “Python looks kind of fun” to “This is what I’ve been looking for.” It’s at http://www.dabeaz.com/generators/.


回答 7

有助于清楚地区分函数foo和生成器foo(n):

def foo(n):
    yield n
    yield n+1

foo是一个函数。foo(6)是一个生成器对象。

使用生成器对象的典型方法是在循环中:

for n in foo(6):
    print(n)

循环打印

# 6
# 7

将生成器视为可恢复的功能。

yield类似的行为return在某种意义上说,那些取得了获取值的“返回”由生成器。但是,与return不同的是,下一次与生成器不同的是,生成器的函数foo在上一个yield语句之后恢复从上次中断的位置继续运行,直到遇到另一个yield语句为止。

在后台,当您调用bar=foo(6)生成器对象栏时,便会为其定义一个next属性。

您可以自己调用它以检索从foo产生的值:

next(bar)    # Works in Python 2.6 or Python 3.x
bar.next()   # Works in Python 2.5+, but is deprecated. Use next() if possible.

当foo结束时(并且不再有产生的值),调用next(bar)将引发StopInteration错误。

It helps to make a clear distinction between the function foo, and the generator foo(n):

def foo(n):
    yield n
    yield n+1

foo is a function. foo(6) is a generator object.

The typical way to use a generator object is in a loop:

for n in foo(6):
    print(n)

The loop prints

# 6
# 7

Think of a generator as a resumable function.

yield behaves like return in the sense that values that are yielded get “returned” by the generator. Unlike return, however, the next time the generator gets asked for a value, the generator’s function, foo, resumes where it left off — after the last yield statement — and continues to run until it hits another yield statement.

Behind the scenes, when you call bar=foo(6) the generator object bar is defined for you to have a next attribute.

You can call it yourself to retrieve values yielded from foo:

next(bar)    # Works in Python 2.6 or Python 3.x
bar.next()   # Works in Python 2.5+, but is deprecated. Use next() if possible.

When foo ends (and there are no more yielded values), calling next(bar) throws a StopInteration error.


回答 8

这篇文章将使用斐波那契数作为工具来解释Python生成器的有用性。

这篇文章将同时介绍C ++和Python代码。

斐波那契数定义为以下顺序:0、1、1、2、3、5、8、13、21、34,…。

或一般来说:

F0 = 0
F1 = 1
Fn = Fn-1 + Fn-2

这可以非常容易地转移到C ++函数中:

size_t Fib(size_t n)
{
    //Fib(0) = 0
    if(n == 0)
        return 0;

    //Fib(1) = 1
    if(n == 1)
        return 1;

    //Fib(N) = Fib(N-2) + Fib(N-1)
    return Fib(n-2) + Fib(n-1);
}

但是,如果要打印前六个斐波那契数,则将使用上述函数重新计算很多值。

例如:Fib(3) = Fib(2) + Fib(1),而且Fib(2)还会重新计算Fib(1)。您想要计算的值越高,您的收益就越差。

因此,可能会想通过跟踪中的状态来重写上面的内容main

// Not supported for the first two elements of Fib
size_t GetNextFib(size_t &pp, size_t &p)
{
    int result = pp + p;
    pp = p;
    p = result;
    return result;
}

int main(int argc, char *argv[])
{
    size_t pp = 0;
    size_t p = 1;
    std::cout << "0 " << "1 ";
    for(size_t i = 0; i <= 4; ++i)
    {
        size_t fibI = GetNextFib(pp, p);
        std::cout << fibI << " ";
    }
    return 0;
}

但这很丑陋,并且使中的逻辑复杂化main。最好不必担心我们main功能的状态。

我们可以返回一个vector值a ,并使用一个iterator来遍历该组值,但是对于大量的返回值,这一次需要大量内存。

回到我们以前的方法,如果我们除了打印数字还想做其他事情,会发生什么?我们必须复制并粘贴整个代码块,main然后将输出语句更改为我们想要执行的其他任何操作。而且,如果您复制并粘贴代码,则应该被枪杀。你不想被枪杀,是吗?

为了解决这些问题并避免被枪杀,我们可以使用回调函数重写此代码块。每次遇到新的斐波那契数字时,我们都会调用回调函数。

void GetFibNumbers(size_t max, void(*FoundNewFibCallback)(size_t))
{
    if(max-- == 0) return;
    FoundNewFibCallback(0);
    if(max-- == 0) return;
    FoundNewFibCallback(1);

    size_t pp = 0;
    size_t p = 1;
    for(;;)
    {
        if(max-- == 0) return;
        int result = pp + p;
        pp = p;
        p = result;
        FoundNewFibCallback(result);
    }
}

void foundNewFib(size_t fibI)
{
    std::cout << fibI << " ";
}

int main(int argc, char *argv[])
{
    GetFibNumbers(6, foundNewFib);
    return 0;
}

显然,这是一种改进,您的输入逻辑main并不那么混乱,您可以使用斐波那契数字进行任何操作,只需定义新的回调即可。

但这仍然不是完美的。如果您只想获取前两个斐波那契数,然后做某事,然后再获取更多,然后再做其他事情,该怎么办?

好吧,我们可以像main往常一样继续,我们可以再次将状态添加到中,从而允许GetFibNumbers从任意点开始。但这将使我们的代码更加膨胀,对于像打印斐波那契数字这样的简单任务而言,它看起来已经太大了。

我们可以通过几个线程来实现生产者和消费者模型。但这使代码更加复杂。

相反,让我们谈论生成器。

Python具有很好的语言功能,可以解决诸如此类的生成器之类的问题。

生成器允许您执行功能,在任意点处停止,然后从上次中断的地方继续执行。每次返回一个值。

考虑以下使用生成器的代码:

def fib():
    pp, p = 0, 1
    while 1:
        yield pp
        pp, p = p, pp+p

g = fib()
for i in range(6):
    g.next()

这给了我们结果:

0 1 1 2 3 5

yield语句与Python生成器结合使用。它保存函数的状态并返回yeilded值。下次您在生成器上调用next()函数时,它将在中断收益率的地方继续。

到目前为止,这比回调函数代码更干净。我们拥有更干净的代码,更小的代码,更不用说更多的功能代码了(Python允许任意大的整数)。

资源

This post will use Fibonacci numbers as a tool to build up to explaining the usefulness of Python generators.

This post will feature both C++ and Python code.

Fibonacci numbers are defined as the sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ….

Or in general:

F0 = 0
F1 = 1
Fn = Fn-1 + Fn-2

This can be transferred into a C++ function extremely easily:

size_t Fib(size_t n)
{
    //Fib(0) = 0
    if(n == 0)
        return 0;

    //Fib(1) = 1
    if(n == 1)
        return 1;

    //Fib(N) = Fib(N-2) + Fib(N-1)
    return Fib(n-2) + Fib(n-1);
}

But if you want to print the first six Fibonacci numbers, you will be recalculating a lot of the values with the above function.

For example: Fib(3) = Fib(2) + Fib(1), but Fib(2) also recalculates Fib(1). The higher the value you want to calculate, the worse off you will be.

So one may be tempted to rewrite the above by keeping track of the state in main.

// Not supported for the first two elements of Fib
size_t GetNextFib(size_t &pp, size_t &p)
{
    int result = pp + p;
    pp = p;
    p = result;
    return result;
}

int main(int argc, char *argv[])
{
    size_t pp = 0;
    size_t p = 1;
    std::cout << "0 " << "1 ";
    for(size_t i = 0; i <= 4; ++i)
    {
        size_t fibI = GetNextFib(pp, p);
        std::cout << fibI << " ";
    }
    return 0;
}

But this is very ugly, and it complicates our logic in main. It would be better to not have to worry about state in our main function.

We could return a vector of values and use an iterator to iterate over that set of values, but this requires a lot of memory all at once for a large number of return values.

So back to our old approach, what happens if we wanted to do something else besides print the numbers? We’d have to copy and paste the whole block of code in main and change the output statements to whatever else we wanted to do. And if you copy and paste code, then you should be shot. You don’t want to get shot, do you?

To solve these problems, and to avoid getting shot, we may rewrite this block of code using a callback function. Every time a new Fibonacci number is encountered, we would call the callback function.

void GetFibNumbers(size_t max, void(*FoundNewFibCallback)(size_t))
{
    if(max-- == 0) return;
    FoundNewFibCallback(0);
    if(max-- == 0) return;
    FoundNewFibCallback(1);

    size_t pp = 0;
    size_t p = 1;
    for(;;)
    {
        if(max-- == 0) return;
        int result = pp + p;
        pp = p;
        p = result;
        FoundNewFibCallback(result);
    }
}

void foundNewFib(size_t fibI)
{
    std::cout << fibI << " ";
}

int main(int argc, char *argv[])
{
    GetFibNumbers(6, foundNewFib);
    return 0;
}

This is clearly an improvement, your logic in main is not as cluttered, and you can do anything you want with the Fibonacci numbers, simply define new callbacks.

But this is still not perfect. What if you wanted to only get the first two Fibonacci numbers, and then do something, then get some more, then do something else?

Well, we could go on like we have been, and we could start adding state again into main, allowing GetFibNumbers to start from an arbitrary point. But this will further bloat our code, and it already looks too big for a simple task like printing Fibonacci numbers.

We could implement a producer and consumer model via a couple of threads. But this complicates the code even more.

Instead let’s talk about generators.

Python has a very nice language feature that solves problems like these called generators.

A generator allows you to execute a function, stop at an arbitrary point, and then continue again where you left off. Each time returning a value.

Consider the following code that uses a generator:

def fib():
    pp, p = 0, 1
    while 1:
        yield pp
        pp, p = p, pp+p

g = fib()
for i in range(6):
    g.next()

Which gives us the results:

0 1 1 2 3 5

The yield statement is used in conjuction with Python generators. It saves the state of the function and returns the yeilded value. The next time you call the next() function on the generator, it will continue where the yield left off.

This is by far more clean than the callback function code. We have cleaner code, smaller code, and not to mention much more functional code (Python allows arbitrarily large integers).

Source


回答 9

我相信迭代器和生成器的首次出现是在20年前的Icon编程语言中。

您可能会喜欢Icon概述,它使您可以专心围绕它们,而不必专注于语法(因为Icon是您可能不知道的语言,并且Griswold向其他语言的人解释了他的语言的好处)。

在这里仅阅读了几段之后,生成器和迭代器的实用程序可能会变得更加明显。

I believe the first appearance of iterators and generators were in the Icon programming language, about 20 years ago.

You may enjoy the Icon overview, which lets you wrap your head around them without concentrating on the syntax (since Icon is a language you probably don’t know, and Griswold was explaining the benefits of his language to people coming from other languages).

After reading just a few paragraphs there, the utility of generators and iterators might become more apparent.


回答 10

列表理解的经验表明它们在整个Python中具有广泛的实用性。但是,许多用例不需要在内存中创建完整列表。相反,它们只需要一次遍历一个元素。

例如,以下求和代码将在内存中构建一个完整的正方形列表,遍历这些值,并且在不再需要引用时,删除该列表:

sum([x*x for x in range(10)])

通过使用生成器表达式来节省内存:

sum(x*x for x in range(10))

容器对象的构造函数具有类似的好处:

s = Set(word  for line in page  for word in line.split())
d = dict( (k, func(k)) for k in keylist)

生成器表达式对于诸如sum(),min()和max()之类的函数特别有用,这些函数将可迭代的输入减少为单个值:

max(len(line)  for line in file  if line.strip())

更多

Experience with list comprehensions has shown their widespread utility throughout Python. However, many of the use cases do not need to have a full list created in memory. Instead, they only need to iterate over the elements one at a time.

For instance, the following summation code will build a full list of squares in memory, iterate over those values, and, when the reference is no longer needed, delete the list:

sum([x*x for x in range(10)])

Memory is conserved by using a generator expression instead:

sum(x*x for x in range(10))

Similar benefits are conferred on constructors for container objects:

s = Set(word  for line in page  for word in line.split())
d = dict( (k, func(k)) for k in keylist)

Generator expressions are especially useful with functions like sum(), min(), and max() that reduce an iterable input to a single value:

max(len(line)  for line in file  if line.strip())

more


回答 11

我编写了这段代码,解释了有关生成器的3个关键概念:

def numbers():
    for i in range(10):
            yield i

gen = numbers() #this line only returns a generator object, it does not run the code defined inside numbers

for i in gen: #we iterate over the generator and the values are printed
    print(i)

#the generator is now empty

for i in gen: #so this for block does not print anything
    print(i)

I put up this piece of code which explains 3 key concepts about generators:

def numbers():
    for i in range(10):
            yield i

gen = numbers() #this line only returns a generator object, it does not run the code defined inside numbers

for i in gen: #we iterate over the generator and the values are printed
    print(i)

#the generator is now empty

for i in gen: #so this for block does not print anything
    print(i)