Python空生成器函数

问题:Python空生成器函数

在python中,可以通过将yield关键字放在函数主体中来轻松定义迭代器函数,例如:

def gen():
    for i in range(100):
        yield i

我如何定义不产生任何值的生成器函数(生成0个值),以下代码不起作用,因为python无法知道它应该是生成器而不是普通函数:

def empty():
    pass

我可以做类似的事情

def empty():
    if False:
        yield None

但这将是非常丑陋的。有什么好的方法可以实现空的迭代器功能?

In python, one can easily define an iterator function, by putting the yield keyword in the function’s body, such as:

def gen():
    for i in range(100):
        yield i

How can I define a generator function that yields no value (generates 0 values), the following code doesn’t work, since python cannot know that it is supposed to be an generator and not a normal function:

def empty():
    pass

I could do something like

def empty():
    if False:
        yield None

But that would be very ugly. Is there any nice way to realize an empty iterator function?


回答 0

您可以return在生成器中使用一次;它会停止迭代而不会产生任何结果,因此提供了一种使函数超出范围的明确选择。因此,用于yield将函数转换为生成器,但在return产生任何内容之前先终止该生成器。

>>> def f():
...     return
...     yield
... 
>>> list(f())
[]

我不确定这是否比您拥有的要好得多-它只是将无操作if语句替换为无操作yield语句。但这更惯用了。请注意,仅使用yield不起作用。

>>> def f():
...     yield
... 
>>> list(f())
[None]

为什么不只是使用iter(())

这个问题专门询问一个空的生成器函数。因此,我将其视为关于Python语法的内部一致性的问题,而不是一般而言有关创建空迭代器的最佳方法的问题。

如果问题实际上是关于创建一个空迭代器的最佳方法,那么您可能同意Zectbumo关于使用它的iter(())替代方法。但是,请务必注意iter(())不要返回函数!它直接返回一个空的Iterable。假设您正在使用一个期望可调用返回一个可迭代对象的API 。您必须执行以下操作:

def empty():
    return iter(())

(信用额应转到Unutbu,以给出此答案的第一个正确版本。)

现在,您可能会发现上面的内容更加清晰,但是我可以想象一下情况不太清楚的情况。考虑以下(伪造的)生成器函数定义列表的示例:

def zeros():
    while True:
        yield 0

def ones():
    while True:
        yield 1

...

在这个长长的列表的末尾,我宁愿看到其中包含yield的内容,如下所示:

def empty():
    return
    yield

或者,在Python 3.3及更高版本中(如DSM所建议的那样):

def empty():
    yield from ()

对存在yield关键字清楚地在最短的一瞥,这只是另一个发生器功能,所有其他的一模一样。花更多的时间才能看到该iter(())版本正在执行相同的操作。

这是一个细微的差异,但是老实说,我认为yield基于功能的函数更具可读性和可维护性。

You can use return once in a generator; it stops iteration without yielding anything, and thus provides an explicit alternative to letting the function run out of scope. So use yield to turn the function into a generator, but precede it with return to terminate the generator before yielding anything.

>>> def f():
...     return
...     yield
... 
>>> list(f())
[]

I’m not sure it’s that much better than what you have — it just replaces a no-op if statement with a no-op yield statement. But it is more idiomatic. Note that just using yield doesn’t work.

>>> def f():
...     yield
... 
>>> list(f())
[None]

Why not just use iter(())?

This question asks specifically about an empty generator function. For that reason, I take it to be a question about the internal consistency of Python’s syntax, rather than a question about the best way to create an empty iterator in general.

If question is actually about the best way to create an empty iterator, then you might agree with Zectbumo about using iter(()) instead. However, it’s important to observe that iter(()) doesn’t return a function! It directly returns an empty iterable. Suppose you’re working with an API that expects a callable that returns an iterable each time it’s called, just like an ordinary generator function. You’ll have to do something like this:

def empty():
    return iter(())

(Credit should go to Unutbu for giving the first correct version of this answer.)

Now, you may find the above clearer, but I can imagine situations in which it would be less clear. Consider this example of a long list of (contrived) generator function definitions:

def zeros():
    while True:
        yield 0

def ones():
    while True:
        yield 1

...

At the end of that long list, I’d rather see something with a yield in it, like this:

def empty():
    return
    yield

or, in Python 3.3 and above (as suggested by DSM), this:

def empty():
    yield from ()

The presence of the yield keyword makes it clear at the briefest glance that this is just another generator function, exactly like all the others. It takes a bit more time to see that the iter(()) version is doing the same thing.

It’s a subtle difference, but I honestly think the yield-based functions are more readable and maintainable.

See also this great answer from user3840170 that uses dis to show another reason why this approach is preferable: it emits the fewest instructions when compiled.


回答 1

iter(())

不需要生成器。来吧!

iter(())

You don’t require a generator. C’mon guys!


回答 2

Python 3.3(因为我yield from踢了,因为@senderle偷走了我的第一个念头):

>>> def f():
...     yield from ()
... 
>>> list(f())
[]

但是我不得不承认,我很难为此提出一个用例,iter([])否则用例(x)range(0)就不能很好地解决问题。

Python 3.3 (because I’m on a yield from kick, and because @senderle stole my first thought):

>>> def f():
...     yield from ()
... 
>>> list(f())
[]

But I have to admit, I’m having a hard time coming up with a use case for this for which iter([]) or (x)range(0) wouldn’t work equally well.


回答 3

另一个选择是:

(_ for _ in ())

Another option is:

(_ for _ in ())

回答 4

它一定是生成器函数吗?如果没有的话

def f():
    return iter([])

Must it be a generator function? If not, how about

def f():
    return iter(())

回答 5

生成空迭代器的“标准”方法似乎是iter([])。我建议将[]设为iter()的默认参数;这与良好的论据被拒绝,见http://bugs.python.org/issue25215 – Jurjen

The “standard” way to make an empty iterator appears to be iter([]). I suggested to make [] the default argument to iter(); this was rejected with good arguments, see http://bugs.python.org/issue25215 – Jurjen


回答 6

就像@senderle所说的,使用这个:

def empty():
    return
    yield

我写这个答案主要是为了分享另一个理由。

选择此解决方案优先于其他解决方案的一个原因是,对于解释器而言,它是最佳的。

>>> import dis
>>> def empty_yield_from():
...     yield from ()
... 
>>> def empty_iter():
...     return iter(())
... 
>>> def empty_return():
...     return
...     yield
...
>>> def noop():
...     pass
...
>>> dis.dis(empty_yield_from)
  2           0 LOAD_CONST               1 (())
              2 GET_YIELD_FROM_ITER
              4 LOAD_CONST               0 (None)
              6 YIELD_FROM
              8 POP_TOP
             10 LOAD_CONST               0 (None)
             12 RETURN_VALUE
>>> dis.dis(empty_iter)
  2           0 LOAD_GLOBAL              0 (iter)
              2 LOAD_CONST               1 (())
              4 CALL_FUNCTION            1
              6 RETURN_VALUE
>>> dis.dis(empty_return)
  2           0 LOAD_CONST               0 (None)
              2 RETURN_VALUE
>>> dis.dis(noop)
  2           0 LOAD_CONST               0 (None)
              2 RETURN_VALUE

如我们所见,的empty_return字节码与常规的空函数完全相同;其余的执行许多其他操作,这些操作无论如何都不会改变行为。empty_return和之间的唯一区别noop是,前者已设置了generator标志:

>>> dis.show_code(noop)
Name:              noop
Filename:          <stdin>
Argument count:    0
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals:  0
Stack size:        1
Flags:             OPTIMIZED, NEWLOCALS, NOFREE
Constants:
   0: None
>>> dis.show_code(empty_return)
Name:              empty_return
Filename:          <stdin>
Argument count:    0
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals:  0
Stack size:        1
Flags:             OPTIMIZED, NEWLOCALS, GENERATOR, NOFREE
Constants:
   0: None

当然,该论点的强度非常依赖于所使用的Python的特定实现;具体请参见第5章。一个足够聪明的替代解释器可能会注意到其他操作毫无用处,并对其进行了优化。但是,即使存在这种优化,它们也需要解释器花费时间执行优化,并防止优化假设被破坏,例如iter全局范围内的标识符反弹到其他事物(即使这很可能表示错误,如果它实际上发生了)。在empty_return没有什么可以优化的情况下,因此即使是天真的CPython也不会在任何虚假操作上浪费时间。

Like @senderle said, use this:

def empty():
    return
    yield

I’m writing this answer mostly to share another justification for it.

One reason for choosing this solution above the others is that it is optimal as far as the interpreter is concerned.

>>> import dis
>>> def empty_yield_from():
...     yield from ()
... 
>>> def empty_iter():
...     return iter(())
... 
>>> def empty_return():
...     return
...     yield
...
>>> def noop():
...     pass
...
>>> dis.dis(empty_yield_from)
  2           0 LOAD_CONST               1 (())
              2 GET_YIELD_FROM_ITER
              4 LOAD_CONST               0 (None)
              6 YIELD_FROM
              8 POP_TOP
             10 LOAD_CONST               0 (None)
             12 RETURN_VALUE
>>> dis.dis(empty_iter)
  2           0 LOAD_GLOBAL              0 (iter)
              2 LOAD_CONST               1 (())
              4 CALL_FUNCTION            1
              6 RETURN_VALUE
>>> dis.dis(empty_return)
  2           0 LOAD_CONST               0 (None)
              2 RETURN_VALUE
>>> dis.dis(noop)
  2           0 LOAD_CONST               0 (None)
              2 RETURN_VALUE

As we can see, the empty_return has exactly the same bytecode as a regular empty function; the rest perform a number of other operations that don’t change the behaviour anyway. The only difference between empty_return and noop is that the former has the generator flag set:

>>> dis.show_code(noop)
Name:              noop
Filename:          <stdin>
Argument count:    0
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals:  0
Stack size:        1
Flags:             OPTIMIZED, NEWLOCALS, NOFREE
Constants:
   0: None
>>> dis.show_code(empty_return)
Name:              empty_return
Filename:          <stdin>
Argument count:    0
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals:  0
Stack size:        1
Flags:             OPTIMIZED, NEWLOCALS, GENERATOR, NOFREE
Constants:
   0: None

Of course, the strength of this argument is very dependent on the particular implementation of Python in use; a sufficiently smart alternative interpreter may notice that the other operations amount to nothing useful and optimise them out. However, even if such optimisations are present, they require the interpreter to spend time performing them and to safeguard against optimisation assumptions being broken, like the iter identifier at global scope being rebound to something else (even though that would most likely indicate a bug if it actually happened). In the case of empty_return there is simply nothing to optimise, so even the relatively naïve CPython will not waste time on any spurious operations.


回答 7

generator = (item for item in [])
generator = (item for item in [])