(lambda)函数闭包捕获了什么?

问题:(lambda)函数闭包捕获了什么?

最近,我开始玩弄Python,并且在闭包的工作方式中遇到了一些奇怪的事情。考虑以下代码:

adders=[0,1,2,3]

for i in [0,1,2,3]:
   adders[i]=lambda a: i+a

print adders[1](3)

它构建了一个简单的函数数组,这些函数接受单个输入并返回该输入加数字后的结果。这些函数在for循环中构造,其中迭代器i0到运行3。对于这些数字中的每一个,lambda都会创建一个函数,将其捕获i并将其添加到函数的输入中。最后一行使用参数作为参数调用第二个lambda函数3。令我惊讶的Yield6

我期望一个4。我的推论是:在Python中,一切都是对象,因此每个变量都是指向它的指针。为创建lambda闭包时i,我希望它存储一个指向当前由指向的整数对象的指针i。这意味着,当i分配一个新的整数对象时,它不应影响先前创建的闭包。可悲的是,adders在调试器中检查该阵列是否可以完成。所有的lambda功能指的最后一个值i3,其结果adders[1](3)返回6

这让我想知道以下几点:

  • 闭包到底捕获了什么?
  • 用最优雅的方法说服lambda功能捕获当前值,i而该方法在i更改其值时不会受到影响?

Recently I started playing around with Python and I came around something peculiar in the way closures work. Consider the following code:

adders=[None, None, None, None]

for i in [0,1,2,3]:
   adders[i]=lambda a: i+a

print adders[1](3)

It builds a simple array of functions that take a single input and return that input added by a number. The functions are constructed in for loop where the iterator i runs from 0 to 3. For each of these numbers a lambda function is created which captures i and adds it to the function’s input. The last line calls the second lambda function with 3 as a parameter. To my surprise the output was 6.

I expected a 4. My reasoning was: in Python everything is an object and thus every variable is essential a pointer to it. When creating the lambda closures for i, I expected it to store a pointer to the integer object currently pointed to by i. That means that when i assigned a new integer object it shouldn’t effect the previously created closures. Sadly, inspecting the adders array within a debugger shows that it does. All lambda functions refer to the last value of i, 3, which results in adders[1](3) returning 6.

Which make me wonder about the following:

  • What do the closures capture exactly?
  • What is the most elegant way to convince the lambda functions to capture the current value of i in a way that will not be affected when i changes its value?

回答 0

您的第二个问题已经回答,但第一个问题是:

闭包究竟捕获了什么?

Python的作用域是动态且词汇丰富的。闭包将始终记住变量的名称和范围,而不是其指向的对象。由于示例中的所有函数都是在同一作用域中创建的,并且使用相同的变量名,因此它们始终引用相同的变量。

编辑:关于您如何解决此问题的其他问题,有两种方法可以想到:

  1. 最简洁但并非严格等效的方法是Adrien Plisson推荐的方法。创建带有额外参数的lambda,并将额外参数的默认值设置为要保留的对象。

  2. 每次创建lambda时,创建一个新的作用域会更加冗长一些,但hacky会更少:

    >>> adders = [0,1,2,3]
    >>> for i in [0,1,2,3]:
    ...     adders[i] = (lambda b: lambda a: b + a)(i)
    ...     
    >>> adders[1](3)
    4
    >>> adders[2](3)
    5

    这里的范围是使用新函数(为简便起见,为lambda)创建的,该函数绑定了其参数,并将要绑定的值作为参数传递。但是,在实际代码中,您很可能会使用普通函数而不是lambda来创建新范围:

    def createAdder(x):
        return lambda y: y + x
    adders = [createAdder(i) for i in range(4)]

Your second question has been answered, but as for your first:

what does the closure capture exactly?

Scoping in Python is dynamic and lexical. A closure will always remember the name and scope of the variable, not the object it’s pointing to. Since all the functions in your example are created in the same scope and use the same variable name, they always refer to the same variable.

EDIT: Regarding your other question of how to overcome this, there are two ways that come to mind:

  1. The most concise, but not strictly equivalent way is the one recommended by Adrien Plisson. Create a lambda with an extra argument, and set the extra argument’s default value to the object you want preserved.

  2. A little more verbose but less hacky would be to create a new scope each time you create the lambda:

    >>> adders = [0,1,2,3]
    >>> for i in [0,1,2,3]:
    ...     adders[i] = (lambda b: lambda a: b + a)(i)
    ...     
    >>> adders[1](3)
    4
    >>> adders[2](3)
    5
    

    The scope here is created using a new function (a lambda, for brevity), which binds its argument, and passing the value you want to bind as the argument. In real code, though, you most likely will have an ordinary function instead of the lambda to create the new scope:

    def createAdder(x):
        return lambda y: y + x
    adders = [createAdder(i) for i in range(4)]
    

回答 1

您可以使用具有默认值的参数来强制捕获变量:

>>> for i in [0,1,2,3]:
...    adders[i]=lambda a,i=i: i+a  # note the dummy parameter with a default value
...
>>> print( adders[1](3) )
4

想法是声明一个参数(命名为i),并为其提供要捕获的变量的默认值(的值 i

you may force the capture of a variable using an argument with a default value:

>>> for i in [0,1,2,3]:
...    adders[i]=lambda a,i=i: i+a  # note the dummy parameter with a default value
...
>>> print( adders[1](3) )
4

the idea is to declare a parameter (cleverly named i) and give it a default value of the variable you want to capture (the value of i)


回答 2

为了完整起见,第二个问题的另一个答案是:您可以在functools模块中使用partial

通过像Chris Lutz所建议的那样从运算符导入add,示例变为:

from functools import partial
from operator import add   # add(a, b) -- Same as a + b.

adders = [0,1,2,3]
for i in [0,1,2,3]:
   # store callable object with first argument given as (current) i
   adders[i] = partial(add, i) 

print adders[1](3)

For completeness another answer to your second question: You could use partial in the functools module.

With importing add from operator as Chris Lutz proposed the example becomes:

from functools import partial
from operator import add   # add(a, b) -- Same as a + b.

adders = [0,1,2,3]
for i in [0,1,2,3]:
   # store callable object with first argument given as (current) i
   adders[i] = partial(add, i) 

print adders[1](3)

回答 3

考虑以下代码:

x = "foo"

def print_x():
    print x

x = "bar"

print_x() # Outputs "bar"

我认为大多数人都不会觉得这令人困惑。这是预期的行为。

那么,为什么人们认为循环完成会有所不同呢?我知道我自己犯了这个错误,但我不知道为什么。是循环吗?也许是lambda?

毕竟,循环只是以下内容的简化版本:

adders= [0,1,2,3]
i = 0
adders[i] = lambda a: i+a
i = 1
adders[i] = lambda a: i+a
i = 2
adders[i] = lambda a: i+a
i = 3
adders[i] = lambda a: i+a

Consider the following code:

x = "foo"

def print_x():
    print x

x = "bar"

print_x() # Outputs "bar"

I think most people won’t find this confusing at all. It is the expected behaviour.

So, why do people think it would be different when it is done in a loop? I know I did that mistake myself, but I don’t know why. It is the loop? Or perhaps the lambda?

After all, the loop is just a shorter version of:

adders= [0,1,2,3]
i = 0
adders[i] = lambda a: i+a
i = 1
adders[i] = lambda a: i+a
i = 2
adders[i] = lambda a: i+a
i = 3
adders[i] = lambda a: i+a

回答 4

为了回答第二个问题,最优雅的方法是使用一个接受两个参数而不是数组的函数:

add = lambda a, b: a + b
add(1, 3)

但是,在这里使用lambda有点愚蠢。Python为我们operator提供了该模块,该模块为基本运算符提供了功能接口。上面的lambda仅在调用加法运算符时就有不必要的开销:

from operator import add
add(1, 3)

我了解到您正在玩耍,尝试探索该语言,但是我无法想象出现这样的情况:我会使用一系列函数来阻止Python的范围异常。

如果需要,可以编写一个使用数组索引语法的小类:

class Adders(object):
    def __getitem__(self, item):
        return lambda a: a + item

adders = Adders()
adders[1](3)

In answer to your second question, the most elegant way to do this would be to use a function that takes two parameters instead of an array:

add = lambda a, b: a + b
add(1, 3)

However, using lambda here is a bit silly. Python gives us the operator module, which provides a functional interface to the basic operators. The lambda above has unnecessary overhead just to call the addition operator:

from operator import add
add(1, 3)

I understand that you’re playing around, trying to explore the language, but I can’t imagine a situation I would use an array of functions where Python’s scoping weirdness would get in the way.

If you wanted, you could write a small class that uses your array-indexing syntax:

class Adders(object):
    def __getitem__(self, item):
        return lambda a: a + item

adders = Adders()
adders[1](3)

回答 5

这是一个新的示例,突出显示了闭包的数据结构和内容,以帮助阐明何时“保存”了封闭的上下文。

def make_funcs():
    i = 42
    my_str = "hi"

    f_one = lambda: i

    i += 1
    f_two = lambda: i+1

    f_three = lambda: my_str
    return f_one, f_two, f_three

f_1, f_2, f_3 = make_funcs()

什么是封闭?

>>> print f_1.func_closure, f_1.func_closure[0].cell_contents
(<cell at 0x106a99a28: int object at 0x7fbb20c11170>,) 43 

值得注意的是,my_str不在f1的闭包中。

f2的闭包是什么?

>>> print f_2.func_closure, f_2.func_closure[0].cell_contents
(<cell at 0x106a99a28: int object at 0x7fbb20c11170>,) 43

从内存地址中注意到,两个闭包都包含相同的对象。所以,你可以开始将lambda函数视为对范围的引用。但是,my_str不在f_1或f_2的闭包中,i不在f_3的闭包中(未显示),这表明闭包对象本身是不同的对象。

闭包对象本身是否是同一对象?

>>> print f_1.func_closure is f_2.func_closure
False

Here’s a new example that highlights the data structure and contents of a closure, to help clarify when the enclosing context is “saved.”

def make_funcs():
    i = 42
    my_str = "hi"

    f_one = lambda: i

    i += 1
    f_two = lambda: i+1

    f_three = lambda: my_str
    return f_one, f_two, f_three

f_1, f_2, f_3 = make_funcs()

What is in a closure?

>>> print f_1.func_closure, f_1.func_closure[0].cell_contents
(<cell at 0x106a99a28: int object at 0x7fbb20c11170>,) 43 

Notably, my_str is not in f1’s closure.

What’s in f2’s closure?

>>> print f_2.func_closure, f_2.func_closure[0].cell_contents
(<cell at 0x106a99a28: int object at 0x7fbb20c11170>,) 43

Notice (from the memory addresses) that both closures contain the same objects. So, you can start to think of the lambda function as having a reference to the scope. However, my_str is not in the closure for f_1 or f_2, and i is not in the closure for f_3 (not shown), which suggests the closure objects themselves are distinct objects.

Are the closure objects themselves the same object?

>>> print f_1.func_closure is f_2.func_closure
False