为什么[]比list()快?

问题:为什么[]比list()快?

我最近比较了[]和的处理速度,并list()惊讶地发现它的[]运行速度是的三倍以上list()。我跑了相同的测试与{}dict(),结果几乎相同:[]{}两个花了大约0.128sec /百万次,而list()dict()把每个粗0.428sec /万次。

为什么是这样?不要[]{}(可能()'',太)立即传回了一些空的股票面值的副本,而其明确命名同行(list()dict()tuple()str())完全去创建一个对象,他们是否真的有元素?

我不知道这两种方法有何不同,但我很想找出答案。我在文档中或SO上都找不到答案,而寻找空括号却比我预期的要麻烦得多。

通过分别调用timeit.timeit("[]")timeit.timeit("list()"),和timeit.timeit("{}")timeit.timeit("dict()")来比较列表和字典,以获得计时结果。我正在运行Python 2.7.9。

我最近发现“ 为什么True慢于if? ”比较了if Trueto 的性能,if 1并且似乎触及了类似的文字对全局场景;也许也值得考虑。

I recently compared the processing speeds of [] and list() and was surprised to discover that [] runs more than three times faster than list(). I ran the same test with {} and dict() and the results were practically identical: [] and {} both took around 0.128sec / million cycles, while list() and dict() took roughly 0.428sec / million cycles each.

Why is this? Do [] and {} (and probably () and '', too) immediately pass back a copies of some empty stock literal while their explicitly-named counterparts (list(), dict(), tuple(), str()) fully go about creating an object, whether or not they actually have elements?

I have no idea how these two methods differ but I’d love to find out. I couldn’t find an answer in the docs or on SO, and searching for empty brackets turned out to be more problematic than I’d expected.

I got my timing results by calling timeit.timeit("[]") and timeit.timeit("list()"), and timeit.timeit("{}") and timeit.timeit("dict()"), to compare lists and dictionaries, respectively. I’m running Python 2.7.9.

I recently discovered “Why is if True slower than if 1?” that compares the performance of if True to if 1 and seems to touch on a similar literal-versus-global scenario; perhaps it’s worth considering as well.


回答 0

因为[]{}文字语法。Python可以创建字节码仅用于创建列表或字典对象:

>>> import dis
>>> dis.dis(compile('[]', '', 'eval'))
  1           0 BUILD_LIST               0
              3 RETURN_VALUE        
>>> dis.dis(compile('{}', '', 'eval'))
  1           0 BUILD_MAP                0
              3 RETURN_VALUE        

list()dict()是单独的对象。它们的名称需要解析,必须包含堆栈以推入参数,必须存储框架以供以后检索,并且必须进行调用。这都需要更多时间。

对于空的情况,这意味着您至少要有一个LOAD_NAME(必须在全局命名空间以及__builtin__模块中进行搜索),后跟一个CALL_FUNCTION必须保留当前帧的:

>>> dis.dis(compile('list()', '', 'eval'))
  1           0 LOAD_NAME                0 (list)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        
>>> dis.dis(compile('dict()', '', 'eval'))
  1           0 LOAD_NAME                0 (dict)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        

您可以使用以下命令分别计时名称查找timeit

>>> import timeit
>>> timeit.timeit('list', number=10**7)
0.30749011039733887
>>> timeit.timeit('dict', number=10**7)
0.4215109348297119

时间差异可能是字典哈希冲突。从调用这些对象的时间中减去这些时间,然后将结果与使用文字的时间进行比较:

>>> timeit.timeit('[]', number=10**7)
0.30478692054748535
>>> timeit.timeit('{}', number=10**7)
0.31482696533203125
>>> timeit.timeit('list()', number=10**7)
0.9991960525512695
>>> timeit.timeit('dict()', number=10**7)
1.0200958251953125

因此,1.00 - 0.31 - 0.30 == 0.39每1000万次调用必须调用该对象花费了额外的几秒钟。

您可以通过将全局名称别名为本地名称来避免全局查找成本(使用timeit设置,绑定到名称的所有内容都是本地名称):

>>> timeit.timeit('_list', '_list = list', number=10**7)
0.1866450309753418
>>> timeit.timeit('_dict', '_dict = dict', number=10**7)
0.19016098976135254
>>> timeit.timeit('_list()', '_list = list', number=10**7)
0.841480016708374
>>> timeit.timeit('_dict()', '_dict = dict', number=10**7)
0.7233691215515137

但您永远无法克服这些CALL_FUNCTION成本。

Because [] and {} are literal syntax. Python can create bytecode just to create the list or dictionary objects:

>>> import dis
>>> dis.dis(compile('[]', '', 'eval'))
  1           0 BUILD_LIST               0
              3 RETURN_VALUE        
>>> dis.dis(compile('{}', '', 'eval'))
  1           0 BUILD_MAP                0
              3 RETURN_VALUE        

list() and dict() are separate objects. Their names need to be resolved, the stack has to be involved to push the arguments, the frame has to be stored to retrieve later, and a call has to be made. That all takes more time.

For the empty case, that means you have at the very least a LOAD_NAME (which has to search through the global namespace as well as the __builtin__ module) followed by a CALL_FUNCTION, which has to preserve the current frame:

>>> dis.dis(compile('list()', '', 'eval'))
  1           0 LOAD_NAME                0 (list)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        
>>> dis.dis(compile('dict()', '', 'eval'))
  1           0 LOAD_NAME                0 (dict)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        

You can time the name lookup separately with timeit:

>>> import timeit
>>> timeit.timeit('list', number=10**7)
0.30749011039733887
>>> timeit.timeit('dict', number=10**7)
0.4215109348297119

The time discrepancy there is probably a dictionary hash collision. Subtract those times from the times for calling those objects, and compare the result against the times for using literals:

>>> timeit.timeit('[]', number=10**7)
0.30478692054748535
>>> timeit.timeit('{}', number=10**7)
0.31482696533203125
>>> timeit.timeit('list()', number=10**7)
0.9991960525512695
>>> timeit.timeit('dict()', number=10**7)
1.0200958251953125

So having to call the object takes an additional 1.00 - 0.31 - 0.30 == 0.39 seconds per 10 million calls.

You can avoid the global lookup cost by aliasing the global names as locals (using a timeit setup, everything you bind to a name is a local):

>>> timeit.timeit('_list', '_list = list', number=10**7)
0.1866450309753418
>>> timeit.timeit('_dict', '_dict = dict', number=10**7)
0.19016098976135254
>>> timeit.timeit('_list()', '_list = list', number=10**7)
0.841480016708374
>>> timeit.timeit('_dict()', '_dict = dict', number=10**7)
0.7233691215515137

but you never can overcome that CALL_FUNCTION cost.


回答 1

list()需要全局查找和函数调用,但需要[]编译为一条指令。看到:

Python 2.7.3
>>> import dis
>>> print dis.dis(lambda: list())
  1           0 LOAD_GLOBAL              0 (list)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        
None
>>> print dis.dis(lambda: [])
  1           0 BUILD_LIST               0
              3 RETURN_VALUE        
None

list() requires a global lookup and a function call but [] compiles to a single instruction. See:

Python 2.7.3
>>> import dis
>>> print dis.dis(lambda: list())
  1           0 LOAD_GLOBAL              0 (list)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        
None
>>> print dis.dis(lambda: [])
  1           0 BUILD_LIST               0
              3 RETURN_VALUE        
None

回答 2

因为list是一个功能转化说一个字符串列表对象,而[]用于创建一个列表蝙蝠。尝试一下(可能对您更有意义):

x = "wham bam"
a = list(x)
>>> a
["w", "h", "a", "m", ...]

y = ["wham bam"]
>>> y
["wham bam"]

为您提供包含您所输入内容的实际列表。

Because list is a function to convert say a string to a list object, while [] is used to create a list off the bat. Try this (might make more sense to you):

x = "wham bam"
a = list(x)
>>> a
["w", "h", "a", "m", ...]

While

y = ["wham bam"]
>>> y
["wham bam"]

Gives you a actual list containing whatever you put in it.


回答 3

至此,答案非常好,并完全涵盖了这个问题。对于那些感兴趣的人,我将进一步从字节码中删除。我正在使用CPython的最新仓库;在这方面,旧版本的行为类似,但可能会稍作更改。

这是每个BUILD_LIST针对for []CALL_FUNCTIONfor 的执行情况的细分list()


BUILD_LIST指令:

您应该只查看恐怖:

PyObject *list =  PyList_New(oparg);
if (list == NULL)
    goto error;
while (--oparg >= 0) {
    PyObject *item = POP();
    PyList_SET_ITEM(list, oparg, item);
}
PUSH(list);
DISPATCH();

我知道那令人费解。这是多么简单:

  • 使用创建新列表PyList_New(主要是为新的列表对象分配内存),以oparg信号指示堆栈上的参数数量。开门见山。
  • 检查是否没有问题if (list==NULL)
  • 使用PyList_SET_ITEM(宏)添加位于堆栈上的所有参数(在我们的示例中此参数未执行)。

难怪它很快!它是为创建新列表而定制的,仅此而已:-)

CALL_FUNCTION指令:

窥视代码处理时,这是您看到的第一件事CALL_FUNCTION

PyObject **sp, *res;
sp = stack_pointer;
res = call_function(&sp, oparg, NULL);
stack_pointer = sp;
PUSH(res);
if (res == NULL) {
    goto error;
}
DISPATCH();

看起来很无害吧?好吧,不是,不幸的call_function是,不是一个会立即调用该函数的直截了当的家伙,它不会。相反,它从堆栈中获取对象,获取堆栈中的所有参数,然后根据对象的类型进行切换。它是:

我们正在调用list类型,传入的参数call_functionPyList_Type。CPython现在必须调用一个泛型函数来处理名为的所有可调用对象_PyObject_FastCallKeywords,还有更多函数调用。

该函数再次检查某些函数类型(我不明白为什么),然后在为kwargs创建字典后,如果需要,继续调用_PyObject_FastCallDict

_PyObject_FastCallDict终于把我们带到某个地方!执行后甚至更多的检查抓住了tp_call从插槽type中的type我们在通过了,那就是它抓住type.tp_call。然后,它根据传入的参数来创建元组_PyStack_AsTuple,最后可以最终进行调用

tp_call,它将匹配type.__call__并最终创建列表对象。它调用与之__new__对应的列表PyType_GenericNew并为其分配内存PyType_GenericAlloc这实际上是它与追上的部分PyList_New,最后。所有以前的内容对于以通用方式处理对象都是必需的。

最后,使用任何可用参数type_call调用list.__init__并初始化列表,然后继续返回原来的方式。:-)

最后,记住 LOAD_NAME,这是另一个在这里做出贡献的家伙。


很容易看到,在处理我们的输入时,Python通常必须跳过圈以真正找到合适的C函数来完成工作。它不具有立即调用它的功能,因为它是动态的,有人可能会掩盖list并且男孩会做很多人做的事情),因此必须采取另一条路。

这是哪里 list()损失很多的地方:正在探索的Python需要做以找出它应该做什么。

另一方面,字面语法恰好意味着一回事。它无法更改,并且始终以预定的方式运行。

脚注:所有功能名称均可能从一个版本更改为另一个版本。关键点仍然存在,并且很可能在将来的任何版本中都存在,这是动态查找使事情变慢的原因。

The answers here are great, to the point and fully cover this question. I’ll drop a further step down from byte-code for those interested. I’m using the most recent repo of CPython; older versions behave similar in this regard but slight changes might be in place.

Here’s a break down of the execution for each of these, BUILD_LIST for [] and CALL_FUNCTION for list().


The BUILD_LIST instruction:

You should just view the horror:

PyObject *list =  PyList_New(oparg);
if (list == NULL)
    goto error;
while (--oparg >= 0) {
    PyObject *item = POP();
    PyList_SET_ITEM(list, oparg, item);
}
PUSH(list);
DISPATCH();

Terribly convoluted, I know. This is how simple it is:

  • Create a new list with PyList_New (this mainly allocates the memory for a new list object), oparg signalling the number of arguments on the stack. Straight to the point.
  • Check that nothing went wrong with if (list==NULL).
  • Add any arguments (in our case this isn’t executed) located on the stack with PyList_SET_ITEM (a macro).

No wonder it is fast! It’s custom-made for creating new lists, nothing else :-)

The CALL_FUNCTION instruction:

Here’s the first thing you see when you peek at the code handling CALL_FUNCTION:

PyObject **sp, *res;
sp = stack_pointer;
res = call_function(&sp, oparg, NULL);
stack_pointer = sp;
PUSH(res);
if (res == NULL) {
    goto error;
}
DISPATCH();

Looks pretty harmless, right? Well, no, unfortunately not, call_function is not a straightforward guy that will call the function immediately, it can’t. Instead, it grabs the object from the stack, grabs all arguments of the stack and then switches based on the type of the object; is it a:

We’re calling the list type, the argument passed in to call_function is PyList_Type. CPython now has to call a generic function to handle any callable objects named _PyObject_FastCallKeywords, yay more function calls.

This function again makes some checks for certain function types (which I cannot understand why) and then, after creating a dict for kwargs if required, goes on to call _PyObject_FastCallDict.

_PyObject_FastCallDict finally gets us somewhere! After performing even more checks it grabs the tp_call slot from the type of the type we’ve passed in, that is, it grabs type.tp_call. It then proceeds to create a tuple out of of the arguments passed in with _PyStack_AsTuple and, finally, a call can finally be made!

tp_call, which matches type.__call__ takes over and finally creates the list object. It calls the lists __new__ which corresponds to PyType_GenericNew and allocates memory for it with PyType_GenericAlloc: This is actually the part where it catches up with PyList_New, finally. All the previous are necessary to handle objects in a generic fashion.

In the end, type_call calls list.__init__ and initializes the list with any available arguments, then we go on a returning back the way we came. :-)

Finally, remmeber the LOAD_NAME, that’s another guy that contributes here.


It’s easy to see that, when dealing with our input, Python generally has to jump through hoops in order to actually find out the appropriate C function to do the job. It doesn’t have the curtesy of immediately calling it because it’s dynamic, someone might mask list (and boy do many people do) and another path must be taken.

This is where list() loses much: The exploring Python needs to do to find out what the heck it should do.

Literal syntax, on the other hand, means exactly one thing; it cannot be changed and always behaves in a pre-determined way.

Footnote: All function names are subject to change from one release to the other. The point still stands and most likely will stand in any future versions, it’s the dynamic look-up that slows things down.


回答 4

为什么[]要比list()

最大的原因是Python list()就像用户定义的函数一样对待,这意味着您可以通过别名别名来拦截它list并做一些不同的事情(例如使用您自己的子类列表或双端队列)。

它将立即使用创建新的内置列表实例[]

我的解释旨在为您提供直觉。

说明

[] 通常称为文字语法。

在语法中,这称为“列表显示”。从文档

列表显示是括在方括号中的一系列可能为空的表达式:

list_display ::=  "[" [starred_list | comprehension] "]"

列表显示将产生一个新的列表对象,其内容由表达式列表或理解列表指定。提供逗号分隔的表达式列表时,将按从左到右的顺序评估其元素,并将其按此顺序放入列表对象中。提供理解后,将根据理解产生的元素来构建列表。

简而言之,这意味着将list创建一个内置类型的对象。

不能回避这一点-这意味着Python可以尽快完成它。

另一方面,list()可以list使用内置列表构造函数拦截创建内置对象的过程。

例如,假设我们希望创建噪音较大的列表:

class List(list):
    def __init__(self, iterable=None):
        if iterable is None:
            super().__init__()
        else:
            super().__init__(iterable)
        print('List initialized.')

然后,我们可以list在模块级别的全局范围内截取该名称,然后在创建时list,实际上创建了子类型列表:

>>> list = List
>>> a_list = list()
List initialized.
>>> type(a_list)
<class '__main__.List'>

同样,我们可以将其从全局命名空间中删除

del list

并将其放在内置命名空间中:

import builtins
builtins.list = List

现在:

>>> list_0 = list()
List initialized.
>>> type(list_0)
<class '__main__.List'>

并注意列表显示无条件创建列表:

>>> list_1 = []
>>> type(list_1)
<class 'list'>

我们可能只是暂时执行此操作,所以请撤消更改-首先List从内置文件中删除新对象:

>>> del builtins.list
>>> builtins.list
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'builtins' has no attribute 'list'
>>> list()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'list' is not defined

哦,不,我们失去了原来的踪迹。

不用担心,我们仍然可以得到list-它是列表文字的类型:

>>> builtins.list = type([])
>>> list()
[]

所以…

为什么[]要比list()

如我们所见-我们可以覆盖list-但是我们不能截取文字类型的创建。使用时,list我们必须进行查找以查看是否存在任何内容。

然后,我们必须调用已查找的任何可调用对象。从语法上:

调用使用一系列可能为空的参数来调用可调用对象(例如,函数):

call                 ::=  primary "(" [argument_list [","] | comprehension] ")"

我们可以看到它对任何名称都具有相同的作用,而不仅仅是列表:

>>> import dis
>>> dis.dis('list()')
  1           0 LOAD_NAME                0 (list)
              2 CALL_FUNCTION            0
              4 RETURN_VALUE
>>> dis.dis('doesnotexist()')
  1           0 LOAD_NAME                0 (doesnotexist)
              2 CALL_FUNCTION            0
              4 RETURN_VALUE

因为[]在Python字节码级别没有函数调用:

>>> dis.dis('[]')
  1           0 BUILD_LIST               0
              2 RETURN_VALUE

它只是直接建立列表而无需在字节码级别进行任何查找或调用。

结论

我们已经证明了list可以使用范围规则用用户代码拦截,并且可以list()查找可调用对象然后调用它。

[]列表显示或文字显示则避免了名称查找和函数调用。

Why is [] faster than list()?

The biggest reason is that Python treats list() just like a user-defined function, which means you can intercept it by aliasing something else to list and do something different (like use your own subclassed list or perhaps a deque).

It immediately creates a new instance of a builtin list with [].

My explanation seeks to give you the intuition for this.

Explanation

[] is commonly known as literal syntax.

In the grammar, this is referred to as a “list display”. From the docs:

A list display is a possibly empty series of expressions enclosed in square brackets:

list_display ::=  "[" [starred_list | comprehension] "]"

A list display yields a new list object, the contents being specified by either a list of expressions or a comprehension. When a comma-separated list of expressions is supplied, its elements are evaluated from left to right and placed into the list object in that order. When a comprehension is supplied, the list is constructed from the elements resulting from the comprehension.

In short, this means that a builtin object of type list is created.

There is no circumventing this – which means Python can do it as quickly as it may.

On the other hand, list() can be intercepted from creating a builtin list using the builtin list constructor.

For example, say we want our lists to be created noisily:

class List(list):
    def __init__(self, iterable=None):
        if iterable is None:
            super().__init__()
        else:
            super().__init__(iterable)
        print('List initialized.')

We could then intercept the name list on the module level global scope, and then when we create a list, we actually create our subtyped list:

>>> list = List
>>> a_list = list()
List initialized.
>>> type(a_list)
<class '__main__.List'>

Similarly we could remove it from the global namespace

del list

and put it in the builtin namespace:

import builtins
builtins.list = List

And now:

>>> list_0 = list()
List initialized.
>>> type(list_0)
<class '__main__.List'>

And note that the list display creates a list unconditionally:

>>> list_1 = []
>>> type(list_1)
<class 'list'>

We probably only do this temporarily, so lets undo our changes – first remove the new List object from the builtins:

>>> del builtins.list
>>> builtins.list
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'builtins' has no attribute 'list'
>>> list()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'list' is not defined

Oh, no, we lost track of the original.

Not to worry, we can still get list – it’s the type of a list literal:

>>> builtins.list = type([])
>>> list()
[]

So…

Why is [] faster than list()?

As we’ve seen – we can overwrite list – but we can’t intercept the creation of the literal type. When we use list we have to do the lookups to see if anything is there.

Then we have to call whatever callable we have looked up. From the grammar:

A call calls a callable object (e.g., a function) with a possibly empty series of arguments:

call                 ::=  primary "(" [argument_list [","] | comprehension] ")"

We can see that it does the same thing for any name, not just list:

>>> import dis
>>> dis.dis('list()')
  1           0 LOAD_NAME                0 (list)
              2 CALL_FUNCTION            0
              4 RETURN_VALUE
>>> dis.dis('doesnotexist()')
  1           0 LOAD_NAME                0 (doesnotexist)
              2 CALL_FUNCTION            0
              4 RETURN_VALUE

For [] there is no function call at the Python bytecode level:

>>> dis.dis('[]')
  1           0 BUILD_LIST               0
              2 RETURN_VALUE

It simply goes straight to building the list without any lookups or calls at the bytecode level.

Conclusion

We have demonstrated that list can be intercepted with user code using the scoping rules, and that list() looks for a callable and then calls it.

Whereas [] is a list display, or a literal, and thus avoids the name lookup and function call.