I’ve seen there are actually two (maybe more) ways to concatenate lists in Python:
One way is to use the extend() method:
a = [1, 2]
b = [2, 3]
b.extend(a)
the other to use the plus(+) operator:
b += a
Now I wonder: Which of those two options is the ‘pythonic’ way to do list concatenation and is there a difference between the two (I’ve looked up the official Python tutorial but couldn’t find anything anything about this topic).
The only difference on a bytecode level is that the .extend way involves a function call, which is slightly more expensive in Python than the INPLACE_ADD.
It’s really nothing you should be worrying about, unless you’re performing this operation billions of times. It is likely, however, that the bottleneck would lie some place else.
回答 1
您不能将+ =用于非局部变量(该变量对于函数而言不是局部变量,也不是全局变量)
def main():
l =[1,2,3]def foo():
l.extend([4])def boo():
l +=[5]
foo()print l
boo()# this will fail
main()
这是因为对于扩展情况,编译器将l使用LOAD_DEREF指令加载变量,而对于+ =,它将使用LOAD_FAST-,您将获得*UnboundLocalError: local variable 'l' referenced before assignment*
You can’t use += for non-local variable (variable which is not local for function and also not global)
def main():
l = [1, 2, 3]
def foo():
l.extend([4])
def boo():
l += [5]
foo()
print l
boo() # this will fail
main()
It’s because for extend case compiler will load the variable l using LOAD_DEREF instruction, but for += it will use LOAD_FAST – and you get *UnboundLocalError: local variable 'l' referenced before assignment*
回答 2
您可以链接函数调用,但不能直接+ =函数调用:
class A:def __init__(self):
self.listFoo =[1,2]
self.listBar =[3,4]def get_list(self, which):if which =="Foo":return self.listFoo
return self.listBar
a = A()
other_list =[5,6]
a.get_list("Foo").extend(other_list)
a.get_list("Foo")+= other_list #SyntaxError: can't assign to function call
I would say that there is some difference when it comes with numpy (I just saw that the question ask about concatenating two lists, not numpy array, but since it might be a issue for beginner, such as me, I hope this can help someone who seek the solution to this post), for ex.
import numpy as np
a = np.zeros((4,4,4))
b = []
b += a
it will return with error
ValueError: operands could not be broadcast together with shapes (0,) (4,4,4)
import numpy as np
l =[2,3,4]
t =(5,6,7)
l += t
l
[2,3,4,5,6,7]
l =[2,3,4]
t = np.array((5,6,7))
l += t
l
array([7,9,11])
l =[2,3,4]
t = np.array((5,6,7))
l.extend(t)
l
[2,3,4,5,6,7]
extend() works with any iterable*, += works with some but can get funky.
import numpy as np
l = [2, 3, 4]
t = (5, 6, 7)
l += t
l
[2, 3, 4, 5, 6, 7]
l = [2, 3, 4]
t = np.array((5, 6, 7))
l += t
l
array([ 7, 9, 11])
l = [2, 3, 4]
t = np.array((5, 6, 7))
l.extend(t)
l
[2, 3, 4, 5, 6, 7]
Python 3.6
*pretty sure .extend() works with any iterable but please comment if I am incorrect
import time
def test():
x = list(range(10000000))
y = list(range(10000000))
z = list(range(10000000))# INPLACE_ADD
t0 = time.process_time()
z += x
t_inplace_add = time.process_time()- t0
# ADD
t0 = time.process_time()
w = x + y
t_add = time.process_time()- t0
# Extend
t0 = time.process_time()
x.extend(y)
t_extend = time.process_time()- t0
print('ADD {} s'.format(t_add))print('INPLACE_ADD {} s'.format(t_inplace_add))print('extend {} s'.format(t_extend))print()for i in range(10):
test()
ADD 0.3540440000000018 s
INPLACE_ADD 0.10896000000000328 s
extend 0.08370399999999734 s
ADD 0.2024550000000005 s
INPLACE_ADD 0.0972940000000051 s
extend 0.09610200000000191 s
ADD 0.1680199999999985 s
INPLACE_ADD 0.08162199999999586 s
extend 0.0815160000000077 s
ADD 0.16708400000000267 s
INPLACE_ADD 0.0797719999999913 s
extend 0.0801490000000058 s
ADD 0.1681250000000034 s
INPLACE_ADD 0.08324399999999343 s
extend 0.08062700000000689 s
ADD 0.1707760000000036 s
INPLACE_ADD 0.08071900000000198 s
extend 0.09226200000000517 s
ADD 0.1668420000000026 s
INPLACE_ADD 0.08047300000001201 s
extend 0.0848089999999928 s
ADD 0.16659500000000094 s
INPLACE_ADD 0.08019399999999166 s
extend 0.07981599999999389 s
ADD 0.1710910000000041 s
INPLACE_ADD 0.0783479999999912 s
extend 0.07987599999999873 s
ADD 0.16435900000000458 s
INPLACE_ADD 0.08131200000001115 s
extend 0.0818660000000051 s
Actually, there are differences among the three options: ADD, INPLACE_ADD and extend. The former is always slower, while the other two are roughly the same.
With this information, I would rather use extend, which is faster than ADD, and seems to me more explicit of what you are doing than INPLACE_ADD.
Try the following code a few times (for Python 3):
import time
def test():
x = list(range(10000000))
y = list(range(10000000))
z = list(range(10000000))
# INPLACE_ADD
t0 = time.process_time()
z += x
t_inplace_add = time.process_time() - t0
# ADD
t0 = time.process_time()
w = x + y
t_add = time.process_time() - t0
# Extend
t0 = time.process_time()
x.extend(y)
t_extend = time.process_time() - t0
print('ADD {} s'.format(t_add))
print('INPLACE_ADD {} s'.format(t_inplace_add))
print('extend {} s'.format(t_extend))
print()
for i in range(10):
test()
ADD 0.3540440000000018 s
INPLACE_ADD 0.10896000000000328 s
extend 0.08370399999999734 s
ADD 0.2024550000000005 s
INPLACE_ADD 0.0972940000000051 s
extend 0.09610200000000191 s
ADD 0.1680199999999985 s
INPLACE_ADD 0.08162199999999586 s
extend 0.0815160000000077 s
ADD 0.16708400000000267 s
INPLACE_ADD 0.0797719999999913 s
extend 0.0801490000000058 s
ADD 0.1681250000000034 s
INPLACE_ADD 0.08324399999999343 s
extend 0.08062700000000689 s
ADD 0.1707760000000036 s
INPLACE_ADD 0.08071900000000198 s
extend 0.09226200000000517 s
ADD 0.1668420000000026 s
INPLACE_ADD 0.08047300000001201 s
extend 0.0848089999999928 s
ADD 0.16659500000000094 s
INPLACE_ADD 0.08019399999999166 s
extend 0.07981599999999389 s
ADD 0.1710910000000041 s
INPLACE_ADD 0.0783479999999912 s
extend 0.07987599999999873 s
ADD 0.16435900000000458 s
INPLACE_ADD 0.08131200000001115 s
extend 0.0818660000000051 s
I’ve looked up the official Python tutorial but couldn’t find anything anything about this topic
This information happens to be buried in the Programming FAQ:
… for lists, __iadd__ [i.e. +=] is equivalent to calling extend on the list and returning the list. That’s why we say that for lists, += is a “shorthand” for list.extend
“Note that list concatenation by addition is a comparatively expensive operation since a new list must be created and the objects copied over. Using extend to append elements to an existing list, especially if you are building up a large list, is usually preferable. ”
Thus,
everything = []
for chunk in list_of_lists:
everything.extend(chunk)
is faster than the concatenative alternative:
everything = []
for chunk in list_of_lists:
everything = everything + chunk