# 为什么说Python大数据处理一定要用Numpy Array?

Numpy 是Python科学计算的一个核心模块。它提供了非常高效的数组对象，以及用于处理这些数组对象的工具。一个Numpy数组由许多值组成，所有值的类型是相同的。

Python的核心库提供了 List 列表。列表是最常见的Python数据类型之一，它可以调整大小并且包含不同类型的元素，非常方便。

Numpy数据结构在以下方面表现更好：

1.内存大小—Numpy数据结构占用的内存更小。

2.性能—Numpy底层是用C语言实现的，比列表更快。

3.运算方法—内置优化了代数运算等方法。

## 1.Numpy Array内存占用更小

64 + 8 * len(lst) + len(lst) * 28 字节

96 + len(a) * 8 字节

## 2.Numpy Array速度更快、内置计算方法

```import time
import numpy as np

size_of_vec = 1000

def pure_python_version():
t1 = time.time()
X = range(size_of_vec)
Y = range(size_of_vec)
Z = [X[i] + Y[i] for i in range(len(X)) ]
return time.time() - t1

def numpy_version():
t1 = time.time()
X = np.arange(size_of_vec)
Y = np.arange(size_of_vec)
Z = X + Y
return time.time() - t1

t1 = pure_python_version()
t2 = numpy_version()
print(t1, t2)
print("Numpy is in this example " + str(t1/t2) + " faster!")
```

```0.00048732757568359375 0.0002491474151611328
Numpy is in this example 1.955980861244019 faster!```

```import numpy as np
from timeit import Timer

size_of_vec = 1000
X_list = range(size_of_vec)
Y_list = range(size_of_vec)
X = np.arange(size_of_vec)
Y = np.arange(size_of_vec)

def pure_python_version():
Z = [X_list[i] + Y_list[i] for i in range(len(X_list)) ]

def numpy_version():
Z = X + Y

timer_obj1 = Timer("pure_python_version()",
"from __main__ import pure_python_version")
timer_obj2 = Timer("numpy_version()",
"from __main__ import numpy_version")

print(timer_obj1.timeit(10))
print(timer_obj2.timeit(10))  # Runs Faster!

print(timer_obj1.repeat(repeat=3, number=10))
print(timer_obj2.repeat(repeat=3, number=10)) # repeat to prove it!```

```0.0029753120616078377
0.00014940369874238968
[0.002683573868125677, 0.002754641231149435, 0.002803879790008068]
[6.536301225423813e-05, 2.9387418180704117e-05, 2.9171351343393326e-05]
```

​Python实用宝典 ( pythondict.com )

# Box 为你的字典添加点符号访问特性

```test_dict = {"test": {"imdb stars": 6.7, "length": 104}}

print(test_dict["test"]["imdb stars"])
# 104```

```from box import Box

movie_box = Box({ "Robin Hood: Men in Tights": { "imdb stars": 6.7, "length": 104 } })

movie_box.Robin_Hood_Men_in_Tights.imdb_stars

# 6.7```

## 1.准备

(可选1) 如果你用Python的目的是数据分析，可以直接安装Anaconda：Python数据分析与挖掘好帮手—Anaconda，它内置了Python和pip.

(可选2) 此外，推荐大家用VSCode编辑器来编写小型Python项目：Python 编程的最好搭档—VSCode 详细指南

Windows环境下打开Cmd(开始—运行—CMD)，苹果系统环境下请打开Terminal(command+空格输入Terminal)，输入命令安装依赖：

`pip install --upgrade python-box[all]`

## 2.基本使用

```from box import Box

my_box = Box(funny_movie='Hudson Hawk', best_movie='Kung Fu Panda')
my_box.funny_movie
# 'Hudson Hawk'```

```my_box = Box({"team": {"red": {"leader": "Sarge", "members": []}}})
# Sarge

my_box.team.blue = {"leader": "Church", "members": []}
print(repr(my_box.team.blue))
# <Box: {'leader': 'Church', 'members': []}>```

```my_box.team.red.members = [
{"name": "Grif", "rank": "Minor Junior Private Negative First Class"},
{"name": "Dick Simmons", "rank": "Captain"}
]

print(my_box.team.red.members.name)
# Grif```

`my_box['keys']`

```from box import Box

box_1 = Box(val={'important_key': 1})
box_2 = Box(val={'less_important_key': 2})

box_1.merge_update(box_2)

print(box_1)
# {'val': {'important_key': 1, 'less_important_key': 2}}```

```from box import Box

box_1 = Box(val={'important_key': 1})
box_2 = Box(val={'less_important_key': 2})

box_1.update(box_2)

print(box_1)
# {'val': {'less_important_key': 2}}```

```from box import Box

box_1 = Box(val={'important_key': 1})

print(box_1)
# {'val': {'less_important_key': 2}}
print(type(box_1))
# <class 'box.box.Box'>
print(type(box_1.to_dict()))
# <class 'dict'>
```

```from box import BoxList

my_boxlist = BoxList({'item': x} for x in range(10))
#  <BoxList: [<Box: {'item': 0}>, <Box: {'item': 1}>, ...

my_boxlist.item
# 5

print(type(my_boxlist.to_list()))
# <class 'list'>```

## 3.导入导出功能

Box对象有一个很方便的功能，就是能够轻松地将Box对象导出为Json/yaml/csv/msgpack文件：

```from box import BoxList

my_boxlist = BoxList({'item': x} for x in range(10))
#  <BoxList: [<Box: {'item': 0}>, <Box: {'item': 1}>, ...

my_boxlist.to_json(filename="test.json")
# 在当前文件夹下生成一个 test.json 文件```

`new_box = Box.from_json(filename="films.json")`

* 不适用于 BoxList，仅适用于 Box ** 不适用于 Box，仅适用于 BoxList。

https://github.com/cdgriffith/Box/wiki

​Python实用宝典 ( pythondict.com )

# 如何使用Python将文本文件读取到列表或数组中

## 问题：如何使用Python将文本文件读取到列表或数组中

``0,0,200,0,53,1,0,255,...,0.``

`...`以上，有实际的文本文件中有数百或数千多个项目。

``````text_file = open("filename.dat", "r")
print lines
print len(lines)
text_file.close()
``````

``````['0,0,200,0,53,1,0,255,...,0.']
1
``````

I am trying to read the lines of a text file into a list or array in python. I just need to be able to individually access any item in the list or array after it is created.

The text file is formatted as follows:

``````0,0,200,0,53,1,0,255,...,0.
``````

Where the `...` is above, there actual text file has hundreds or thousands more items.

I’m using the following code to try to read the file into a list:

``````text_file = open("filename.dat", "r")
print lines
print len(lines)
text_file.close()
``````

The output I get is:

``````['0,0,200,0,53,1,0,255,...,0.']
1
``````

Apparently it is reading the entire file into a list of just one item, rather than a list of individual items. What am I doing wrong?

## 回答 0

``lines = text_file.read().split(',')``

You will have to split your string into a list of values using `split()`

So,

``````lines = text_file.read().split(',')
``````

EDIT: I didn’t realise there would be so much traction to this. Here’s a more idiomatic approach.

``````import csv
with open('filename.csv', 'r') as fd:
# do something
``````

## 回答 1

``````from numpy import loadtxt
``````

You can also use numpy loadtxt like

``````from numpy import loadtxt
``````

## 回答 2

``list_of_lists = []``

``````with open('data') as f:
for line in f:
inner_list = [elt.strip() for elt in line.split(',')]
# in alternative, if you need to use the file content as numbers
# inner_list = [int(elt.strip()) for elt in line.split(',')]
list_of_lists.append(inner_list)
``````

``by_cols = zip(*list_of_lists)``

``````col_names = ('apples sold', 'pears sold', 'apples revenue', 'pears revenue')
by_names = {}
for i, col_name in enumerate(col_names):
by_names[col_name] = by_cols[i]
``````

`````` mean_apple_prices = [money/fruits for money, fruits in
zip(by_names['apples revenue'], by_names['apples_sold'])]
``````

``by_cols = list(zip(*list_of_lists))``

``````file = open('some_data.csv')
names = get_names(next(file))
columns = zip(*((x.strip() for x in line.split(',')) for line in file)))
d = {}
for name, column in zip(names, columns): d[name] = column
``````

So you want to create a list of lists… We need to start with an empty list

``````list_of_lists = []
``````

next, we read the file content, line by line

``````with open('data') as f:
for line in f:
inner_list = [elt.strip() for elt in line.split(',')]
# in alternative, if you need to use the file content as numbers
# inner_list = [int(elt.strip()) for elt in line.split(',')]
list_of_lists.append(inner_list)
``````

A common use case is that of columnar data, but our units of storage are the rows of the file, that we have read one by one, so you may want to transpose your list of lists. This can be done with the following idiom

``````by_cols = zip(*list_of_lists)
``````

Another common use is to give a name to each column

``````col_names = ('apples sold', 'pears sold', 'apples revenue', 'pears revenue')
by_names = {}
for i, col_name in enumerate(col_names):
by_names[col_name] = by_cols[i]
``````

so that you can operate on homogeneous data items

`````` mean_apple_prices = [money/fruits for money, fruits in
zip(by_names['apples revenue'], by_names['apples_sold'])]
``````

Most of what I’ve written can be speeded up using the `csv` module, from the standard library. Another third party module is `pandas`, that lets you automate most aspects of a typical data analysis (but has a number of dependencies).

Update While in Python 2 `zip(*list_of_lists)` returns a different (transposed) list of lists, in Python 3 the situation has changed and `zip(*list_of_lists)` returns a zip object that is not subscriptable.

If you need indexed access you can use

``````by_cols = list(zip(*list_of_lists))
``````

that gives you a list of lists in both versions of Python.

On the other hand, if you don’t need indexed access and what you want is just to build a dictionary indexed by column names, a zip object is just fine…

``````file = open('some_data.csv')
names = get_names(next(file))
columns = zip(*((x.strip() for x in line.split(',')) for line in file)))
d = {}
for name, column in zip(names, columns): d[name] = column
``````

## 回答 3

`0,0,200,0,53,1,0,255,...,0.`

``````import csv
with open('filename.dat', newline='') as csvfile:
``````

``````for row in spamreader:
print(', '.join(row))
``````

This question is asking how to read the comma-separated value contents from a file into an iterable list:

`0,0,200,0,53,1,0,255,...,0.`

The easiest way to do this is with the `csv` module as follows:

``````import csv
with open('filename.dat', newline='') as csvfile:
``````

Now, you can easily iterate over `spamreader` like this:

``````for row in spamreader:
print(', '.join(row))
``````

See documentation for more examples.

# 如何在Python中将多个值附加到列表

## 问题：如何在Python中将多个值附加到列表

I am trying to figure out how to append multiple values to a list in Python. I know there are few methods to do so, such as manually input the values, or put the append operation in a `for` loop, or the `append` and `extend` functions.

However, I wonder if there is a more neat way to do so? Maybe a certain package or function?

## 回答 0

``````>>> lst = [1, 2]
>>> lst.append(3)
>>> lst.append(4)
>>> lst
[1, 2, 3, 4]

>>> lst.extend([5, 6, 7])
>>> lst.extend((8, 9, 10))
>>> lst
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

>>> lst.extend(range(11, 14))
>>> lst
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]``````

You can use the sequence method `list.extend` to extend the list by multiple values from any kind of iterable, being it another list or any other thing that provides a sequence of values.

``````>>> lst = [1, 2]
>>> lst.append(3)
>>> lst.append(4)
>>> lst
[1, 2, 3, 4]

>>> lst.extend([5, 6, 7])
>>> lst.extend((8, 9, 10))
>>> lst
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

>>> lst.extend(range(11, 14))
>>> lst
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
``````

So you can use `list.append()` to append a single value, and `list.extend()` to append multiple values.

## 回答 1

``````>>> a = [1,2,3]
>>> b = [4,5,6]
>>> a + b
[1, 2, 3, 4, 5, 6]``````

Other than the `append` function, if by “multiple values” you mean another list, you can simply concatenate them like so.

``````>>> a = [1,2,3]
>>> b = [4,5,6]
>>> a + b
[1, 2, 3, 4, 5, 6]
``````

## 回答 2

`itertools.chain`如果您对高效的迭代感兴趣，而不是最终获得一个完全填充的数据结构，那也是很有用的。

If you take a look at the official docs, you’ll see right below `append`, `extend`. That’s what your looking for.

There’s also `itertools.chain` if you are more interested in efficient iteration than ending up with a fully populated data structure.

# 用括号括起来的列表和括号在Python中有什么区别？

## 问题：用括号括起来的列表和括号在Python中有什么区别？

``````>>> x=[1,2]
>>> x
2
>>> x=(1,2)
>>> x
2
``````

``````>>> x=[1,2]
>>> x
2
>>> x=(1,2)
>>> x
2
``````

Are they both valid? Is one preferred for some reason?

## 回答 0

``````>>> x = [1,2]
>>> x.append(3)
>>> x
[1, 2, 3]
``````

``````>>> x = (1,2)
>>> x
(1, 2)
>>> x.append(3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'append'
``````

``````>>> x = (1,2)
>>> y = [1,2]
>>> z = {}
>>> z[x] = 3
>>> z
{(1, 2): 3}
>>> z[y] = 4
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
``````

``````>>> x = (1,2)
>>> x += (3,)
>>> x
(1, 2, 3)
``````

``````>>> x = (1,2)
>>> y = x
>>> x += (3,)
>>> x
(1, 2, 3)
>>> y
(1, 2)
``````

``````>>> x = [1, 2]
>>> y = x
>>> x += 
>>> x
[1, 2, 3]
>>> y
[1, 2, 3]
``````

Square brackets are lists while parentheses are tuples.

A list is mutable, meaning you can change its contents:

``````>>> x = [1,2]
>>> x.append(3)
>>> x
[1, 2, 3]
``````

while tuples are not:

``````>>> x = (1,2)
>>> x
(1, 2)
>>> x.append(3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'append'
``````

The other main difference is that a tuple is hashable, meaning that you can use it as a key to a dictionary, among other things. For example:

``````>>> x = (1,2)
>>> y = [1,2]
>>> z = {}
>>> z[x] = 3
>>> z
{(1, 2): 3}
>>> z[y] = 4
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
``````

Note that, as many people have pointed out, you can add tuples together. For example:

``````>>> x = (1,2)
>>> x += (3,)
>>> x
(1, 2, 3)
``````

However, this does not mean tuples are mutable. In the example above, a new tuple is constructed by adding together the two tuples as arguments. The original tuple is not modified. To demonstrate this, consider the following:

``````>>> x = (1,2)
>>> y = x
>>> x += (3,)
>>> x
(1, 2, 3)
>>> y
(1, 2)
``````

Whereas, if you were to construct this same example with a list, `y` would also be updated:

``````>>> x = [1, 2]
>>> y = x
>>> x += 
>>> x
[1, 2, 3]
>>> y
[1, 2, 3]
``````

## 回答 1

``````lst=
print lst          // prints 
print type(lst)    // prints <type 'list'>

notATuple=(1)
print notATuple        // prints 1
print type(notATuple)  // prints <type 'int'>

One interesting difference :

``````lst=
print lst          // prints 
print type(lst)    // prints <type 'list'>

notATuple=(1)
print notATuple        // prints 1
print type(notATuple)  // prints <type 'int'>
``````

A comma must be included in a tuple even if it contains only a single value. e.g. `(1,)` instead of `(1)`.

## 回答 2

``````In : x = (1, 2)

In : x = 3
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

/home/user/<ipython console> in <module>()

TypeError: 'tuple' object does not support item assignment``````

They are not lists, they are a list and a tuple. You can read about tuples in the Python tutorial. While you can mutate lists, this is not possible with tuples.

``````In : x = (1, 2)

In : x = 3
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

/home/user/<ipython console> in <module>()

TypeError: 'tuple' object does not support item assignment
``````

## 回答 3

Another way brackets and parentheses differ is that square brackets can describe a list comprehension, e.g. `[x for x in y]`

Whereas the corresponding parenthetic syntax specifies a tuple generator: `(x for x in y)`

You can get a tuple comprehension using: `tuple(x for x in y)`

## 回答 4

The first is a list, the second is a tuple. Lists are mutable, tuples are not.

Take a look at the Data Structures section of the tutorial, and the Sequence Types section of the documentation.

## 回答 5

Comma-separated items enclosed by `(` and `)` are `tuple`s, those enclosed by `[` and `]` are `list`s.

# 如何执行两个列表的按元素相乘？

## 问题：如何执行两个列表的按元素相乘？

``````a = [1,2,3,4]
b = [2,3,4,5]
a .* b = [2, 6, 12, 20]``````

I want to perform an element wise multiplication, to multiply two lists together by value in Python, like we can do it in Matlab.

This is how I would do it in Matlab.

``````a = [1,2,3,4]
b = [2,3,4,5]
a .* b = [2, 6, 12, 20]
``````

A list comprehension would give 16 list entries, for every combination `x * y` of `x` from `a` and `y` from `b`. Unsure of how to map this.

If anyone is interested why, I have a dataset, and want to multiply it by `Numpy.linspace(1.0, 0.5, num=len(dataset)) =)`.

## 回答 0

``[a*b for a,b in zip(lista,listb)]``

Use a list comprehension mixed with `zip()`:.

``````[a*b for a,b in zip(lista,listb)]
``````

## 回答 1

``````In : import numpy as np

In : a = np.array([1,2,3,4])

In : b = np.array([2,3,4,5])

In : a * b
Out: array([ 2,  6, 12, 20])``````

Since you’re already using `numpy`, it makes sense to store your data in a `numpy` array rather than a list. Once you do this, you get things like element-wise products for free:

``````In : import numpy as np

In : a = np.array([1,2,3,4])

In : b = np.array([2,3,4,5])

In : a * b
Out: array([ 2,  6, 12, 20])
``````

## 回答 2

``````import numpy as np
a = [1,2,3,4]
b = [2,3,4,5]
np.multiply(a,b)``````

Use np.multiply(a,b):

``````import numpy as np
a = [1,2,3,4]
b = [2,3,4,5]
np.multiply(a,b)
``````

## 回答 3

``ab = [a[i]*b[i] for i in range(len(a))]``

You can try multiplying each element in a loop. The short hand for doing that is

``````ab = [a[i]*b[i] for i in range(len(a))]
``````

## 回答 4

`-1`…需要导入
`+1`…非常易读

``````import operator
a = [1,2,3,4]
b = [10,11,12,13]

list(map(operator.mul, a, b))``````

`-1` … requires import
`+1` … is very readable

``````import operator
a = [1,2,3,4]
b = [10,11,12,13]

list(map(operator.mul, a, b))
``````

outputs [10, 22, 36, 52]

## 回答 5

``````a = [1,2,3,4]
b = [2,3,4,5]
ab = []                        #Create empty list
for i in range(0, len(a)):
ab.append(a[i]*b[i])      #Adds each element to the list``````

Fairly intuitive way of doing this:

``````a = [1,2,3,4]
b = [2,3,4,5]
ab = []                        #Create empty list
for i in range(0, len(a)):
ab.append(a[i]*b[i])      #Adds each element to the list
``````

## 回答 6

``````foo=[1,2,3,4]
bar=[1,2,5,55]
l=map(lambda x,y:x*y,foo,bar)``````

you can multiplication using `lambda`

``````foo=[1,2,3,4]
bar=[1,2,5,55]
l=map(lambda x,y:x*y,foo,bar)
``````

## 回答 7

``product_iter_object = itertools.imap(operator.mul, [1,2,3,4], [2,3,4,5])``

`product_iter_object.next()` 给出输出列表中的每个元素。

For large lists, we can do it the iter-way:

``````product_iter_object = itertools.imap(operator.mul, [1,2,3,4], [2,3,4,5])
``````

`product_iter_object.next()` gives each of the element in the output list.

The output would be the length of the shorter of the two input lists.

## 回答 8

``````import numpy as np

a = [1,2,3,4]
b = [2,3,4,5]

c = (np.ones(len(a))*a*b).tolist()

[2.0, 6.0, 12.0, 20.0]``````

create an array of ones; multiply each list times the array; convert array to a list

``````import numpy as np

a = [1,2,3,4]
b = [2,3,4,5]

c = (np.ones(len(a))*a*b).tolist()

[2.0, 6.0, 12.0, 20.0]
``````

## 回答 9

gahooa的答案对于标题中所述的问题是正确的，但是如果列表已经是numpy格式大于十，它将更快（3个数量级）并且可读性更高，如NPE。我得到这些时间：

``````0.0049ms -> N = 4, a = [i for i in range(N)], c = [a*b for a,b in zip(a, b)]
0.0075ms -> N = 4, a = [i for i in range(N)], c = a * b
0.0167ms -> N = 4, a = np.arange(N), c = [a*b for a,b in zip(a, b)]
0.0013ms -> N = 4, a = np.arange(N), c = a * b
0.0171ms -> N = 40, a = [i for i in range(N)], c = [a*b for a,b in zip(a, b)]
0.0095ms -> N = 40, a = [i for i in range(N)], c = a * b
0.1077ms -> N = 40, a = np.arange(N), c = [a*b for a,b in zip(a, b)]
0.0013ms -> N = 40, a = np.arange(N), c = a * b
0.1485ms -> N = 400, a = [i for i in range(N)], c = [a*b for a,b in zip(a, b)]
0.0397ms -> N = 400, a = [i for i in range(N)], c = a * b
1.0348ms -> N = 400, a = np.arange(N), c = [a*b for a,b in zip(a, b)]
0.0020ms -> N = 400, a = np.arange(N), c = a * b``````

``````import timeit

init = ['''
import numpy as np
N = {}
a = {}
b = np.linspace(0.0, 0.5, len(a))
'''.format(i, j) for i in [4, 40, 400]
for j in ['[i for i in range(N)]', 'np.arange(N)']]

func = ['''c = [a*b for a,b in zip(a, b)]''',
'''c = a * b''']

for i in init:
for f in func:
lines = i.split('\n')
print('{:6.4f}ms -> {}, {}, {}'.format(
timeit.timeit(f, setup=i, number=1000), lines, lines, f))``````

gahooa’s answer is correct for the question as phrased in the heading, but if the lists are already numpy format or larger than ten it will be MUCH faster (3 orders of magnitude) as well as more readable, to do simple numpy multiplication as suggested by NPE. I get these timings:

``````0.0049ms -> N = 4, a = [i for i in range(N)], c = [a*b for a,b in zip(a, b)]
0.0075ms -> N = 4, a = [i for i in range(N)], c = a * b
0.0167ms -> N = 4, a = np.arange(N), c = [a*b for a,b in zip(a, b)]
0.0013ms -> N = 4, a = np.arange(N), c = a * b
0.0171ms -> N = 40, a = [i for i in range(N)], c = [a*b for a,b in zip(a, b)]
0.0095ms -> N = 40, a = [i for i in range(N)], c = a * b
0.1077ms -> N = 40, a = np.arange(N), c = [a*b for a,b in zip(a, b)]
0.0013ms -> N = 40, a = np.arange(N), c = a * b
0.1485ms -> N = 400, a = [i for i in range(N)], c = [a*b for a,b in zip(a, b)]
0.0397ms -> N = 400, a = [i for i in range(N)], c = a * b
1.0348ms -> N = 400, a = np.arange(N), c = [a*b for a,b in zip(a, b)]
0.0020ms -> N = 400, a = np.arange(N), c = a * b
``````

i.e. from the following test program.

``````import timeit

init = ['''
import numpy as np
N = {}
a = {}
b = np.linspace(0.0, 0.5, len(a))
'''.format(i, j) for i in [4, 40, 400]
for j in ['[i for i in range(N)]', 'np.arange(N)']]

func = ['''c = [a*b for a,b in zip(a, b)]''',
'''c = a * b''']

for i in init:
for f in func:
lines = i.split('\n')
print('{:6.4f}ms -> {}, {}, {}'.format(
timeit.timeit(f, setup=i, number=1000), lines, lines, f))
``````

## 回答 10

``````a = [1, 2, 3, 4]
b = [2, 3, 4, 5]

ab = [val * b[i] for i, val in enumerate(a)]``````

Can use enumerate.

``````a = [1, 2, 3, 4]
b = [2, 3, 4, 5]

ab = [val * b[i] for i, val in enumerate(a)]
``````

## 回答 11

`map`功能在这里可能非常有用。使用`map`我们可以将任何函数应用于可迭代对象的每个元素。

Python 3.x

``````>>> def my_mul(x,y):
...     return x*y
...
>>> a = [1,2,3,4]
>>> b = [2,3,4,5]
>>>
>>> list(map(my_mul,a,b))
[2, 6, 12, 20]
>>>``````

``map(f, iterable)``

``[f(x) for x in iterable]``

``````>>> [my_mul(x,y) for x, y in zip(a,b)]
[2, 6, 12, 20]
>>>``````

Python 2.7

``````>>>from operator import mul # import mul operator
>>>a = [1,2,3,4]
>>>b = [2,3,4,5]
>>>map(mul,a,b)
[2, 6, 12, 20]
>>>``````

Python 3.5+

``````>>> from operator import mul
>>> a = [1,2,3,4]
>>> b = [2,3,4,5]
>>> [*map(mul,a,b)]
[2, 6, 12, 20]
>>>``````

``````>>> list(map(mul,a,b))
[2, 6, 12, 20]
>>>``````

The `map` function can be very useful here. Using `map` we can apply any function to each element of an iterable.

Python 3.x

``````>>> def my_mul(x,y):
...     return x*y
...
>>> a = [1,2,3,4]
>>> b = [2,3,4,5]
>>>
>>> list(map(my_mul,a,b))
[2, 6, 12, 20]
>>>
``````

Of course:

``````map(f, iterable)
``````

is equivalent to

``````[f(x) for x in iterable]
``````

So we can get our solution via:

``````>>> [my_mul(x,y) for x, y in zip(a,b)]
[2, 6, 12, 20]
>>>
``````

In Python 2.x `map()` means: apply a function to each element of an iterable and construct a new list. In Python 3.x, `map` construct iterators instead of lists.

Instead of `my_mul` we could use `mul` operator

Python 2.7

``````>>>from operator import mul # import mul operator
>>>a = [1,2,3,4]
>>>b = [2,3,4,5]
>>>map(mul,a,b)
[2, 6, 12, 20]
>>>
``````

Python 3.5+

``````>>> from operator import mul
>>> a = [1,2,3,4]
>>> b = [2,3,4,5]
>>> [*map(mul,a,b)]
[2, 6, 12, 20]
>>>
``````

Please note that since `map()` constructs an iterator we use `*` iterable unpacking operator to get a list. The unpacking approach is a bit faster then the `list` constructor:

``````>>> list(map(mul,a,b))
[2, 6, 12, 20]
>>>
``````

## 回答 12

``list(np.array([1,2,3,4]) * np.array([2,3,4,5]))``

``list(np.array(a) * np.array(b))``

To maintain the list type, and do it in one line (after importing numpy as np, of course):

``````list(np.array([1,2,3,4]) * np.array([2,3,4,5]))
``````

or

``````list(np.array(a) * np.array(b))
``````

## 回答 13

``````def lstsum(a, b):
c=0
pos = 0
for element in a:
c+= element*b[pos]
pos+=1
return c``````

you can use this for lists of the same length

``````def lstsum(a, b):
c=0
pos = 0
for element in a:
c+= element*b[pos]
pos+=1
return c
``````

# 是什么导致[* a]总体化？

## 问题：是什么导致[* a]总体化？ ``````0 56 56 56
1 64 88 88
2 72 88 96
3 80 88 104
4 88 88 112
5 96 120 120
6 104 120 128
7 112 120 136
8 120 120 152
9 128 184 184
10 136 184 192
11 144 184 200
12 152 184 208``````

``````from sys import getsizeof

for n in range(13):
a = [None] * n
print(n, getsizeof(list(a)),
getsizeof([x for x in a]),
getsizeof([*a]))``````  Apparently `list(a)` doesn’t overallocate, `[x for x in a]` overallocates at some points, and `[*a]` overallocates all the time? Here are sizes n from 0 to 12 and the resulting sizes in bytes for the three methods:

``````0 56 56 56
1 64 88 88
2 72 88 96
3 80 88 104
4 88 88 112
5 96 120 120
6 104 120 128
7 112 120 136
8 120 120 152
9 128 184 184
10 136 184 192
11 144 184 200
12 152 184 208
``````

Computed like this, reproducable at repl.it, using Python 3.8:

``````from sys import getsizeof

for n in range(13):
a = [None] * n
print(n, getsizeof(list(a)),
getsizeof([x for x in a]),
getsizeof([*a]))
``````

So: How does this work? How does `[*a]` overallocate? Actually, what mechanism does it use to create the result list from the given input? Does it use an iterator over `a` and use something like `list.append`? Where is the source code?

(Colab with data and code that produced the images.)

Zooming in to smaller n: Zooming out to larger n: ## 回答 0

`[*a]` 在内部执行C等效于

1. 新建一个空的 `list`
2. 呼叫 `newlist.extend(a)`
3. 返回`list`

``````from sys import getsizeof

for n in range(13):
a = [None] * n
l = []
l.extend(a)
print(n, getsizeof(list(a)),
getsizeof([x for x in a]),
getsizeof([*a]),
getsizeof(l))``````

`[*a]` is internally doing the C equivalent of:

1. Make a new, empty `list`
2. Call `newlist.extend(a)`
3. Returns `list`.

So if you expand your test to:

``````from sys import getsizeof

for n in range(13):
a = [None] * n
l = []
l.extend(a)
print(n, getsizeof(list(a)),
getsizeof([x for x in a]),
getsizeof([*a]),
getsizeof(l))
``````

Try it online!

you’ll see the results for `getsizeof([*a])` and `l = []; l.extend(a); getsizeof(l)` are the same.

This is usually the right thing to do; when `extend`ing you’re usually expecting to add more later, and similarly for generalized unpacking, it’s assumed that multiple things will be added one after the other. `[*a]` is not the normal case; Python assumes there are multiple items or iterables being added to the `list` (`[*a, b, c, *d]`), so overallocation saves work in the common case.

By contrast, a `list` constructed from a single, presized iterable (with `list()`) may not grow or shrink during use, and overallocating is premature until proven otherwise; Python recently fixed a bug that made the constructor overallocate even for inputs with known size.

As for `list` comprehensions, they’re effectively equivalent to repeated `append`s, so you’re seeing the final result of the normal overallocation growth pattern when adding an element at a time.

To be clear, none of this is a language guarantee. It’s just how CPython implements it. The Python language spec is generally unconcerned with specific growth patterns in `list` (aside from guaranteeing amortized `O(1)` `append`s and `pop`s from the end). As noted in the comments, the specific implementation changes again in 3.9; while it won’t affect `[*a]`, it could affect other cases where what used to be “build a temporary `tuple` of individual items and then `extend` with the `tuple`” now becomes multiple applications of `LIST_APPEND`, which can change when the overallocation occurs and what numbers go into the calculation.

## 回答 1

``````>>> import dis
>>> dis.dis('[*a]')
2 BUILD_LIST_UNPACK        1
4 RETURN_VALUE``````

``````        case TARGET(BUILD_LIST_UNPACK): {
...
PyObject *sum = PyList_New(0);
...
none_val = _PyList_Extend((PyListObject *)sum, PEEK(i));``````

`_PyList_Extend` 用途 `list_extend`

``````_PyList_Extend(PyListObject *self, PyObject *iterable)
{
return list_extend(self, iterable);
}``````
``````list_extend(PyListObject *self, PyObject *iterable)
...
n = PySequence_Fast_GET_SIZE(iterable);
...
m = Py_SIZE(self);
...
if (list_resize(self, m + n) < 0) {``````

overallocates如下：

``````list_resize(PyListObject *self, Py_ssize_t newsize)
{
...
new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6);``````

``````from sys import getsizeof
for n in range(13):
a = [None] * n
expected_spots = n + (n >> 3) + (3 if n < 9 else 6)
expected_bytesize = getsizeof([]) + expected_spots * 8
real_bytesize = getsizeof([*a])
print(n,
expected_bytesize,
real_bytesize,
real_bytesize == expected_bytesize)``````

``````0 80 56 False
1 88 88 True
2 96 96 True
3 104 104 True
4 112 112 True
5 120 120 True
6 128 128 True
7 136 136 True
8 152 152 True
9 184 184 True
10 192 192 True
11 200 200 True
12 208 208 True``````

``````        if (n == 0) {
...
Py_RETURN_NONE;
}
...
if (list_resize(self, m + n) < 0) {``````

Full picture of what happens, building on the other answers and comments (especially ShadowRanger’s answer, which also explains why it’s done like that).

Disassembling shows that `BUILD_LIST_UNPACK` gets used:

``````>>> import dis
>>> dis.dis('[*a]')
2 BUILD_LIST_UNPACK        1
4 RETURN_VALUE
``````

That’s handled in `ceval.c`, which builds an empty list and extends it (with `a`):

``````        case TARGET(BUILD_LIST_UNPACK): {
...
PyObject *sum = PyList_New(0);
...
none_val = _PyList_Extend((PyListObject *)sum, PEEK(i));
``````

`_PyList_Extend` uses `list_extend`:

``````_PyList_Extend(PyListObject *self, PyObject *iterable)
{
return list_extend(self, iterable);
}
``````
``````list_extend(PyListObject *self, PyObject *iterable)
...
n = PySequence_Fast_GET_SIZE(iterable);
...
m = Py_SIZE(self);
...
if (list_resize(self, m + n) < 0) {
``````

And that overallocates as follows:

``````list_resize(PyListObject *self, Py_ssize_t newsize)
{
...
new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6);
``````

Let’s check that. Compute the expected number of spots with the formula above, and compute the expected byte size by multiplying it with 8 (as I’m using 64-bit Python here) and adding an empty list’s byte size (i.e., a list object’s constant overhead):

``````from sys import getsizeof
for n in range(13):
a = [None] * n
expected_spots = n + (n >> 3) + (3 if n < 9 else 6)
expected_bytesize = getsizeof([]) + expected_spots * 8
real_bytesize = getsizeof([*a])
print(n,
expected_bytesize,
real_bytesize,
real_bytesize == expected_bytesize)
``````

Output:

``````0 80 56 False
1 88 88 True
2 96 96 True
3 104 104 True
4 112 112 True
5 120 120 True
6 128 128 True
7 136 136 True
8 152 152 True
9 184 184 True
10 192 192 True
11 200 200 True
12 208 208 True
``````

Matches except for `n = 0`, which `list_extend` actually shortcuts, so actually that matches, too:

``````        if (n == 0) {
...
Py_RETURN_NONE;
}
...
if (list_resize(self, m + n) < 0) {
``````

## 回答 2

https://github.com/python/cpython/blob/master/Objects/listobject.c#L36

`````` * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
...

new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6);``````

These are going to be implementation details of the CPython interpreter, and so may not be consistent across other interpreters.

That said, you can see where the comprehension and `list(a)` behaviors come in here:

https://github.com/python/cpython/blob/master/Objects/listobject.c#L36

Specifically for the comprehension:

`````` * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
...

new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6);
``````

Just below those lines, there is `list_preallocate_exact` which is used when calling `list(a)`.

# 为什么列表中允许尾随逗号？

## 问题：为什么列表中允许尾随逗号？

``````>>> ['a','b',]
['a', 'b']``````

I am curious why in Python a trailing comma in a list is valid syntax, and it seems that Python simply ignores it:

``````>>> ['a','b',]
['a', 'b']
``````

It makes sense when its a tuple since `('a')` and `('a',)` are two different things, but in lists?

## 回答 0

``````s = ['manny',
'mo',
'jack',
]``````

``````s = ['manny',
'mo',
'jack',
'roger',
]``````

``````  s = ['manny',
'mo',
'jack',
+      'roger',
]``````

``````  s = ['manny',
'mo',
-      'jack'
+      'jack',
+      'roger'
]``````

``````s = ['manny',
'mo',
'jack'
'roger'  # Added this line, but forgot to add a comma on the previous line
]``````

The main advantages are that it makes multi-line lists easier to edit and that it reduces clutter in diffs.

Changing:

``````s = ['manny',
'mo',
'jack',
]
``````

to:

``````s = ['manny',
'mo',
'jack',
'roger',
]
``````

involves only a one-line change in the diff:

``````  s = ['manny',
'mo',
'jack',
+      'roger',
]
``````

This beats the more confusing multi-line diff when the trailing comma was omitted:

``````  s = ['manny',
'mo',
-      'jack'
+      'jack',
+      'roger'
]
``````

The latter diff makes it harder to see that only one line was added and that the other line didn’t change content.

It also reduces the risk of doing this:

``````s = ['manny',
'mo',
'jack'
'roger'  # Added this line, but forgot to add a comma on the previous line
]
``````

and triggering implicit string literal concatenation, producing `s = ['manny', 'mo', 'jackroger']` instead of the intended result.

## 回答 1

It’s a common syntactical convention to allow trailing commas in an array, languages like C and Java allow it, and Python seems to have adopted this convention for its list data structure. It’s particularly useful when generating code for populating a list: just generate a sequence of elements and commas, no need to consider the last one as a special case that shouldn’t have a comma at the end.

## 回答 2

``````l1 = [
1,
2,
3,
4,
5
]

# Now you want to rearrange

l1 = [
1,
2,
3,
5
4,
]

# Now you have an error``````

It helps to eliminate a certain kind of bug. It’s sometimes clearer to write lists on multiple lines. But in, later maintenace you may want to rearrange the items.

``````l1 = [
1,
2,
3,
4,
5
]

# Now you want to rearrange

l1 = [
1,
2,
3,
5
4,
]

# Now you have an error
``````

But if you allow trailing commas, and use them, you can easily rearrange the lines without introducing an error.

## 回答 3

A tuple is different because `('a')` is expanded using implicit continuation and `()`s as a precendence operator, whereas `('a',)` refers to a length 1 tuple.

Your original example would have been `tuple('a')`

## 回答 4

``````list = [
'a',
'b',
'c'
]``````

``````list = [
'a',
'b',
'c',
'd'
]``````

The main reason is to make diff less complicated. For example you have a list :

``````list = [
'a',
'b',
'c'
]
``````

and you want to add another element to it. Then you will be end up doing this:

``````list = [
'a',
'b',
'c',
'd'
]
``````

thus, diff will show that two lines have been changed, first adding ‘,’ in line with ‘c’ and adding ‘d’ at last line.

So, python allows trailing ‘,’ in last element of list, to prevent extra diff which can cause confusion.

# 将两个LISTS的值之和添加到新的LIST中

## 问题：将两个LISTS的值之和添加到新的LIST中

``````first = [1,2,3,4,5]
second = [6,7,8,9,10]
``````

``third = [7,9,11,13,15]``

I have the following two lists:

``````first = [1,2,3,4,5]
second = [6,7,8,9,10]
``````

Now I want to add the items from both of these lists into a new list.

output should be

``````third = [7,9,11,13,15]
``````

## 回答 0

`zip`功能在此处有用，可与列表推导一起使用。

``[x + y for x, y in zip(first, second)]``

``````lists_of_lists = [[1, 2, 3], [4, 5, 6]]
[sum(x) for x in zip(*lists_of_lists)]
# -> [5, 7, 9]
``````

The `zip` function is useful here, used with a list comprehension.

``````[x + y for x, y in zip(first, second)]
``````

If you have a list of lists (instead of just two lists):

``````lists_of_lists = [[1, 2, 3], [4, 5, 6]]
[sum(x) for x in zip(*lists_of_lists)]
# -> [5, 7, 9]
``````

## 回答 1

``````import operator

From docs

``````import operator
``````

## 回答 2

Python 2.x和3.x：

``[a[i]+b[i] for i in range(len(a))]``

Assuming both lists `a` and `b` have same length, you do not need zip, numpy or anything else.

Python 2.x and 3.x:

``````[a[i]+b[i] for i in range(len(a))]
``````

## 回答 3

numpy中的默认行为是逐组件添加

``````import numpy as np

``array([7,9,11,13,15])``

Default behavior in numpy is add componentwise

``````import numpy as np
``````

which outputs

``````array([7,9,11,13,15])
``````

## 回答 4

``[sum(sublist) for sublist in itertools.izip(*myListOfLists)]``

This extends itself to any number of lists:

``````[sum(sublist) for sublist in itertools.izip(*myListOfLists)]
``````

In your case, `myListOfLists` would be `[first, second]`

## 回答 5

``````first = [1, 2, 3, 4]
second = [2, 3, 4, 5]
third = map(sum, zip(first, second))``````

Try the following code:

``````first = [1, 2, 3, 4]
second = [2, 3, 4, 5]
third = map(sum, zip(first, second))
``````

## 回答 6

``three = [sum(i) for i in zip(first,second)] # [7,9,11,13,15]``

``````from numpy import sum
three = sum([first,second], axis=0) # array([7,9,11,13,15])``````

The easy way and fast way to do this is:

``````three = [sum(i) for i in zip(first,second)] # [7,9,11,13,15]
``````

Alternatively, you can use numpy sum:

``````from numpy import sum
three = sum([first,second], axis=0) # array([7,9,11,13,15])
``````

## 回答 7

``````first = [1, 2, 3, 4, 5]
second = [6, 7, 8, 9, 10]
three = map(lambda x,y: x+y,first,second)
print three

Output
[7, 9, 11, 13, 15]``````
``````first = [1, 2, 3, 4, 5]
second = [6, 7, 8, 9, 10]
three = map(lambda x,y: x+y,first,second)
print three

Output
[7, 9, 11, 13, 15]
``````

## 回答 8

``list(map(lambda x,y: x+y, a,b))``

one-liner solution

``````list(map(lambda x,y: x+y, a,b))
``````

## 回答 9

Thiru在3月17日9:25回答了我的问题。

`` three = [sum(i) for i in zip(first,second)] # [7,9,11,13,15]``

`````` from numpy import sum
three = sum([first,second], axis=0) # array([7,9,11,13,15])``````

numpy数组可以做一些像矢量的操作

``````import numpy as np
a = [1,2,3,4,5]
b = [6,7,8,9,10]
c = list(np.array(a) + np.array(b))
print c
# [7, 9, 11, 13, 15]``````

My answer is repeated with Thiru’s that answered it in Mar 17 at 9:25.

It was simpler and quicker, here are his solutions:

The easy way and fast way to do this is:

`````` three = [sum(i) for i in zip(first,second)] # [7,9,11,13,15]
``````

Alternatively, you can use numpy sum:

`````` from numpy import sum
three = sum([first,second], axis=0) # array([7,9,11,13,15])
``````

You need numpy!

numpy array could do some operation like vectors
``````import numpy as np
a = [1,2,3,4,5]
b = [6,7,8,9,10]
c = list(np.array(a) + np.array(b))
print c
# [7, 9, 11, 13, 15]
``````

## 回答 10

``````def sum_lists(*args):
return list(map(sum, zip(*args)))

a = [1,2,3]
b = [1,2,3]

sum_lists(a,b)``````

``[2, 4, 6]``

``sum_lists([5,5,5,5,5], [10,10,10,10,10], [4,4,4,4,4])``

``[19, 19, 19, 19, 19]``

If you have an unknown number of lists of the same length, you can use the below function.

Here the *args accepts a variable number of list arguments (but only sums the same number of elements in each). The * is used again to unpack the elements in each of the lists.

``````def sum_lists(*args):
return list(map(sum, zip(*args)))

a = [1,2,3]
b = [1,2,3]

sum_lists(a,b)
``````

Output:

``````[2, 4, 6]
``````

Or with 3 lists

``````sum_lists([5,5,5,5,5], [10,10,10,10,10], [4,4,4,4,4])
``````

Output:

``````[19, 19, 19, 19, 19]
``````

## 回答 11

``````>>> a = [1,2,3,4,5]
>>> b = [6,7,8,9,10]
>>> zip(a, b)
[(1, 6), (2, 7), (3, 8), (4, 9), (5, 10)]
>>> map(lambda x: x + x, zip(a, b))
[7, 9, 11, 13, 15]``````

You can use `zip()`, which will “interleave” the two arrays together, and then `map()`, which will apply a function to each element in an iterable:

``````>>> a = [1,2,3,4,5]
>>> b = [6,7,8,9,10]
>>> zip(a, b)
[(1, 6), (2, 7), (3, 8), (4, 9), (5, 10)]
>>> map(lambda x: x + x, zip(a, b))
[7, 9, 11, 13, 15]
``````

## 回答 12

``````class SumList(object):
def __init__(self, this_list):
self.mylist = this_list

new_list = []
zipped_list = zip(self.mylist, other.mylist)
for item in zipped_list:
new_list.append(item + item)
return SumList(new_list)

def __repr__(self):
return str(self.mylist)

list1 = SumList([1,2,3,4,5])
list2 = SumList([10,20,30,40,50])
sum_list1_list2 = list1 + list2
print(sum_list1_list2)``````

``[11, 22, 33, 44, 55]``

Here is another way to do it. We make use of the internal __add__ function of python:

``````class SumList(object):
def __init__(self, this_list):
self.mylist = this_list

new_list = []
zipped_list = zip(self.mylist, other.mylist)
for item in zipped_list:
new_list.append(item + item)
return SumList(new_list)

def __repr__(self):
return str(self.mylist)

list1 = SumList([1,2,3,4,5])
list2 = SumList([10,20,30,40,50])
sum_list1_list2 = list1 + list2
print(sum_list1_list2)
``````

Output

``````[11, 22, 33, 44, 55]
``````

## 回答 13

``````def addVectors(v1, v2):
sum = [x + y for x, y in zip(v1, v2)]
if not len(v1) >= len(v2):
sum += v2[len(v1):]
else:
sum += v1[len(v2):]

return sum

#for testing
if __name__=='__main__':
a = [1, 2]
b = [1, 2, 3, 4]
print(a)
print(b)

If you want to add also the rest of the values in the lists you can use this (this is working in Python3.5)

``````def addVectors(v1, v2):
sum = [x + y for x, y in zip(v1, v2)]
if not len(v1) >= len(v2):
sum += v2[len(v1):]
else:
sum += v1[len(v2):]

return sum

#for testing
if __name__=='__main__':
a = [1, 2]
b = [1, 2, 3, 4]
print(a)
print(b)
``````

## 回答 14

``````    first = [1,2,3,4,5]
second = [6,7,8,9,10]
#one way
third = [x + y for x, y in zip(first, second)]
print("third" , third)
#otherway
fourth = []
for i,j in zip(first,second):
global fourth
fourth.append(i + j)
print("fourth" , fourth )
#third [7, 9, 11, 13, 15]
#fourth [7, 9, 11, 13, 15]``````
``````    first = [1,2,3,4,5]
second = [6,7,8,9,10]
#one way
third = [x + y for x, y in zip(first, second)]
print("third" , third)
#otherway
fourth = []
for i,j in zip(first,second):
global fourth
fourth.append(i + j)
print("fourth" , fourth )
#third [7, 9, 11, 13, 15]
#fourth [7, 9, 11, 13, 15]
``````

## 回答 15

``````N=int(input())
num1 = list(map(int, input().split()))
num2 = list(map(int, input().split()))
sum=[]

for i in range(0,N):
sum.append(num1[i]+num2[i])

for element in sum:
print(element, end=" ")

print("")``````

Here is another way to do it.It is working fine for me .

``````N=int(input())
num1 = list(map(int, input().split()))
num2 = list(map(int, input().split()))
sum=[]

for i in range(0,N):
sum.append(num1[i]+num2[i])

for element in sum:
print(element, end=" ")

print("")
``````

## 回答 16

``````j = min(len(l1), len(l2))
l3 = [l1[i]+l2[i] for i in range(j)]``````
``````j = min(len(l1), len(l2))
l3 = [l1[i]+l2[i] for i in range(j)]
``````

## 回答 17

``````first = [1,2,3,4,5]
second = [6,7,8,9,10]
three=[]

for i in range(0,5):
three.append(first[i]+second[i])

print(three)``````

Perhaps the simplest approach:

``````first = [1,2,3,4,5]
second = [6,7,8,9,10]
three=[]

for i in range(0,5):
three.append(first[i]+second[i])

print(three)
``````

## 回答 18

``````import numpy as np

third = np.array(first) + np.array(second)

print third

[7, 9, 11, 13, 15]``````

If you consider your lists as numpy array, then you need to easily sum them:

``````import numpy as np

third = np.array(first) + np.array(second)

print third

[7, 9, 11, 13, 15]
``````

## 回答 19

``````from itertools import zip_longest  # izip_longest for python2.x

l1 = [1, 2, 3]
l2 = [4, 5, 6, 7]

>>> list(map(sum, zip_longest(l1, l2, fillvalue=0)))
[5, 7, 9, 7]``````

What if you have list with different length, then you can try something like this (using `zip_longest`)

``````from itertools import zip_longest  # izip_longest for python2.x

l1 = [1, 2, 3]
l2 = [4, 5, 6, 7]

>>> list(map(sum, zip_longest(l1, l2, fillvalue=0)))
[5, 7, 9, 7]
``````

## 回答 20

``````first = [1, 2, 3, 4, 5]
second = [6, 7, 8, 9, 10]
third = []

a = len(first)
b = int(0)
while True:
x = first[b]
y = second[b]
ans = x + y
third.append(ans)
b = b + 1
if b == a:
break

print third``````

You can use this method but it will work only if both the list are of the same size:

``````first = [1, 2, 3, 4, 5]
second = [6, 7, 8, 9, 10]
third = []

a = len(first)
b = int(0)
while True:
x = first[b]
y = second[b]
ans = x + y
third.append(ans)
b = b + 1
if b == a:
break

print third
``````

# Python在一个列表中查找不在另一个列表中的元素[重复]

## 问题：Python在一个列表中查找不在另一个列表中的元素[重复]

``````main_list=[]
list_1=["a", "b", "c", "d", "e"]
list_2=["a", "f", "c", "m"] ``````

``main_list=["f", "m"]``

I need to compare two lists in order to create a new list of specific elements found in one list but not in the other. For example:

``````main_list=[]
list_1=["a", "b", "c", "d", "e"]
list_2=["a", "f", "c", "m"]
``````

I want to loop through list_1 and append to main_list all the elements from list_2 that are not found in list_1.

The result should be:

``````main_list=["f", "m"]
``````

How can I do it with python?

## 回答 0

TL; DR：

``````import numpy as np
main_list = np.setdiff1d(list_2,list_1)
# yields the elements in `list_2` that are NOT in `list_1```````

``````def setdiff_sorted(array1,array2,assume_unique=False):
ans = np.setdiff1d(array1,array2,assume_unique).tolist()
if assume_unique:
return sorted(ans)
return ans
main_list = setdiff_sorted(list_2,list_1)``````

（1）可以使用与NumPy的`setdiff1d``array1``array2``assume_unique`= `False`）。

`assume_unique`询问用户数组是否已经唯一。

``````import numpy as np
list_1 = ["a", "b", "c", "d", "e"]
list_2 = ["a", "f", "c", "m"]
main_list = np.setdiff1d(list_2,list_1)
# yields the elements in `list_2` that are NOT in `list_1```````

（2） 对于想要对答案进行排序的人，我做了一个自定义函数：

``````import numpy as np
def setdiff_sorted(array1,array2,assume_unique=False):
ans = np.setdiff1d(array1,array2,assume_unique).tolist()
if assume_unique:
return sorted(ans)
return ans``````

``main_list = setdiff_sorted(list_2,list_1)``

（a）解决方案2（自定义函数`setdiff_sorted`）返回一个列表（与解决方案1中的数组相比）。

（b）如果不确定这些元素是否唯一，则只需`setdiff1d`在解决方案A和B中都使用NumPy的默认设置。并发症的例子是什么？见注释（c）。

（c）如果两个列表中的任何一个都不唯一，情况将有所不同。
`list_2`的不是唯一的：`list2 = ["a", "f", "c", "m", "m"]`。保持`list1`原样：`list_1 = ["a", "b", "c", "d", "e"]`

TL;DR:
SOLUTION (1)

``````import numpy as np
main_list = np.setdiff1d(list_2,list_1)
# yields the elements in `list_2` that are NOT in `list_1`
``````

SOLUTION (2) You want a sorted list

``````def setdiff_sorted(array1,array2,assume_unique=False):
ans = np.setdiff1d(array1,array2,assume_unique).tolist()
if assume_unique:
return sorted(ans)
return ans
main_list = setdiff_sorted(list_2,list_1)
``````

EXPLANATIONS:
(1) You can use NumPy’s `setdiff1d` (`array1`,`array2`,`assume_unique`=`False`).

`assume_unique` asks the user IF the arrays ARE ALREADY UNIQUE.
If `False`, then the unique elements are determined first.
If `True`, the function will assume that the elements are already unique AND function will skip determining the unique elements.

This yields the unique values in `array1` that are not in `array2`. `assume_unique` is `False` by default.

If you are concerned with the unique elements (based on the response of Chinny84), then simply use (where `assume_unique=False` => the default value):

``````import numpy as np
list_1 = ["a", "b", "c", "d", "e"]
list_2 = ["a", "f", "c", "m"]
main_list = np.setdiff1d(list_2,list_1)
# yields the elements in `list_2` that are NOT in `list_1`
``````

(2) For those who want answers to be sorted, I’ve made a custom function:

``````import numpy as np
def setdiff_sorted(array1,array2,assume_unique=False):
ans = np.setdiff1d(array1,array2,assume_unique).tolist()
if assume_unique:
return sorted(ans)
return ans
``````

``````main_list = setdiff_sorted(list_2,list_1)
``````

SIDE NOTES:
(a) Solution 2 (custom function `setdiff_sorted`) returns a list (compared to an array in solution 1).

(b) If you aren’t sure if the elements are unique, just use the default setting of NumPy’s `setdiff1d` in both solutions A and B. What can be an example of a complication? See note (c).

(c) Things will be different if either of the two lists is not unique.
Say `list_2` is not unique: `list2 = ["a", "f", "c", "m", "m"]`. Keep `list1` as is: `list_1 = ["a", "b", "c", "d", "e"]`
Setting the default value of `assume_unique` yields `["f", "m"]` (in both solutions). HOWEVER, if you set `assume_unique=True`, both solutions give `["f", "m", "m"]`. Why? This is because the user ASSUMED that the elements are unique). Hence, IT IS BETTER TO KEEP `assume_unique` to its default value. Note that both answers are sorted.

## 回答 1

``main_list = list(set(list_2) - set(list_1))``

``````>>> list_1=["a", "b", "c", "d", "e"]
>>> list_2=["a", "f", "c", "m"]
>>> set(list_2) - set(list_1)
set(['m', 'f'])
>>> list(set(list_2) - set(list_1))
['m', 'f']``````

``````>>> list_1=["a", "b", "c", "d", "e"]
>>> list_2=["a", "f", "c", "m"]
>>> list(set(list_2).difference(list_1))
['m', 'f']``````

You can use sets:

``````main_list = list(set(list_2) - set(list_1))
``````

Output:

``````>>> list_1=["a", "b", "c", "d", "e"]
>>> list_2=["a", "f", "c", "m"]
>>> set(list_2) - set(list_1)
set(['m', 'f'])
>>> list(set(list_2) - set(list_1))
['m', 'f']
``````

Per @JonClements’ comment, here is a tidier version:

``````>>> list_1=["a", "b", "c", "d", "e"]
>>> list_2=["a", "f", "c", "m"]
>>> list(set(list_2).difference(list_1))
['m', 'f']
``````

## 回答 2

``main_list = list(set(list_2)-set(list_1))``

Not sure why the above explanations are so complicated when you have native methods available:

``````main_list = list(set(list_2)-set(list_1))
``````

## 回答 3

``main_list = [item for item in list_2 if item not in list_1]``

``````>>> list_1 = ["a", "b", "c", "d", "e"]
>>> list_2 = ["a", "f", "c", "m"]
>>>
>>> main_list = [item for item in list_2 if item not in list_1]
>>> main_list
['f', 'm']``````

``````set_1 = set(list_1)  # this reduces the lookup time from O(n) to O(1)
main_list = [item for item in list_2 if item not in set_1]``````

Use a list comprehension like this:

``````main_list = [item for item in list_2 if item not in list_1]
``````

Output:

``````>>> list_1 = ["a", "b", "c", "d", "e"]
>>> list_2 = ["a", "f", "c", "m"]
>>>
>>> main_list = [item for item in list_2 if item not in list_1]
>>> main_list
['f', 'm']
``````

Edit:

Like mentioned in the comments below, with large lists, the above is not the ideal solution. When that’s the case, a better option would be converting `list_1` to a `set` first:

``````set_1 = set(list_1)  # this reduces the lookup time from O(n) to O(1)
main_list = [item for item in list_2 if item not in set_1]
``````

## 回答 4

``````from itertools import filterfalse

main_list = list(filterfalse(set(list_1).__contains__, list_2))``````

``main_list = [x for x in list_2 if x not in list_1]``

``````set_1 = set(list_1)
main_list = [x for x in list_2 if x not in set_1]``````

``````list_1 = [1, 2, 3]
list_2 = [2, 3, 4]``````

``main_list = [2, 3, 4]``

（因为in `list_2`中的值与in 中的相同索引相匹配`list_1`），您绝对应该使用Patrick的答案，该答案不涉及临时`list`s或`set`s（即使`set`s大致相同`O(1)`，它们每张支票的“常数”因数也比简单的等式支票高） ）并且涉及`O(min(n, m))`工作，比其他任何答案都要少，并且如果您的问题对位置敏感，则是唯一正确的答案当匹配元素以不匹配的偏移量出现时解决方案。

†：使用列表理解来做与单行代码相同的方法是滥用嵌套循环以在“最外层”循环中创建和缓存值，例如：

``main_list = [x for set_1 in (set(list_1),) for x in list_2 if x not in set_1]``

If you want a one-liner solution (ignoring imports) that only requires `O(max(n, m))` work for inputs of length `n` and `m`, not `O(n * m)` work, you can do so with the `itertools` module:

``````from itertools import filterfalse

main_list = list(filterfalse(set(list_1).__contains__, list_2))
``````

This takes advantage of the functional functions taking a callback function on construction, allowing it to create the callback once and reuse it for every element without needing to store it somewhere (because `filterfalse` stores it internally); list comprehensions and generator expressions can do this, but it’s ugly.†

That gets the same results in a single line as:

``````main_list = [x for x in list_2 if x not in list_1]
``````

with the speed of:

``````set_1 = set(list_1)
main_list = [x for x in list_2 if x not in set_1]
``````

Of course, if the comparisons are intended to be positional, so:

``````list_1 = [1, 2, 3]
list_2 = [2, 3, 4]
``````

should produce:

``````main_list = [2, 3, 4]
``````

(because no value in `list_2` has a match at the same index in `list_1`), you should definitely go with Patrick’s answer, which involves no temporary `list`s or `set`s (even with `set`s being roughly `O(1)`, they have a higher “constant” factor per check than simple equality checks) and involves `O(min(n, m))` work, less than any other answer, and if your problem is position sensitive, is the only correct solution when matching elements appear at mismatched offsets.

†: The way to do the same thing with a list comprehension as a one-liner would be to abuse nested looping to create and cache value(s) in the “outermost” loop, e.g.:

``````main_list = [x for set_1 in (set(list_1),) for x in list_2 if x not in set_1]
``````

which also gives a minor performance benefit on Python 3 (because now `set_1` is locally scoped in the comprehension code, rather than looked up from nested scope for each check; on Python 2 that doesn’t matter, because Python 2 doesn’t use closures for list comprehensions; they operate in the same scope they’re used in).

## 回答 5

``````main_list=[]
list_1=["a", "b", "c", "d", "e"]
list_2=["a", "f", "c", "m"]

for i in list_2:
if i not in list_1:
main_list.append(i)

print(main_list)``````

``['f', 'm']``
``````main_list=[]
list_1=["a", "b", "c", "d", "e"]
list_2=["a", "f", "c", "m"]

for i in list_2:
if i not in list_1:
main_list.append(i)

print(main_list)
``````

output:

``````['f', 'm']
``````

## 回答 6

``main_list = [b for a, b in zip(list1, list2) if a!= b]``

I would `zip` the lists together to compare them element by element.

``````main_list = [b for a, b in zip(list1, list2) if a!= b]
``````

## 回答 7

``````crkmod_mpp = ['M13','M18','M19','M24']
testmod_mpp = ['M13','M14','M15','M16','M17','M18','M19','M20','M21','M22','M23','M24']``````

``````test= list(np.setdiff1d(testmod_mpp,crkmod_mpp))
print(test)
['M15', 'M16', 'M22', 'M23', 'M20', 'M14', 'M17', 'M21']``````

``````test = list(set(testmod_mpp).difference(set(crkmod_mpp)))
print(test)
['POA23', 'POA15', 'POA17', 'POA16', 'POA22', 'POA18', 'POA24', 'POA21']``````

I used two methods and I found one method useful over other. Here is my answer:

My input data:

``````crkmod_mpp = ['M13','M18','M19','M24']
testmod_mpp = ['M13','M14','M15','M16','M17','M18','M19','M20','M21','M22','M23','M24']
``````

Method1: `np.setdiff1d` I like this approach over other because it preserves the position

``````test= list(np.setdiff1d(testmod_mpp,crkmod_mpp))
print(test)
['M15', 'M16', 'M22', 'M23', 'M20', 'M14', 'M17', 'M21']
``````

Method2: Though it gives same answer as in Method1 but disturbs the order

``````test = list(set(testmod_mpp).difference(set(crkmod_mpp)))
print(test)
['POA23', 'POA15', 'POA17', 'POA16', 'POA22', 'POA18', 'POA24', 'POA21']
``````

Method1 `np.setdiff1d` meets my requirements perfectly. This answer for information.

## 回答 8

``````list_1=["a", "b", "c", "d", "e"]
list_2=["a", "f", "c", "m"]
from collections import Counter
cnt1 = Counter(list_1)
cnt2 = Counter(list_2)
final = [key for key, counts in cnt2.items() if cnt1.get(key, 0) != counts]

>>> final
['f', 'm']``````

``````list_1=["a", "b", "c", "d", "e", 'a']
cnt1 = Counter(list_1)
cnt2 = Counter(list_2)
final = [key for key, counts in cnt2.items() if cnt1.get(key, 0) != counts]

>>> final
['a', 'f', 'm']``````

If the number of occurences should be taken into account you probably need to use something like `collections.Counter`:

``````list_1=["a", "b", "c", "d", "e"]
list_2=["a", "f", "c", "m"]
from collections import Counter
cnt1 = Counter(list_1)
cnt2 = Counter(list_2)
final = [key for key, counts in cnt2.items() if cnt1.get(key, 0) != counts]

>>> final
['f', 'm']
``````

As promised this can also handle differing number of occurences as “difference”:

``````list_1=["a", "b", "c", "d", "e", 'a']
cnt1 = Counter(list_1)
cnt2 = Counter(list_2)
final = [key for key, counts in cnt2.items() if cnt1.get(key, 0) != counts]

>>> final
['a', 'f', 'm']
``````

# 输入项

ser1 = pd.Series（[1、2、3、4、5]）ser2 = pd.Series（[4、5、6、7、8]）

# 解

ser1 [〜ser1.isin（ser2）]

From ser1 remove items present in ser2.

# Input

ser1 = pd.Series([1, 2, 3, 4, 5]) ser2 = pd.Series([4, 5, 6, 7, 8])

# Solution

ser1[~ser1.isin(ser2)]