问题:查找列表元素之间的差异
给定一个数字列表,人们如何发现i
第()个元素与其第()个元素之间的差异i+1
?
使用lambda
表达式还是列表理解更好?
例如:
给定一个列表t=[1,3,6,...]
,我们的目标是要找到一个列表v=[2,3,...]
,因为3-1=2
,6-3=3
等等。
Given a list of numbers, how does one find differences between every (i
)-th elements and its (i+1
)-th?
Is it better to use a lambda
expression or maybe a list comprehension?
For example:
Given a list t=[1,3,6,...]
, the goal is to find a list v=[2,3,...]
because 3-1=2
, 6-3=3
, etc.
回答 0
>>> t
[1, 3, 6]
>>> [j-i for i, j in zip(t[:-1], t[1:])] # or use itertools.izip in py2k
[2, 3]
>>> t
[1, 3, 6]
>>> [j-i for i, j in zip(t[:-1], t[1:])] # or use itertools.izip in py2k
[2, 3]
回答 1
其他答案是正确的,但是如果您要进行数值运算,则可能需要考虑使用numpy。使用numpy,答案是:
v = numpy.diff(t)
The other answers are correct but if you’re doing numerical work, you might want to consider numpy. Using numpy, the answer is:
v = numpy.diff(t)
回答 2
如果您不想使用numpy
nor zip
,则可以使用以下解决方案:
>>> t = [1, 3, 6]
>>> v = [t[i+1]-t[i] for i in range(len(t)-1)]
>>> v
[2, 3]
If you don’t want to use numpy
nor zip
, you can use the following solution:
>>> t = [1, 3, 6]
>>> v = [t[i+1]-t[i] for i in range(len(t)-1)]
>>> v
[2, 3]
回答 3
您可以使用itertools.tee
和zip
有效地构建结果:
from itertools import tee
# python2 only:
#from itertools import izip as zip
def differences(seq):
iterable, copied = tee(seq)
next(copied)
for x, y in zip(iterable, copied):
yield y - x
或itertools.islice
改为使用:
from itertools import islice
def differences(seq):
nexts = islice(seq, 1, None)
for x, y in zip(seq, nexts):
yield y - x
您也可以避免使用itertools
模块:
def differences(seq):
iterable = iter(seq)
prev = next(iterable)
for element in iterable:
yield element - prev
prev = element
如果您不需要存储所有结果并支持无限的可迭代对象,那么所有这些解决方案都可以在恒定的空间中工作。
以下是解决方案的一些微观基准:
In [12]: L = range(10**6)
In [13]: from collections import deque
In [15]: %timeit deque(differences_tee(L), maxlen=0)
10 loops, best of 3: 122 ms per loop
In [16]: %timeit deque(differences_islice(L), maxlen=0)
10 loops, best of 3: 127 ms per loop
In [17]: %timeit deque(differences_no_it(L), maxlen=0)
10 loops, best of 3: 89.9 ms per loop
以及其他建议的解决方案:
In [18]: %timeit [x[1] - x[0] for x in zip(L[1:], L)]
10 loops, best of 3: 163 ms per loop
In [19]: %timeit [L[i+1]-L[i] for i in range(len(L)-1)]
1 loops, best of 3: 395 ms per loop
In [20]: import numpy as np
In [21]: %timeit np.diff(L)
1 loops, best of 3: 479 ms per loop
In [35]: %%timeit
...: res = []
...: for i in range(len(L) - 1):
...: res.append(L[i+1] - L[i])
...:
1 loops, best of 3: 234 ms per loop
注意:
zip(L[1:], L)
等价于,zip(L[1:], L[:-1])
因为zip
已经终止于最短的输入,但是它避免了的整个副本L
。
- 通过索引访问单个元素非常慢,因为每次索引访问都是python中的方法调用
numpy.diff
之所以很慢是因为它必须首先将转换list
为ndarray
。显然,如果你开始用ndarray
这将是多快:
In [22]: arr = np.array(L)
In [23]: %timeit np.diff(arr)
100 loops, best of 3: 3.02 ms per loop
You can use itertools.tee
and zip
to efficiently build the result:
from itertools import tee
# python2 only:
#from itertools import izip as zip
def differences(seq):
iterable, copied = tee(seq)
next(copied)
for x, y in zip(iterable, copied):
yield y - x
Or using itertools.islice
instead:
from itertools import islice
def differences(seq):
nexts = islice(seq, 1, None)
for x, y in zip(seq, nexts):
yield y - x
You can also avoid using the itertools
module:
def differences(seq):
iterable = iter(seq)
prev = next(iterable)
for element in iterable:
yield element - prev
prev = element
All these solution work in constant space if you don’t need to store all the results and support infinite iterables.
Here are some micro-benchmarks of the solutions:
In [12]: L = range(10**6)
In [13]: from collections import deque
In [15]: %timeit deque(differences_tee(L), maxlen=0)
10 loops, best of 3: 122 ms per loop
In [16]: %timeit deque(differences_islice(L), maxlen=0)
10 loops, best of 3: 127 ms per loop
In [17]: %timeit deque(differences_no_it(L), maxlen=0)
10 loops, best of 3: 89.9 ms per loop
And the other proposed solutions:
In [18]: %timeit [x[1] - x[0] for x in zip(L[1:], L)]
10 loops, best of 3: 163 ms per loop
In [19]: %timeit [L[i+1]-L[i] for i in range(len(L)-1)]
1 loops, best of 3: 395 ms per loop
In [20]: import numpy as np
In [21]: %timeit np.diff(L)
1 loops, best of 3: 479 ms per loop
In [35]: %%timeit
...: res = []
...: for i in range(len(L) - 1):
...: res.append(L[i+1] - L[i])
...:
1 loops, best of 3: 234 ms per loop
Note that:
zip(L[1:], L)
is equivalent to zip(L[1:], L[:-1])
since zip
already terminates on the shortest input, however it avoids a whole copy of L
.
- Accessing the single elements by index is very slow because every index access is a method call in python
numpy.diff
is slow because it has to first convert the list
to a ndarray
. Obviously if you start with an ndarray
it will be much faster:
In [22]: arr = np.array(L)
In [23]: %timeit np.diff(arr)
100 loops, best of 3: 3.02 ms per loop
回答 4
使用:=
Python 3.8+中可用的walrus运算符:
>>> t = [1, 3, 6]
>>> prev = t[0]; [-prev + (prev := x) for x in t[1:]]
[2, 3]
Using the :=
walrus operator available in Python 3.8+:
>>> t = [1, 3, 6]
>>> prev = t[0]; [-prev + (prev := x) for x in t[1:]]
[2, 3]
回答 5
我建议使用
v = np.diff(t)
这是简单易读的。
但如果你想v
有相同的长度,t
然后
v = np.diff([t[0]] + t) # for python 3.x
要么
v = np.diff(t + [t[-1]])
仅供参考:这仅适用于列表。
用于numpy数组
v = np.diff(np.append(t[0], t))
I would suggest using
v = np.diff(t)
this is simple and easy to read.
But if you want v
to have the same length as t
then
v = np.diff([t[0]] + t) # for python 3.x
or
v = np.diff(t + [t[-1]])
FYI: this will only work for lists.
for numpy arrays
v = np.diff(np.append(t[0], t))
回答 6
功能方法:
>>> import operator
>>> a = [1,3,5,7,11,13,17,21]
>>> map(operator.sub, a[1:], a[:-1])
[2, 2, 2, 4, 2, 4, 4]
使用生成器:
>>> import operator, itertools
>>> g1,g2 = itertools.tee((x*x for x in xrange(5)),2)
>>> list(itertools.imap(operator.sub, itertools.islice(g1,1,None), g2))
[1, 3, 5, 7]
使用索引:
>>> [a[i+1]-a[i] for i in xrange(len(a)-1)]
[2, 2, 2, 4, 2, 4, 4]
A functional approach:
>>> import operator
>>> a = [1,3,5,7,11,13,17,21]
>>> map(operator.sub, a[1:], a[:-1])
[2, 2, 2, 4, 2, 4, 4]
Using generator:
>>> import operator, itertools
>>> g1,g2 = itertools.tee((x*x for x in xrange(5)),2)
>>> list(itertools.imap(operator.sub, itertools.islice(g1,1,None), g2))
[1, 3, 5, 7]
Using indices:
>>> [a[i+1]-a[i] for i in xrange(len(a)-1)]
[2, 2, 2, 4, 2, 4, 4]
回答 7
好。我想我找到了正确的解决方案:
v = [x[1]-x[0] for x in zip(t[1:],t[:-1])]
Ok. I think I found the proper solution:
v = [x[1]-x[0] for x in zip(t[1:],t[:-1])]
回答 8
具有周期边界的解决方案
有时,使用数值积分时,您可能希望将具有周期性边界条件的列表与众不同(因此,第一个元素计算与最后一个元素的差。在这种情况下,numpy.roll函数会有所帮助:
v-np.roll(v,1)
零前置解决方案
另一个numpy解决方案(仅出于完整性考虑)是使用
numpy.ediff1d(v)
它作为numpy.diff起作用,但仅在向量上起作用(它使输入数组变平)。它提供了在结果矢量前添加或添加数字的功能。这在处理通常是气象变量(例如,雨水,潜热等)通量变化的累积字段时很有用,因为您想要一个与输入变量长度相同的结果列表,而第一个条目保持不变。
那你会写
np.ediff1d(v,to_begin=v[0])
当然,您也可以使用np.diff命令来执行此操作,在这种情况下,尽管您需要使用prepend关键字在序列前加零:
np.diff(v,prepend=0.0)
以上所有解决方案都返回一个与输入长度相同的向量。
Solution with periodic boundaries
Sometimes with numerical integration you will want to difference a list with periodic boundary conditions (so the first element calculates the difference to the last. In this case the numpy.roll function is helpful:
v-np.roll(v,1)
Solutions with zero prepended
Another numpy solution (just for completeness) is to use
numpy.ediff1d(v)
This works as numpy.diff, but only on a vector (it flattens the input array). It offers the ability to prepend or append numbers to the resulting vector. This is useful when handling accumulated fields that is often the case fluxes in meteorological variables (e.g. rain, latent heat etc), as you want a resulting list of the same length as the input variable, with the first entry untouched.
Then you would write
np.ediff1d(v,to_begin=v[0])
Of course, you can also do this with the np.diff command, in this case though you need to prepend zero to the series with the prepend keyword:
np.diff(v,prepend=0.0)
All the above solutions return a vector that is the same length as the input.
回答 9
我的方式
>>>v = [1,2,3,4,5]
>>>[v[i] - v[i-1] for i, value in enumerate(v[1:], 1)]
[1, 1, 1, 1]
My way
>>>v = [1,2,3,4,5]
>>>[v[i] - v[i-1] for i, value in enumerate(v[1:], 1)]
[1, 1, 1, 1]