问题:NumPy数组初始化(使用相同的值填充)

我需要创建一个长度为NumPy的数组n,其中每个元素为v

还有什么比:

a = empty(n)
for i in range(n):
    a[i] = v

我知道zeros并且ones可以在v = 0,1下使用。我可以使用v * ones(n),但是vis 上将不起作用None,而且速度会慢很多。

I need to create a NumPy array of length n, each element of which is v.

Is there anything better than:

a = empty(n)
for i in range(n):
    a[i] = v

I know zeros and ones would work for v = 0, 1. I could use v * ones(n), but it won’t work when v is None, and also would be much slower.


回答 0

NumPy的1.8引入,这是比更直接的方法empty(),接着fill()用于创建填充有一定值的数组:

>>> np.full((3, 5), 7)
array([[ 7.,  7.,  7.,  7.,  7.],
       [ 7.,  7.,  7.,  7.,  7.],
       [ 7.,  7.,  7.,  7.,  7.]])

>>> np.full((3, 5), 7, dtype=int)
array([[7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7]])

可以说这是创建一个填充有某些值的数组方法,因为它明确描述了要实现的目标(并且从原理上讲,它可以执行非常具体的任务,因此非常高效)。

NumPy 1.8 introduced , which is a more direct method than empty() followed by fill() for creating an array filled with a certain value:

>>> np.full((3, 5), 7)
array([[ 7.,  7.,  7.,  7.,  7.],
       [ 7.,  7.,  7.,  7.,  7.],
       [ 7.,  7.,  7.,  7.,  7.]])

>>> np.full((3, 5), 7, dtype=int)
array([[7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7]])

This is arguably the way of creating an array filled with certain values, because it explicitly describes what is being achieved (and it can in principle be very efficient since it performs a very specific task).


回答 1

已为Numpy 1.7.0更新:(@ Rolf Bartstra的提示)。

a=np.empty(n); a.fill(5) 最快。

以降序排列:

%timeit a=np.empty(1e4); a.fill(5)
100000 loops, best of 3: 5.85 us per loop

%timeit a=np.empty(1e4); a[:]=5 
100000 loops, best of 3: 7.15 us per loop

%timeit a=np.ones(1e4)*5
10000 loops, best of 3: 22.9 us per loop

%timeit a=np.repeat(5,(1e4))
10000 loops, best of 3: 81.7 us per loop

%timeit a=np.tile(5,[1e4])
10000 loops, best of 3: 82.9 us per loop

Updated for Numpy 1.7.0:(Hat-tip to @Rolf Bartstra.)

a=np.empty(n); a.fill(5) is fastest.

In descending speed order:

%timeit a=np.empty(1e4); a.fill(5)
100000 loops, best of 3: 5.85 us per loop

%timeit a=np.empty(1e4); a[:]=5 
100000 loops, best of 3: 7.15 us per loop

%timeit a=np.ones(1e4)*5
10000 loops, best of 3: 22.9 us per loop

%timeit a=np.repeat(5,(1e4))
10000 loops, best of 3: 81.7 us per loop

%timeit a=np.tile(5,[1e4])
10000 loops, best of 3: 82.9 us per loop

回答 2

我相信这fill是最快的方法。

a = np.empty(10)
a.fill(7)

您还应该始终避免像在示例中那样进行迭代。一个简单的a[:] = v函数将使用numpy 广播来完成您的迭代操作。

I believe fill is the fastest way to do this.

a = np.empty(10)
a.fill(7)

You should also always avoid iterating like you are doing in your example. A simple a[:] = v will accomplish what your iteration does using numpy broadcasting.


回答 3

显然,不仅绝对速度而且速度顺序(如user1579844所报告)均取决于机器。这是我发现的:

a=np.empty(1e4); a.fill(5) 最快

以降序排列:

timeit a=np.empty(1e4); a.fill(5) 
# 100000 loops, best of 3: 10.2 us per loop
timeit a=np.empty(1e4); a[:]=5
# 100000 loops, best of 3: 16.9 us per loop
timeit a=np.ones(1e4)*5
# 100000 loops, best of 3: 32.2 us per loop
timeit a=np.tile(5,[1e4])
# 10000 loops, best of 3: 90.9 us per loop
timeit a=np.repeat(5,(1e4))
# 10000 loops, best of 3: 98.3 us per loop
timeit a=np.array([5]*int(1e4))
# 1000 loops, best of 3: 1.69 ms per loop (slowest BY FAR!)

因此,请尝试找出并使用平台上最快的功能。

Apparently, not only the absolute speeds but also the speed order (as reported by user1579844) are machine dependent; here’s what I found:

a=np.empty(1e4); a.fill(5) is fastest;

In descending speed order:

timeit a=np.empty(1e4); a.fill(5) 
# 100000 loops, best of 3: 10.2 us per loop
timeit a=np.empty(1e4); a[:]=5
# 100000 loops, best of 3: 16.9 us per loop
timeit a=np.ones(1e4)*5
# 100000 loops, best of 3: 32.2 us per loop
timeit a=np.tile(5,[1e4])
# 10000 loops, best of 3: 90.9 us per loop
timeit a=np.repeat(5,(1e4))
# 10000 loops, best of 3: 98.3 us per loop
timeit a=np.array([5]*int(1e4))
# 1000 loops, best of 3: 1.69 ms per loop (slowest BY FAR!)

So, try and find out, and use what’s fastest on your platform.


回答 4

我有

numpy.array(n * [value])

请记住,但是显然,这比所有其他建议都足够慢n

这是与perfplot(我的一个宠物项目)的完整比较。

在此处输入图片说明

这两种empty选择仍然是最快的(使用NumPy 1.12.1)。full赶上大型阵列。


生成绘图的代码:

import numpy as np
import perfplot


def empty_fill(n):
    a = np.empty(n)
    a.fill(3.14)
    return a


def empty_colon(n):
    a = np.empty(n)
    a[:] = 3.14
    return a


def ones_times(n):
    return 3.14 * np.ones(n)


def repeat(n):
    return np.repeat(3.14, (n))


def tile(n):
    return np.repeat(3.14, [n])


def full(n):
    return np.full((n), 3.14)


def list_to_array(n):
    return np.array(n * [3.14])


perfplot.show(
    setup=lambda n: n,
    kernels=[empty_fill, empty_colon, ones_times, repeat, tile, full, list_to_array],
    n_range=[2 ** k for k in range(27)],
    xlabel="len(a)",
    logx=True,
    logy=True,
)

I had

numpy.array(n * [value])

in mind, but apparently that is slower than all other suggestions for large enough n.

Here is full comparison with perfplot (a pet project of mine).

enter image description here

The two empty alternatives are still the fastest (with NumPy 1.12.1). full catches up for large arrays.


Code to generate the plot:

import numpy as np
import perfplot


def empty_fill(n):
    a = np.empty(n)
    a.fill(3.14)
    return a


def empty_colon(n):
    a = np.empty(n)
    a[:] = 3.14
    return a


def ones_times(n):
    return 3.14 * np.ones(n)


def repeat(n):
    return np.repeat(3.14, (n))


def tile(n):
    return np.repeat(3.14, [n])


def full(n):
    return np.full((n), 3.14)


def list_to_array(n):
    return np.array(n * [3.14])


perfplot.show(
    setup=lambda n: n,
    kernels=[empty_fill, empty_colon, ones_times, repeat, tile, full, list_to_array],
    n_range=[2 ** k for k in range(27)],
    xlabel="len(a)",
    logx=True,
    logy=True,
)

回答 5

您可以使用numpy.tile,例如:

v = 7
rows = 3
cols = 5
a = numpy.tile(v, (rows,cols))
a
Out[1]: 
array([[7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7]])

尽管tile是为了“平铺”一个数组(而不是这种情况下的标量),但它可以完成工作,创建任何大小和尺寸的预填充数组。

You can use numpy.tile, e.g. :

v = 7
rows = 3
cols = 5
a = numpy.tile(v, (rows,cols))
a
Out[1]: 
array([[7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7]])

Although tile is meant to ’tile’ an array (instead of a scalar, as in this case), it will do the job, creating pre-filled arrays of any size and dimension.


回答 6

没有numpy的

>>>[2]*3
[2, 2, 2]

without numpy

>>>[2]*3
[2, 2, 2]

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。