初始化一个numpy数组

问题:初始化一个numpy数组

有没有办法初始化形状的numpy数组并将其添加到其中?我将通过列表示例来说明我需要的内容。如果要创建循环中生成的对象列表,可以执行以下操作:

a = []
for i in range(5):
    a.append(i)

我想对一个numpy数组做类似的事情。我了解vstack,串联等。但是,这些似乎需要两个numpy数组作为输入。我需要的是:

big_array # Initially empty. This is where I don't know what to specify
for i in range(5):
    array i of shape = (2,4) created.
    add to big_array

big_array应具有的形状(10,4)。这该怎么做?


编辑:

我想添加以下说明。我知道我可以定义big_array = numpy.zeros((10,4))然后填充它。但是,这需要预先指定big_array的大小。我知道这种情况下的大小,但是如果我不知道该怎么办?当我们使用该.append函数在python中扩展列表时,我们不需要事先知道其最终大小。我想知道是否存在从空数组开始的从较小数组创建较大数组的类似方法。

Is there way to initialize a numpy array of a shape and add to it? I will explain what I need with a list example. If I want to create a list of objects generated in a loop, I can do:

a = []
for i in range(5):
    a.append(i)

I want to do something similar with a numpy array. I know about vstack, concatenate etc. However, it seems these require two numpy arrays as inputs. What I need is:

big_array # Initially empty. This is where I don't know what to specify
for i in range(5):
    array i of shape = (2,4) created.
    add to big_array

The big_array should have a shape (10,4). How to do this?


EDIT:

I want to add the following clarification. I am aware that I can define big_array = numpy.zeros((10,4)) and then fill it up. However, this requires specifying the size of big_array in advance. I know the size in this case, but what if I do not? When we use the .append function for extending the list in python, we don’t need to know its final size in advance. I am wondering if something similar exists for creating a bigger array from smaller arrays, starting with an empty array.


回答 0

numpy.zeros

返回给定形状和类型的新数组,并用零填充。

要么

numpy.ones

返回给定形状和类型的新数组,并填充其中的一个。

要么

numpy.empty

返回给定形状和类型的新数组,而无需初始化条目。


但是,通过将元素追加到列表来构造数组的思路在numpy中使用不多,因为它效率较低(numpy数据类型更接近基础C数组)。相反,您应该将数组预分配为所需的大小,然后填写行。不过,您可以numpy.append根据需要使用。

numpy.zeros

Return a new array of given shape and type, filled with zeros.

or

numpy.ones

Return a new array of given shape and type, filled with ones.

or

numpy.empty

Return a new array of given shape and type, without initializing entries.


However, the mentality in which we construct an array by appending elements to a list is not much used in numpy, because it’s less efficient (numpy datatypes are much closer to the underlying C arrays). Instead, you should preallocate the array to the size that you need it to be, and then fill in the rows. You can use numpy.append if you must, though.


回答 1

我通常这样做的方法是创建一个常规列表,然后将其添加到列表中,最后将列表转换为numpy数组,如下所示:

import numpy as np
big_array = [] #  empty regular list
for i in range(5):
    arr = i*np.ones((2,4)) # for instance
    big_array.append(arr)
big_np_array = np.array(big_array)  # transformed to a numpy array

当然,最终对象在创建步骤中占用的内存空间是原来的两倍,但是追加到python列表上的速度非常快,并且使用np.array()进行创建也是如此。

The way I usually do that is by creating a regular list, then append my stuff into it, and finally transform the list to a numpy array as follows :

import numpy as np
big_array = [] #  empty regular list
for i in range(5):
    arr = i*np.ones((2,4)) # for instance
    big_array.append(arr)
big_np_array = np.array(big_array)  # transformed to a numpy array

of course your final object takes twice the space in the memory at the creation step, but appending on python list is very fast, and creation using np.array() also.


回答 2

在numpy 1.8中引入:

numpy.full

返回给定形状和类型的新数组,并用fill_value填充。

例子:

>>> import numpy as np
>>> np.full((2, 2), np.inf)
array([[ inf,  inf],
       [ inf,  inf]])
>>> np.full((2, 2), 10)
array([[10, 10],
       [10, 10]])

Introduced in numpy 1.8:

numpy.full

Return a new array of given shape and type, filled with fill_value.

Examples:

>>> import numpy as np
>>> np.full((2, 2), np.inf)
array([[ inf,  inf],
       [ inf,  inf]])
>>> np.full((2, 2), 10)
array([[10, 10],
       [10, 10]])

回答 3

python的数组模拟

a = []
for i in range(5):
    a.append(i)

是:

import numpy as np

a = np.empty((0))
for i in range(5):
    a = np.append(a, i)

Array analogue for the python’s

a = []
for i in range(5):
    a.append(i)

is:

import numpy as np

a = np.empty((0))
for i in range(5):
    a = np.append(a, i)

回答 4

numpy.fromiter() 您正在寻找的是:

big_array = numpy.fromiter(xrange(5), dtype="int")

它也适用于生成器表达式,例如:

big_array = numpy.fromiter( (i*(i+1)/2 for i in xrange(5)), dtype="int" )

如果事先知道数组的长度,则可以使用可选的’count’参数指定它的长度。

numpy.fromiter() is what you are looking for:

big_array = numpy.fromiter(xrange(5), dtype="int")

It also works with generator expressions, e.g.:

big_array = numpy.fromiter( (i*(i+1)/2 for i in xrange(5)), dtype="int" )

If you know the length of the array in advance, you can specify it with an optional ‘count’ argument.


回答 5

您确实希望在进行数组计算时尽可能避免显式循环,因为这会降低这种形式的计算的速度增益。有多种初始化numpy数组的方法。如果要用零填充,请按照katrielalex的指示进行:

big_array = numpy.zeros((10,4))

编辑:您正在制作哪种顺序?您应该查看创建数组的不同numpy函数,例如numpy.linspace(start, stop, size)(等号)或numpy.arange(start, stop, inc)。在可能的情况下,这些函数将使数组比在显式循环中完成相同工作的速度快得多

You do want to avoid explicit loops as much as possible when doing array computing, as that reduces the speed gain from that form of computing. There are multiple ways to initialize a numpy array. If you want it filled with zeros, do as katrielalex said:

big_array = numpy.zeros((10,4))

EDIT: What sort of sequence is it you’re making? You should check out the different numpy functions that create arrays, like numpy.linspace(start, stop, size) (equally spaced number), or numpy.arange(start, stop, inc). Where possible, these functions will make arrays substantially faster than doing the same work in explicit loops


回答 6

对于您的第一个数组示例,

a = numpy.arange(5)

要初始化big_array,请使用

big_array = numpy.zeros((10,4))

假设您要用零初始化,这很典型,但是还有许多其他方法可以在numpy中初始化数组

编辑: 如果您事先不知道big_array的大小,通常最好首先使用append构建一个Python列表,并且当列表中收集了所有内容时,请使用将该列表转换为numpy数组numpy.array(mylist)。原因是列表的目的是非常高效和快速地增长,而numpy.concatenate效率很低,因为numpy数组不容易更改大小。但是,一旦所有内容都收集到列表中,并且您知道最终的数组大小,就可以有效地构造一个numpy数组。

For your first array example use,

a = numpy.arange(5)

To initialize big_array, use

big_array = numpy.zeros((10,4))

This assumes you want to initialize with zeros, which is pretty typical, but there are many other ways to initialize an array in numpy.

Edit: If you don’t know the size of big_array in advance, it’s generally best to first build a Python list using append, and when you have everything collected in the list, convert this list to a numpy array using numpy.array(mylist). The reason for this is that lists are meant to grow very efficiently and quickly, whereas numpy.concatenate would be very inefficient since numpy arrays don’t change size easily. But once everything is collected in a list, and you know the final array size, a numpy array can be efficiently constructed.


回答 7

要使用特定矩阵初始化numpy数组,请执行以下操作:

import numpy as np

mat = np.array([[1, 1, 0, 0, 0],
                [0, 1, 0, 0, 1],
                [1, 0, 0, 1, 1],
                [0, 0, 0, 0, 0],
                [1, 0, 1, 0, 1]])

print mat.shape
print mat

输出:

(5, 5)
[[1 1 0 0 0]
 [0 1 0 0 1]
 [1 0 0 1 1]
 [0 0 0 0 0]
 [1 0 1 0 1]]

To initialize a numpy array with a specific matrix:

import numpy as np

mat = np.array([[1, 1, 0, 0, 0],
                [0, 1, 0, 0, 1],
                [1, 0, 0, 1, 1],
                [0, 0, 0, 0, 0],
                [1, 0, 1, 0, 1]])

print mat.shape
print mat

output:

(5, 5)
[[1 1 0 0 0]
 [0 1 0 0 1]
 [1 0 0 1 1]
 [0 0 0 0 0]
 [1 0 1 0 1]]

回答 8

每当您处于以下情况时:

a = []
for i in range(5):
    a.append(i)

并且您想要类似numpy的内容,先前的几个答案已经指出了实现方法,但是正如@katrielalex指出的那样,这些方法效率不高。执行此操作的有效方法是建立一个长列表,然后在拥有一个长列表后以所需的方式重塑它。例如,假设我正在从文件中读取一些行,并且每一行都有一个数字列表,并且我想构建一个形状为numpy的数组(读取的行数,每一行中的向量长度)。这是我将更有效地执行此操作的方法:

long_list = []
counter = 0
with open('filename', 'r') as f:
    for row in f:
        row_list = row.split()
        long_list.extend(row_list)
        counter++
#  now we have a long list and we are ready to reshape
result = np.array(long_list).reshape(counter, len(row_list)) #  desired numpy array

Whenever you are in the following situation:

a = []
for i in range(5):
    a.append(i)

and you want something similar in numpy, several previous answers have pointed out ways to do it, but as @katrielalex pointed out these methods are not efficient. The efficient way to do this is to build a long list and then reshape it the way you want after you have a long list. For example, let’s say I am reading some lines from a file and each row has a list of numbers and I want to build a numpy array of shape (number of lines read, length of vector in each row). Here is how I would do it more efficiently:

long_list = []
counter = 0
with open('filename', 'r') as f:
    for row in f:
        row_list = row.split()
        long_list.extend(row_list)
        counter++
#  now we have a long list and we are ready to reshape
result = np.array(long_list).reshape(counter, len(row_list)) #  desired numpy array

回答 9

我意识到这有点晚了,但是我没有注意到提到索引到空数组的其他答案:

big_array = numpy.empty(10, 4)
for i in range(5):
    array_i = numpy.random.random(2, 4)
    big_array[2 * i:2 * (i + 1), :] = array_i

这样,您numpy.empty可以使用索引分配预先分配整个结果数组,并在行中填写行。

使用预分配empty而不是zeros您给出的示例是完全安全的,因为您可以保证整个数组将被生成的块填充。

I realize that this is a bit late, but I did not notice any of the other answers mentioning indexing into the empty array:

big_array = numpy.empty(10, 4)
for i in range(5):
    array_i = numpy.random.random(2, 4)
    big_array[2 * i:2 * (i + 1), :] = array_i

This way, you preallocate the entire result array with numpy.empty and fill in the rows as you go using indexed assignment.

It is perfectly safe to preallocate with empty instead of zeros in the example you gave since you are guaranteeing that the entire array will be filled with the chunks you generate.


回答 10

我建议先定义形状。然后对其进行迭代以插入值。

big_array= np.zeros(shape = ( 6, 2 ))
for it in range(6):
    big_array[it] = (it,it) # For example

>>>big_array

array([[ 0.,  0.],
       [ 1.,  1.],
       [ 2.,  2.],
       [ 3.,  3.],
       [ 4.,  4.],
       [ 5.,  5.]])

I’d suggest defining shape first. Then iterate over it to insert values.

big_array= np.zeros(shape = ( 6, 2 ))
for it in range(6):
    big_array[it] = (it,it) # For example

>>>big_array

array([[ 0.,  0.],
       [ 1.,  1.],
       [ 2.,  2.],
       [ 3.,  3.],
       [ 4.,  4.],
       [ 5.,  5.]])

回答 11

也许这样的东西会满足您的需求。

import numpy as np

N = 5
res = []

for i in range(N):
    res.append(np.cumsum(np.ones(shape=(2,4))))

res = np.array(res).reshape((10, 4))
print(res)

产生以下输出

[[ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]]

Maybe something like this will fit your needs..

import numpy as np

N = 5
res = []

for i in range(N):
    res.append(np.cumsum(np.ones(shape=(2,4))))

res = np.array(res).reshape((10, 4))
print(res)

Which produces the following output

[[ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]]