如何在NumPy中创建一个空数组/矩阵?

问题:如何在NumPy中创建一个空数组/矩阵?

我无法弄清楚如何以通常使用列表的方式使用数组或矩阵。我想创建一个空数组(或矩阵),然后一次向其中添加一列(或行)。

目前,我能找到的唯一方法是:

mat = None
for col in columns:
    if mat is None:
        mat = col
    else:
        mat = hstack((mat, col))

而如果这是一个列表,我会做这样的事情:

list = []
for item in data:
    list.append(item)

有没有办法对NumPy数组或矩阵使用这种表示法?

I can’t figure out how to use an array or matrix in the way that I would normally use a list. I want to create an empty array (or matrix) and then add one column (or row) to it at a time.

At the moment the only way I can find to do this is like:

mat = None
for col in columns:
    if mat is None:
        mat = col
    else:
        mat = hstack((mat, col))

Whereas if it were a list, I’d do something like this:

list = []
for item in data:
    list.append(item)

Is there a way to use that kind of notation for NumPy arrays or matrices?


回答 0

您对有效使用NumPy的思维模式有误。NumPy数组存储在连续的内存块中。如果要向现有阵列添加行或列,则需要将整个阵列复制到新的内存块中,从而为要存储的新元素创建间隙。如果反复进行以构建阵列,则效率非常低下。

在添加行的情况下,最好的选择是创建一个与数据集最终大小一样大的数组,然后逐行向其中添加数据:

>>> import numpy
>>> a = numpy.zeros(shape=(5,2))
>>> a
array([[ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.]])
>>> a[0] = [1,2]
>>> a[1] = [2,3]
>>> a
array([[ 1.,  2.],
   [ 2.,  3.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.]])

You have the wrong mental model for using NumPy efficiently. NumPy arrays are stored in contiguous blocks of memory. If you want to add rows or columns to an existing array, the entire array needs to be copied to a new block of memory, creating gaps for the new elements to be stored. This is very inefficient if done repeatedly to build an array.

In the case of adding rows, your best bet is to create an array that is as big as your data set will eventually be, and then assign data to it row-by-row:

>>> import numpy
>>> a = numpy.zeros(shape=(5,2))
>>> a
array([[ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.]])
>>> a[0] = [1,2]
>>> a[1] = [2,3]
>>> a
array([[ 1.,  2.],
   [ 2.,  3.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.]])

回答 1

NumPy数组是与列表非常不同的数据结构,旨在以不同的方式使用。您的使用hstack可能效率很低…每次调用它时,现有数组中的所有数据都将被复制到一个新数组中。(该append函数会有相同的问题。)如果您想一次建立一列矩阵,最好将其保留在列表中直到完成,然后再将其转换为数组。

例如


mylist = []
for item in data:
    mylist.append(item)
mat = numpy.array(mylist)

item可以是列表,数组或任何可迭代的,只要每个item元素具有相同数量的元素即可。
在这种特殊情况下(data保存矩阵列有些可迭代),您可以简单地使用


mat = numpy.array(data)

(还请注意,将其list用作变量名可能不是一个好习惯,因为它会用该名称掩盖内置类型,这可能会导致错误。)

编辑:

如果出于某种原因您确实想创建一个空数组,则可以使用 numpy.array([]),但这很少有用!

A NumPy array is a very different data structure from a list and is designed to be used in different ways. Your use of hstack is potentially very inefficient… every time you call it, all the data in the existing array is copied into a new one. (The append function will have the same issue.) If you want to build up your matrix one column at a time, you might be best off to keep it in a list until it is finished, and only then convert it into an array.

e.g.


mylist = []
for item in data:
    mylist.append(item)
mat = numpy.array(mylist)

item can be a list, an array or any iterable, as long as each item has the same number of elements.
In this particular case (data is some iterable holding the matrix columns) you can simply use


mat = numpy.array(data)

(Also note that using list as a variable name is probably not good practice since it masks the built-in type by that name, which can lead to bugs.)

EDIT:

If for some reason you really do want to create an empty array, you can just use numpy.array([]), but this is rarely useful!


回答 2

要在NumPy中创建一个空的多维数组(例如,m*n用于存储矩阵的2D数组),以防万一您不知道m要追加多少行并且不在乎Stephen Simmons提到的计算成本(即重新构建数组),您可以将要附加到的尺寸压缩为0 X = np.empty(shape=[0, n])

例如,您可以使用这种方式(在这里m = 5我们假设在创建空矩阵时我们并不知道,以及n = 2):

import numpy as np

n = 2
X = np.empty(shape=[0, n])

for i in range(5):
    for j  in range(2):
        X = np.append(X, [[i, j]], axis=0)

print X

这将为您提供:

[[ 0.  0.]
 [ 0.  1.]
 [ 1.  0.]
 [ 1.  1.]
 [ 2.  0.]
 [ 2.  1.]
 [ 3.  0.]
 [ 3.  1.]
 [ 4.  0.]
 [ 4.  1.]]

To create an empty multidimensional array in NumPy (e.g. a 2D array m*n to store your matrix), in case you don’t know m how many rows you will append and don’t care about the computational cost Stephen Simmons mentioned (namely re-buildinging the array at each append), you can squeeze to 0 the dimension to which you want to append to: X = np.empty(shape=[0, n]).

This way you can use for example (here m = 5 which we assume we didn’t know when creating the empty matrix, and n = 2):

import numpy as np

n = 2
X = np.empty(shape=[0, n])

for i in range(5):
    for j  in range(2):
        X = np.append(X, [[i, j]], axis=0)

print X

which will give you:

[[ 0.  0.]
 [ 0.  1.]
 [ 1.  0.]
 [ 1.  1.]
 [ 2.  0.]
 [ 2.  1.]
 [ 3.  0.]
 [ 3.  1.]
 [ 4.  0.]
 [ 4.  1.]]

回答 3

我进行了很多研究,因为我需要在我的一个学校项目中使用numpy.array作为集合,并且需要将其初始化为空。在这里,我在Stack Overflow上没有找到任何相关的答案,所以我开始涂鸦的东西。

# Initialize your variable as an empty list first
In [32]: x=[]
# and now cast it as a numpy ndarray
In [33]: x=np.array(x)

结果将是:

In [34]: x
Out[34]: array([], dtype=float64)

因此,您可以按如下所示直接初始化np数组:

In [36]: x= np.array([], dtype=np.float64)

我希望这有帮助。

I looked into this a lot because I needed to use a numpy.array as a set in one of my school projects and I needed to be initialized empty… I didn’t found any relevant answer here on Stack Overflow, so I started doodling something.

# Initialize your variable as an empty list first
In [32]: x=[]
# and now cast it as a numpy ndarray
In [33]: x=np.array(x)

The result will be:

In [34]: x
Out[34]: array([], dtype=float64)

Therefore you can directly initialize an np array as follows:

In [36]: x= np.array([], dtype=np.float64)

I hope this helps.


回答 4

您可以使用附加功能。对于行:

>>> from numpy import *
>>> a = array([10,20,30])
>>> append(a, [[1,2,3]], axis=0)
array([[10, 20, 30],      
       [1, 2, 3]])

对于列:

>>> append(a, [[15],[15]], axis=1)
array([[10, 20, 30, 15],      
       [1, 2, 3, 15]])

编辑
当然,正如其他答案中所述,除非每次对矩阵/数组进行一些处理(例如反转),否则每次将其添加到列表中时,我都会创建一个列表,将其添加到列表中,然后将其转换为数组。

You can use the append function. For rows:

>>> from numpy import *
>>> a = array([10,20,30])
>>> append(a, [[1,2,3]], axis=0)
array([[10, 20, 30],      
       [1, 2, 3]])

For columns:

>>> append(a, [[15],[15]], axis=1)
array([[10, 20, 30, 15],      
       [1, 2, 3, 15]])

EDIT
Of course, as mentioned in other answers, unless you’re doing some processing (ex. inversion) on the matrix/array EVERY time you append something to it, I would just create a list, append to it then convert it to an array.


回答 5

如果您完全不知道数组的最终大小,则可以像这样增加数组的大小:

my_arr = numpy.zeros((0,5))
for i in range(3):
    my_arr=numpy.concatenate( ( my_arr, numpy.ones((1,5)) ) )
print(my_arr)

[[ 1.  1.  1.  1.  1.]  [ 1.  1.  1.  1.  1.]  [ 1.  1.  1.  1.  1.]]
  • 注意0第一行中的。
  • numpy.append是另一种选择。它调用numpy.concatenate

If you absolutely don’t know the final size of the array, you can increment the size of the array like this:

my_arr = numpy.zeros((0,5))
for i in range(3):
    my_arr=numpy.concatenate( ( my_arr, numpy.ones((1,5)) ) )
print(my_arr)

[[ 1.  1.  1.  1.  1.]  [ 1.  1.  1.  1.  1.]  [ 1.  1.  1.  1.  1.]]
  • Notice the 0 in the first line.
  • numpy.append is another option. It calls numpy.concatenate.

回答 6

您可以将其应用于构建任何类型的数组,例如零:

a = range(5)
a = [i*0 for i in a]
print a 
[0, 0, 0, 0, 0]

You can apply it to build any kind of array, like zeros:

a = range(5)
a = [i*0 for i in a]
print a 
[0, 0, 0, 0, 0]

回答 7

这是一些使numpys看起来更像列表的解决方法

np_arr = np.array([])
np_arr = np.append(np_arr , 2)
np_arr = np.append(np_arr , 24)
print(np_arr)

输出:array([2.,24.])

Here is some workaround to make numpys look more like Lists

np_arr = np.array([])
np_arr = np.append(np_arr , 2)
np_arr = np.append(np_arr , 24)
print(np_arr)

OUTPUT: array([ 2., 24.])


回答 8

根据您使用它的目的,您可能需要指定数据类型(请参见‘dtype’)。

例如,要创建一个8位值的2D数组(适合用作单色图像):

myarray = numpy.empty(shape=(H,W),dtype='u1')

对于RGB图像,在形状中包括颜色通道的数量: shape=(H,W,3)

您可能还想考虑使用零初始化,numpy.zeros而不是使用numpy.empty。请参阅此处的注释。

Depending on what you are using this for, you may need to specify the data type (see ‘dtype’).

For example, to create a 2D array of 8-bit values (suitable for use as a monochrome image):

myarray = numpy.empty(shape=(H,W),dtype='u1')

For an RGB image, include the number of color channels in the shape: shape=(H,W,3)

You may also want to consider zero-initializing with numpy.zeros instead of using numpy.empty. See the note here.


回答 9

我认为您想处理列表的大部分工作,然后将结果用作矩阵。也许这是一种方式;

ur_list = []
for col in columns:
    ur_list.append(list(col))

mat = np.matrix(ur_list)

I think you want to handle most of the work with lists then use the result as a matrix. Maybe this is a way ;

ur_list = []
for col in columns:
    ur_list.append(list(col))

mat = np.matrix(ur_list)

回答 10

我认为您可以创建空的numpy数组,例如:

>>> import numpy as np
>>> empty_array= np.zeros(0)
>>> empty_array
array([], dtype=float64)
>>> empty_array.shape
(0,)

当您要在循环中附加numpy数组时,此格式很有用。

I think you can create empty numpy array like:

>>> import numpy as np
>>> empty_array= np.zeros(0)
>>> empty_array
array([], dtype=float64)
>>> empty_array.shape
(0,)

This format is useful when you want to append numpy array in the loop.


回答 11

为了创建一个空的NumPy数组而不定义其形状,有一种方法:

1。

arr = np.array([]) 

首选。因为您知道您将以numpy的形式使用它。

2。

arr = []
# and use it as numpy. append to it or etc..

NumPy之后将其转换为np.ndarray类型,无需额外的操作[] dimionsion

For creating an empty NumPy array without defining its shape there is to way:

1.

arr = np.array([]) 

preferred. cause you know you will be using this as numpy.

2.

arr = []
# and use it as numpy. append to it or etc..

NumPy converts this to np.ndarray type afterward, without extra [] dimionsion.