标签归档:matrix

NumPy矩阵与数组类的乘法有何不同?

问题:NumPy矩阵与数组类的乘法有何不同?

numpy文档建议使用数组而不是矩阵来处理矩阵。但是,与八度(我直到最近才使用)不同,*不执行矩阵乘法,您需要使用函数matrixmultipy()。我觉得这使代码非常不可读。

是否有人分享我的观点并找到了解决方案?

The numpy docs recommend using array instead of matrix for working with matrices. However, unlike octave (which I was using till recently), * doesn’t perform matrix multiplication, you need to use the function matrixmultipy(). I feel this makes the code very unreadable.

Does anybody share my views, and has found a solution?


回答 0

避免使用的主要原因 matrix该类的是:a)本质上是二维的,并且b)与“常规” numpy数组相比,存在额外的开销。如果您要做的只是线性代数,那么请务必使用矩阵类…就我个人而言,我发现它比它值得的麻烦更多。

对于数组(Python 3.5之前的版本),请使用dot代替matrixmultiply

例如

import numpy as np
x = np.arange(9).reshape((3,3))
y = np.arange(3)

print np.dot(x,y)

或在新版本的numpy中,只需使用 x.dot(y)

就个人而言,我发现它比*表示矩阵乘法的运算符更具可读性…

对于Python 3.5中的数组,请使用x @ y

The main reason to avoid using the matrix class is that a) it’s inherently 2-dimensional, and b) there’s additional overhead compared to a “normal” numpy array. If all you’re doing is linear algebra, then by all means, feel free to use the matrix class… Personally I find it more trouble than it’s worth, though.

For arrays (prior to Python 3.5), use dot instead of matrixmultiply.

E.g.

import numpy as np
x = np.arange(9).reshape((3,3))
y = np.arange(3)

print np.dot(x,y)

Or in newer versions of numpy, simply use x.dot(y)

Personally, I find it much more readable than the * operator implying matrix multiplication…

For arrays in Python 3.5, use x @ y.


回答 1

与在NumPy 矩阵上进行操作相比,在NumPy 数组上进行操作要了解的关键事项是:

  • NumPy矩阵是NumPy数组的子类

  • NumPy 数组操作是基于元素的(一旦考虑了广播)

  • NumPy 矩阵运算遵循线性代数的一般规则

一些代码片段来说明:

>>> from numpy import linalg as LA
>>> import numpy as NP

>>> a1 = NP.matrix("4 3 5; 6 7 8; 1 3 13; 7 21 9")
>>> a1
matrix([[ 4,  3,  5],
        [ 6,  7,  8],
        [ 1,  3, 13],
        [ 7, 21,  9]])

>>> a2 = NP.matrix("7 8 15; 5 3 11; 7 4 9; 6 15 4")
>>> a2
matrix([[ 7,  8, 15],
        [ 5,  3, 11],
        [ 7,  4,  9],
        [ 6, 15,  4]])

>>> a1.shape
(4, 3)

>>> a2.shape
(4, 3)

>>> a2t = a2.T
>>> a2t.shape
(3, 4)

>>> a1 * a2t         # same as NP.dot(a1, a2t) 
matrix([[127,  84,  85,  89],
        [218, 139, 142, 173],
        [226, 157, 136, 103],
        [352, 197, 214, 393]])

但是如果将以下两个NumPy矩阵转换为数组,则此操作将失败:

>>> a1 = NP.array(a1)
>>> a2t = NP.array(a2t)

>>> a1 * a2t
Traceback (most recent call last):
   File "<pyshell#277>", line 1, in <module>
   a1 * a2t
   ValueError: operands could not be broadcast together with shapes (4,3) (3,4) 

尽管使用NP.dot语法可以处理数组 ; 该操作类似于矩阵乘法:

>> NP.dot(a1, a2t)
array([[127,  84,  85,  89],
       [218, 139, 142, 173],
       [226, 157, 136, 103],
       [352, 197, 214, 393]])

那么您是否需要NumPy矩阵?即,NumPy数组是否足以进行线性代数计算(前提是您知道正确的语法,即NP.dot)?

规则似乎是,如果参数(数组)的形状(mxn)与给定的线性代数运算兼容,那么您就可以了,否则,NumPy抛出。

我遇到的唯一exceptions(可能还有其他exceptions)是计算矩阵逆

下面是我称为纯线性代数运算(实际上是从Numpy的线性代数模块)并传递给NumPy数组的代码片段

数组的行列式

>>> m = NP.random.randint(0, 10, 16).reshape(4, 4)
>>> m
array([[6, 2, 5, 2],
       [8, 5, 1, 6],
       [5, 9, 7, 5],
       [0, 5, 6, 7]])

>>> type(m)
<type 'numpy.ndarray'>

>>> md = LA.det(m)
>>> md
1772.9999999999995

特征向量/特征值对:

>>> LA.eig(m)
(array([ 19.703+0.j   ,   0.097+4.198j,   0.097-4.198j,   5.103+0.j   ]), 
array([[-0.374+0.j   , -0.091+0.278j, -0.091-0.278j, -0.574+0.j   ],
       [-0.446+0.j   ,  0.671+0.j   ,  0.671+0.j   , -0.084+0.j   ],
       [-0.654+0.j   , -0.239-0.476j, -0.239+0.476j, -0.181+0.j   ],
       [-0.484+0.j   , -0.387+0.178j, -0.387-0.178j,  0.794+0.j   ]]))

矩阵范数

>>>> LA.norm(m)
22.0227

qr因式分解

>>> LA.qr(a1)
(array([[ 0.5,  0.5,  0.5],
        [ 0.5,  0.5, -0.5],
        [ 0.5, -0.5,  0.5],
        [ 0.5, -0.5, -0.5]]), 
 array([[ 6.,  6.,  6.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]]))

矩阵等级

>>> m = NP.random.rand(40).reshape(8, 5)
>>> m
array([[ 0.545,  0.459,  0.601,  0.34 ,  0.778],
       [ 0.799,  0.047,  0.699,  0.907,  0.381],
       [ 0.004,  0.136,  0.819,  0.647,  0.892],
       [ 0.062,  0.389,  0.183,  0.289,  0.809],
       [ 0.539,  0.213,  0.805,  0.61 ,  0.677],
       [ 0.269,  0.071,  0.377,  0.25 ,  0.692],
       [ 0.274,  0.206,  0.655,  0.062,  0.229],
       [ 0.397,  0.115,  0.083,  0.19 ,  0.701]])
>>> LA.matrix_rank(m)
5

矩阵条件

>>> a1 = NP.random.randint(1, 10, 12).reshape(4, 3)
>>> LA.cond(a1)
5.7093446189400954

反演需要一个NumPy矩阵

>>> a1 = NP.matrix(a1)
>>> type(a1)
<class 'numpy.matrixlib.defmatrix.matrix'>

>>> a1.I
matrix([[ 0.028,  0.028,  0.028,  0.028],
        [ 0.028,  0.028,  0.028,  0.028],
        [ 0.028,  0.028,  0.028,  0.028]])
>>> a1 = NP.array(a1)
>>> a1.I

Traceback (most recent call last):
   File "<pyshell#230>", line 1, in <module>
   a1.I
   AttributeError: 'numpy.ndarray' object has no attribute 'I'

但是Moore-Penrose伪逆似乎工作得很好

>>> LA.pinv(m)
matrix([[ 0.314,  0.407, -1.008, -0.553,  0.131,  0.373,  0.217,  0.785],
        [ 1.393,  0.084, -0.605,  1.777, -0.054, -1.658,  0.069, -1.203],
        [-0.042, -0.355,  0.494, -0.729,  0.292,  0.252,  1.079, -0.432],
        [-0.18 ,  1.068,  0.396,  0.895, -0.003, -0.896, -1.115, -0.666],
        [-0.224, -0.479,  0.303, -0.079, -0.066,  0.872, -0.175,  0.901]])

>>> m = NP.array(m)

>>> LA.pinv(m)
array([[ 0.314,  0.407, -1.008, -0.553,  0.131,  0.373,  0.217,  0.785],
       [ 1.393,  0.084, -0.605,  1.777, -0.054, -1.658,  0.069, -1.203],
       [-0.042, -0.355,  0.494, -0.729,  0.292,  0.252,  1.079, -0.432],
       [-0.18 ,  1.068,  0.396,  0.895, -0.003, -0.896, -1.115, -0.666],
       [-0.224, -0.479,  0.303, -0.079, -0.066,  0.872, -0.175,  0.901]])

the key things to know for operations on NumPy arrays versus operations on NumPy matrices are:

  • NumPy matrix is a subclass of NumPy array

  • NumPy array operations are element-wise (once broadcasting is accounted for)

  • NumPy matrix operations follow the ordinary rules of linear algebra

some code snippets to illustrate:

>>> from numpy import linalg as LA
>>> import numpy as NP

>>> a1 = NP.matrix("4 3 5; 6 7 8; 1 3 13; 7 21 9")
>>> a1
matrix([[ 4,  3,  5],
        [ 6,  7,  8],
        [ 1,  3, 13],
        [ 7, 21,  9]])

>>> a2 = NP.matrix("7 8 15; 5 3 11; 7 4 9; 6 15 4")
>>> a2
matrix([[ 7,  8, 15],
        [ 5,  3, 11],
        [ 7,  4,  9],
        [ 6, 15,  4]])

>>> a1.shape
(4, 3)

>>> a2.shape
(4, 3)

>>> a2t = a2.T
>>> a2t.shape
(3, 4)

>>> a1 * a2t         # same as NP.dot(a1, a2t) 
matrix([[127,  84,  85,  89],
        [218, 139, 142, 173],
        [226, 157, 136, 103],
        [352, 197, 214, 393]])

but this operations fails if these two NumPy matrices are converted to arrays:

>>> a1 = NP.array(a1)
>>> a2t = NP.array(a2t)

>>> a1 * a2t
Traceback (most recent call last):
   File "<pyshell#277>", line 1, in <module>
   a1 * a2t
   ValueError: operands could not be broadcast together with shapes (4,3) (3,4) 

though using the NP.dot syntax works with arrays; this operations works like matrix multiplication:

>> NP.dot(a1, a2t)
array([[127,  84,  85,  89],
       [218, 139, 142, 173],
       [226, 157, 136, 103],
       [352, 197, 214, 393]])

so do you ever need a NumPy matrix? ie, will a NumPy array suffice for linear algebra computation (provided you know the correct syntax, ie, NP.dot)?

the rule seems to be that if the arguments (arrays) have shapes (m x n) compatible with the a given linear algebra operation, then you are ok, otherwise, NumPy throws.

the only exception i have come across (there are likely others) is calculating matrix inverse.

below are snippets in which i have called a pure linear algebra operation (in fact, from Numpy’s Linear Algebra module) and passed in a NumPy array

determinant of an array:

>>> m = NP.random.randint(0, 10, 16).reshape(4, 4)
>>> m
array([[6, 2, 5, 2],
       [8, 5, 1, 6],
       [5, 9, 7, 5],
       [0, 5, 6, 7]])

>>> type(m)
<type 'numpy.ndarray'>

>>> md = LA.det(m)
>>> md
1772.9999999999995

eigenvectors/eigenvalue pairs:

>>> LA.eig(m)
(array([ 19.703+0.j   ,   0.097+4.198j,   0.097-4.198j,   5.103+0.j   ]), 
array([[-0.374+0.j   , -0.091+0.278j, -0.091-0.278j, -0.574+0.j   ],
       [-0.446+0.j   ,  0.671+0.j   ,  0.671+0.j   , -0.084+0.j   ],
       [-0.654+0.j   , -0.239-0.476j, -0.239+0.476j, -0.181+0.j   ],
       [-0.484+0.j   , -0.387+0.178j, -0.387-0.178j,  0.794+0.j   ]]))

matrix norm:

>>>> LA.norm(m)
22.0227

qr factorization:

>>> LA.qr(a1)
(array([[ 0.5,  0.5,  0.5],
        [ 0.5,  0.5, -0.5],
        [ 0.5, -0.5,  0.5],
        [ 0.5, -0.5, -0.5]]), 
 array([[ 6.,  6.,  6.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]]))

matrix rank:

>>> m = NP.random.rand(40).reshape(8, 5)
>>> m
array([[ 0.545,  0.459,  0.601,  0.34 ,  0.778],
       [ 0.799,  0.047,  0.699,  0.907,  0.381],
       [ 0.004,  0.136,  0.819,  0.647,  0.892],
       [ 0.062,  0.389,  0.183,  0.289,  0.809],
       [ 0.539,  0.213,  0.805,  0.61 ,  0.677],
       [ 0.269,  0.071,  0.377,  0.25 ,  0.692],
       [ 0.274,  0.206,  0.655,  0.062,  0.229],
       [ 0.397,  0.115,  0.083,  0.19 ,  0.701]])
>>> LA.matrix_rank(m)
5

matrix condition:

>>> a1 = NP.random.randint(1, 10, 12).reshape(4, 3)
>>> LA.cond(a1)
5.7093446189400954

inversion requires a NumPy matrix though:

>>> a1 = NP.matrix(a1)
>>> type(a1)
<class 'numpy.matrixlib.defmatrix.matrix'>

>>> a1.I
matrix([[ 0.028,  0.028,  0.028,  0.028],
        [ 0.028,  0.028,  0.028,  0.028],
        [ 0.028,  0.028,  0.028,  0.028]])
>>> a1 = NP.array(a1)
>>> a1.I

Traceback (most recent call last):
   File "<pyshell#230>", line 1, in <module>
   a1.I
   AttributeError: 'numpy.ndarray' object has no attribute 'I'

but the Moore-Penrose pseudoinverse seems to works just fine

>>> LA.pinv(m)
matrix([[ 0.314,  0.407, -1.008, -0.553,  0.131,  0.373,  0.217,  0.785],
        [ 1.393,  0.084, -0.605,  1.777, -0.054, -1.658,  0.069, -1.203],
        [-0.042, -0.355,  0.494, -0.729,  0.292,  0.252,  1.079, -0.432],
        [-0.18 ,  1.068,  0.396,  0.895, -0.003, -0.896, -1.115, -0.666],
        [-0.224, -0.479,  0.303, -0.079, -0.066,  0.872, -0.175,  0.901]])

>>> m = NP.array(m)

>>> LA.pinv(m)
array([[ 0.314,  0.407, -1.008, -0.553,  0.131,  0.373,  0.217,  0.785],
       [ 1.393,  0.084, -0.605,  1.777, -0.054, -1.658,  0.069, -1.203],
       [-0.042, -0.355,  0.494, -0.729,  0.292,  0.252,  1.079, -0.432],
       [-0.18 ,  1.068,  0.396,  0.895, -0.003, -0.896, -1.115, -0.666],
       [-0.224, -0.479,  0.303, -0.079, -0.066,  0.872, -0.175,  0.901]])

回答 2

在3.5中,Python终于有了一个矩阵乘法运算符。语法为a @ b

In 3.5, Python finally got a matrix multiplication operator. The syntax is a @ b.


回答 3

在处理数组和处理矩阵时,点运算符会给出不同的答案。例如,假设以下内容:

>>> a=numpy.array([1, 2, 3])
>>> b=numpy.array([1, 2, 3])

让我们将它们转换成矩阵:

>>> am=numpy.mat(a)
>>> bm=numpy.mat(b)

现在,我们可以看到两种情况的不同输出:

>>> print numpy.dot(a.T, b)
14
>>> print am.T*bm
[[1.  2.  3.]
 [2.  4.  6.]
 [3.  6.  9.]]

There is a situation where the dot operator will give different answers when dealing with arrays as with dealing with matrices. For example, suppose the following:

>>> a=numpy.array([1, 2, 3])
>>> b=numpy.array([1, 2, 3])

Lets convert them into matrices:

>>> am=numpy.mat(a)
>>> bm=numpy.mat(b)

Now, we can see a different output for the two cases:

>>> print numpy.dot(a.T, b)
14
>>> print am.T*bm
[[1.  2.  3.]
 [2.  4.  6.]
 [3.  6.  9.]]

回答 4

来自http://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html的参考

…,使用的numpy.matrix气馁,因为它增加了什么,无法与2D来完成numpy.ndarray对象,并可能导致混乱,其中正在使用的类。例如,

>>> import numpy as np
>>> from scipy import linalg
>>> A = np.array([[1,2],[3,4]])
>>> A
    array([[1, 2],
           [3, 4]])
>>> linalg.inv(A)
array([[-2. ,  1. ],
      [ 1.5, -0.5]])
>>> b = np.array([[5,6]]) #2D array
>>> b
array([[5, 6]])
>>> b.T
array([[5],
      [6]])
>>> A*b #not matrix multiplication!
array([[ 5, 12],
      [15, 24]])
>>> A.dot(b.T) #matrix multiplication
array([[17],
      [39]])
>>> b = np.array([5,6]) #1D array
>>> b
array([5, 6])
>>> b.T  #not matrix transpose!
array([5, 6])
>>> A.dot(b)  #does not matter for multiplication
array([17, 39])

scipy.linalg操作可以同等地应用于numpy.matrix或2D numpy.ndarray对象。

Reference from http://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html

…, the use of the numpy.matrix class is discouraged, since it adds nothing that cannot be accomplished with 2D numpy.ndarray objects, and may lead to a confusion of which class is being used. For example,

>>> import numpy as np
>>> from scipy import linalg
>>> A = np.array([[1,2],[3,4]])
>>> A
    array([[1, 2],
           [3, 4]])
>>> linalg.inv(A)
array([[-2. ,  1. ],
      [ 1.5, -0.5]])
>>> b = np.array([[5,6]]) #2D array
>>> b
array([[5, 6]])
>>> b.T
array([[5],
      [6]])
>>> A*b #not matrix multiplication!
array([[ 5, 12],
      [15, 24]])
>>> A.dot(b.T) #matrix multiplication
array([[17],
      [39]])
>>> b = np.array([5,6]) #1D array
>>> b
array([5, 6])
>>> b.T  #not matrix transpose!
array([5, 6])
>>> A.dot(b)  #does not matter for multiplication
array([17, 39])

scipy.linalg operations can be applied equally to numpy.matrix or to 2D numpy.ndarray objects.


回答 5

这个技巧可能就是您想要的。这是一种简单的运算符重载。

然后,您可以使用类似建议的Infix类的东西:

a = np.random.rand(3,4)
b = np.random.rand(4,3)
x = Infix(lambda x,y: np.dot(x,y))
c = a |x| b

This trick could be what you are looking for. It is a kind of simple operator overload.

You can then use something like the suggested Infix class like this:

a = np.random.rand(3,4)
b = np.random.rand(4,3)
x = Infix(lambda x,y: np.dot(x,y))
c = a |x| b

回答 6

来自PEP 465的相关报价 @ petr-viktorin提到的用于矩阵乘法的专用中缀运算符,阐明了OP遇到的问题:

numpy提供了两种使用不同__mul__方法的不同类型。对于numpy.ndarray对象,*执行元素乘法,矩阵乘法必须使用函数调用(numpy.dot)。对于numpy.matrix对象,*执行矩阵乘法,而元素乘法则需要函数语法。使用编写代码numpy.ndarray效果很好。使用编写代码numpy.matrix也可以。但是,一旦我们尝试将这两段代码集成在一起,麻烦就会开始。预期为ndarray并得到matrix或相反的代码可能会崩溃或返回错误的结果

@infix运算符的引入应有助于统一和简化python矩阵代码。

A pertinent quote from PEP 465 – A dedicated infix operator for matrix multiplication , as mentioned by @petr-viktorin, clarifies the problem the OP was getting at:

[…] numpy provides two different types with different __mul__ methods. For numpy.ndarray objects, * performs elementwise multiplication, and matrix multiplication must use a function call (numpy.dot). For numpy.matrix objects, * performs matrix multiplication, and elementwise multiplication requires function syntax. Writing code using numpy.ndarray works fine. Writing code using numpy.matrix also works fine. But trouble begins as soon as we try to integrate these two pieces of code together. Code that expects an ndarray and gets a matrix, or vice-versa, may crash or return incorrect results

The introduction of the @ infix operator should help to unify and simplify python matrix code.


回答 7

函数matmul(自numpy 1.10.1起)对两种类型均适用,并以numpy矩阵类返回结果:

import numpy as np

A = np.mat('1 2 3; 4 5 6; 7 8 9; 10 11 12')
B = np.array(np.mat('1 1 1 1; 1 1 1 1; 1 1 1 1'))
print (A, type(A))
print (B, type(B))

C = np.matmul(A, B)
print (C, type(C))

输出:

(matrix([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]]), <class 'numpy.matrixlib.defmatrix.matrix'>)
(array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]]), <type 'numpy.ndarray'>)
(matrix([[ 6,  6,  6,  6],
        [15, 15, 15, 15],
        [24, 24, 24, 24],
        [33, 33, 33, 33]]), <class 'numpy.matrixlib.defmatrix.matrix'>)

由于python 3.5 如前所述,您还可以使用新的矩阵乘法运算符,@例如

C = A @ B

并获得与上述相同的结果。

Function matmul (since numpy 1.10.1) works fine for both types and return result as a numpy matrix class:

import numpy as np

A = np.mat('1 2 3; 4 5 6; 7 8 9; 10 11 12')
B = np.array(np.mat('1 1 1 1; 1 1 1 1; 1 1 1 1'))
print (A, type(A))
print (B, type(B))

C = np.matmul(A, B)
print (C, type(C))

Output:

(matrix([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]]), <class 'numpy.matrixlib.defmatrix.matrix'>)
(array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]]), <type 'numpy.ndarray'>)
(matrix([[ 6,  6,  6,  6],
        [15, 15, 15, 15],
        [24, 24, 24, 24],
        [33, 33, 33, 33]]), <class 'numpy.matrixlib.defmatrix.matrix'>)

Since python 3.5 as mentioned early you also can use a new matrix multiplication operator @ like

C = A @ B

and get the same result as above.


在numpy中将一维数组转换为二维数组

问题:在numpy中将一维数组转换为二维数组

我想通过指定2D数组中的列数将一维数组转换为二维数组。可能会像这样工作:

> import numpy as np
> A = np.array([1,2,3,4,5,6])
> B = vec2matrix(A,ncol=2)
> B
array([[1, 2],
       [3, 4],
       [5, 6]])

numpy是否具有类似于我的组合函数“ vec2matrix”的功能?(我知道您可以像2D数组一样索引1D数组,但这不是我拥有的代码中的选项-我需要进行此转换。)

I want to convert a 1-dimensional array into a 2-dimensional array by specifying the number of columns in the 2D array. Something that would work like this:

> import numpy as np
> A = np.array([1,2,3,4,5,6])
> B = vec2matrix(A,ncol=2)
> B
array([[1, 2],
       [3, 4],
       [5, 6]])

Does numpy have a function that works like my made-up function “vec2matrix”? (I understand that you can index a 1D array like a 2D array, but that isn’t an option in the code I have – I need to make this conversion.)


回答 0

您要reshape阵列。

B = np.reshape(A, (-1, 2))

其中-1,从输入数组的大小推断出新维度的大小。

You want to reshape the array.

B = np.reshape(A, (-1, 2))

where -1 infers the size of the new dimension from the size of the input array.


回答 1

您有两种选择:

  • 如果您不再想要原始形状,最简单的方法就是为数组分配一个新形状

    a.shape = (a.size//ncols, ncols)

    您可以切换a.size//ncols通过-1自动计算合适的形状。确保a.shape[0]*a.shape[1]=a.size,否则会遇到一些问题。

  • 您可以使用np.reshape函数获得一个新的数组,该函数的工作原理与上述版本相似

    new = np.reshape(a, (-1, ncols))

    如果可能,new将仅是初始array的视图a,这意味着数据是共享的。但是,在某些情况下,new数组将被复制。请注意,np.reshape还接受一个可选关键字order,该关键字使您可以从行优先C顺序切换到列优先Fortran顺序。np.reshape是该a.reshape方法的函数版本。

如果您不能满足要求a.shape[0]*a.shape[1]=a.size,则必须创建一个新数组。您可以使用该np.resize函数并将其与混合使用np.reshape,例如

>>> a =np.arange(9)
>>> np.resize(a, 10).reshape(5,2)

You have two options:

  • If you no longer want the original shape, the easiest is just to assign a new shape to the array

    a.shape = (a.size//ncols, ncols)
    

    You can switch the a.size//ncols by -1 to compute the proper shape automatically. Make sure that a.shape[0]*a.shape[1]=a.size, else you’ll run into some problem.

  • You can get a new array with the np.reshape function, that works mostly like the version presented above

    new = np.reshape(a, (-1, ncols))
    

    When it’s possible, new will be just a view of the initial array a, meaning that the data are shared. In some cases, though, new array will be acopy instead. Note that np.reshape also accepts an optional keyword order that lets you switch from row-major C order to column-major Fortran order. np.reshape is the function version of the a.reshape method.

If you can’t respect the requirement a.shape[0]*a.shape[1]=a.size, you’re stuck with having to create a new array. You can use the np.resize function and mixing it with np.reshape, such as

>>> a =np.arange(9)
>>> np.resize(a, 10).reshape(5,2)

回答 2

尝试类似的方法:

B = np.reshape(A,(-1,ncols))

您需要确保可以将数组中的元素数除以ncols。您也可以B使用order关键字按照将数字拉入的顺序进行游戏。

Try something like:

B = np.reshape(A,(-1,ncols))

You’ll need to make sure that you can divide the number of elements in your array by ncols though. You can also play with the order in which the numbers are pulled into B using the order keyword.


回答 3

如果您的唯一目的是将1d数组X转换为2d数组,请执行以下操作:

X = np.reshape(X,(1, X.size))

If your sole purpose is to convert a 1d array X to a 2d array just do:

X = np.reshape(X,(1, X.size))

回答 4

import numpy as np
array = np.arange(8) 
print("Original array : \n", array)
array = np.arange(8).reshape(2, 4)
print("New array : \n", array)
import numpy as np
array = np.arange(8) 
print("Original array : \n", array)
array = np.arange(8).reshape(2, 4)
print("New array : \n", array)

回答 5

some_array.shape = (1,)+some_array.shape

或换一个新的

another_array = numpy.reshape(some_array, (1,)+some_array.shape)

这将使尺寸+1,等于在最外层添加一个括号

some_array.shape = (1,)+some_array.shape

or get a new one

another_array = numpy.reshape(some_array, (1,)+some_array.shape)

This will make dimensions +1, equals to adding a bracket on the outermost


回答 6

您可以flatten()从numpy包中使用。

import numpy as np
a = np.array([[1, 2],
       [3, 4],
       [5, 6]])
a_flat = a.flatten()
print(f"original array: {a} \nflattened array = {a_flat}")

输出:

original array: [[1 2]
 [3 4]
 [5 6]] 
flattened array = [1 2 3 4 5 6]

You can useflatten() from the numpy package.

import numpy as np
a = np.array([[1, 2],
       [3, 4],
       [5, 6]])
a_flat = a.flatten()
print(f"original array: {a} \nflattened array = {a_flat}")

Output:

original array: [[1 2]
 [3 4]
 [5 6]] 
flattened array = [1 2 3 4 5 6]

回答 7

不使用Numpy将一维数组更改为二维数组。

l = [i for i in range(1,21)]
part = 3
new = []
start, end = 0, part


while end <= len(l):
    temp = []
    for i in range(start, end):
        temp.append(l[i])
    new.append(temp)
    start += part
    end += part
print("new values:  ", new)


# for uneven cases
temp = []
while start < len(l):
    temp.append(l[start])
    start += 1
    new.append(temp)
print("new values for uneven cases:   ", new)

Change 1D array into 2D array without using Numpy.

l = [i for i in range(1,21)]
part = 3
new = []
start, end = 0, part


while end <= len(l):
    temp = []
    for i in range(start, end):
        temp.append(l[i])
    new.append(temp)
    start += part
    end += part
print("new values:  ", new)


# for uneven cases
temp = []
while start < len(l):
    temp.append(l[start])
    start += 1
    new.append(temp)
print("new values for uneven cases:   ", new)

如何在numpy中获得按元素矩阵乘法(Hadamard积)?

问题:如何在numpy中获得按元素矩阵乘法(Hadamard积)?

我有两个矩阵

a = np.matrix([[1,2], [3,4]])
b = np.matrix([[5,6], [7,8]])

我想得到元素乘积[[1*5,2*6], [3*7,4*8]],等于

[[5,12], [21,32]]

我努力了

print(np.dot(a,b)) 

print(a*b)

但两者都给出结果

[[19 22], [43 50]]

这是矩阵乘积,而不是元素乘积。如何使用内置函数获取按元素分类的产品(又名Hadamard产品)?

I have two matrices

a = np.matrix([[1,2], [3,4]])
b = np.matrix([[5,6], [7,8]])

and I want to get the element-wise product, [[1*5,2*6], [3*7,4*8]], equaling

[[5,12], [21,32]]

I have tried

print(np.dot(a,b)) 

and

print(a*b)

but both give the result

[[19 22], [43 50]]

which is the matrix product, not the element-wise product. How can I get the the element-wise product (aka Hadamard product) using built-in functions?


回答 0

对于matrix对象的元素乘法,可以使用numpy.multiply

import numpy as np
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])
np.multiply(a,b)

结果

array([[ 5, 12],
       [21, 32]])

但是,您应该真正使用array而不是matrixmatrix对象与常规ndarray具有各种可怕的不兼容性。使用ndarrays时,您可以仅使用*元素级乘法:

a * b

如果您使用的是Python 3.5+,则您甚至都不会失去使用运算符执行矩阵乘法的能力,因为@矩阵乘法现在可以

a @ b  # matrix multiplication

For elementwise multiplication of matrix objects, you can use numpy.multiply:

import numpy as np
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])
np.multiply(a,b)

Result

array([[ 5, 12],
       [21, 32]])

However, you should really use array instead of matrix. matrix objects have all sorts of horrible incompatibilities with regular ndarrays. With ndarrays, you can just use * for elementwise multiplication:

a * b

If you’re on Python 3.5+, you don’t even lose the ability to perform matrix multiplication with an operator, because @ does matrix multiplication now:

a @ b  # matrix multiplication

回答 1

只是这样做:

import numpy as np

a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])

a * b

just do this:

import numpy as np

a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])

a * b

回答 2

import numpy as np
x = np.array([[1,2,3], [4,5,6]])
y = np.array([[-1, 2, 0], [-2, 5, 1]])

x*y
Out: 
array([[-1,  4,  0],
       [-8, 25,  6]])

%timeit x*y
1000000 loops, best of 3: 421 ns per loop

np.multiply(x,y)
Out: 
array([[-1,  4,  0],
       [-8, 25,  6]])

%timeit np.multiply(x, y)
1000000 loops, best of 3: 457 ns per loop

两者np.multiply*都会产生元素明智的乘法,称为Hadamard积

%timeit 是ipython的魔力

import numpy as np
x = np.array([[1,2,3], [4,5,6]])
y = np.array([[-1, 2, 0], [-2, 5, 1]])

x*y
Out: 
array([[-1,  4,  0],
       [-8, 25,  6]])

%timeit x*y
1000000 loops, best of 3: 421 ns per loop

np.multiply(x,y)
Out: 
array([[-1,  4,  0],
       [-8, 25,  6]])

%timeit np.multiply(x, y)
1000000 loops, best of 3: 457 ns per loop

Both np.multiply and * would yield element wise multiplication known as the Hadamard Product

%timeit is ipython magic


回答 3

试试这个:

a = np.matrix([[1,2], [3,4]])
b = np.matrix([[5,6], [7,8]])

#This would result a 'numpy.ndarray'
result = np.array(a) * np.array(b)

在此,np.array(a)返回类型为2D的2D数组,ndarray并且ndarray将导致元素相乘。因此结果将是:

result = [[5, 12], [21, 32]]

如果您想获取矩阵,请执行以下操作:

result = np.mat(result)

Try this:

a = np.matrix([[1,2], [3,4]])
b = np.matrix([[5,6], [7,8]])

#This would result a 'numpy.ndarray'
result = np.array(a) * np.array(b)

Here, np.array(a) returns a 2D array of type ndarray and multiplication of two ndarray would result element wise multiplication. So the result would be:

result = [[5, 12], [21, 32]]

If you wanna get a matrix, the do it with this:

result = np.mat(result)

块矩阵到数组

问题:块矩阵到数组

我正在使用numpy。我有一个具有1列和N行的矩阵,并且我想从中获得具有N个元素的数组。

例如,如果我有M = matrix([[1], [2], [3], [4]]),我想得到A = array([1,2,3,4])

为此,我使用A = np.array(M.T)[0]。有谁知道一种更优雅的方式来获得相同的结果?

谢谢!

I am using numpy. I have a matrix with 1 column and N rows and I want to get an array from with N elements.

For example, if i have M = matrix([[1], [2], [3], [4]]), I want to get A = array([1,2,3,4]).

To achieve it, I use A = np.array(M.T)[0]. Does anyone know a more elegant way to get the same result?

Thanks!


回答 0

如果您想让内容更具可读性,可以执行以下操作:

A = np.squeeze(np.asarray(M))

同样,您也可以执行以下操作:A = np.asarray(M).reshape(-1),但是它不太容易阅读。

If you’d like something a bit more readable, you can do this:

A = np.squeeze(np.asarray(M))

Equivalently, you could also do: A = np.asarray(M).reshape(-1), but that’s a bit less easy to read.


回答 1


回答 2

A, = np.array(M.T)

我想,这取决于您所说的优雅的意思,但这就是我会做的

A, = np.array(M.T)

depends what you mean by elegance i suppose but thats what i would do


回答 3

您可以尝试以下变体:

result=np.array(M).flatten()

You can try the following variant:

result=np.array(M).flatten()

回答 4

np.array(M).ravel()

如果您在乎速度;但是,如果您关心内存:

np.asarray(M).ravel()
np.array(M).ravel()

If you care for speed; But if you care for memory:

np.asarray(M).ravel()

回答 5

或者您可以尝试避免与

A = M.view(np.ndarray)
A.shape = -1

Or you could try to avoid some temps with

A = M.view(np.ndarray)
A.shape = -1

回答 6

第一, Mv = numpy.asarray(M.T)您会得到一个4×1但2D的数组。

然后,执行A = Mv[0,:],这将为您提供所需的内容。您可以将它们放在一起,如numpy.asarray(M.T)[0,:]

First, Mv = numpy.asarray(M.T), which gives you a 4×1 but 2D array.

Then, perform A = Mv[0,:], which gives you what you want. You could put them together, as numpy.asarray(M.T)[0,:].


回答 7

这会将矩阵转换为数组

A = np.ravel(M).T

This will convert the matrix into array

A = np.ravel(M).T

回答 8

numpy的ravel()flatten()函数是我将在此处尝试的两种技术。我想补充一下JoeSirajbubbleKevad的帖子

拉威尔:

A = M.ravel()
print A, A.shape
>>> [1 2 3 4] (4,)

展平:

M = np.array([[1], [2], [3], [4]])
A = M.flatten()
print A, A.shape
>>> [1 2 3 4] (4,)

numpy.ravel()更快,因为它是库级别的函数,不会复制任何数组。但是,如果使用,则数组A中的任何更改都会将其自身带到原始数组M中numpy.ravel()

numpy.flatten()比慢numpy.ravel()。但是,如果你使用的是numpy.flatten()创建一个,然后改变了一个将不会延续到原来的列M

numpy.squeeze()并且M.reshape(-1)numpy.flatten()和慢numpy.ravel()

%timeit M.ravel()
>>> 1000000 loops, best of 3: 309 ns per loop

%timeit M.flatten()
>>> 1000000 loops, best of 3: 650 ns per loop

%timeit M.reshape(-1)
>>> 1000000 loops, best of 3: 755 ns per loop

%timeit np.squeeze(M)
>>> 1000000 loops, best of 3: 886 ns per loop

ravel() and flatten() functions from numpy are two techniques that I would try here. I will like to add to the posts made by Joe, Siraj, bubble and Kevad.

Ravel:

A = M.ravel()
print A, A.shape
>>> [1 2 3 4] (4,)

Flatten:

M = np.array([[1], [2], [3], [4]])
A = M.flatten()
print A, A.shape
>>> [1 2 3 4] (4,)

numpy.ravel() is faster, since it is a library level function which does not make any copy of the array. However, any change in array A will carry itself over to the original array M if you are using numpy.ravel().

numpy.flatten() is slower than numpy.ravel(). But if you are using numpy.flatten() to create A, then changes in A will not get carried over to the original array M.

numpy.squeeze() and M.reshape(-1) are slower than numpy.flatten() and numpy.ravel().

%timeit M.ravel()
>>> 1000000 loops, best of 3: 309 ns per loop

%timeit M.flatten()
>>> 1000000 loops, best of 3: 650 ns per loop

%timeit M.reshape(-1)
>>> 1000000 loops, best of 3: 755 ns per loop

%timeit np.squeeze(M)
>>> 1000000 loops, best of 3: 886 ns per loop

numpy.array形状(R,1)和(R,)之间的区别

问题:numpy.array形状(R,1)和(R,)之间的区别

进入时numpy,一些操作恢复了形状,(R, 1)但有些恢复了(R,)。由于reshape需要显式运算,因此这将使矩阵乘法更加乏味。例如,给定矩阵M,如果我们想在numpy.dot(M[:,0], numpy.ones((1, R)))哪里做R行数(当然,同样的问题也会逐列出现)。我们会得到matrices are not aligned错误,因为M[:,0]是在外形(R,),但numpy.ones((1, R))在形状(1, R)

所以我的问题是:

  1. 什么形状之间的差异(R, 1)(R,)。我从字面上知道它是数字列表和列表列表,其中所有列表仅包含一个数字。只是想知道为什么不设计numpy使其偏爱形状(R, 1)而不是(R,)更容易进行矩阵乘法。

  2. 以上示例是否有更好的方法?无需像这样显式重塑:numpy.dot(M[:,0].reshape(R, 1), numpy.ones((1, R)))

In numpy, some of the operations return in shape (R, 1) but some return (R,). This will make matrix multiplication more tedious since explicit reshape is required. For example, given a matrix M, if we want to do numpy.dot(M[:,0], numpy.ones((1, R))) where R is the number of rows (of course, the same issue also occurs column-wise). We will get matrices are not aligned error since M[:,0] is in shape (R,) but numpy.ones((1, R)) is in shape (1, R).

So my questions are:

  1. What’s the difference between shape (R, 1) and (R,). I know literally it’s list of numbers and list of lists where all list contains only a number. Just wondering why not design numpy so that it favors shape (R, 1) instead of (R,) for easier matrix multiplication.

  2. Are there better ways for the above example? Without explicitly reshape like this: numpy.dot(M[:,0].reshape(R, 1), numpy.ones((1, R)))


回答 0

1. NumPy中形状的含义

您写道:“我从字面上知道这是一个数字列表和一个列表列表,其中所有列表都只包含一个数字”,但这是一种无益的思考方式。

考虑NumPy数组的最佳方法是它们由两部分组成,一个数据缓冲区只是一个原始元素块,另一个视图描述了如何解释数据缓冲区。

例如,如果我们创建一个包含12个整数的数组:

>>> a = numpy.arange(12)
>>> a
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

然后a由一个数据缓冲区组成,排列如下:

┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
  0   1   2   3   4   5   6   7   8   9  10  11 
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

还有一个描述如何解释数据的视图:

>>> a.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
>>> a.dtype
dtype('int64')
>>> a.itemsize
8
>>> a.strides
(8,)
>>> a.shape
(12,)

这里的形状 (12,)表示该数组由一个从0到11的单个索引建立索引。从概念上讲,如果我们标记此单个索引i,则该数组a如下所示:

i= 0    1    2    3    4    5    6    7    8    9   10   11
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
  0   1   2   3   4   5   6   7   8   9  10  11 
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

如果我们调整数组的形状,则不会更改数据缓冲区。相反,它创建一个新视图,该视图描述了另一种解释数据的方式。所以之后:

>>> b = a.reshape((3, 4))

该数组b具有与相同的数据缓冲区a,但是现在它由两个索引分别从0到2和0到3进行索引。如果我们标记两个索引ij,则数组b如下所示:

i= 0    0    0    0    1    1    1    1    2    2    2    2
j= 0    1    2    3    0    1    2    3    0    1    2    3
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
  0   1   2   3   4   5   6   7   8   9  10  11 
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

意思就是:

>>> b[2,1]
9

您可以看到第二个索引变化很快,而第一个索引变化缓慢。如果您不希望这样做,可以指定order参数:

>>> c = a.reshape((3, 4), order='F')

这将导致数组的索引如下:

i= 0    1    2    0    1    2    0    1    2    0    1    2
j= 0    0    0    1    1    1    2    2    2    3    3    3
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
  0   1   2   3   4   5   6   7   8   9  10  11 
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

意思就是:

>>> c[2,1]
5

现在应该清楚一个数组具有一个或多个尺寸为1的尺寸的形状的含义。

>>> d = a.reshape((12, 1))

数组d由两个索引索引,第一个索引的范围是0到11,第二个索引始终是0:

i= 0    1    2    3    4    5    6    7    8    9   10   11
j= 0    0    0    0    0    0    0    0    0    0    0    0
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
  0   1   2   3   4   5   6   7   8   9  10  11 
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

所以:

>>> d[10,0]
10

长度为1的尺寸是“自由的”(在某种意义上),因此没有什么可以阻止您进入城镇:

>>> e = a.reshape((1, 2, 1, 6, 1))

给出一个索引如下的数组:

i= 0    0    0    0    0    0    0    0    0    0    0    0
j= 0    0    0    0    0    0    1    1    1    1    1    1
k= 0    0    0    0    0    0    0    0    0    0    0    0
l= 0    1    2    3    4    5    0    1    2    3    4    5
m= 0    0    0    0    0    0    0    0    0    0    0    0
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
  0   1   2   3   4   5   6   7   8   9  10  11 
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

所以:

>>> e[0,1,0,0,0]
6

有关如何实现数组的更多详细信息,请参见NumPy内部文档

2.怎么办?

由于numpy.reshape只是创建了一个新视图,因此不必在必要时使用它。当您想以其他方式索引数组时,它是使用的正确工具。

但是,在较长的计算中,通常可能首先要安排构造具有“正确”形状的数组,这样就可以最大程度地减少变形和转置的次数。但是,在没有看到导致需要重塑的实际环境的情况下,很难说应该改变什么。

您问题中的示例是:

numpy.dot(M[:,0], numpy.ones((1, R)))

但这是不现实的。首先,此表达式:

M[:,0].sum()

计算结果更简单。第二,第0列真的有什么特别之处吗?也许您实际需要的是:

M.sum(axis=0)

1. The meaning of shapes in NumPy

You write, “I know literally it’s list of numbers and list of lists where all list contains only a number” but that’s a bit of an unhelpful way to think about it.

The best way to think about NumPy arrays is that they consist of two parts, a data buffer which is just a block of raw elements, and a view which describes how to interpret the data buffer.

For example, if we create an array of 12 integers:

>>> a = numpy.arange(12)
>>> a
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

Then a consists of a data buffer, arranged something like this:

┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │  8 │  9 │ 10 │ 11 │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

and a view which describes how to interpret the data:

>>> a.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
>>> a.dtype
dtype('int64')
>>> a.itemsize
8
>>> a.strides
(8,)
>>> a.shape
(12,)

Here the shape (12,) means the array is indexed by a single index which runs from 0 to 11. Conceptually, if we label this single index i, the array a looks like this:

i= 0    1    2    3    4    5    6    7    8    9   10   11
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │  8 │  9 │ 10 │ 11 │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

If we reshape an array, this doesn’t change the data buffer. Instead, it creates a new view that describes a different way to interpret the data. So after:

>>> b = a.reshape((3, 4))

the array b has the same data buffer as a, but now it is indexed by two indices which run from 0 to 2 and 0 to 3 respectively. If we label the two indices i and j, the array b looks like this:

i= 0    0    0    0    1    1    1    1    2    2    2    2
j= 0    1    2    3    0    1    2    3    0    1    2    3
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │  8 │  9 │ 10 │ 11 │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

which means that:

>>> b[2,1]
9

You can see that the second index changes quickly and the first index changes slowly. If you prefer this to be the other way round, you can specify the order parameter:

>>> c = a.reshape((3, 4), order='F')

which results in an array indexed like this:

i= 0    1    2    0    1    2    0    1    2    0    1    2
j= 0    0    0    1    1    1    2    2    2    3    3    3
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │  8 │  9 │ 10 │ 11 │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

which means that:

>>> c[2,1]
5

It should now be clear what it means for an array to have a shape with one or more dimensions of size 1. After:

>>> d = a.reshape((12, 1))

the array d is indexed by two indices, the first of which runs from 0 to 11, and the second index is always 0:

i= 0    1    2    3    4    5    6    7    8    9   10   11
j= 0    0    0    0    0    0    0    0    0    0    0    0
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │  8 │  9 │ 10 │ 11 │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

and so:

>>> d[10,0]
10

A dimension of length 1 is “free” (in some sense), so there’s nothing stopping you from going to town:

>>> e = a.reshape((1, 2, 1, 6, 1))

giving an array indexed like this:

i= 0    0    0    0    0    0    0    0    0    0    0    0
j= 0    0    0    0    0    0    1    1    1    1    1    1
k= 0    0    0    0    0    0    0    0    0    0    0    0
l= 0    1    2    3    4    5    0    1    2    3    4    5
m= 0    0    0    0    0    0    0    0    0    0    0    0
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │  8 │  9 │ 10 │ 11 │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

and so:

>>> e[0,1,0,0,0]
6

See the NumPy internals documentation for more details about how arrays are implemented.

2. What to do?

Since numpy.reshape just creates a new view, you shouldn’t be scared about using it whenever necessary. It’s the right tool to use when you want to index an array in a different way.

However, in a long computation it’s usually possible to arrange to construct arrays with the “right” shape in the first place, and so minimize the number of reshapes and transposes. But without seeing the actual context that led to the need for a reshape, it’s hard to say what should be changed.

The example in your question is:

numpy.dot(M[:,0], numpy.ones((1, R)))

but this is not realistic. First, this expression:

M[:,0].sum()

computes the result more simply. Second, is there really something special about column 0? Perhaps what you actually need is:

M.sum(axis=0)

回答 1

(R,)和之间的区别(1,R)实际上是您需要使用的索引数。 ones((1,R))是一个二维数组,碰巧只有一行。 ones(R)是一个向量。通常,如果变量的行数/列数不超过一个,则应该使用向量,而不是单维度的矩阵。

对于您的特定情况,有两种选择:

1)只需将第二个参数设为向量。以下工作正常:

    np.dot(M[:,0], np.ones(R))

2)如果您想要矩阵等矩阵运算,请使用类matrix代替ndarray。所有*矩阵都被强制为二维数组,并且运算符执行矩阵乘法而不是按元素进行乘法(因此您不需要点)。以我的经验,这是值得解决的麻烦,但是如果您习惯使用matlab可能会很好。

The difference between (R,) and (1,R) is literally the number of indices that you need to use. ones((1,R)) is a 2-D array that happens to have only one row. ones(R) is a vector. Generally if it doesn’t make sense for the variable to have more than one row/column, you should be using a vector, not a matrix with a singleton dimension.

For your specific case, there are a couple of options:

1) Just make the second argument a vector. The following works fine:

    np.dot(M[:,0], np.ones(R))

2) If you want matlab like matrix operations, use the class matrix instead of ndarray. All matricies are forced into being 2-D arrays, and operator * does matrix multiplication instead of element-wise (so you don’t need dot). In my experience, this is more trouble that it is worth, but it may be nice if you are used to matlab.


回答 2

形状是一个元组。如果只有一维,则形状将是一个数字,并且逗号后仅是空白。对于2维以上的尺寸,所有逗号后面都会有一个数字。

# 1 dimension with 2 elements, shape = (2,). 
# Note there's nothing after the comma.
z=np.array([  # start dimension
    10,       # not a dimension
    20        # not a dimension
])            # end dimension
print(z.shape)

(2,)

# 2 dimensions, each with 1 element, shape = (2,1)
w=np.array([  # start outer dimension 
    [10],     # element is in an inner dimension
    [20]      # element is in an inner dimension
])            # end outer dimension
print(w.shape)

(2,1)

The shape is a tuple. If there is only 1 dimension the shape will be one number and just blank after a comma. For 2+ dimensions, there will be a number after all the commas.

# 1 dimension with 2 elements, shape = (2,). 
# Note there's nothing after the comma.
z=np.array([  # start dimension
    10,       # not a dimension
    20        # not a dimension
])            # end dimension
print(z.shape)

(2,)

# 2 dimensions, each with 1 element, shape = (2,1)
w=np.array([  # start outer dimension 
    [10],     # element is in an inner dimension
    [20]      # element is in an inner dimension
])            # end outer dimension
print(w.shape)

(2,1)


回答 3

对于其基本数组类,2d数组不比1d或3d数组更特殊。有一些操作可以保留尺寸,一些可以减小尺寸,其他可以组合甚至扩展尺寸。

M=np.arange(9).reshape(3,3)
M[:,0].shape # (3,) selects one column, returns a 1d array
M[0,:].shape # same, one row, 1d array
M[:,[0]].shape # (3,1), index with a list (or array), returns 2d
M[:,[0,1]].shape # (3,2)

In [20]: np.dot(M[:,0].reshape(3,1),np.ones((1,3)))

Out[20]: 
array([[ 0.,  0.,  0.],
       [ 3.,  3.,  3.],
       [ 6.,  6.,  6.]])

In [21]: np.dot(M[:,[0]],np.ones((1,3)))
Out[21]: 
array([[ 0.,  0.,  0.],
       [ 3.,  3.,  3.],
       [ 6.,  6.,  6.]])

其他给出相同数组的表达式

np.dot(M[:,0][:,np.newaxis],np.ones((1,3)))
np.dot(np.atleast_2d(M[:,0]).T,np.ones((1,3)))
np.einsum('i,j',M[:,0],np.ones((3)))
M1=M[:,0]; R=np.ones((3)); np.dot(M1[:,None], R[None,:])

MATLAB最初只是2D阵列。较新的版本允许更大的尺寸,但保留2的下限。但是,您仍然必须注意行矩阵和列1之间的差异,即形状为(1,3)v的列(3,1)。你多久写一次[1,2,3].'?我将要编写row vectorcolumn vector,但是受2d约束,MATLAB中没有任何矢量-至少从矢量的数学意义上讲不是1d。

您是否看过np.atleast_2d(还有_1d和_3d版本)?

For its base array class, 2d arrays are no more special than 1d or 3d ones. There are some operations the preserve the dimensions, some that reduce them, other combine or even expand them.

M=np.arange(9).reshape(3,3)
M[:,0].shape # (3,) selects one column, returns a 1d array
M[0,:].shape # same, one row, 1d array
M[:,[0]].shape # (3,1), index with a list (or array), returns 2d
M[:,[0,1]].shape # (3,2)

In [20]: np.dot(M[:,0].reshape(3,1),np.ones((1,3)))

Out[20]: 
array([[ 0.,  0.,  0.],
       [ 3.,  3.,  3.],
       [ 6.,  6.,  6.]])

In [21]: np.dot(M[:,[0]],np.ones((1,3)))
Out[21]: 
array([[ 0.,  0.,  0.],
       [ 3.,  3.,  3.],
       [ 6.,  6.,  6.]])

Other expressions that give the same array

np.dot(M[:,0][:,np.newaxis],np.ones((1,3)))
np.dot(np.atleast_2d(M[:,0]).T,np.ones((1,3)))
np.einsum('i,j',M[:,0],np.ones((3)))
M1=M[:,0]; R=np.ones((3)); np.dot(M1[:,None], R[None,:])

MATLAB started out with just 2D arrays. Newer versions allow more dimensions, but retain the lower bound of 2. But you still have to pay attention to the difference between a row matrix and column one, one with shape (1,3) v (3,1). How often have you written [1,2,3].'? I was going to write row vector and column vector, but with that 2d constraint, there aren’t any vectors in MATLAB – at least not in the mathematical sense of vector as being 1d.

Have you looked at np.atleast_2d (also _1d and _3d versions)?


回答 4

1)不喜欢的形状的原因(R, 1)(R,)在于,它不必要地复杂的事情。此外,为什么最好在(R, 1)长度R向量上默认使用形状而不是(1, R)?当您需要其他尺寸时,最好使其简单明了。

2)以您的示例为例,您正在计算外部产品,因此可以reshape通过使用np.outer以下命令来执行此操作而无需调用

np.outer(M[:,0], numpy.ones((1, R)))

1) The reason not to prefer a shape of (R, 1) over (R,) is that it unnecessarily complicates things. Besides, why would it be preferable to have shape (R, 1) by default for a length-R vector instead of (1, R)? It’s better to keep it simple and be explicit when you require additional dimensions.

2) For your example, you are computing an outer product so you can do this without a reshape call by using np.outer:

np.outer(M[:,0], numpy.ones((1, R)))

回答 5

这里已经有很多好的答案。但是对我来说,很难找到一些例子,其中形状或数组会破坏所有程序。

所以这是一个:

import numpy as np
a = np.array([1,2,3,4])
b = np.array([10,20,30,40])


from sklearn.linear_model import LinearRegression
regr = LinearRegression()
regr.fit(a,b)

这将因错误而失败:

ValueError:预期的2D数组,取而代之的是1D数组

但是如果我们添加reshapea

a = np.array([1,2,3,4]).reshape(-1,1)

这正常工作!

There are a lot of good answers here already. But for me it was hard to find some example, where the shape or array can break all the program.

So here is the one:

import numpy as np
a = np.array([1,2,3,4])
b = np.array([10,20,30,40])


from sklearn.linear_model import LinearRegression
regr = LinearRegression()
regr.fit(a,b)

This will fail with error:

ValueError: Expected 2D array, got 1D array instead

but if we add reshape to a:

a = np.array([1,2,3,4]).reshape(-1,1)

this works correctly!


numpy数组和矩阵有什么区别?我应该使用哪一个?

问题:numpy数组和矩阵有什么区别?我应该使用哪一个?

每种都有哪些优点和缺点?

从我所看到的情况来看,如果需要,任何一个都可以替代另一个,所以我应该同时使用这两个还是应该仅使用其中之一?

程序的样式会影响我的选择吗?我正在使用numpy进行一些机器学习,因此确实有很多矩阵,但也有很多向量(数组)。

What are the advantages and disadvantages of each?

From what I’ve seen, either one can work as a replacement for the other if need be, so should I bother using both or should I stick to just one of them?

Will the style of the program influence my choice? I am doing some machine learning using numpy, so there are indeed lots of matrices, but also lots of vectors (arrays).


回答 0

根据官方文件,不再建议使用矩阵类,因为将来会删除它。

https://numpy.org/doc/stable/reference/generation/numpy.matrix.html

正如其他答案所指出的那样,您可以使用NumPy数组实现所有操作。

As per the official documents, it’s not anymore advisable to use matrix class since it will be removed in the future.

https://numpy.org/doc/stable/reference/generated/numpy.matrix.html

As other answers already state that you can achieve all the operations with NumPy arrays.


回答 1

numpy的矩阵是严格2维的,而numpy的阵列(ndarrays)是N维的。矩阵对象是ndarray的子​​类,因此它们继承了ndarray的所有属性和方法。

numpy矩阵的主要优点是它们为矩阵乘法提供了一种方便的表示法:如果a和b是矩阵,则a*b它们是矩阵乘积。

import numpy as np

a = np.mat('4 3; 2 1')
b = np.mat('1 2; 3 4')
print(a)
# [[4 3]
#  [2 1]]
print(b)
# [[1 2]
#  [3 4]]
print(a*b)
# [[13 20]
#  [ 5  8]]

另一方面,从Python 3.5开始,NumPy使用@运算符支持中缀矩阵乘法,因此您可以在Python> = 3.5中使用ndarrays实现相同的矩阵乘法便捷性。

import numpy as np

a = np.array([[4, 3], [2, 1]])
b = np.array([[1, 2], [3, 4]])
print(a@b)
# [[13 20]
#  [ 5  8]]

矩阵对象和ndarray都.T必须返回转置,但是矩阵对象也必须具有.H共轭转置和.I逆。

相反,numpy数组始终遵守以元素为单位应用操作的规则(除了new @运算符)。因此,如果ab是numpy数组,则a*b该数组是通过按元素逐个乘以组成的:

c = np.array([[4, 3], [2, 1]])
d = np.array([[1, 2], [3, 4]])
print(c*d)
# [[4 6]
#  [6 4]]

要获得矩阵乘法的结果,请使用np.dot(或@在Python> = 3.5中,如上所示):

print(np.dot(c,d))
# [[13 20]
#  [ 5  8]]

**运营商还表现不同:

print(a**2)
# [[22 15]
#  [10  7]]
print(c**2)
# [[16  9]
#  [ 4  1]]

由于a是矩阵,所以a**2返回矩阵乘积a*a。由于c是ndarray,因此c**2返回一个ndarray,每个组件的元素均平方。

矩阵对象和ndarray之间还有其他技术差异(与np.ravel,项目选择和序列行为有关)。

numpy数组的主要优点是它们比二维矩阵更通用。当您需要3维数组时会发生什么?然后,您必须使用ndarray,而不是矩阵对象。因此,学习使用矩阵对象的工作量更大-您必须学习矩阵对象操作和ndarray操作。

编写一个将矩阵和数组混合在一起的程序会使您的生活变得困难,因为您必须跟踪变量是什么类型的对象,以免乘法返回您不期望的东西。

相反,如果仅使用ndarray,则可以执行矩阵对象可以执行的所有操作,以及更多操作,但功能/符号略有不同。

如果您愿意放弃NumPy矩阵产品表示法的视觉吸引力(使用python> = 3.5的ndarrays几乎可以优雅地实现),那么我认为NumPy数组绝对是可行的方法。

PS。当然,您实际上不必选择以牺牲另一个为代价,因为np.asmatrixnp.asarray允许您将一个转换为另一个(只要数组是二维的)。


还有就是与NumPy之间的差异大纲arraysVS NumPy的matrixES 这里

Numpy matrices are strictly 2-dimensional, while numpy arrays (ndarrays) are N-dimensional. Matrix objects are a subclass of ndarray, so they inherit all the attributes and methods of ndarrays.

The main advantage of numpy matrices is that they provide a convenient notation for matrix multiplication: if a and b are matrices, then a*b is their matrix product.

import numpy as np

a = np.mat('4 3; 2 1')
b = np.mat('1 2; 3 4')
print(a)
# [[4 3]
#  [2 1]]
print(b)
# [[1 2]
#  [3 4]]
print(a*b)
# [[13 20]
#  [ 5  8]]

On the other hand, as of Python 3.5, NumPy supports infix matrix multiplication using the @ operator, so you can achieve the same convenience of matrix multiplication with ndarrays in Python >= 3.5.

import numpy as np

a = np.array([[4, 3], [2, 1]])
b = np.array([[1, 2], [3, 4]])
print(a@b)
# [[13 20]
#  [ 5  8]]

Both matrix objects and ndarrays have .T to return the transpose, but matrix objects also have .H for the conjugate transpose, and .I for the inverse.

In contrast, numpy arrays consistently abide by the rule that operations are applied element-wise (except for the new @ operator). Thus, if a and b are numpy arrays, then a*b is the array formed by multiplying the components element-wise:

c = np.array([[4, 3], [2, 1]])
d = np.array([[1, 2], [3, 4]])
print(c*d)
# [[4 6]
#  [6 4]]

To obtain the result of matrix multiplication, you use np.dot (or @ in Python >= 3.5, as shown above):

print(np.dot(c,d))
# [[13 20]
#  [ 5  8]]

The ** operator also behaves differently:

print(a**2)
# [[22 15]
#  [10  7]]
print(c**2)
# [[16  9]
#  [ 4  1]]

Since a is a matrix, a**2 returns the matrix product a*a. Since c is an ndarray, c**2 returns an ndarray with each component squared element-wise.

There are other technical differences between matrix objects and ndarrays (having to do with np.ravel, item selection and sequence behavior).

The main advantage of numpy arrays is that they are more general than 2-dimensional matrices. What happens when you want a 3-dimensional array? Then you have to use an ndarray, not a matrix object. Thus, learning to use matrix objects is more work — you have to learn matrix object operations, and ndarray operations.

Writing a program that mixes both matrices and arrays makes your life difficult because you have to keep track of what type of object your variables are, lest multiplication return something you don’t expect.

In contrast, if you stick solely with ndarrays, then you can do everything matrix objects can do, and more, except with slightly different functions/notation.

If you are willing to give up the visual appeal of NumPy matrix product notation (which can be achieved almost as elegantly with ndarrays in Python >= 3.5), then I think NumPy arrays are definitely the way to go.

PS. Of course, you really don’t have to choose one at the expense of the other, since np.asmatrix and np.asarray allow you to convert one to the other (as long as the array is 2-dimensional).


There is a synopsis of the differences between NumPy arrays vs NumPy matrixes here.


回答 2

Scipy.org建议您使用数组:

*’array’或’matrix’?我应该使用哪个?-简短答案

使用数组。

  • 它们是numpy的标准向量/矩阵/张量类型。许多numpy函数返回数组,而不是矩阵。

  • 在逐元素运算和线性代数运算之间有明显的区别。

  • 如果愿意,可以有标准向量或行/列向量。

使用数组类型的唯一缺点是,您将不得不使用dot而不是*乘(减少)两个张量(标量积,矩阵向量乘法等)。

Scipy.org recommends that you use arrays:

*’array’ or ‘matrix’? Which should I use? – Short answer

Use arrays.

  • They are the standard vector/matrix/tensor type of numpy. Many numpy function return arrays, not matrices.

  • There is a clear distinction between element-wise operations and linear algebra operations.

  • You can have standard vectors or row/column vectors if you like.

The only disadvantage of using the array type is that you will have to use dot instead of * to multiply (reduce) two tensors (scalar product, matrix vector multiplication etc.).


回答 3

只是将一个案例添加到unutbu的列表中。

与numpy矩阵或矩阵语言(如matlab)相比,numpy ndarray对我而言最大的实际差异之一是,在归约运算中未保留维。矩阵始终为2d,而数组的均值则少一维。

例如,矩阵或数组的行为不佳的行:

带矩阵

>>> m = np.mat([[1,2],[2,3]])
>>> m
matrix([[1, 2],
        [2, 3]])
>>> mm = m.mean(1)
>>> mm
matrix([[ 1.5],
        [ 2.5]])
>>> mm.shape
(2, 1)
>>> m - mm
matrix([[-0.5,  0.5],
        [-0.5,  0.5]])

带阵列

>>> a = np.array([[1,2],[2,3]])
>>> a
array([[1, 2],
       [2, 3]])
>>> am = a.mean(1)
>>> am.shape
(2,)
>>> am
array([ 1.5,  2.5])
>>> a - am #wrong
array([[-0.5, -0.5],
       [ 0.5,  0.5]])
>>> a - am[:, np.newaxis]  #right
array([[-0.5,  0.5],
       [-0.5,  0.5]])

我还认为混合数组和矩阵会带来很多“快乐的”调试时间。但是,就乘法而言,scipy.sparse矩阵始终是矩阵。

Just to add one case to unutbu’s list.

One of the biggest practical differences for me of numpy ndarrays compared to numpy matrices or matrix languages like matlab, is that the dimension is not preserved in reduce operations. Matrices are always 2d, while the mean of an array, for example, has one dimension less.

For example demean rows of a matrix or array:

with matrix

>>> m = np.mat([[1,2],[2,3]])
>>> m
matrix([[1, 2],
        [2, 3]])
>>> mm = m.mean(1)
>>> mm
matrix([[ 1.5],
        [ 2.5]])
>>> mm.shape
(2, 1)
>>> m - mm
matrix([[-0.5,  0.5],
        [-0.5,  0.5]])

with array

>>> a = np.array([[1,2],[2,3]])
>>> a
array([[1, 2],
       [2, 3]])
>>> am = a.mean(1)
>>> am.shape
(2,)
>>> am
array([ 1.5,  2.5])
>>> a - am #wrong
array([[-0.5, -0.5],
       [ 0.5,  0.5]])
>>> a - am[:, np.newaxis]  #right
array([[-0.5,  0.5],
       [-0.5,  0.5]])

I also think that mixing arrays and matrices gives rise to many “happy” debugging hours. However, scipy.sparse matrices are always matrices in terms of operators like multiplication.


回答 4

正如其他人提到的那样,也许它的主要优点matrix是它为矩阵乘法提供了一种方便的符号。

但是,在Python 3.5中,终于有了一个专用的infix运算符用于矩阵乘法@

在最新的NumPy版本中,它可以与ndarrays 一起使用:

A = numpy.ones((1, 3))
B = numpy.ones((3, 3))
A @ B

因此,如今,如果有更多疑问,您应该坚持ndarray

As others have mentioned, perhaps the main advantage of matrix was that it provided a convenient notation for matrix multiplication.

However, in Python 3.5 there is finally a dedicated infix operator for matrix multiplication: @.

With recent NumPy versions, it can be used with ndarrays:

A = numpy.ones((1, 3))
B = numpy.ones((3, 3))
A @ B

So nowadays, even more, when in doubt, you should stick to ndarray.


转置/解压缩功能(zip的反函数)?

问题:转置/解压缩功能(zip的反函数)?

我有一个2项元组的列表,我想将它们转换为2个列表,其中第一个包含每个元组中的第一项,第二个包含第二项。

例如:

original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
# and I want to become...
result = (['a', 'b', 'c', 'd'], [1, 2, 3, 4])

有内置的功能吗?

I have a list of 2-item tuples and I’d like to convert them to 2 lists where the first contains the first item in each tuple and the second list holds the second item.

For example:

original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
# and I want to become...
result = (['a', 'b', 'c', 'd'], [1, 2, 3, 4])

Is there a builtin function that does that?


回答 0

zip是它自己的逆!前提是您使用特殊的*运算符。

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]

它的工作方式是通过调用zip参数:

zip(('a', 1), ('b', 2), ('c', 3), ('d', 4))

…除了参数zip直接传递(在转换为元组之后)之外,因此不必担心参数数量太大。

zip is its own inverse! Provided you use the special * operator.

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]

The way this works is by calling zip with the arguments:

zip(('a', 1), ('b', 2), ('c', 3), ('d', 4))

… except the arguments are passed to zip directly (after being converted to a tuple), so there’s no need to worry about the number of arguments getting too big.


回答 1

你也可以

result = ([ a for a,b in original ], [ b for a,b in original ])

应该更好地扩展。特别是如果Python除非需要,否则最好不要扩展列表推导。

(顺便说一句,它会组成一个2元组(一对)的列表,而不是一个元组列表,例如 zip。)

如果可以使用生成器而不是实际列表,则可以这样做:

result = (( a for a,b in original ), ( b for a,b in original ))

生成器在您请求每个元素之前不会仔细检查列表,但是另一方面,它们会保留对原始列表的引用。

You could also do

result = ([ a for a,b in original ], [ b for a,b in original ])

It should scale better. Especially if Python makes good on not expanding the list comprehensions unless needed.

(Incidentally, it makes a 2-tuple (pair) of lists, rather than a list of tuples, like zip does.)

If generators instead of actual lists are ok, this would do that:

result = (( a for a,b in original ), ( b for a,b in original ))

The generators don’t munch through the list until you ask for each element, but on the other hand, they do keep references to the original list.


回答 2

如果列表的长度不同,则可能不希望按照Patricks的答案使用zip。这有效:

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]

但是使用不同的长度列表,zip会将每个项目截断为最短列表的长度:

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', )])
[('a', 'b', 'c', 'd', 'e')]

您可以使用不带功能的map来用None填充空白结果:

>>> map(None, *[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', )])
[('a', 'b', 'c', 'd', 'e'), (1, 2, 3, 4, None)]

zip()稍快一些。

If you have lists that are not the same length, you may not want to use zip as per Patricks answer. This works:

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]

But with different length lists, zip truncates each item to the length of the shortest list:

>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', )])
[('a', 'b', 'c', 'd', 'e')]

You can use map with no function to fill empty results with None:

>>> map(None, *[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', )])
[('a', 'b', 'c', 'd', 'e'), (1, 2, 3, 4, None)]

zip() is marginally faster though.


回答 3

我喜欢在程序中使用zip(*iterable)(这是您要查找的代码):

def unzip(iterable):
    return zip(*iterable)

我发现unzip更具可读性。

I like to use zip(*iterable) (which is the piece of code you’re looking for) in my programs as so:

def unzip(iterable):
    return zip(*iterable)

I find unzip more readable.


回答 4

>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> tuple([list(tup) for tup in zip(*original)])
(['a', 'b', 'c', 'd'], [1, 2, 3, 4])

给出问题中的列表元组。

list1, list2 = [list(tup) for tup in zip(*original)]

解压缩两个列表。

>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> tuple([list(tup) for tup in zip(*original)])
(['a', 'b', 'c', 'd'], [1, 2, 3, 4])

Gives a tuple of lists as in the question.

list1, list2 = [list(tup) for tup in zip(*original)]

Unpacks the two lists.


回答 5

天真的方法

def transpose_finite_iterable(iterable):
    return zip(*iterable)  # `itertools.izip` for Python 2 users

对于(潜在无限)可迭代的有限可迭代(例如list/ tuple/的序列str),效果很好

| |a_00| |a_10| ... |a_n0| |
| |a_01| |a_11| ... |a_n1| |
| |... | |... | ... |... | |
| |a_0i| |a_1i| ... |a_ni| |
| |... | |... | ... |... | |

哪里

  • n in ℕ
  • a_ij对应于-th可迭代的j-th元素i

申请后transpose_finite_iterable我们得到

| |a_00| |a_01| ... |a_0i| ... |
| |a_10| |a_11| ... |a_1i| ... |
| |... | |... | ... |... | ... |
| |a_n0| |a_n1| ... |a_ni| ... |

这种情况的Python示例,其中a_ij == jn == 2

>>> from itertools import count
>>> iterable = [count(), count()]
>>> result = transpose_finite_iterable(iterable)
>>> next(result)
(0, 0)
>>> next(result)
(1, 1)

但是我们不能transpose_finite_iterable再次使用它来返回原始的结构,iterable因为它result是有限迭代的无限迭代(tuple在我们的例子中是s):

>>> transpose_finite_iterable(result)
... hangs ...
Traceback (most recent call last):
  File "...", line 1, in ...
  File "...", line 2, in transpose_finite_iterable
MemoryError

那么我们该如何处理呢?

…这是 deque

看完itertools.teefunction文档后,有一些Python配方可以通过一些修改来帮助解决我们的问题

def transpose_finite_iterables(iterable):
    iterator = iter(iterable)
    try:
        first_elements = next(iterator)
    except StopIteration:
        return ()
    queues = [deque([element])
              for element in first_elements]

    def coordinate(queue):
        while True:
            if not queue:
                try:
                    elements = next(iterator)
                except StopIteration:
                    return
                for sub_queue, element in zip(queues, elements):
                    sub_queue.append(element)
            yield queue.popleft()

    return tuple(map(coordinate, queues))

让我们检查

>>> from itertools import count
>>> iterable = [count(), count()]
>>> result = transpose_finite_iterables(transpose_finite_iterable(iterable))
>>> result
(<generator object transpose_finite_iterables.<locals>.coordinate at ...>, <generator object transpose_finite_iterables.<locals>.coordinate at ...>)
>>> next(result[0])
0
>>> next(result[0])
1

合成

现在我们可以定义通用函数来处理可迭代的可迭代对象,其中一些是有限的,而另一个则可以使用functools.singledispatch装饰器(例如)

from collections import (abc,
                         deque)
from functools import singledispatch


@singledispatch
def transpose(object_):
    """
    Transposes given object.
    """
    raise TypeError('Unsupported object type: {type}.'
                    .format(type=type))


@transpose.register(abc.Iterable)
def transpose_finite_iterables(object_):
    """
    Transposes given iterable of finite iterables.
    """
    iterator = iter(object_)
    try:
        first_elements = next(iterator)
    except StopIteration:
        return ()
    queues = [deque([element])
              for element in first_elements]

    def coordinate(queue):
        while True:
            if not queue:
                try:
                    elements = next(iterator)
                except StopIteration:
                    return
                for sub_queue, element in zip(queues, elements):
                    sub_queue.append(element)
            yield queue.popleft()

    return tuple(map(coordinate, queues))


def transpose_finite_iterable(object_):
    """
    Transposes given finite iterable of iterables.
    """
    yield from zip(*object_)

try:
    transpose.register(abc.Collection, transpose_finite_iterable)
except AttributeError:
    # Python3.5-
    transpose.register(abc.Mapping, transpose_finite_iterable)
    transpose.register(abc.Sequence, transpose_finite_iterable)
    transpose.register(abc.Set, transpose_finite_iterable)

在有限非空可迭代项上的二元运算符类中,可以将其视为自身的逆(数学家称这种函数为“对合”)。


作为singledispatching 的奖励,我们可以处理numpy类似

import numpy as np
...
transpose.register(np.ndarray, np.transpose)

然后像

>>> array = np.arange(4).reshape((2,2))
>>> array
array([[0, 1],
       [2, 3]])
>>> transpose(array)
array([[0, 2],
       [1, 3]])

注意

由于transpose返回迭代器,并且如果有人希望在OP中具有的tuplelist则可以通过map内置函数(例如

>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> tuple(map(list, transpose(original)))
(['a', 'b', 'c', 'd'], [1, 2, 3, 4])

广告

我已经添加推广解决方案lz0.5.0版本,可以像使用

>>> from lz.transposition import transpose
>>> list(map(tuple, transpose(zip(range(10), range(10, 20)))))
[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)]

聚苯乙烯

没有用于处理潜在无限迭代的潜在无限迭代的解决方案(至少很明显),但是这种情况并不常见。

Naive approach

def transpose_finite_iterable(iterable):
    return zip(*iterable)  # `itertools.izip` for Python 2 users

works fine for finite iterable (e.g. sequences like list/tuple/str) of (potentially infinite) iterables which can be illustrated like

| |a_00| |a_10| ... |a_n0| |
| |a_01| |a_11| ... |a_n1| |
| |... | |... | ... |... | |
| |a_0i| |a_1i| ... |a_ni| |
| |... | |... | ... |... | |

where

  • n in ℕ,
  • a_ij corresponds to j-th element of i-th iterable,

and after applying transpose_finite_iterable we get

| |a_00| |a_01| ... |a_0i| ... |
| |a_10| |a_11| ... |a_1i| ... |
| |... | |... | ... |... | ... |
| |a_n0| |a_n1| ... |a_ni| ... |

Python example of such case where a_ij == j, n == 2

>>> from itertools import count
>>> iterable = [count(), count()]
>>> result = transpose_finite_iterable(iterable)
>>> next(result)
(0, 0)
>>> next(result)
(1, 1)

But we can’t use transpose_finite_iterable again to return to structure of original iterable because result is an infinite iterable of finite iterables (tuples in our case):

>>> transpose_finite_iterable(result)
... hangs ...
Traceback (most recent call last):
  File "...", line 1, in ...
  File "...", line 2, in transpose_finite_iterable
MemoryError

So how can we deal with this case?

… and here comes the deque

After we take a look at docs of itertools.tee function, there is Python recipe that with some modification can help in our case

def transpose_finite_iterables(iterable):
    iterator = iter(iterable)
    try:
        first_elements = next(iterator)
    except StopIteration:
        return ()
    queues = [deque([element])
              for element in first_elements]

    def coordinate(queue):
        while True:
            if not queue:
                try:
                    elements = next(iterator)
                except StopIteration:
                    return
                for sub_queue, element in zip(queues, elements):
                    sub_queue.append(element)
            yield queue.popleft()

    return tuple(map(coordinate, queues))

let’s check

>>> from itertools import count
>>> iterable = [count(), count()]
>>> result = transpose_finite_iterables(transpose_finite_iterable(iterable))
>>> result
(<generator object transpose_finite_iterables.<locals>.coordinate at ...>, <generator object transpose_finite_iterables.<locals>.coordinate at ...>)
>>> next(result[0])
0
>>> next(result[0])
1

Synthesis

Now we can define general function for working with iterables of iterables ones of which are finite and another ones are potentially infinite using functools.singledispatch decorator like

from collections import (abc,
                         deque)
from functools import singledispatch


@singledispatch
def transpose(object_):
    """
    Transposes given object.
    """
    raise TypeError('Unsupported object type: {type}.'
                    .format(type=type))


@transpose.register(abc.Iterable)
def transpose_finite_iterables(object_):
    """
    Transposes given iterable of finite iterables.
    """
    iterator = iter(object_)
    try:
        first_elements = next(iterator)
    except StopIteration:
        return ()
    queues = [deque([element])
              for element in first_elements]

    def coordinate(queue):
        while True:
            if not queue:
                try:
                    elements = next(iterator)
                except StopIteration:
                    return
                for sub_queue, element in zip(queues, elements):
                    sub_queue.append(element)
            yield queue.popleft()

    return tuple(map(coordinate, queues))


def transpose_finite_iterable(object_):
    """
    Transposes given finite iterable of iterables.
    """
    yield from zip(*object_)

try:
    transpose.register(abc.Collection, transpose_finite_iterable)
except AttributeError:
    # Python3.5-
    transpose.register(abc.Mapping, transpose_finite_iterable)
    transpose.register(abc.Sequence, transpose_finite_iterable)
    transpose.register(abc.Set, transpose_finite_iterable)

which can be considered as its own inverse (mathematicians call this kind of functions “involutions”) in class of binary operators over finite non-empty iterables.


As a bonus of singledispatching we can handle numpy arrays like

import numpy as np
...
transpose.register(np.ndarray, np.transpose)

and then use it like

>>> array = np.arange(4).reshape((2,2))
>>> array
array([[0, 1],
       [2, 3]])
>>> transpose(array)
array([[0, 2],
       [1, 3]])

Note

Since transpose returns iterators and if someone wants to have a tuple of lists like in OP — this can be made additionally with map built-in function like

>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> tuple(map(list, transpose(original)))
(['a', 'b', 'c', 'd'], [1, 2, 3, 4])

Advertisement

I’ve added generalized solution to lz package from 0.5.0 version which can be used like

>>> from lz.transposition import transpose
>>> list(map(tuple, transpose(zip(range(10), range(10, 20)))))
[(0, 1, 2, 3, 4, 5, 6, 7, 8, 9), (10, 11, 12, 13, 14, 15, 16, 17, 18, 19)]

P.S.

There is no solution (at least obvious) for handling potentially infinite iterable of potentially infinite iterables, but this case is less common though.


回答 6

这只是另一种实现方式,但是它对我有很大帮助,所以我在这里写下来:

具有以下数据结构:

X=[1,2,3,4]
Y=['a','b','c','d']
XY=zip(X,Y)

导致:

In: XY
Out: [(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')]

我认为,将其解压缩并返回原始格式的更Python方式是:

x,y=zip(*XY)

但这返回一个元组,因此如果您需要一个列表,则可以使用:

x,y=(list(x),list(y))

It’s only another way to do it but it helped me a lot so I write it here:

Having this data structure:

X=[1,2,3,4]
Y=['a','b','c','d']
XY=zip(X,Y)

Resulting in:

In: XY
Out: [(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')]

The more pythonic way to unzip it and go back to the original is this one in my opinion:

x,y=zip(*XY)

But this return a tuple so if you need a list you can use:

x,y=(list(x),list(y))

回答 7

考虑使用more_itertools.unzip

>>> from more_itertools import unzip
>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> [list(x) for x in unzip(original)]
[['a', 'b', 'c', 'd'], [1, 2, 3, 4]]     

Consider using more_itertools.unzip:

>>> from more_itertools import unzip
>>> original = [('a', 1), ('b', 2), ('c', 3), ('d', 4)]
>>> [list(x) for x in unzip(original)]
[['a', 'b', 'c', 'd'], [1, 2, 3, 4]]     

回答 8

因为它返回元组(并且可以使用大量内存),所以zip(*zipped)对我来说,这个技巧似乎比有用的还要聪明。

这是一个实际上将为您提供zip反函数的函数。

def unzip(zipped):
    """Inverse of built-in zip function.
    Args:
        zipped: a list of tuples

    Returns:
        a tuple of lists

    Example:
        a = [1, 2, 3]
        b = [4, 5, 6]
        zipped = list(zip(a, b))

        assert zipped == [(1, 4), (2, 5), (3, 6)]

        unzipped = unzip(zipped)

        assert unzipped == ([1, 2, 3], [4, 5, 6])

    """

    unzipped = ()
    if len(zipped) == 0:
        return unzipped

    dim = len(zipped[0])

    for i in range(dim):
        unzipped = unzipped + ([tup[i] for tup in zipped], )

    return unzipped

Since it returns tuples (and can use tons of memory), the zip(*zipped) trick seems more clever than useful, to me.

Here’s a function that will actually give you the inverse of zip.

def unzip(zipped):
    """Inverse of built-in zip function.
    Args:
        zipped: a list of tuples

    Returns:
        a tuple of lists

    Example:
        a = [1, 2, 3]
        b = [4, 5, 6]
        zipped = list(zip(a, b))

        assert zipped == [(1, 4), (2, 5), (3, 6)]

        unzipped = unzip(zipped)

        assert unzipped == ([1, 2, 3], [4, 5, 6])

    """

    unzipped = ()
    if len(zipped) == 0:
        return unzipped

    dim = len(zipped[0])

    for i in range(dim):
        unzipped = unzipped + ([tup[i] for tup in zipped], )

    return unzipped

回答 9

先前的答案都没有有效地提供所需的输出,即列表的元组,而不是元组的列表。对于前者,你可以使用与。区别在于:tuplemap

res1 = list(zip(*original))              # [('a', 'b', 'c', 'd'), (1, 2, 3, 4)]
res2 = tuple(map(list, zip(*original)))  # (['a', 'b', 'c', 'd'], [1, 2, 3, 4])

此外,大多数以前的解决方案都假定使用Python 2.7,在Python 2.7中zip返回列表而不是迭代器。

对于Python 3.x,您需要将结果传递给诸如listtuple耗尽迭代器的函数。对于内存高效的迭代器,您可以省略外部list和外部tuple调用各自的解决方案。

None of the previous answers efficiently provide the required output, which is a tuple of lists, rather than a list of tuples. For the former, you can use tuple with map. Here’s the difference:

res1 = list(zip(*original))              # [('a', 'b', 'c', 'd'), (1, 2, 3, 4)]
res2 = tuple(map(list, zip(*original)))  # (['a', 'b', 'c', 'd'], [1, 2, 3, 4])

In addition, most of the previous solutions assume Python 2.7, where zip returns a list rather than an iterator.

For Python 3.x, you will need to pass the result to a function such as list or tuple to exhaust the iterator. For memory-efficient iterators, you can omit the outer list and tuple calls for the respective solutions.


回答 10

虽然zip(*seq)非常有用,但可能不适用于很长的序列,因为它将创建要传递的值的元组。例如,我一直在使用具有超过一百万个条目的坐标系,并且发现创建它的速度明显更快序列直接。

通用方法如下所示:

from collections import deque
seq = ((a1, b1, …), (a2, b2, …), …)
width = len(seq[0])
output = [deque(len(seq))] * width # preallocate memory
for element in seq:
    for s, item in zip(output, element):
        s.append(item)

但是,根据您要对结果执行的操作,收集的选择可能会产生很大的不同。在我的实际用例中,使用集而不使用内部循环比所有其他方法明显更快。

而且,正如其他人指出的那样,如果您要对数据集执行此操作,则可以改用Numpy或Pandas集合。

While zip(*seq) is very useful, it may be unsuitable for very long sequences as it will create a tuple of values to be passed in. For example, I’ve been working with a coordinate system with over a million entries and find it signifcantly faster to create the sequences directly.

A generic approach would be something like this:

from collections import deque
seq = ((a1, b1, …), (a2, b2, …), …)
width = len(seq[0])
output = [deque(len(seq))] * width # preallocate memory
for element in seq:
    for s, item in zip(output, element):
        s.append(item)

But, depending on what you want to do with the result, the choice of collection can make a big difference. In my actual use case, using sets and no internal loop, is noticeably faster than all other approaches.

And, as others have noted, if you are doing this with datasets, it might make sense to use Numpy or Pandas collections instead.


回答 11

虽然numpy数组和熊猫可能是更可取的,但此函数模仿zip(*args)as时的行为unzip(args)

允许在args迭代值时传递生成器。装饰cls和/或main_cls微管理容器初始化。

def unzip(items, cls=list, main_cls=tuple):
    """Zip function in reverse.

    :param items: Zipped-like iterable.
    :type  items: iterable

    :param cls: Callable that returns iterable with callable append attribute.
        Defaults to `list`.
    :type  cls: callable, optional

    :param main_cls: Callable that returns iterable with callable append
        attribute. Defaults to `tuple`.
    :type  main_cls: callable, optional

    :returns: Unzipped items in instances returned from `cls`, in an instance
        returned from `main_cls`.

    :Example:

        assert unzip(zip(["a","b","c"],[1,2,3])) == (["a","b",c"],[1,2,3])
        assert unzip([("a",1),("b",2),("c",3)]) == (["a","b","c"],[1,2,3])
        assert unzip([("a",1)], deque, list) == [deque(["a"]),deque([1])]
        assert unzip((["a"],["b"]), lambda i: deque(i,1)) == (deque(["b"]),)
    """
    items = iter(items)

    try:
        i = next(items)
    except StopIteration:
        return main_cls()

    unzipped = main_cls(cls([v]) for v in i)

    for i in items:
        for c,v in zip(unzipped,i):
            c.append(v)

    return unzipped

While numpy arrays and pandas may be preferrable, this function imitates the behavior of zip(*args) when called as unzip(args).

Allows for generators to be passed as args as it iterates through values. Decorate cls and/or main_cls to micro manage container initialization.

def unzip(items, cls=list, main_cls=tuple):
    """Zip function in reverse.

    :param items: Zipped-like iterable.
    :type  items: iterable

    :param cls: Callable that returns iterable with callable append attribute.
        Defaults to `list`.
    :type  cls: callable, optional

    :param main_cls: Callable that returns iterable with callable append
        attribute. Defaults to `tuple`.
    :type  main_cls: callable, optional

    :returns: Unzipped items in instances returned from `cls`, in an instance
        returned from `main_cls`.

    :Example:

        assert unzip(zip(["a","b","c"],[1,2,3])) == (["a","b",c"],[1,2,3])
        assert unzip([("a",1),("b",2),("c",3)]) == (["a","b","c"],[1,2,3])
        assert unzip([("a",1)], deque, list) == [deque(["a"]),deque([1])]
        assert unzip((["a"],["b"]), lambda i: deque(i,1)) == (deque(["b"]),)
    """
    items = iter(items)

    try:
        i = next(items)
    except StopIteration:
        return main_cls()

    unzipped = main_cls(cls([v]) for v in i)

    for i in items:
        for c,v in zip(unzipped,i):
            c.append(v)

    return unzipped

如何在Python中定义二维数组

问题:如何在Python中定义二维数组

我想定义一个没有初始化长度的二维数组,如下所示:

Matrix = [][]

但这不起作用…

我已经尝试过下面的代码,但是它也是错误的:

Matrix = [5][5]

错误:

Traceback ...

IndexError: list index out of range

我怎么了

I want to define a two-dimensional array without an initialized length like this:

Matrix = [][]

but it does not work…

I’ve tried the code below, but it is wrong too:

Matrix = [5][5]

Error:

Traceback ...

IndexError: list index out of range

What is my mistake?


回答 0

从技术上讲,您正在尝试索引未初始化的数组。您必须先使用列表初始化外部列表,然后再添加项目。Python将其称为“列表理解”。

# Creates a list containing 5 lists, each of 8 items, all set to 0
w, h = 8, 5;
Matrix = [[0 for x in range(w)] for y in range(h)] 

您现在可以将项目添加到列表中:

Matrix[0][0] = 1
Matrix[6][0] = 3 # error! range... 
Matrix[0][6] = 3 # valid

请注意,矩阵是“ y”地址主地址,换句话说,“ y索引”位于“ x索引”之前。

print Matrix[0][0] # prints 1
x, y = 0, 6 
print Matrix[x][y] # prints 3; be careful with indexing! 

尽管您可以根据需要命名它们,但是如果您对内部列表和外部列表都使用“ x”,并且希望使用非平方矩阵,那么我会以这种方式来避免索引可能引起的混淆。

You’re technically trying to index an uninitialized array. You have to first initialize the outer list with lists before adding items; Python calls this “list comprehension”.

# Creates a list containing 5 lists, each of 8 items, all set to 0
w, h = 8, 5;
Matrix = [[0 for x in range(w)] for y in range(h)] 

You can now add items to the list:

Matrix[0][0] = 1
Matrix[6][0] = 3 # error! range... 
Matrix[0][6] = 3 # valid

Note that the matrix is “y” address major, in other words, the “y index” comes before the “x index”.

print Matrix[0][0] # prints 1
x, y = 0, 6 
print Matrix[x][y] # prints 3; be careful with indexing! 

Although you can name them as you wish, I look at it this way to avoid some confusion that could arise with the indexing, if you use “x” for both the inner and outer lists, and want a non-square Matrix.


回答 1

如果您确实需要矩阵,最好使用numpy。在numpy大多数情况下,矩阵运算使用具有二维的数组类型。有很多方法可以创建一个新数组。最有用的zeros函数之一是函数,它采用shape参数并返回给定形状的数组,其值初始化为零:

>>> import numpy
>>> numpy.zeros((5, 5))
array([[ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.]])

这是创建二维数组和矩阵的其他一些方法(为了紧凑起见,删除了输出):

numpy.arange(25).reshape((5, 5))         # create a 1-d range and reshape
numpy.array(range(25)).reshape((5, 5))   # pass a Python range and reshape
numpy.array([5] * 25).reshape((5, 5))    # pass a Python list and reshape
numpy.empty((5, 5))                      # allocate, but don't initialize
numpy.ones((5, 5))                       # initialize with ones

numpy也提供了一种matrix类型,但是不再建议将其用于任何用途,以后可能会删除numpy它。

If you really want a matrix, you might be better off using numpy. Matrix operations in numpy most often use an array type with two dimensions. There are many ways to create a new array; one of the most useful is the zeros function, which takes a shape parameter and returns an array of the given shape, with the values initialized to zero:

>>> import numpy
>>> numpy.zeros((5, 5))
array([[ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.]])

Here are some other ways to create 2-d arrays and matrices (with output removed for compactness):

numpy.arange(25).reshape((5, 5))         # create a 1-d range and reshape
numpy.array(range(25)).reshape((5, 5))   # pass a Python range and reshape
numpy.array([5] * 25).reshape((5, 5))    # pass a Python list and reshape
numpy.empty((5, 5))                      # allocate, but don't initialize
numpy.ones((5, 5))                       # initialize with ones

numpy provides a matrix type as well, but it is no longer recommended for any use, and may be removed from numpy in the future.


回答 2

这是用于初始化列表列表的简短表示法:

matrix = [[0]*5 for i in range(5)]

不幸的是,将其缩短为类似的方法5*[5*[0]]实际上是行不通的,因为最终您会得到同一列表的5个副本,因此,当您修改其中一个副本时,它们都会更改,例如:

>>> matrix = 5*[5*[0]]
>>> matrix
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
>>> matrix[4][4] = 2
>>> matrix
[[0, 0, 0, 0, 2], [0, 0, 0, 0, 2], [0, 0, 0, 0, 2], [0, 0, 0, 0, 2], [0, 0, 0, 0, 2]]

Here is a shorter notation for initializing a list of lists:

matrix = [[0]*5 for i in range(5)]

Unfortunately shortening this to something like 5*[5*[0]] doesn’t really work because you end up with 5 copies of the same list, so when you modify one of them they all change, for example:

>>> matrix = 5*[5*[0]]
>>> matrix
[[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
>>> matrix[4][4] = 2
>>> matrix
[[0, 0, 0, 0, 2], [0, 0, 0, 0, 2], [0, 0, 0, 0, 2], [0, 0, 0, 0, 2], [0, 0, 0, 0, 2]]

回答 3

如果要创建一个空矩阵,则正确的语法是

matrix = [[]]

如果您要生成大小为5的矩阵,并用0填充,

matrix = [[0 for i in xrange(5)] for i in xrange(5)]

If you want to create an empty matrix, the correct syntax is

matrix = [[]]

And if you want to generate a matrix of size 5 filled with 0,

matrix = [[0 for i in xrange(5)] for i in xrange(5)]

回答 4

如果只需要一个二维容器来容纳某些元素,则可以方便地使用字典:

Matrix = {}

然后,您可以执行以下操作:

Matrix[1,2] = 15
print Matrix[1,2]

这是有效的,因为它1,2是一个元组,并且您将其用作索引字典的键。结果类似于哑的稀疏矩阵。

如osa和Josap Valls所指出的,您也可以使用,Matrix = collections.defaultdict(lambda:0)以便丢失的元素具有默认值0

Vatsal进一步指出,该方法对于大型矩阵可能不是很有效,并且仅应在代码的非关键性能部分中使用。

If all you want is a two dimensional container to hold some elements, you could conveniently use a dictionary instead:

Matrix = {}

Then you can do:

Matrix[1,2] = 15
print Matrix[1,2]

This works because 1,2 is a tuple, and you’re using it as a key to index the dictionary. The result is similar to a dumb sparse matrix.

As indicated by osa and Josap Valls, you can also use Matrix = collections.defaultdict(lambda:0) so that the missing elements have a default value of 0.

Vatsal further points that this method is probably not very efficient for large matrices and should only be used in non performance-critical parts of the code.


回答 5

在Python中,您将创建一个列表列表。您不必提前声明尺寸,但是可以声明。例如:

matrix = []
matrix.append([])
matrix.append([])
matrix[0].append(2)
matrix[1].append(3)

现在,matrix [0] [0] == 2和matrix [1] [0] ==3。您还可以使用列表理解语法。此示例两次使用它来构建“二维列表”:

from itertools import count, takewhile
matrix = [[i for i in takewhile(lambda j: j < (k+1) * 10, count(k*10))] for k in range(10)]

In Python you will be creating a list of lists. You do not have to declare the dimensions ahead of time, but you can. For example:

matrix = []
matrix.append([])
matrix.append([])
matrix[0].append(2)
matrix[1].append(3)

Now matrix[0][0] == 2 and matrix[1][0] == 3. You can also use the list comprehension syntax. This example uses it twice over to build a “two-dimensional list”:

from itertools import count, takewhile
matrix = [[i for i in takewhile(lambda j: j < (k+1) * 10, count(k*10))] for k in range(10)]

回答 6

公认的答案是正确且正确的,但是花了我一段时间才了解到我也可以使用它来创建一个完全空的数组。

l =  [[] for _ in range(3)]

结果是

[[], [], []]

The accepted answer is good and correct, but it took me a while to understand that I could also use it to create a completely empty array.

l =  [[] for _ in range(3)]

results in

[[], [], []]

回答 7

您应该列出列表,最好的方法是使用嵌套的理解:

>>> matrix = [[0 for i in range(5)] for j in range(5)]
>>> pprint.pprint(matrix)
[[0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0]]

在您的[5][5]示例中,您正在创建一个内部带有整数“ 5”的列表,并尝试访问其第五项,这自然会引发IndexError,因为没有第五项:

>>> l = [5]
>>> l[5]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

You should make a list of lists, and the best way is to use nested comprehensions:

>>> matrix = [[0 for i in range(5)] for j in range(5)]
>>> pprint.pprint(matrix)
[[0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0],
 [0, 0, 0, 0, 0]]

On your [5][5] example, you are creating a list with an integer “5” inside, and try to access its 5th item, and that naturally raises an IndexError because there is no 5th item:

>>> l = [5]
>>> l[5]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

回答 8

rows = int(input())
cols = int(input())

matrix = []
for i in range(rows):
  row = []
  for j in range(cols):
    row.append(0)
  matrix.append(row)

print(matrix)

为什么这么长的代码,Python您也会问?

很久以前,当我不熟悉Python时,我看到了编写2D矩阵的单行答案,并告诉自己我不再打算在Python中再次使用2D矩阵。(这些行很吓人,它没有给我有关Python所做的任何信息。还要注意,我不知道这些速记法。)

无论如何,这是一个来自C,CPP和Java背景的初学者的代码

给Python爱好者和专家的说明:请不要因为我编写了详细的代码而投了反对票。

rows = int(input())
cols = int(input())

matrix = []
for i in range(rows):
  row = []
  for j in range(cols):
    row.append(0)
  matrix.append(row)

print(matrix)

Why such a long code, that too in Python you ask?

Long back when I was not comfortable with Python, I saw the single line answers for writing 2D matrix and told myself I am not going to use 2-D matrix in Python again. (Those single lines were pretty scary and It didn’t give me any information on what Python was doing. Also note that I am not aware of these shorthands.)

Anyways, here’s the code for a beginner whose coming from C, CPP and Java background

Note to Python Lovers and Experts: Please do not down vote just because I wrote a detailed code.


回答 9

重写以便于阅读:

# 2D array/ matrix

# 5 rows, 5 cols
rows_count = 5
cols_count = 5

# create
#     creation looks reverse
#     create an array of "cols_count" cols, for each of the "rows_count" rows
#        all elements are initialized to 0
two_d_array = [[0 for j in range(cols_count)] for i in range(rows_count)]

# index is from 0 to 4
#     for both rows & cols
#     since 5 rows, 5 cols

# use
two_d_array[0][0] = 1
print two_d_array[0][0]  # prints 1   # 1st row, 1st col (top-left element of matrix)

two_d_array[1][0] = 2
print two_d_array[1][0]  # prints 2   # 2nd row, 1st col

two_d_array[1][4] = 3
print two_d_array[1][4]  # prints 3   # 2nd row, last col

two_d_array[4][4] = 4
print two_d_array[4][4]  # prints 4   # last row, last col (right, bottom element of matrix)

A rewrite for easy reading:

# 2D array/ matrix

# 5 rows, 5 cols
rows_count = 5
cols_count = 5

# create
#     creation looks reverse
#     create an array of "cols_count" cols, for each of the "rows_count" rows
#        all elements are initialized to 0
two_d_array = [[0 for j in range(cols_count)] for i in range(rows_count)]

# index is from 0 to 4
#     for both rows & cols
#     since 5 rows, 5 cols

# use
two_d_array[0][0] = 1
print two_d_array[0][0]  # prints 1   # 1st row, 1st col (top-left element of matrix)

two_d_array[1][0] = 2
print two_d_array[1][0]  # prints 2   # 2nd row, 1st col

two_d_array[1][4] = 3
print two_d_array[1][4]  # prints 3   # 2nd row, last col

two_d_array[4][4] = 4
print two_d_array[4][4]  # prints 4   # last row, last col (right, bottom element of matrix)

回答 10

采用:

matrix = [[0]*5 for i in range(5)]

第一维的* 5起作用是因为在此级别上,数据是不可变的。

Use:

matrix = [[0]*5 for i in range(5)]

The *5 for the first dimension works because at this level the data is immutable.


回答 11

声明零(一)矩阵:

numpy.zeros((x, y))

例如

>>> numpy.zeros((3, 5))
    array([[ 0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.]])

或numpy.ones((x,y))例如

>>> np.ones((3, 5))
array([[ 1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.]])

甚至三个尺寸都是可能的。(http://www.astro.ufl.edu/~warner/prog/python.html参见->多维数组)

To declare a matrix of zeros (ones):

numpy.zeros((x, y))

e.g.

>>> numpy.zeros((3, 5))
    array([[ 0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.]])

or numpy.ones((x, y)) e.g.

>>> np.ones((3, 5))
array([[ 1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.],
   [ 1.,  1.,  1.,  1.,  1.]])

Even three dimensions are possible. (http://www.astro.ufl.edu/~warner/prog/python.html see –> Multi-dimensional arrays)


回答 12

这就是我通常在python中创建2D数组的方式。

col = 3
row = 4
array = [[0] * col for _ in range(row)]

与在列表理解中使用两个for循环相比,我发现此语法易于记住。

This is how I usually create 2D arrays in python.

col = 3
row = 4
array = [[0] * col for _ in range(row)]

I find this syntax easy to remember compared to using two for loops in a list comprehension.


回答 13

我正在使用我的第一个Python脚本,我对方矩阵示例有些困惑,因此希望以下示例可以帮助您节省一些时间:

 # Creates a 2 x 5 matrix
 Matrix = [[0 for y in xrange(5)] for x in xrange(2)]

以便

Matrix[1][4] = 2 # Valid
Matrix[4][1] = 3 # IndexError: list index out of range

I’m on my first Python script, and I was a little confused by the square matrix example so I hope the below example will help you save some time:

 # Creates a 2 x 5 matrix
 Matrix = [[0 for y in xrange(5)] for x in xrange(2)]

so that

Matrix[1][4] = 2 # Valid
Matrix[4][1] = 3 # IndexError: list index out of range

回答 14

使用NumPy,您可以像这样初始化空矩阵:

import numpy as np
mm = np.matrix([])

然后像这样追加数据:

mm = np.append(mm, [[1,2]], axis=1)

Using NumPy you can initialize empty matrix like this:

import numpy as np
mm = np.matrix([])

And later append data like this:

mm = np.append(mm, [[1,2]], axis=1)

回答 15

我读了这样的逗号分隔文件:

data=[]
for l in infile:
    l = split(',')
    data.append(l)

然后,列表“数据”是带有索引数据的列表的列表[行] [列]

I read in comma separated files like this:

data=[]
for l in infile:
    l = split(',')
    data.append(l)

The list “data” is then a list of lists with index data[row][col]


回答 16

如果您希望能够将其视为2D数组,而不是被迫以列表列表的方式思考(我认为这自然得多),则可以执行以下操作:

import numpy
Nx=3; Ny=4
my2Dlist= numpy.zeros((Nx,Ny)).tolist()

结果是一个列表(不是NumPy数组),您可以用数字,字符串或其他内容覆盖各个位置。

If you want to be able to think it as a 2D array rather than being forced to think in term of a list of lists (much more natural in my opinion), you can do the following:

import numpy
Nx=3; Ny=4
my2Dlist= numpy.zeros((Nx,Ny)).tolist()

The result is a list (not a NumPy array), and you can overwrite the individual positions with numbers, strings, whatever.


回答 17

这就是字典的用途!

matrix = {}

您可以通过两种方式定义

matrix[0,0] = value

要么

matrix = { (0,0)  : value }

结果:

   [ value,  value,  value,  value,  value],
   [ value,  value,  value,  value,  value],
   ...

That’s what dictionary is made for!

matrix = {}

You can define keys and values in two ways:

matrix[0,0] = value

or

matrix = { (0,0)  : value }

Result:

   [ value,  value,  value,  value,  value],
   [ value,  value,  value,  value,  value],
   ...

回答 18

采用:

import copy

def ndlist(*args, init=0):
    dp = init
    for x in reversed(args):
        dp = [copy.deepcopy(dp) for _ in range(x)]
    return dp

l = ndlist(1,2,3,4) # 4 dimensional list initialized with 0's
l[0][1][2][3] = 1

我确实认为NumPy是要走的路。如果您不想使用NumPy,则以上是一种通用方法。

Use:

import copy

def ndlist(*args, init=0):
    dp = init
    for x in reversed(args):
        dp = [copy.deepcopy(dp) for _ in range(x)]
    return dp

l = ndlist(1,2,3,4) # 4 dimensional list initialized with 0's
l[0][1][2][3] = 1

I do think NumPy is the way to go. The above is a generic one if you don’t want to use NumPy.


回答 19

通过使用列表:

matrix_in_python  = [['Roy',80,75,85,90,95],['John',75,80,75,85,100],['Dave',80,80,80,90,95]]

通过使用dict:您还可以将此信息存储在哈希表中,以进行快速搜索,例如

matrix = { '1':[0,0] , '2':[0,1],'3':[0,2],'4' : [1,0],'5':[1,1],'6':[1,2],'7':[2,0],'8':[2,1],'9':[2,2]};

matrix [‘1’]将为您提供O(1)时间的结果

* nb:您需要处理哈希表中的冲突

by using list :

matrix_in_python  = [['Roy',80,75,85,90,95],['John',75,80,75,85,100],['Dave',80,80,80,90,95]]

by using dict: you can also store this info in the hash table for fast searching like

matrix = { '1':[0,0] , '2':[0,1],'3':[0,2],'4' : [1,0],'5':[1,1],'6':[1,2],'7':[2,0],'8':[2,1],'9':[2,2]};

matrix[‘1’] will give you result in O(1) time

*nb: you need to deal with a collision in the hash table


回答 20

如果在开始之前没有尺寸信息,请创建两个一维列表。

list 1: To store rows
list 2: Actual two-dimensional matrix

将整个行存储在第一个列表中。完成后,将列表1附加到列表2:

from random import randint

coordinates=[]
temp=[]
points=int(raw_input("Enter No Of Coordinates >"))
for i in range(0,points):
    randomx=randint(0,1000)
    randomy=randint(0,1000)
    temp=[]
    temp.append(randomx)
    temp.append(randomy)
    coordinates.append(temp)

print coordinates

输出:

Enter No Of Coordinates >4
[[522, 96], [378, 276], [349, 741], [238, 439]]

If you don’t have size information before start then create two one-dimensional lists.

list 1: To store rows
list 2: Actual two-dimensional matrix

Store the entire row in the 1st list. Once done, append list 1 into list 2:

from random import randint

coordinates=[]
temp=[]
points=int(raw_input("Enter No Of Coordinates >"))
for i in range(0,points):
    randomx=randint(0,1000)
    randomy=randint(0,1000)
    temp=[]
    temp.append(randomx)
    temp.append(randomy)
    coordinates.append(temp)

print coordinates

Output:

Enter No Of Coordinates >4
[[522, 96], [378, 276], [349, 741], [238, 439]]

回答 21

# Creates a list containing 5 lists initialized to 0
Matrix = [[0]*5]*5

请注意此简短表达,请参见@FJ答案中的完整解释

# Creates a list containing 5 lists initialized to 0
Matrix = [[0]*5]*5

Be careful about this short expression, see full explanation down in @F.J’s answer


回答 22

l=[[0]*(L) for _ in range(W)]

将比:

l = [[0 for x in range(L)] for y in range(W)] 
l=[[0]*(L) for _ in range(W)]

Will be faster than:

l = [[0 for x in range(L)] for y in range(W)] 

回答 23

您可以通过将两个或多个方括号或第三个方括号([]用逗号分隔)嵌套在一起来创建一个空的二维列表,如下所示:

Matrix = [[], []]

现在假设您要在Matrix[0][0]其后附加1,然后键入:

Matrix[0].append(1)

现在,键入Matrix并按Enter。输出将是:

[[1], []]

You can create an empty two dimensional list by nesting two or more square bracing or third bracket ([], separated by comma) with a square bracing, just like below:

Matrix = [[], []]

Now suppose you want to append 1 to Matrix[0][0] then you type:

Matrix[0].append(1)

Now, type Matrix and hit Enter. The output will be:

[[1], []]

回答 24

尝试这个:

rows = int(input('Enter rows\n'))
my_list = []
for i in range(rows):
    my_list.append(list(map(int, input().split())))

Try this:

rows = int(input('Enter rows\n'))
my_list = []
for i in range(rows):
    my_list.append(list(map(int, input().split())))

回答 25

如果您需要带有预定义数字的矩阵,则可以使用以下代码:

def matrix(rows, cols, start=0):
    return [[c + start + r * cols for c in range(cols)] for r in range(rows)]


assert matrix(2, 3, 1) == [[1, 2, 3], [4, 5, 6]]

In case if you need a matrix with predefined numbers you can use the following code:

def matrix(rows, cols, start=0):
    return [[c + start + r * cols for c in range(cols)] for r in range(rows)]


assert matrix(2, 3, 1) == [[1, 2, 3], [4, 5, 6]]

回答 26

这是在python中创建矩阵的代码片段:

# get the input rows and cols
rows = int(input("rows : "))
cols = int(input("Cols : "))

# initialize the list
l=[[0]*cols for i in range(rows)]

# fill some random values in it
for i in range(0,rows):
    for j in range(0,cols):
        l[i][j] = i+j

# print the list
for i in range(0,rows):
    print()
    for j in range(0,cols):
        print(l[i][j],end=" ")

如果我错过了什么,请提出建议。

Here is the code snippet for creating a matrix in python:

# get the input rows and cols
rows = int(input("rows : "))
cols = int(input("Cols : "))

# initialize the list
l=[[0]*cols for i in range(rows)]

# fill some random values in it
for i in range(0,rows):
    for j in range(0,cols):
        l[i][j] = i+j

# print the list
for i in range(0,rows):
    print()
    for j in range(0,cols):
        print(l[i][j],end=" ")

Please suggest if I have missed something.