标签归档:matrix-multiplication

Python中的’@ =’符号是什么?

问题:Python中的’@ =’符号是什么?

我知道@是给装饰器用的,但是@=Python有什么用呢?只是保留一些未来的想法吗?

这只是我阅读时遇到的许多问题之一tokenizer.py

I know @ is for decorators, but what is @= for in Python? Is it just reservation for some future idea?

This is just one of my many questions while reading tokenizer.py.


回答 0

文档

@(在)操作者意图被用于矩阵乘法。没有内置的Python类型实现此运算符。

@运算符是在Python 3.5中引入的。@=正如您所期望的那样,是矩阵乘法,后跟赋值。它们映射到__matmul____rmatmul____imatmul__类似于如何++=映射__add____radd____iadd__

PEP 465中详细讨论了操作员及其背后的原理。

From the documentation:

The @ (at) operator is intended to be used for matrix multiplication. No builtin Python types implement this operator.

The @ operator was introduced in Python 3.5. @= is matrix multiplication followed by assignment, as you would expect. They map to __matmul__, __rmatmul__ or __imatmul__ similar to how + and += map to __add__, __radd__ or __iadd__.

The operator and the rationale behind it are discussed in detail in PEP 465.


回答 1

@=@是Python 3.5中引入的用于执行矩阵乘法的新运算符。它们的目的是澄清迄今为止​​与运算符之间存在的混淆,该运算符*根据该特定库/代码中采用的约定用于元素方式乘法或矩阵乘法。结果,将来,运营商*只能用于按元素乘法。

PEP0465中所述,引入了两个运算符:

  • 一个新的二进制运算符A @ B,与A * B
  • 就地版本A @= B,与A *= B

矩阵乘法与按元素乘法

为了快速突出区别,对于两个矩阵:

A = [[1, 2],    B = [[11, 12],
     [3, 4]]         [13, 14]]
  • 逐元素乘法将生成:

    A * B = [[1 * 11,   2 * 12], 
             [3 * 13,   4 * 14]]
  • 矩阵乘法将生成:

    A @ B  =  [[1 * 11 + 2 * 13,   1 * 12 + 2 * 14],
               [3 * 11 + 4 * 13,   3 * 12 + 4 * 14]]

在Numpy中使用

到目前为止,Numpy使用以下约定:

@运算符的引入使涉及矩阵乘法的代码更易于阅读。PEP0465举了一个例子:

# Current implementation of matrix multiplications using dot function
S = np.dot((np.dot(H, beta) - r).T,
            np.dot(inv(np.dot(np.dot(H, V), H.T)), np.dot(H, beta) - r))

# Current implementation of matrix multiplications using dot method
S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r)

# Using the @ operator instead
S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)

显然,最后一种实现更易于阅读和解释为等式。

@= and @ are new operators introduced in Python 3.5 performing matrix multiplication. They are meant to clarify the confusion which existed so far with the operator * which was used either for element-wise multiplication or matrix multiplication depending on the convention employed in that particular library/code. As a result, in the future, the operator * is meant to be used for element-wise multiplication only.

As explained in PEP0465, two operators were introduced:

  • A new binary operator A @ B, used similarly as A * B
  • An in-place version A @= B, used similarly as A *= B

Matrix Multiplication vs Element-wise Multiplication

To quickly highlight the difference, for two matrices:

A = [[1, 2],    B = [[11, 12],
     [3, 4]]         [13, 14]]
  • Element-wise multiplication will yield:

    A * B = [[1 * 11,   2 * 12], 
             [3 * 13,   4 * 14]]
    
  • Matrix multiplication will yield:

    A @ B  =  [[1 * 11 + 2 * 13,   1 * 12 + 2 * 14],
               [3 * 11 + 4 * 13,   3 * 12 + 4 * 14]]
    

Usage in Numpy

So far, Numpy used the following convention:

Introduction of the @ operator makes the code involving matrix multiplications much easier to read. PEP0465 gives us an example:

# Current implementation of matrix multiplications using dot function
S = np.dot((np.dot(H, beta) - r).T,
            np.dot(inv(np.dot(np.dot(H, V), H.T)), np.dot(H, beta) - r))

# Current implementation of matrix multiplications using dot method
S = (H.dot(beta) - r).T.dot(inv(H.dot(V).dot(H.T))).dot(H.dot(beta) - r)

# Using the @ operator instead
S = (H @ beta - r).T @ inv(H @ V @ H.T) @ (H @ beta - r)

Clearly, the last implementation is much easier to read and interpret as an equation.


回答 2

@是Python3.5中新增的矩阵乘法运算符

参考:https : //docs.python.org/3/whatsnew/3.5.html#whatsnew-pep-465

C = A @ B

@ is the new operator for Matrix Multiplication added in Python3.5

Reference: https://docs.python.org/3/whatsnew/3.5.html#whatsnew-pep-465

Example

C = A @ B

NumPy矩阵与数组类的乘法有何不同?

问题:NumPy矩阵与数组类的乘法有何不同?

numpy文档建议使用数组而不是矩阵来处理矩阵。但是,与八度(我直到最近才使用)不同,*不执行矩阵乘法,您需要使用函数matrixmultipy()。我觉得这使代码非常不可读。

是否有人分享我的观点并找到了解决方案?

The numpy docs recommend using array instead of matrix for working with matrices. However, unlike octave (which I was using till recently), * doesn’t perform matrix multiplication, you need to use the function matrixmultipy(). I feel this makes the code very unreadable.

Does anybody share my views, and has found a solution?


回答 0

避免使用的主要原因 matrix该类的是:a)本质上是二维的,并且b)与“常规” numpy数组相比,存在额外的开销。如果您要做的只是线性代数,那么请务必使用矩阵类…就我个人而言,我发现它比它值得的麻烦更多。

对于数组(Python 3.5之前的版本),请使用dot代替matrixmultiply

例如

import numpy as np
x = np.arange(9).reshape((3,3))
y = np.arange(3)

print np.dot(x,y)

或在新版本的numpy中,只需使用 x.dot(y)

就个人而言,我发现它比*表示矩阵乘法的运算符更具可读性…

对于Python 3.5中的数组,请使用x @ y

The main reason to avoid using the matrix class is that a) it’s inherently 2-dimensional, and b) there’s additional overhead compared to a “normal” numpy array. If all you’re doing is linear algebra, then by all means, feel free to use the matrix class… Personally I find it more trouble than it’s worth, though.

For arrays (prior to Python 3.5), use dot instead of matrixmultiply.

E.g.

import numpy as np
x = np.arange(9).reshape((3,3))
y = np.arange(3)

print np.dot(x,y)

Or in newer versions of numpy, simply use x.dot(y)

Personally, I find it much more readable than the * operator implying matrix multiplication…

For arrays in Python 3.5, use x @ y.


回答 1

与在NumPy 矩阵上进行操作相比,在NumPy 数组上进行操作要了解的关键事项是:

  • NumPy矩阵是NumPy数组的子类

  • NumPy 数组操作是基于元素的(一旦考虑了广播)

  • NumPy 矩阵运算遵循线性代数的一般规则

一些代码片段来说明:

>>> from numpy import linalg as LA
>>> import numpy as NP

>>> a1 = NP.matrix("4 3 5; 6 7 8; 1 3 13; 7 21 9")
>>> a1
matrix([[ 4,  3,  5],
        [ 6,  7,  8],
        [ 1,  3, 13],
        [ 7, 21,  9]])

>>> a2 = NP.matrix("7 8 15; 5 3 11; 7 4 9; 6 15 4")
>>> a2
matrix([[ 7,  8, 15],
        [ 5,  3, 11],
        [ 7,  4,  9],
        [ 6, 15,  4]])

>>> a1.shape
(4, 3)

>>> a2.shape
(4, 3)

>>> a2t = a2.T
>>> a2t.shape
(3, 4)

>>> a1 * a2t         # same as NP.dot(a1, a2t) 
matrix([[127,  84,  85,  89],
        [218, 139, 142, 173],
        [226, 157, 136, 103],
        [352, 197, 214, 393]])

但是如果将以下两个NumPy矩阵转换为数组,则此操作将失败:

>>> a1 = NP.array(a1)
>>> a2t = NP.array(a2t)

>>> a1 * a2t
Traceback (most recent call last):
   File "<pyshell#277>", line 1, in <module>
   a1 * a2t
   ValueError: operands could not be broadcast together with shapes (4,3) (3,4) 

尽管使用NP.dot语法可以处理数组 ; 该操作类似于矩阵乘法:

>> NP.dot(a1, a2t)
array([[127,  84,  85,  89],
       [218, 139, 142, 173],
       [226, 157, 136, 103],
       [352, 197, 214, 393]])

那么您是否需要NumPy矩阵?即,NumPy数组是否足以进行线性代数计算(前提是您知道正确的语法,即NP.dot)?

规则似乎是,如果参数(数组)的形状(mxn)与给定的线性代数运算兼容,那么您就可以了,否则,NumPy抛出。

我遇到的唯一exceptions(可能还有其他exceptions)是计算矩阵逆

下面是我称为纯线性代数运算(实际上是从Numpy的线性代数模块)并传递给NumPy数组的代码片段

数组的行列式

>>> m = NP.random.randint(0, 10, 16).reshape(4, 4)
>>> m
array([[6, 2, 5, 2],
       [8, 5, 1, 6],
       [5, 9, 7, 5],
       [0, 5, 6, 7]])

>>> type(m)
<type 'numpy.ndarray'>

>>> md = LA.det(m)
>>> md
1772.9999999999995

特征向量/特征值对:

>>> LA.eig(m)
(array([ 19.703+0.j   ,   0.097+4.198j,   0.097-4.198j,   5.103+0.j   ]), 
array([[-0.374+0.j   , -0.091+0.278j, -0.091-0.278j, -0.574+0.j   ],
       [-0.446+0.j   ,  0.671+0.j   ,  0.671+0.j   , -0.084+0.j   ],
       [-0.654+0.j   , -0.239-0.476j, -0.239+0.476j, -0.181+0.j   ],
       [-0.484+0.j   , -0.387+0.178j, -0.387-0.178j,  0.794+0.j   ]]))

矩阵范数

>>>> LA.norm(m)
22.0227

qr因式分解

>>> LA.qr(a1)
(array([[ 0.5,  0.5,  0.5],
        [ 0.5,  0.5, -0.5],
        [ 0.5, -0.5,  0.5],
        [ 0.5, -0.5, -0.5]]), 
 array([[ 6.,  6.,  6.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]]))

矩阵等级

>>> m = NP.random.rand(40).reshape(8, 5)
>>> m
array([[ 0.545,  0.459,  0.601,  0.34 ,  0.778],
       [ 0.799,  0.047,  0.699,  0.907,  0.381],
       [ 0.004,  0.136,  0.819,  0.647,  0.892],
       [ 0.062,  0.389,  0.183,  0.289,  0.809],
       [ 0.539,  0.213,  0.805,  0.61 ,  0.677],
       [ 0.269,  0.071,  0.377,  0.25 ,  0.692],
       [ 0.274,  0.206,  0.655,  0.062,  0.229],
       [ 0.397,  0.115,  0.083,  0.19 ,  0.701]])
>>> LA.matrix_rank(m)
5

矩阵条件

>>> a1 = NP.random.randint(1, 10, 12).reshape(4, 3)
>>> LA.cond(a1)
5.7093446189400954

反演需要一个NumPy矩阵

>>> a1 = NP.matrix(a1)
>>> type(a1)
<class 'numpy.matrixlib.defmatrix.matrix'>

>>> a1.I
matrix([[ 0.028,  0.028,  0.028,  0.028],
        [ 0.028,  0.028,  0.028,  0.028],
        [ 0.028,  0.028,  0.028,  0.028]])
>>> a1 = NP.array(a1)
>>> a1.I

Traceback (most recent call last):
   File "<pyshell#230>", line 1, in <module>
   a1.I
   AttributeError: 'numpy.ndarray' object has no attribute 'I'

但是Moore-Penrose伪逆似乎工作得很好

>>> LA.pinv(m)
matrix([[ 0.314,  0.407, -1.008, -0.553,  0.131,  0.373,  0.217,  0.785],
        [ 1.393,  0.084, -0.605,  1.777, -0.054, -1.658,  0.069, -1.203],
        [-0.042, -0.355,  0.494, -0.729,  0.292,  0.252,  1.079, -0.432],
        [-0.18 ,  1.068,  0.396,  0.895, -0.003, -0.896, -1.115, -0.666],
        [-0.224, -0.479,  0.303, -0.079, -0.066,  0.872, -0.175,  0.901]])

>>> m = NP.array(m)

>>> LA.pinv(m)
array([[ 0.314,  0.407, -1.008, -0.553,  0.131,  0.373,  0.217,  0.785],
       [ 1.393,  0.084, -0.605,  1.777, -0.054, -1.658,  0.069, -1.203],
       [-0.042, -0.355,  0.494, -0.729,  0.292,  0.252,  1.079, -0.432],
       [-0.18 ,  1.068,  0.396,  0.895, -0.003, -0.896, -1.115, -0.666],
       [-0.224, -0.479,  0.303, -0.079, -0.066,  0.872, -0.175,  0.901]])

the key things to know for operations on NumPy arrays versus operations on NumPy matrices are:

  • NumPy matrix is a subclass of NumPy array

  • NumPy array operations are element-wise (once broadcasting is accounted for)

  • NumPy matrix operations follow the ordinary rules of linear algebra

some code snippets to illustrate:

>>> from numpy import linalg as LA
>>> import numpy as NP

>>> a1 = NP.matrix("4 3 5; 6 7 8; 1 3 13; 7 21 9")
>>> a1
matrix([[ 4,  3,  5],
        [ 6,  7,  8],
        [ 1,  3, 13],
        [ 7, 21,  9]])

>>> a2 = NP.matrix("7 8 15; 5 3 11; 7 4 9; 6 15 4")
>>> a2
matrix([[ 7,  8, 15],
        [ 5,  3, 11],
        [ 7,  4,  9],
        [ 6, 15,  4]])

>>> a1.shape
(4, 3)

>>> a2.shape
(4, 3)

>>> a2t = a2.T
>>> a2t.shape
(3, 4)

>>> a1 * a2t         # same as NP.dot(a1, a2t) 
matrix([[127,  84,  85,  89],
        [218, 139, 142, 173],
        [226, 157, 136, 103],
        [352, 197, 214, 393]])

but this operations fails if these two NumPy matrices are converted to arrays:

>>> a1 = NP.array(a1)
>>> a2t = NP.array(a2t)

>>> a1 * a2t
Traceback (most recent call last):
   File "<pyshell#277>", line 1, in <module>
   a1 * a2t
   ValueError: operands could not be broadcast together with shapes (4,3) (3,4) 

though using the NP.dot syntax works with arrays; this operations works like matrix multiplication:

>> NP.dot(a1, a2t)
array([[127,  84,  85,  89],
       [218, 139, 142, 173],
       [226, 157, 136, 103],
       [352, 197, 214, 393]])

so do you ever need a NumPy matrix? ie, will a NumPy array suffice for linear algebra computation (provided you know the correct syntax, ie, NP.dot)?

the rule seems to be that if the arguments (arrays) have shapes (m x n) compatible with the a given linear algebra operation, then you are ok, otherwise, NumPy throws.

the only exception i have come across (there are likely others) is calculating matrix inverse.

below are snippets in which i have called a pure linear algebra operation (in fact, from Numpy’s Linear Algebra module) and passed in a NumPy array

determinant of an array:

>>> m = NP.random.randint(0, 10, 16).reshape(4, 4)
>>> m
array([[6, 2, 5, 2],
       [8, 5, 1, 6],
       [5, 9, 7, 5],
       [0, 5, 6, 7]])

>>> type(m)
<type 'numpy.ndarray'>

>>> md = LA.det(m)
>>> md
1772.9999999999995

eigenvectors/eigenvalue pairs:

>>> LA.eig(m)
(array([ 19.703+0.j   ,   0.097+4.198j,   0.097-4.198j,   5.103+0.j   ]), 
array([[-0.374+0.j   , -0.091+0.278j, -0.091-0.278j, -0.574+0.j   ],
       [-0.446+0.j   ,  0.671+0.j   ,  0.671+0.j   , -0.084+0.j   ],
       [-0.654+0.j   , -0.239-0.476j, -0.239+0.476j, -0.181+0.j   ],
       [-0.484+0.j   , -0.387+0.178j, -0.387-0.178j,  0.794+0.j   ]]))

matrix norm:

>>>> LA.norm(m)
22.0227

qr factorization:

>>> LA.qr(a1)
(array([[ 0.5,  0.5,  0.5],
        [ 0.5,  0.5, -0.5],
        [ 0.5, -0.5,  0.5],
        [ 0.5, -0.5, -0.5]]), 
 array([[ 6.,  6.,  6.],
        [ 0.,  0.,  0.],
        [ 0.,  0.,  0.]]))

matrix rank:

>>> m = NP.random.rand(40).reshape(8, 5)
>>> m
array([[ 0.545,  0.459,  0.601,  0.34 ,  0.778],
       [ 0.799,  0.047,  0.699,  0.907,  0.381],
       [ 0.004,  0.136,  0.819,  0.647,  0.892],
       [ 0.062,  0.389,  0.183,  0.289,  0.809],
       [ 0.539,  0.213,  0.805,  0.61 ,  0.677],
       [ 0.269,  0.071,  0.377,  0.25 ,  0.692],
       [ 0.274,  0.206,  0.655,  0.062,  0.229],
       [ 0.397,  0.115,  0.083,  0.19 ,  0.701]])
>>> LA.matrix_rank(m)
5

matrix condition:

>>> a1 = NP.random.randint(1, 10, 12).reshape(4, 3)
>>> LA.cond(a1)
5.7093446189400954

inversion requires a NumPy matrix though:

>>> a1 = NP.matrix(a1)
>>> type(a1)
<class 'numpy.matrixlib.defmatrix.matrix'>

>>> a1.I
matrix([[ 0.028,  0.028,  0.028,  0.028],
        [ 0.028,  0.028,  0.028,  0.028],
        [ 0.028,  0.028,  0.028,  0.028]])
>>> a1 = NP.array(a1)
>>> a1.I

Traceback (most recent call last):
   File "<pyshell#230>", line 1, in <module>
   a1.I
   AttributeError: 'numpy.ndarray' object has no attribute 'I'

but the Moore-Penrose pseudoinverse seems to works just fine

>>> LA.pinv(m)
matrix([[ 0.314,  0.407, -1.008, -0.553,  0.131,  0.373,  0.217,  0.785],
        [ 1.393,  0.084, -0.605,  1.777, -0.054, -1.658,  0.069, -1.203],
        [-0.042, -0.355,  0.494, -0.729,  0.292,  0.252,  1.079, -0.432],
        [-0.18 ,  1.068,  0.396,  0.895, -0.003, -0.896, -1.115, -0.666],
        [-0.224, -0.479,  0.303, -0.079, -0.066,  0.872, -0.175,  0.901]])

>>> m = NP.array(m)

>>> LA.pinv(m)
array([[ 0.314,  0.407, -1.008, -0.553,  0.131,  0.373,  0.217,  0.785],
       [ 1.393,  0.084, -0.605,  1.777, -0.054, -1.658,  0.069, -1.203],
       [-0.042, -0.355,  0.494, -0.729,  0.292,  0.252,  1.079, -0.432],
       [-0.18 ,  1.068,  0.396,  0.895, -0.003, -0.896, -1.115, -0.666],
       [-0.224, -0.479,  0.303, -0.079, -0.066,  0.872, -0.175,  0.901]])

回答 2

在3.5中,Python终于有了一个矩阵乘法运算符。语法为a @ b

In 3.5, Python finally got a matrix multiplication operator. The syntax is a @ b.


回答 3

在处理数组和处理矩阵时,点运算符会给出不同的答案。例如,假设以下内容:

>>> a=numpy.array([1, 2, 3])
>>> b=numpy.array([1, 2, 3])

让我们将它们转换成矩阵:

>>> am=numpy.mat(a)
>>> bm=numpy.mat(b)

现在,我们可以看到两种情况的不同输出:

>>> print numpy.dot(a.T, b)
14
>>> print am.T*bm
[[1.  2.  3.]
 [2.  4.  6.]
 [3.  6.  9.]]

There is a situation where the dot operator will give different answers when dealing with arrays as with dealing with matrices. For example, suppose the following:

>>> a=numpy.array([1, 2, 3])
>>> b=numpy.array([1, 2, 3])

Lets convert them into matrices:

>>> am=numpy.mat(a)
>>> bm=numpy.mat(b)

Now, we can see a different output for the two cases:

>>> print numpy.dot(a.T, b)
14
>>> print am.T*bm
[[1.  2.  3.]
 [2.  4.  6.]
 [3.  6.  9.]]

回答 4

来自http://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html的参考

…,使用的numpy.matrix气馁,因为它增加了什么,无法与2D来完成numpy.ndarray对象,并可能导致混乱,其中正在使用的类。例如,

>>> import numpy as np
>>> from scipy import linalg
>>> A = np.array([[1,2],[3,4]])
>>> A
    array([[1, 2],
           [3, 4]])
>>> linalg.inv(A)
array([[-2. ,  1. ],
      [ 1.5, -0.5]])
>>> b = np.array([[5,6]]) #2D array
>>> b
array([[5, 6]])
>>> b.T
array([[5],
      [6]])
>>> A*b #not matrix multiplication!
array([[ 5, 12],
      [15, 24]])
>>> A.dot(b.T) #matrix multiplication
array([[17],
      [39]])
>>> b = np.array([5,6]) #1D array
>>> b
array([5, 6])
>>> b.T  #not matrix transpose!
array([5, 6])
>>> A.dot(b)  #does not matter for multiplication
array([17, 39])

scipy.linalg操作可以同等地应用于numpy.matrix或2D numpy.ndarray对象。

Reference from http://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html

…, the use of the numpy.matrix class is discouraged, since it adds nothing that cannot be accomplished with 2D numpy.ndarray objects, and may lead to a confusion of which class is being used. For example,

>>> import numpy as np
>>> from scipy import linalg
>>> A = np.array([[1,2],[3,4]])
>>> A
    array([[1, 2],
           [3, 4]])
>>> linalg.inv(A)
array([[-2. ,  1. ],
      [ 1.5, -0.5]])
>>> b = np.array([[5,6]]) #2D array
>>> b
array([[5, 6]])
>>> b.T
array([[5],
      [6]])
>>> A*b #not matrix multiplication!
array([[ 5, 12],
      [15, 24]])
>>> A.dot(b.T) #matrix multiplication
array([[17],
      [39]])
>>> b = np.array([5,6]) #1D array
>>> b
array([5, 6])
>>> b.T  #not matrix transpose!
array([5, 6])
>>> A.dot(b)  #does not matter for multiplication
array([17, 39])

scipy.linalg operations can be applied equally to numpy.matrix or to 2D numpy.ndarray objects.


回答 5

这个技巧可能就是您想要的。这是一种简单的运算符重载。

然后,您可以使用类似建议的Infix类的东西:

a = np.random.rand(3,4)
b = np.random.rand(4,3)
x = Infix(lambda x,y: np.dot(x,y))
c = a |x| b

This trick could be what you are looking for. It is a kind of simple operator overload.

You can then use something like the suggested Infix class like this:

a = np.random.rand(3,4)
b = np.random.rand(4,3)
x = Infix(lambda x,y: np.dot(x,y))
c = a |x| b

回答 6

来自PEP 465的相关报价 @ petr-viktorin提到的用于矩阵乘法的专用中缀运算符,阐明了OP遇到的问题:

numpy提供了两种使用不同__mul__方法的不同类型。对于numpy.ndarray对象,*执行元素乘法,矩阵乘法必须使用函数调用(numpy.dot)。对于numpy.matrix对象,*执行矩阵乘法,而元素乘法则需要函数语法。使用编写代码numpy.ndarray效果很好。使用编写代码numpy.matrix也可以。但是,一旦我们尝试将这两段代码集成在一起,麻烦就会开始。预期为ndarray并得到matrix或相反的代码可能会崩溃或返回错误的结果

@infix运算符的引入应有助于统一和简化python矩阵代码。

A pertinent quote from PEP 465 – A dedicated infix operator for matrix multiplication , as mentioned by @petr-viktorin, clarifies the problem the OP was getting at:

[…] numpy provides two different types with different __mul__ methods. For numpy.ndarray objects, * performs elementwise multiplication, and matrix multiplication must use a function call (numpy.dot). For numpy.matrix objects, * performs matrix multiplication, and elementwise multiplication requires function syntax. Writing code using numpy.ndarray works fine. Writing code using numpy.matrix also works fine. But trouble begins as soon as we try to integrate these two pieces of code together. Code that expects an ndarray and gets a matrix, or vice-versa, may crash or return incorrect results

The introduction of the @ infix operator should help to unify and simplify python matrix code.


回答 7

函数matmul(自numpy 1.10.1起)对两种类型均适用,并以numpy矩阵类返回结果:

import numpy as np

A = np.mat('1 2 3; 4 5 6; 7 8 9; 10 11 12')
B = np.array(np.mat('1 1 1 1; 1 1 1 1; 1 1 1 1'))
print (A, type(A))
print (B, type(B))

C = np.matmul(A, B)
print (C, type(C))

输出:

(matrix([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]]), <class 'numpy.matrixlib.defmatrix.matrix'>)
(array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]]), <type 'numpy.ndarray'>)
(matrix([[ 6,  6,  6,  6],
        [15, 15, 15, 15],
        [24, 24, 24, 24],
        [33, 33, 33, 33]]), <class 'numpy.matrixlib.defmatrix.matrix'>)

由于python 3.5 如前所述,您还可以使用新的矩阵乘法运算符,@例如

C = A @ B

并获得与上述相同的结果。

Function matmul (since numpy 1.10.1) works fine for both types and return result as a numpy matrix class:

import numpy as np

A = np.mat('1 2 3; 4 5 6; 7 8 9; 10 11 12')
B = np.array(np.mat('1 1 1 1; 1 1 1 1; 1 1 1 1'))
print (A, type(A))
print (B, type(B))

C = np.matmul(A, B)
print (C, type(C))

Output:

(matrix([[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9],
        [10, 11, 12]]), <class 'numpy.matrixlib.defmatrix.matrix'>)
(array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]]), <type 'numpy.ndarray'>)
(matrix([[ 6,  6,  6,  6],
        [15, 15, 15, 15],
        [24, 24, 24, 24],
        [33, 33, 33, 33]]), <class 'numpy.matrixlib.defmatrix.matrix'>)

Since python 3.5 as mentioned early you also can use a new matrix multiplication operator @ like

C = A @ B

and get the same result as above.


numpy dot()和Python 3.5+矩阵乘法@之间的区别

问题:numpy dot()和Python 3.5+矩阵乘法@之间的区别

我最近使用Python 3.5,注意到新的矩阵乘法运算符(@)有时与numpy点运算符的行为有所不同。例如,对于3d阵列:

import numpy as np

a = np.random.rand(8,13,13)
b = np.random.rand(8,13,13)
c = a @ b  # Python 3.5+
d = np.dot(a, b)

@运算符返回形状的阵列:

c.shape
(8, 13, 13)

np.dot()函数返回时:

d.shape
(8, 13, 8, 13)

如何用numpy点重现相同的结果?还有其他重大区别吗?

I recently moved to Python 3.5 and noticed the new matrix multiplication operator (@) sometimes behaves differently from the numpy dot operator. In example, for 3d arrays:

import numpy as np

a = np.random.rand(8,13,13)
b = np.random.rand(8,13,13)
c = a @ b  # Python 3.5+
d = np.dot(a, b)

The @ operator returns an array of shape:

c.shape
(8, 13, 13)

while the np.dot() function returns:

d.shape
(8, 13, 8, 13)

How can I reproduce the same result with numpy dot? Are there any other significant differences?


回答 0

@运营商称阵列的__matmul__方法,而不是dot。此方法在API中也作为函数存在np.matmul

>>> a = np.random.rand(8,13,13)
>>> b = np.random.rand(8,13,13)
>>> np.matmul(a, b).shape
(8, 13, 13)

从文档中:

matmul区别在于dot两个重要方面。

  • 标量不能相乘。
  • 将矩阵堆栈一起广播,就好像矩阵是元素一样。

最后一点很清楚,当传递3D(或更高维)数组时,dotmatmul方法的行为会有所不同。从文档中引用更多内容:

对于matmul

如果任何一个参数为ND,N> 2,则将其视为驻留在最后两个索引中的一组矩阵,并进行相应广播。

对于np.dot

对于2-D数组,它等效于矩阵乘法,对于1-D数组,其等效于向量的内积(无复共轭)。对于N维,它是a的最后一个轴和b的倒数第二个轴的和积

The @ operator calls the array’s __matmul__ method, not dot. This method is also present in the API as the function np.matmul.

>>> a = np.random.rand(8,13,13)
>>> b = np.random.rand(8,13,13)
>>> np.matmul(a, b).shape
(8, 13, 13)

From the documentation:

matmul differs from dot in two important ways.

  • Multiplication by scalars is not allowed.
  • Stacks of matrices are broadcast together as if the matrices were elements.

The last point makes it clear that dot and matmul methods behave differently when passed 3D (or higher dimensional) arrays. Quoting from the documentation some more:

For matmul:

If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.

For np.dot:

For 2-D arrays it is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors (without complex conjugation). For N dimensions it is a sum product over the last axis of a and the second-to-last of b


回答 1

@ajcr的答案说明了dotand matmul(由@符号调用)之间的区别。通过看一个简单的例子,可以清楚地看到两者在“矩阵堆栈”或张量上进行操作时的行为有何不同。

为了弄清差异,采用4×4数组,然后将dot乘积和matmul乘积返回3x4x2的“矩阵堆栈”或张量。

import numpy as np
fourbyfour = np.array([
                       [1,2,3,4],
                       [3,2,1,4],
                       [5,4,6,7],
                       [11,12,13,14]
                      ])


threebyfourbytwo = np.array([
                             [[2,3],[11,9],[32,21],[28,17]],
                             [[2,3],[1,9],[3,21],[28,7]],
                             [[2,3],[1,9],[3,21],[28,7]],
                            ])

print('4x4*3x4x2 dot:\n {}\n'.format(np.dot(fourbyfour,twobyfourbythree)))
print('4x4*3x4x2 matmul:\n {}\n'.format(np.matmul(fourbyfour,twobyfourbythree)))

每个操作的结果如下所示。注意点积如何

… a的最后一个轴与b的倒数第二个和的乘积

以及如何通过一起广播矩阵来形成矩阵乘积。

4x4*3x4x2 dot:
 [[[232 152]
  [125 112]
  [125 112]]

 [[172 116]
  [123  76]
  [123  76]]

 [[442 296]
  [228 226]
  [228 226]]

 [[962 652]
  [465 512]
  [465 512]]]

4x4*3x4x2 matmul:
 [[[232 152]
  [172 116]
  [442 296]
  [962 652]]

 [[125 112]
  [123  76]
  [228 226]
  [465 512]]

 [[125 112]
  [123  76]
  [228 226]
  [465 512]]]

The answer by @ajcr explains how the dot and matmul (invoked by the @ symbol) differ. By looking at a simple example, one clearly sees how the two behave differently when operating on ‘stacks of matricies’ or tensors.

To clarify the differences take a 4×4 array and return the dot product and matmul product with a 3x4x2 ‘stack of matricies’ or tensor.

import numpy as np
fourbyfour = np.array([
                       [1,2,3,4],
                       [3,2,1,4],
                       [5,4,6,7],
                       [11,12,13,14]
                      ])


threebyfourbytwo = np.array([
                             [[2,3],[11,9],[32,21],[28,17]],
                             [[2,3],[1,9],[3,21],[28,7]],
                             [[2,3],[1,9],[3,21],[28,7]],
                            ])

print('4x4*3x4x2 dot:\n {}\n'.format(np.dot(fourbyfour,threebyfourbytwo)))
print('4x4*3x4x2 matmul:\n {}\n'.format(np.matmul(fourbyfour,threebyfourbytwo)))

The products of each operation appear below. Notice how the dot product is,

…a sum product over the last axis of a and the second-to-last of b

and how the matrix product is formed by broadcasting the matrix together.

4x4*3x4x2 dot:
 [[[232 152]
  [125 112]
  [125 112]]

 [[172 116]
  [123  76]
  [123  76]]

 [[442 296]
  [228 226]
  [228 226]]

 [[962 652]
  [465 512]
  [465 512]]]

4x4*3x4x2 matmul:
 [[[232 152]
  [172 116]
  [442 296]
  [962 652]]

 [[125 112]
  [123  76]
  [228 226]
  [465 512]]

 [[125 112]
  [123  76]
  [228 226]
  [465 512]]]

回答 2

仅供参考,@其numpy的等价物dot,并matmul都大致一样快。(用我的一个项目perfplot创建的图。)

复制剧情的代码:

import perfplot
import numpy


def setup(n):
    A = numpy.random.rand(n, n)
    x = numpy.random.rand(n)
    return A, x


def at(data):
    A, x = data
    return A @ x


def numpy_dot(data):
    A, x = data
    return numpy.dot(A, x)


def numpy_matmul(data):
    A, x = data
    return numpy.matmul(A, x)


perfplot.show(
    setup=setup,
    kernels=[at, numpy_dot, numpy_matmul],
    n_range=[2 ** k for k in range(12)],
    logx=True,
    logy=True,
)

Just FYI, @ and its numpy equivalents dot and matmul are all equally fast. (Plot created with perfplot, a project of mine.)

Code to reproduce the plot:

import perfplot
import numpy


def setup(n):
    A = numpy.random.rand(n, n)
    x = numpy.random.rand(n)
    return A, x


def at(data):
    A, x = data
    return A @ x


def numpy_dot(data):
    A, x = data
    return numpy.dot(A, x)


def numpy_matmul(data):
    A, x = data
    return numpy.matmul(A, x)


perfplot.show(
    setup=setup,
    kernels=[at, numpy_dot, numpy_matmul],
    n_range=[2 ** k for k in range(15)],
)

回答 3

在数学上,我认为numpy中的更有意义

(a,b)_ {i,j,k,a,b,c} =

因为当a和b是向量时它给出点积,或者当a和b是矩阵时给出矩阵乘积


对于numpy中的matmul操作,它由结果的一部分组成,可以定义为

> matmul(a,b)_ {i,j,k,c} =

因此,您可以看到matmul(a,b)返回的数组形状较小,从而减少了内存消耗,并在应用程序中更有意义。特别是结合广播,您可以获得

matmul(a,b)_ {i,j,k,l} =

例如。


从以上两个定义中,您可以看到使用这两个操作的要求。假设a.shape =(s1,s2,s3,s4)b.shape =(t1,t2,t3,t4)

  • 要使用点(a,b),您需要

    1. t3 = s4 ;
  • 要使用matmul(a,b),您需要

    1. t3 = s4
    2. t2 = s2或t2和s2之一为1
    3. t1 = s1或t1和s1之一为1

使用以下代码说服自己。

代码样例

import numpy as np
for it in xrange(10000):
    a = np.random.rand(5,6,2,4)
    b = np.random.rand(6,4,3)
    c = np.matmul(a,b)
    d = np.dot(a,b)
    #print 'c shape: ', c.shape,'d shape:', d.shape

    for i in range(5):
        for j in range(6):
            for k in range(2):
                for l in range(3):
                    if not c[i,j,k,l] == d[i,j,k,j,l]:
                        print it,i,j,k,l,c[i,j,k,l]==d[i,j,k,j,l] #you will not see them

In mathematics, I think the dot in numpy makes more sense

dot(a,b)_{i,j,k,a,b,c} =

since it gives the dot product when a and b are vectors, or the matrix multiplication when a and b are matrices


As for matmul operation in numpy, it consists of parts of dot result, and it can be defined as

>matmul(a,b)_{i,j,k,c} =

So, you can see that matmul(a,b) returns an array with a small shape, which has smaller memory consumption and make more sense in applications. In particular, combining with broadcasting, you can get

matmul(a,b)_{i,j,k,l} =

for example.


From the above two definitions, you can see the requirements to use those two operations. Assume a.shape=(s1,s2,s3,s4) and b.shape=(t1,t2,t3,t4)

  • To use dot(a,b) you need

    1. t3=s4;
  • To use matmul(a,b) you need

    1. t3=s4
    2. t2=s2, or one of t2 and s2 is 1
    3. t1=s1, or one of t1 and s1 is 1

Use the following piece of code to convince yourself.

Code sample

import numpy as np
for it in xrange(10000):
    a = np.random.rand(5,6,2,4)
    b = np.random.rand(6,4,3)
    c = np.matmul(a,b)
    d = np.dot(a,b)
    #print 'c shape: ', c.shape,'d shape:', d.shape

    for i in range(5):
        for j in range(6):
            for k in range(2):
                for l in range(3):
                    if not c[i,j,k,l] == d[i,j,k,j,l]:
                        print it,i,j,k,l,c[i,j,k,l]==d[i,j,k,j,l] #you will not see them

回答 4

这是与的比较,np.einsum以显示索引的投影方式

np.allclose(np.einsum('ijk,ijk->ijk', a,b), a*b)        # True 
np.allclose(np.einsum('ijk,ikl->ijl', a,b), a@b)        # True
np.allclose(np.einsum('ijk,lkm->ijlm',a,b), a.dot(b))   # True

Here is a comparison with np.einsum to show how the indices are projected

np.allclose(np.einsum('ijk,ijk->ijk', a,b), a*b)        # True 
np.allclose(np.einsum('ijk,ikl->ijl', a,b), a@b)        # True
np.allclose(np.einsum('ijk,lkm->ijlm',a,b), a.dot(b))   # True

回答 5

我对MATMUL和DOT的经验

尝试使用MATMUL时,我经常收到“ ValueError:传递的值的形状为(200,1),索引表示(200,3)”。我想要一个快速的解决方法,并发现DOT可以提供相同的功能。使用DOT我没有任何错误。我得到正确的答案

与MATMUL

X.shape
>>>(200, 3)

type(X)

>>>pandas.core.frame.DataFrame

w

>>>array([0.37454012, 0.95071431, 0.73199394])

YY = np.matmul(X,w)

>>>  ValueError: Shape of passed values is (200, 1), indices imply (200, 3)"

与DOT

YY = np.dot(X,w)
# no error message
YY
>>>array([ 2.59206877,  1.06842193,  2.18533396,  2.11366346,  0.28505879, 

YY.shape

>>> (200, )

My experience with MATMUL and DOT

I was constantly getting “ValueError: Shape of passed values is (200, 1), indices imply (200, 3)” when trying to use MATMUL. I wanted a quick workaround and found DOT to deliver the same functionality. I don’t get any error using DOT. I get the correct answer

with MATMUL

X.shape
>>>(200, 3)

type(X)

>>>pandas.core.frame.DataFrame

w

>>>array([0.37454012, 0.95071431, 0.73199394])

YY = np.matmul(X,w)

>>>  ValueError: Shape of passed values is (200, 1), indices imply (200, 3)"

with DOT

YY = np.dot(X,w)
# no error message
YY
>>>array([ 2.59206877,  1.06842193,  2.18533396,  2.11366346,  0.28505879, …

YY.shape

>>> (200, )

如何在numpy中获得按元素矩阵乘法(Hadamard积)?

问题:如何在numpy中获得按元素矩阵乘法(Hadamard积)?

我有两个矩阵

a = np.matrix([[1,2], [3,4]])
b = np.matrix([[5,6], [7,8]])

我想得到元素乘积[[1*5,2*6], [3*7,4*8]],等于

[[5,12], [21,32]]

我努力了

print(np.dot(a,b)) 

print(a*b)

但两者都给出结果

[[19 22], [43 50]]

这是矩阵乘积,而不是元素乘积。如何使用内置函数获取按元素分类的产品(又名Hadamard产品)?

I have two matrices

a = np.matrix([[1,2], [3,4]])
b = np.matrix([[5,6], [7,8]])

and I want to get the element-wise product, [[1*5,2*6], [3*7,4*8]], equaling

[[5,12], [21,32]]

I have tried

print(np.dot(a,b)) 

and

print(a*b)

but both give the result

[[19 22], [43 50]]

which is the matrix product, not the element-wise product. How can I get the the element-wise product (aka Hadamard product) using built-in functions?


回答 0

对于matrix对象的元素乘法,可以使用numpy.multiply

import numpy as np
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])
np.multiply(a,b)

结果

array([[ 5, 12],
       [21, 32]])

但是,您应该真正使用array而不是matrixmatrix对象与常规ndarray具有各种可怕的不兼容性。使用ndarrays时,您可以仅使用*元素级乘法:

a * b

如果您使用的是Python 3.5+,则您甚至都不会失去使用运算符执行矩阵乘法的能力,因为@矩阵乘法现在可以

a @ b  # matrix multiplication

For elementwise multiplication of matrix objects, you can use numpy.multiply:

import numpy as np
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])
np.multiply(a,b)

Result

array([[ 5, 12],
       [21, 32]])

However, you should really use array instead of matrix. matrix objects have all sorts of horrible incompatibilities with regular ndarrays. With ndarrays, you can just use * for elementwise multiplication:

a * b

If you’re on Python 3.5+, you don’t even lose the ability to perform matrix multiplication with an operator, because @ does matrix multiplication now:

a @ b  # matrix multiplication

回答 1

只是这样做:

import numpy as np

a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])

a * b

just do this:

import numpy as np

a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])

a * b

回答 2

import numpy as np
x = np.array([[1,2,3], [4,5,6]])
y = np.array([[-1, 2, 0], [-2, 5, 1]])

x*y
Out: 
array([[-1,  4,  0],
       [-8, 25,  6]])

%timeit x*y
1000000 loops, best of 3: 421 ns per loop

np.multiply(x,y)
Out: 
array([[-1,  4,  0],
       [-8, 25,  6]])

%timeit np.multiply(x, y)
1000000 loops, best of 3: 457 ns per loop

两者np.multiply*都会产生元素明智的乘法,称为Hadamard积

%timeit 是ipython的魔力

import numpy as np
x = np.array([[1,2,3], [4,5,6]])
y = np.array([[-1, 2, 0], [-2, 5, 1]])

x*y
Out: 
array([[-1,  4,  0],
       [-8, 25,  6]])

%timeit x*y
1000000 loops, best of 3: 421 ns per loop

np.multiply(x,y)
Out: 
array([[-1,  4,  0],
       [-8, 25,  6]])

%timeit np.multiply(x, y)
1000000 loops, best of 3: 457 ns per loop

Both np.multiply and * would yield element wise multiplication known as the Hadamard Product

%timeit is ipython magic


回答 3

试试这个:

a = np.matrix([[1,2], [3,4]])
b = np.matrix([[5,6], [7,8]])

#This would result a 'numpy.ndarray'
result = np.array(a) * np.array(b)

在此,np.array(a)返回类型为2D的2D数组,ndarray并且ndarray将导致元素相乘。因此结果将是:

result = [[5, 12], [21, 32]]

如果您想获取矩阵,请执行以下操作:

result = np.mat(result)

Try this:

a = np.matrix([[1,2], [3,4]])
b = np.matrix([[5,6], [7,8]])

#This would result a 'numpy.ndarray'
result = np.array(a) * np.array(b)

Here, np.array(a) returns a 2D array of type ndarray and multiplication of two ndarray would result element wise multiplication. So the result would be:

result = [[5, 12], [21, 32]]

If you wanna get a matrix, the do it with this:

result = np.mat(result)