“克隆”行或列向量

Question 1

Sometimes it is useful to “clone” a row or column vector to a matrix. By cloning I mean converting a row vector such as

[1,2,3]

Into a matrix

[[1,2,3]
 [1,2,3]
 [1,2,3]
]

or a column vector such as

into

[[1,1,1]
 [2,2,2]
 [3,3,3]
]

In matlab or octave this is done pretty easily:

 x = [1,2,3]
 a = ones(3,1) * x
 a =

    1   2   3
    1   2   3
    1   2   3

 b = (x') * ones(1,3)
 b =

    1   1   1
    2   2   2
    3   3   3

I want to repeat this in numpy, but unsuccessfully

In [14]: x = array([1,2,3])
In [14]: ones((3,1)) * x
Out[14]:
array([[ 1.,  2.,  3.],
       [ 1.,  2.,  3.],
       [ 1.,  2.,  3.]])
# so far so good
In [16]: x.transpose() * ones((1,3))
Out[16]: array([[ 1.,  2.,  3.]])
# DAMN
# I end up with 
In [17]: (ones((3,1)) * x).transpose()
Out[17]:
array([[ 1.,  1.,  1.],
       [ 2.,  2.,  2.],
       [ 3.,  3.,  3.]])

Why wasn’t the first method (In [16]) working? Is there a way to achieve this task in python in a more elegant way?

Question 2

Here’s an elegant, Pythonic way to do it:

>>> array([[1,2,3],]*3)
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

>>> array([[1,2,3],]*3).transpose()
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

the problem with [16] seems to be that the transpose has no effect for an array. you’re probably wanting a matrix instead:

>>> x = array([1,2,3])
>>> x
array([1, 2, 3])
>>> x.transpose()
array([1, 2, 3])
>>> matrix([1,2,3])
matrix([[1, 2, 3]])
>>> matrix([1,2,3]).transpose()
matrix([[1],
        [2],
        [3]])

Question 3

Use numpy.tile:

>>> tile(array([1,2,3]), (3, 1))
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

or for repeating columns:

>>> tile(array([[1,2,3]]).transpose(), (1, 3))
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

Question 4

First note that with numpy’s broadcasting operations it’s usually not necessary to duplicate rows and columns. See this and this for descriptions.

But to do this, repeat and newaxis are probably the best way

In [12]: x = array([1,2,3])

In [13]: repeat(x[:,newaxis], 3, 1)
Out[13]: 
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])

In [14]: repeat(x[newaxis,:], 3, 0)
Out[14]: 
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

This example is for a row vector, but applying this to a column vector is hopefully obvious. repeat seems to spell this well, but you can also do it via multiplication as in your example

In [15]: x = array([[1, 2, 3]])  # note the double brackets

In [16]: (ones((3,1))*x).transpose()
Out[16]: 
array([[ 1.,  1.,  1.],
       [ 2.,  2.,  2.],
       [ 3.,  3.,  3.]])

Question 5

Let:

>>> n = 1000
>>> x = np.arange(n)
>>> reps = 10000

Zero-cost allocations

A view does not take any additional memory. Thus, these declarations are instantaneous:

# New axis
x[np.newaxis, ...]

# Broadcast to specific shape
np.broadcast_to(x, (reps, n))

Forced allocation

If you want force the contents to reside in memory:

>>> %timeit np.array(np.broadcast_to(x, (reps, n)))
10.2 ms ± 62.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

>>> %timeit np.repeat(x[np.newaxis, :], reps, axis=0)
9.88 ms ± 52.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

>>> %timeit np.tile(x, (reps, 1))
9.97 ms ± 77.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

All three methods are roughly the same speed.

Computation

>>> a = np.arange(reps * n).reshape(reps, n)
>>> x_tiled = np.tile(x, (reps, 1))

>>> %timeit np.broadcast_to(x, (reps, n)) * a
17.1 ms ± 284 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

>>> %timeit x[np.newaxis, :] * a
17.5 ms ± 300 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

>>> %timeit x_tiled * a
17.6 ms ± 240 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

All three methods are roughly the same speed.

Conclusion

If you want to replicate before a computation, consider using one of the “zero-cost allocation” methods. You won’t suffer the performance penalty of “forced allocation”.

Question 6

I think using the broadcast in numpy is the best, and faster

I did a compare as following

import numpy as np
b = np.random.randn(1000)
In [105]: %timeit c = np.tile(b[:, newaxis], (1,100))
1000 loops, best of 3: 354 µs per loop

In [106]: %timeit c = np.repeat(b[:, newaxis], 100, axis=1)
1000 loops, best of 3: 347 µs per loop

In [107]: %timeit c = np.array([b,]*100).transpose()
100 loops, best of 3: 5.56 ms per loop

about 15 times faster using broadcast

Question 7

One clean solution is to use NumPy’s outer-product function with a vector of ones:

np.outer(np.ones(n), x)

gives n repeating rows. Switch the argument order to get repeating columns. To get an equal number of rows and columns you might do

np.outer(np.ones_like(x), x)

Question 8

You can use

np.tile(x,3).reshape((4,3))

tile will generate the reps of the vector

and reshape will give it the shape you want

Question 9

If you have a pandas dataframe and want to preserve the dtypes, even the categoricals, this is a fast way to do it:

import numpy as np
import pandas as pd
df = pd.DataFrame({1: [1, 2, 3], 2: [4, 5, 6]})
number_repeats = 50
new_df = df.reindex(np.tile(df.index, number_repeats))

Question 10

import numpy as np
x=np.array([1,2,3])
y=np.multiply(np.ones((len(x),len(x))),x).T
print(y)

yields:

[[ 1.  1.  1.]
 [ 2.  2.  2.]
 [ 3.  3.  3.]]

“克隆”行或列向量

问题：“克隆”行或列向量

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

7行代码 Python热力图可视化分析缺失数据处理

Python 流程图 — 一键转化代码为流程图

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

Tpot-使用遗传编程优化机器学习管道的Python自动机器学习工具

ValueError：使用序列设置数组元素

python多久刷新一次文件？

基于Django类的视图（TemplateView）中的URL参数和逻辑

Python SciPy是否需要BLAS？

如何在Python上使用“ pip”安装psycopg2？

“克隆”行或列向量

问题：“克隆”行或列向量

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

相关文章

排行榜展示

文章展示