从ND到一维阵列

Question 1

Say I have an array a:

a = np.array([[1,2,3], [4,5,6]])

array([[1, 2, 3],
       [4, 5, 6]])

I would like to convert it to a 1D array (i.e. a column vector):

b = np.reshape(a, (1,np.product(a.shape)))

but this returns

array([[1, 2, 3, 4, 5, 6]])

which is not the same as:

array([1, 2, 3, 4, 5, 6])

I can take the first element of this array to manually convert it to a 1D array:

b = np.reshape(a, (1,np.product(a.shape)))[0]

but this requires me to know how many dimensions the original array has (and concatenate [0]’s when working with higher dimensions)

Is there a dimensions-independent way of getting a column/row vector from an arbitrary ndarray?

Question 2

Use np.ravel (for a 1D view) or np.ndarray.flatten (for a 1D copy) or np.ndarray.flat (for an 1D iterator):

In [12]: a = np.array([[1,2,3], [4,5,6]])

In [13]: b = a.ravel()

In [14]: b
Out[14]: array([1, 2, 3, 4, 5, 6])

Note that ravel() returns a view of a when possible. So modifying b also modifies a. ravel() returns a view when the 1D elements are contiguous in memory, but would return a copy if, for example, a were made from slicing another array using a non-unit step size (e.g. a = x[::2]).

If you want a copy rather than a view, use

In [15]: c = a.flatten()

If you just want an iterator, use np.ndarray.flat:

In [20]: d = a.flat

In [21]: d
Out[21]: <numpy.flatiter object at 0x8ec2068>

In [22]: list(d)
Out[22]: [1, 2, 3, 4, 5, 6]

Question 3

In [14]: b = np.reshape(a, (np.product(a.shape),))

In [15]: b
Out[15]: array([1, 2, 3, 4, 5, 6])

or, simply:

In [16]: a.flatten()
Out[16]: array([1, 2, 3, 4, 5, 6])

Question 4

One of the simplest way is to use flatten(), like this example :

 import numpy as np

 batch_y =train_output.iloc[sample, :]
 batch_y = np.array(batch_y).flatten()

My array it was like this :

After using flatten():

array([6, 6, 5, ..., 5, 3, 6])

It’s also the solution of errors of this type :

Cannot feed value of shape (100, 1) for Tensor 'input/Y:0', which has shape '(?,)'

Question 5

For list of array with different size use following:

import numpy as np

# ND array list with different size
a = [[1],[2,3,4,5],[6,7,8]]

# stack them
b = np.hstack(a)

print(b)

Output:

[1 2 3 4 5 6 7 8]

Question 6

I wanted to see a benchmark result of functions mentioned in answers including unutbu’s.

Also want to point out that numpy doc recommend to use arr.reshape(-1) in case view is preferable. (even though ravel is tad faster in the following result)

TL;DR: np.ravel is the most performant (by very small amount).

Benchmark

Functions:

np.ravel: returns view, if possible
np.reshape(-1): returns view, if possible
np.flatten: returns copy
np.flat: returns numpy.flatiter. similar to iterable

numpy version: ‘1.18.0’

Execution times on different `ndarray` sizes

+-------------+----------+-----------+-----------+-------------+
|  function   |   10x10  |  100x100  | 1000x1000 | 10000x10000 |
+-------------+----------+-----------+-----------+-------------+
| ravel       | 0.002073 |  0.002123 |  0.002153 |    0.002077 |
| reshape(-1) | 0.002612 |  0.002635 |  0.002674 |    0.002701 |
| flatten     | 0.000810 |  0.007467 |  0.587538 |  107.321913 |
| flat        | 0.000337 |  0.000255 |  0.000227 |    0.000216 |
+-------------+----------+-----------+-----------+-------------+

Conclusion

ravel and reshape(-1)‘s execution time was consistent and independent from ndarray size. However, ravel is tad faster, but reshape provides flexibility in reshaping size. (maybe that’s why numpy doc recommend to use it instead. Or there could be some cases where reshape returns view and ravel doesn’t).
If you are dealing with large size ndarray, using flatten can cause a performance issue. Recommend not to use it. Unless you need a copy of the data to do something else.

Used code

import timeit
setup = '''
import numpy as np
nd = np.random.randint(10, size=(10, 10))
'''

timeit.timeit('nd = np.reshape(nd, -1)', setup=setup, number=1000)
timeit.timeit('nd = np.ravel(nd)', setup=setup, number=1000)
timeit.timeit('nd = nd.flatten()', setup=setup, number=1000)
timeit.timeit('nd.flat', setup=setup, number=1000)

Question 7

Although this isn’t using the np array format, (to lazy to modify my code) this should do what you want… If, you truly want a column vector you will want to transpose the vector result. It all depends on how you are planning to use this.

def getVector(data_array,col):
    vector = []
    imax = len(data_array)
    for i in range(imax):
        vector.append(data_array[i][col])
    return ( vector )
a = ([1,2,3], [4,5,6])
b = getVector(a,1)
print(b)

Out>[2,5]

So if you need to transpose, you can do something like this:

def transposeArray(data_array):
    # need to test if this is a 1D array 
    # can't do a len(data_array[0]) if it's 1D
    two_d = True
    if isinstance(data_array[0], list):
        dimx = len(data_array[0])
    else:
        dimx = 1
        two_d = False
    dimy = len(data_array)
    # init output transposed array
    data_array_t = [[0 for row in range(dimx)] for col in range(dimy)]
    # fill output transposed array
    for i in range(dimx):
        for j in range(dimy):
            if two_d:
                data_array_t[j][i] = data_array[i][j]
            else:
                data_array_t[j][i] = data_array[j]
    return data_array_t

从ND到一维阵列

问题：从ND到一维阵列

回答 0

回答 1

回答 2

回答 3

对于具有不同大小的数组列表，请使用以下命令：

输出：

For list of array with different size use following:

Output:

回答 4

基准测试

不同`ndarray`大小的执行时间

结论

使用的代码

Benchmark

Execution times on different `ndarray` sizes

Conclusion

Used code

回答 5

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

7行代码 Python热力图可视化分析缺失数据处理

Python 流程图 — 一键转化代码为流程图

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

如何用空格填充Python字符串？

如何在Python中附加文件？

从其他文件夹导入文件

查找列表的平均值

Python中的循环列表迭代器

将缺失的日期添加到熊猫数据框

从ND到一维阵列

问题：从ND到一维阵列

回答 0

回答 1

回答 2

回答 3

对于具有不同大小的数组列表，请使用以下命令：

输出：

For list of array with different size use following:

Output:

回答 4

基准测试

不同ndarray大小的执行时间

结论

使用的代码

Benchmark

Execution times on different ndarray sizes

Conclusion

Used code

回答 5

相关文章

排行榜展示

文章展示

不同`ndarray`大小的执行时间

Execution times on different `ndarray` sizes