脾气暴躁的索引切片而不会丢失尺寸信息

问题:脾气暴躁的索引切片而不会丢失尺寸信息

我正在使用numpy,并希望在不丢失维度信息的情况下对行进行索引。

import numpy as np
X = np.zeros((100,10))
X.shape        # >> (100, 10)
xslice = X[10,:]
xslice.shape   # >> (10,)  

在此示例中,xslice现在为1维,但我希望它为(1,10)。在R中,我将使用X [10,:,drop = F]。numpy中是否有类似的东西。我在文档中找不到它,也没有看到类似的问题。

谢谢!

I’m using numpy and want to index a row without losing the dimension information.

import numpy as np
X = np.zeros((100,10))
X.shape        # >> (100, 10)
xslice = X[10,:]
xslice.shape   # >> (10,)  

In this example xslice is now 1 dimension, but I want it to be (1,10). In R, I would use X[10,:,drop=F]. Is there something similar in numpy. I couldn’t find it in the documentation and didn’t see a similar question asked.

Thanks!


回答 0

这可能最容易做到x[None, 10, :]或等效(但更具可读性)x[np.newaxis, 10, :]

就为什么不是默认值而言,我个人发现,不断拥有单例维度的数组会非常烦人。我猜想那些麻木的开发者也有同样的感觉。

另外,numpy可以很好地处理广播数组,因此通常没有理由保留切片所来自的数组的尺寸。如果您这样做了,那么类似:

a = np.zeros((100,100,10))
b = np.zeros(100,10)
a[0,:,:] = b

要么行不通,要么实施起来更加困难。

(或者至少这是我对切片时删除维度信息背后的numpy开发人员的猜测)

It’s probably easiest to do x[None, 10, :] or equivalently (but more readable) x[np.newaxis, 10, :].

As far as why it’s not the default, personally, I find that constantly having arrays with singleton dimensions gets annoying very quickly. I’d guess the numpy devs felt the same way.

Also, numpy handle broadcasting arrays very well, so there’s usually little reason to retain the dimension of the array the slice came from. If you did, then things like:

a = np.zeros((100,100,10))
b = np.zeros(100,10)
a[0,:,:] = b

either wouldn’t work or would be much more difficult to implement.

(Or at least that’s my guess at the numpy dev’s reasoning behind dropping dimension info when slicing)


回答 1

另一个解决方案是

X[[10],:]

要么

I = array([10])
X[I,:]

当通过索引列表(或数组)执行索引时,将保留数组的维数。这很好,因为它使您可以选择保持尺寸和压缩尺寸。

Another solution is to do

X[[10],:]

or

I = array([10])
X[I,:]

The dimensionality of an array is preserved when indexing is performed by a list (or an array) of indexes. This is nice because it leaves you with the choice between keeping the dimension and squeezing.


回答 2

我找到了一些合理的解决方案。

1)使用 numpy.take(X,[10],0)

2)使用这个奇怪的索引 X[10:11:, :]

理想情况下,这应该是默认设置。我从未理解过为什么尺寸会下降。但这是关于numpy的讨论…

I found a few reasonable solutions.

1) use numpy.take(X,[10],0)

2) use this strange indexing X[10:11:, :]

Ideally, this should be the default. I never understood why dimensions are ever dropped. But that’s a discussion for numpy…


回答 3

这是我更喜欢的替代方法。而不是使用单个数字编制索引,而是使用范围进行索引。即使用X[10:11,:]。(请注意,其中10:11不包括11)。

import numpy as np
X = np.zeros((100,10))
X.shape        # >> (100, 10)
xslice = X[10:11,:]
xslice.shape   # >> (1,10)

这也使得使用更多尺寸也很容易理解,而无需None费力地弄清楚要使用哪个索引的轴。同样,无需为数组大小做额外的记账工作,只需i:i+1i您将在常规索引中使用的任何记账工作做好。

b = np.ones((2, 3, 4))
b.shape # >> (2, 3, 4)
b[1:2,:,:].shape  # >> (1, 3, 4)
b[:, 2:3, :].shape .  # >> (2, 1, 4)

Here’s an alternative I like better. Instead of indexing with a single number, index with a range. That is, use X[10:11,:]. (Note that 10:11 does not include 11).

import numpy as np
X = np.zeros((100,10))
X.shape        # >> (100, 10)
xslice = X[10:11,:]
xslice.shape   # >> (1,10)

This makes it easy to understand with more dimensions too, no None juggling and figuring out which axis to use which index. Also no need to do extra bookkeeping regarding array size, just i:i+1 for any i that you would have used in regular indexing.

b = np.ones((2, 3, 4))
b.shape # >> (2, 3, 4)
b[1:2,:,:].shape  # >> (1, 3, 4)
b[:, 2:3, :].shape .  # >> (2, 1, 4)

回答 4

要添加涉及由gnebehay 按列表或数组建立索引的解决方案,还可以使用元组:

X[(10,),:]

To add to the solution involving indexing by lists or arrays by gnebehay, it is also possible to use tuples:

X[(10,),:]

回答 5

如果您在运行时按长度可能为1的数组建立索引,这将特别令人讨厌。对于这种情况,有np.ix_

some_array[np.ix_(row_index,column_index)]

This is especially annoying if you’re indexing by an array that might be length 1 at runtime. For that case, there’s np.ix_:

some_array[np.ix_(row_index,column_index)]