NumPy数组的就地类型转换

问题:NumPy数组的就地类型转换

给定一个NumPy数组int32,如何将其转换为float32 原位?所以基本上,我想做

a = a.astype(numpy.float32)

而不复制阵列。好大

这样做的原因是我有两种算法来计算a。其中一个返回一个数组int32,另一个返回一个数组float32(这是两种不同算法固有的)。所有进一步的计算都假定a是的数组float32

目前,我在C函数中通过via进行转换ctypes。有没有办法在Python中做到这一点?

Given a NumPy array of int32, how do I convert it to float32 in place? So basically, I would like to do

a = a.astype(numpy.float32)

without copying the array. It is big.

The reason for doing this is that I have two algorithms for the computation of a. One of them returns an array of int32, the other returns an array of float32 (and this is inherent to the two different algorithms). All further computations assume that a is an array of float32.

Currently I do the conversion in a C function called via ctypes. Is there a way to do this in Python?


回答 0

您可以使用不同的dtype创建视图,然后就地复制到视图中:

import numpy as np
x = np.arange(10, dtype='int32')
y = x.view('float32')
y[:] = x

print(y)

Yield

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.], dtype=float32)

要显示转换是否就位,请注意 复制x到已y更改x

print(x)

版画

array([         0, 1065353216, 1073741824, 1077936128, 1082130432,
       1084227584, 1086324736, 1088421888, 1090519040, 1091567616])

You can make a view with a different dtype, and then copy in-place into the view:

import numpy as np
x = np.arange(10, dtype='int32')
y = x.view('float32')
y[:] = x

print(y)

yields

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.], dtype=float32)

To show the conversion was in-place, note that copying from x to y altered x:

print(x)

prints

array([         0, 1065353216, 1073741824, 1077936128, 1082130432,
       1084227584, 1086324736, 1088421888, 1090519040, 1091567616])

回答 1

更新:此功能仅在可能的情况下避免复制,因此这不是此问题的正确答案。unutbu的答案是正确的。


a = a.astype(numpy.float32, copy=False)

numpy astype具有复制标志。我们为什么不应该使用它?

Update: This function only avoids copy if it can, hence this is not the correct answer for this question. unutbu’s answer is the right one.


a = a.astype(numpy.float32, copy=False)

numpy astype has a copy flag. Why shouldn’t we use it ?


回答 2

您可以更改数组类型而无需进行如下转换:

a.dtype = numpy.float32

但首先,您必须将所有整数更改为将被解释为相应浮点数的值。一种很慢的方法是使用python的struct模块,如下所示:

def toi(i):
    return struct.unpack('i',struct.pack('f',float(i)))[0]

…应用于数组的每个成员。

但是,也许更快的方法是利用numpy的ctypeslib工具(我不熟悉)

-编辑-

由于ctypeslib似乎不起作用,所以我将使用典型numpy.astype方法进行转换,但以内存限制内的块大小进行处理:

a[0:10000] = a[0:10000].astype('float32').view('int32')

…然后在完成后更改dtype。

这是一个功能,可以完成所有兼容dtypes的任务(仅适用于具有相同大小项目的dtypes),并通过用户控制块大小来处理任意形状的数组:

import numpy

def astype_inplace(a, dtype, blocksize=10000):
    oldtype = a.dtype
    newtype = numpy.dtype(dtype)
    assert oldtype.itemsize is newtype.itemsize
    for idx in xrange(0, a.size, blocksize):
        a.flat[idx:idx + blocksize] = \
            a.flat[idx:idx + blocksize].astype(newtype).view(oldtype)
    a.dtype = newtype

a = numpy.random.randint(100,size=100).reshape((10,10))
print a
astype_inplace(a, 'float32')
print a

You can change the array type without converting like this:

a.dtype = numpy.float32

but first you have to change all the integers to something that will be interpreted as the corresponding float. A very slow way to do this would be to use python’s struct module like this:

def toi(i):
    return struct.unpack('i',struct.pack('f',float(i)))[0]

…applied to each member of your array.

But perhaps a faster way would be to utilize numpy’s ctypeslib tools (which I am unfamiliar with)

– edit –

Since ctypeslib doesnt seem to work, then I would proceed with the conversion with the typical numpy.astype method, but proceed in block sizes that are within your memory limits:

a[0:10000] = a[0:10000].astype('float32').view('int32')

…then change the dtype when done.

Here is a function that accomplishes the task for any compatible dtypes (only works for dtypes with same-sized items) and handles arbitrarily-shaped arrays with user-control over block size:

import numpy

def astype_inplace(a, dtype, blocksize=10000):
    oldtype = a.dtype
    newtype = numpy.dtype(dtype)
    assert oldtype.itemsize is newtype.itemsize
    for idx in xrange(0, a.size, blocksize):
        a.flat[idx:idx + blocksize] = \
            a.flat[idx:idx + blocksize].astype(newtype).view(oldtype)
    a.dtype = newtype

a = numpy.random.randint(100,size=100).reshape((10,10))
print a
astype_inplace(a, 'float32')
print a

回答 3

import numpy as np
arr_float = np.arange(10, dtype=np.float32)
arr_int = arr_float.view(np.float32)

使用view()和参数’dtype’更改数组。

import numpy as np
arr_float = np.arange(10, dtype=np.float32)
arr_int = arr_float.view(np.float32)

use view() and parameter ‘dtype’ to change the array in place.


回答 4

用这个:

In [105]: a
Out[105]: 
array([[15, 30, 88, 31, 33],
       [53, 38, 54, 47, 56],
       [67,  2, 74, 10, 16],
       [86, 33, 15, 51, 32],
       [32, 47, 76, 15, 81]], dtype=int32)

In [106]: float32(a)
Out[106]: 
array([[ 15.,  30.,  88.,  31.,  33.],
       [ 53.,  38.,  54.,  47.,  56.],
       [ 67.,   2.,  74.,  10.,  16.],
       [ 86.,  33.,  15.,  51.,  32.],
       [ 32.,  47.,  76.,  15.,  81.]], dtype=float32)

Use this:

In [105]: a
Out[105]: 
array([[15, 30, 88, 31, 33],
       [53, 38, 54, 47, 56],
       [67,  2, 74, 10, 16],
       [86, 33, 15, 51, 32],
       [32, 47, 76, 15, 81]], dtype=int32)

In [106]: float32(a)
Out[106]: 
array([[ 15.,  30.,  88.,  31.,  33.],
       [ 53.,  38.,  54.,  47.,  56.],
       [ 67.,   2.,  74.,  10.,  16.],
       [ 86.,  33.,  15.,  51.,  32.],
       [ 32.,  47.,  76.,  15.,  81.]], dtype=float32)

回答 5

a = np.subtract(a, 0., dtype=np.float32)

a = np.subtract(a, 0., dtype=np.float32)