It creates a width x height x 9 matrix filled with zeros. Instead, I’d like to know if there’s a function or way to initialize them instead to NaNs in an easy way.
回答 0
您很少需要在numpy中进行矢量操作循环。您可以创建一个未初始化的数组并立即分配给所有条目:
>>> a = numpy.empty((3,3,))>>> a[:]= numpy.nan
>>> a
array([[NaN,NaN,NaN],[NaN,NaN,NaN],[NaN,NaN,NaN]])
$ python -mtimeit "import numpy as np; a = np.empty((100,100));""a.fill(np.nan)"10000 loops, best of 3:54.3 usec per loop
$ python -mtimeit "import numpy as np; a = np.empty((100,100));""a[:] = np.nan"10000 loops, best of 3:88.8 usec per loop
I have timed the alternatives a[:] = numpy.nan here and a.fill(numpy.nan) as posted by Blaenk:
$ python -mtimeit "import numpy as np; a = np.empty((100,100));" "a.fill(np.nan)"
10000 loops, best of 3: 54.3 usec per loop
$ python -mtimeit "import numpy as np; a = np.empty((100,100));" "a[:] = np.nan"
10000 loops, best of 3: 88.8 usec per loop
The timings show a preference for ndarray.fill(..) as the faster alternative. OTOH, I like numpy’s convenience implementation where you can assign values to whole slices at the time, the code’s intention is very clear.
Note that ndarray.fill performs its operation in-place, so numpy.empty((3,3,)).fill(numpy.nan) will instead return None.
import numpy
import perfplot
val =42.0def fill(n):
a = numpy.empty(n)
a.fill(val)return a
def colon(n):
a = numpy.empty(n)
a[:]= val
return a
def full(n):return numpy.full(n, val)def ones_times(n):return val * numpy.ones(n)def list(n):return numpy.array(n *[val])
perfplot.show(
setup=lambda n: n,
kernels=[fill, colon, full, ones_times, list],
n_range=[2** k for k in range(20)],
logx=True,
logy=True,
xlabel="len(a)",)
I compared the suggested alternatives for speed and found that, for large enough vectors/matrices to fill, all alternatives except val * ones and array(n * [val]) are equally fast.
Code to reproduce the plot:
import numpy
import perfplot
val = 42.0
def fill(n):
a = numpy.empty(n)
a.fill(val)
return a
def colon(n):
a = numpy.empty(n)
a[:] = val
return a
def full(n):
return numpy.full(n, val)
def ones_times(n):
return val * numpy.ones(n)
def list(n):
return numpy.array(n * [val])
perfplot.show(
setup=lambda n: n,
kernels=[fill, colon, full, ones_times, list],
n_range=[2 ** k for k in range(20)],
logx=True,
logy=True,
xlabel="len(a)",
)
回答 3
你熟悉numpy.nan吗?
您可以创建自己的方法,例如:
def nans(shape, dtype=float):
a = numpy.empty(shape, dtype)
a.fill(numpy.nan)return a
$ python -mtimeit "import numpy as np; X = np.empty((100,100));""X[:] = np.nan;"100000 loops, best of 3:8.9 usec per loop
(predict)laneh@predict:~/src/predict/predict/webapp$ master
$ python -mtimeit "import numpy as np; X = np.ones((100,100));""X *= np.nan;"10000 loops, best of 3:24.9 usec per loop
But the @u0b34a0f6ae’s accepted answer is 3x faster (CPU cycles, not brain cycles to remember numpy syntax ;):
$ python -mtimeit "import numpy as np; X = np.empty((100,100));" "X[:] = np.nan;"
100000 loops, best of 3: 8.9 usec per loop
(predict)laneh@predict:~/src/predict/predict/webapp$ master
$ python -mtimeit "import numpy as np; X = np.ones((100,100));" "X *= np.nan;"
10000 loops, best of 3: 24.9 usec per loop
Another alternative is numpy.broadcast_to(val,n) which returns in constant time regardless of the size and is also the most memory efficient (it returns a view of the repeated element). The caveat is that the returned value is read-only.
Below is a comparison of the performances of all the other methods that have been proposed using the same benchmark as in Nico Schlömer’s answer.
In[36]: a = numpy.empty(5,dtype=object)In[37]: a.fill([])In[38]: a
Out[38]: array([[],[],[],[],[]], dtype=object)In[39]: a[0].append(4)In[40]: a
Out[40]: array([[4],[4],[4],[4],[4]], dtype=object)
一种解决方法可以是例如:
In[41]: a = numpy.empty(5,dtype=object)In[42]: a[:]=[[]for x in range(5)]In[43]: a[0].append(4)In[44]: a
Out[44]: array([[4],[],[],[],[]], dtype=object)
As said, numpy.empty() is the way to go. However, for objects, fill() might not do exactly what you think it does:
In[36]: a = numpy.empty(5,dtype=object)
In[37]: a.fill([])
In[38]: a
Out[38]: array([[], [], [], [], []], dtype=object)
In[39]: a[0].append(4)
In[40]: a
Out[40]: array([[4], [4], [4], [4], [4]], dtype=object)
One way around can be e.g.:
In[41]: a = numpy.empty(5,dtype=object)
In[42]: a[:]= [ [] for x in range(5)]
In[43]: a[0].append(4)
In[44]: a
Out[44]: array([[4], [], [], [], []], dtype=object)