问题:将nan值转换为零

我有一个二维的numpy数组。此数组中的一些值为NaN。我想使用此数组执行某些操作。例如考虑数组:

[[   0.   43.   67.    0.   38.]
 [ 100.   86.   96.  100.   94.]
 [  76.   79.   83.   89.   56.]
 [  88.   NaN   67.   89.   81.]
 [  94.   79.   67.   89.   69.]
 [  88.   79.   58.   72.   63.]
 [  76.   79.   71.   67.   56.]
 [  71.   71.   NaN   56.  100.]]

我试图每次取一行,以相反的顺序对其进行排序,以从行中获取最多3个值并取其平均值。我试过的代码是:

# nparr is a 2D numpy array
for entry in nparr:
    sortedentry = sorted(entry, reverse=True)
    highest_3_values = sortedentry[:3]
    avg_highest_3 = float(sum(highest_3_values)) / 3

这不适用于包含的行NaN。我的问题是,有没有一种快速的方法可以将NaN2D numpy数组中的所有值都转换为零,这样我就不会遇到排序和其他尝试执行的操作。

I have a 2D numpy array. Some of the values in this array are NaN. I want to perform certain operations using this array. For example consider the array:

[[   0.   43.   67.    0.   38.]
 [ 100.   86.   96.  100.   94.]
 [  76.   79.   83.   89.   56.]
 [  88.   NaN   67.   89.   81.]
 [  94.   79.   67.   89.   69.]
 [  88.   79.   58.   72.   63.]
 [  76.   79.   71.   67.   56.]
 [  71.   71.   NaN   56.  100.]]

I am trying to take each row, one at a time, sort it in reversed order to get max 3 values from the row and take their average. The code I tried is:

# nparr is a 2D numpy array
for entry in nparr:
    sortedentry = sorted(entry, reverse=True)
    highest_3_values = sortedentry[:3]
    avg_highest_3 = float(sum(highest_3_values)) / 3

This does not work for rows containing NaN. My question is, is there a quick way to convert all NaN values to zero in the 2D numpy array so that I have no problems with sorting and other things I am trying to do.


回答 0

这应该工作:

from numpy import *

a = array([[1, 2, 3], [0, 3, NaN]])
where_are_NaNs = isnan(a)
a[where_are_NaNs] = 0

在上述情况下,where_are_NaNs为:

In [12]: where_are_NaNs
Out[12]: 
array([[False, False, False],
       [False, False,  True]], dtype=bool)

This should work:

from numpy import *

a = array([[1, 2, 3], [0, 3, NaN]])
where_are_NaNs = isnan(a)
a[where_are_NaNs] = 0

In the above case where_are_NaNs is:

In [12]: where_are_NaNs
Out[12]: 
array([[False, False, False],
       [False, False,  True]], dtype=bool)

回答 1

A您的2D阵列在哪里:

import numpy as np
A[np.isnan(A)] = 0

该函数isnan产生一个布尔数组,指示NaN值在哪里。布尔数组可用于索引相同形状的数组。认为它就像一个面具。

Where A is your 2D array:

import numpy as np
A[np.isnan(A)] = 0

The function isnan produces a bool array indicating where the NaN values are. A boolean array can by used to index an array of the same shape. Think of it like a mask.


回答 2


回答 3

您可以用来查找您的位置NaN

import numpy as np

a = np.array([[   0,   43,   67,    0,   38],
              [ 100,   86,   96,  100,   94],
              [  76,   79,   83,   89,   56],
              [  88,   np.nan,   67,   89,   81],
              [  94,   79,   67,   89,   69],
              [  88,   79,   58,   72,   63],
              [  76,   79,   71,   67,   56],
              [  71,   71,   np.nan,   56,  100]])

b = np.where(np.isnan(a), 0, a)

In [20]: b
Out[20]: 
array([[   0.,   43.,   67.,    0.,   38.],
       [ 100.,   86.,   96.,  100.,   94.],
       [  76.,   79.,   83.,   89.,   56.],
       [  88.,    0.,   67.,   89.,   81.],
       [  94.,   79.,   67.,   89.,   69.],
       [  88.,   79.,   58.,   72.,   63.],
       [  76.,   79.,   71.,   67.,   56.],
       [  71.,   71.,    0.,   56.,  100.]])

You could use to find where you have NaN:

import numpy as np

a = np.array([[   0,   43,   67,    0,   38],
              [ 100,   86,   96,  100,   94],
              [  76,   79,   83,   89,   56],
              [  88,   np.nan,   67,   89,   81],
              [  94,   79,   67,   89,   69],
              [  88,   79,   58,   72,   63],
              [  76,   79,   71,   67,   56],
              [  71,   71,   np.nan,   56,  100]])

b = np.where(np.isnan(a), 0, a)

In [20]: b
Out[20]: 
array([[   0.,   43.,   67.,    0.,   38.],
       [ 100.,   86.,   96.,  100.,   94.],
       [  76.,   79.,   83.,   89.,   56.],
       [  88.,    0.,   67.,   89.,   81.],
       [  94.,   79.,   67.,   89.,   69.],
       [  88.,   79.,   58.,   72.,   63.],
       [  76.,   79.,   71.,   67.,   56.],
       [  71.,   71.,    0.,   56.,  100.]])

回答 4

德雷克使用答案的代码示例nan_to_num

>>> import numpy as np
>>> A = np.array([[1, 2, 3], [0, 3, np.NaN]])
>>> A = np.nan_to_num(A)
>>> A
array([[ 1.,  2.,  3.],
       [ 0.,  3.,  0.]])

A code example for drake’s answer to use nan_to_num:

>>> import numpy as np
>>> A = np.array([[1, 2, 3], [0, 3, np.NaN]])
>>> A = np.nan_to_num(A)
>>> A
array([[ 1.,  2.,  3.],
       [ 0.,  3.,  0.]])

回答 5

您可以使用numpy.nan_to_num

numpy.nan_to_num(X):替换INF有限数

示例(请参阅doc):

>>> np.set_printoptions(precision=8)
>>> x = np.array([np.inf, -np.inf, np.nan, -128, 128])
>>> np.nan_to_num(x)
array([  1.79769313e+308,  -1.79769313e+308,   0.00000000e+000,
        -1.28000000e+002,   1.28000000e+002])

You can use numpy.nan_to_num :

numpy.nan_to_num(x) : Replace nan with zero and inf with finite numbers.

Example (see doc) :

>>> np.set_printoptions(precision=8)
>>> x = np.array([np.inf, -np.inf, np.nan, -128, 128])
>>> np.nan_to_num(x)
array([  1.79769313e+308,  -1.79769313e+308,   0.00000000e+000,
        -1.28000000e+002,   1.28000000e+002])

回答 6

nan永远不等于nan

if z!=z:z=0

所以对于二维数组

for entry in nparr:
    if entry!=entry:entry=0

nan is never equal to nan

if z!=z:z=0

so for a 2D array

for entry in nparr:
    if entry!=entry:entry=0

回答 7

您可以使用lambda函数,这是一维数组的示例:

import numpy as np
a = [np.nan, 2, 3]
map(lambda v:0 if np.isnan(v) == True else v, a)

这将为您提供结果:

[0, 2, 3]

You can use lambda function, an example for 1D array:

import numpy as np
a = [np.nan, 2, 3]
map(lambda v:0 if np.isnan(v) == True else v, a)

This will give you the result:

[0, 2, 3]

回答 8

出于您的目的,如果所有项目都存储为str并且您只是按使用的方式使用sorted,然后检查第一个元素并将其替换为“ 0”

>>> l1 = ['88','NaN','67','89','81']
>>> n = sorted(l1,reverse=True)
['NaN', '89', '88', '81', '67']
>>> import math
>>> if math.isnan(float(n[0])):
...     n[0] = '0'
... 
>>> n
['0', '89', '88', '81', '67']

For your purposes, if all the items are stored as str and you just use sorted as you are using and then check for the first element and replace it with ‘0’

>>> l1 = ['88','NaN','67','89','81']
>>> n = sorted(l1,reverse=True)
['NaN', '89', '88', '81', '67']
>>> import math
>>> if math.isnan(float(n[0])):
...     n[0] = '0'
... 
>>> n
['0', '89', '88', '81', '67']

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。