In[56]: a = np.array([0,0,0,1,0,1,1,0,0,0,0,1])In[57]: np.bincount(a)Out[57]: array([8,4])#count of zeros is at index 0 : 8#count of ones is at index 1 : 4
In [56]: a = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
In [57]: np.bincount(a)
Out[57]: array([8, 4]) #count of zeros is at index 0 : 8
#count of ones is at index 1 : 4
回答 4
将数组转换y为列表l,然后执行l.count(1)和l.count(0)
>>> y = numpy.array([0,0,0,1,0,1,1,0,0,0,0,1])>>> l = list(y)>>> l.count(1)4>>> l.count(0)8
In[1]: choices = np.random.randint(0,100,10000)In[2]:%timeit [ np.sum(choices == k)for k in range(min(choices), max(choices)+1)]100 loops, best of 3:2.67 ms per loop
In[3]:%timeit np.unique(choices, return_counts=True)1000 loops, best of 3:388µs per loop
In[4]:%timeit np.bincount(choices, minlength=np.size(choices))100000 loops, best of 3:16.3µs per loop
No one suggested to use numpy.bincount(input, minlength) with minlength = np.size(input), but it seems to be a good solution, and definitely the fastest:
In [1]: choices = np.random.randint(0, 100, 10000)
In [2]: %timeit [ np.sum(choices == k) for k in range(min(choices), max(choices)+1) ]
100 loops, best of 3: 2.67 ms per loop
In [3]: %timeit np.unique(choices, return_counts=True)
1000 loops, best of 3: 388 µs per loop
In [4]: %timeit np.bincount(choices, minlength=np.size(choices))
100000 loops, best of 3: 16.3 µs per loop
That’s a crazy speedup between numpy.unique(x, return_counts=True) and numpy.bincount(x, minlength=np.max(x)) !
numpy.sum(MyArray==x)# sum of a binary list of the occurence of x (=0 or 1) in MyArray
这将导致完整的代码作为示例
import numpy
MyArray=numpy.array([0,0,0,1,0,1,1,0,0,0,0,1])# array we want to search in
x=0# the value I want to count (can be iterator, in a list, etc.)
numpy.sum(MyArray==0)# sum of a binary list of the occurence of x in MyArray
现在,如果MyArray具有多个维度,并且您要计算行中值分布的出现次数(此后为pattern)
MyArray=numpy.array([[6,1],[4,5],[0,7],[5,1],[2,5],[1,2],[3,2],[0,2],[2,5],[5,1],[3,0]])
x=numpy.array([5,1])# the value I want to count (can be iterator, in a list, etc.)
temp = numpy.ascontiguousarray(MyArray).view(numpy.dtype((numpy.void,MyArray.dtype.itemsize *MyArray.shape[1])))# convert the 2d-array into an array of analyzable patterns
xt=numpy.ascontiguousarray(x).view(numpy.dtype((numpy.void, x.dtype.itemsize * x.shape[0])))# convert what you search into one analyzable pattern
numpy.sum(temp==xt)# count of the searched pattern in the list of patterns
numpy.sum(MyArray==x) # sum of a binary list of the occurence of x (=0 or 1) in MyArray
which would result into this full code as exemple
import numpy
MyArray=numpy.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1]) # array we want to search in
x=0 # the value I want to count (can be iterator, in a list, etc.)
numpy.sum(MyArray==0) # sum of a binary list of the occurence of x in MyArray
Now if MyArray is in multiple dimensions and you want to count the occurence of a distribution of values in line (= pattern hereafter)
MyArray=numpy.array([[6, 1],[4, 5],[0, 7],[5, 1],[2, 5],[1, 2],[3, 2],[0, 2],[2, 5],[5, 1],[3, 0]])
x=numpy.array([5,1]) # the value I want to count (can be iterator, in a list, etc.)
temp = numpy.ascontiguousarray(MyArray).view(numpy.dtype((numpy.void, MyArray.dtype.itemsize * MyArray.shape[1]))) # convert the 2d-array into an array of analyzable patterns
xt=numpy.ascontiguousarray(x).view(numpy.dtype((numpy.void, x.dtype.itemsize * x.shape[0]))) # convert what you search into one analyzable pattern
numpy.sum(temp==xt) # count of the searched pattern in the list of patterns
It involves one more step, but a more flexible solution which would also work for 2d arrays and more complicated filters is to create a boolean mask and then use .sum() on the mask.
If you don’t want to use numpy or a collections module you can use a dictionary:
d = dict()
a = [0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1]
for item in a:
try:
d[item]+=1
except KeyError:
d[item]=1
result:
>>>d
{0: 8, 1: 4}
Of course you can also use an if/else statement.
I think the Counter function does almost the same thing but this is more transparant.
回答 24
对于通用条目:
x = np.array([11,2,3,5,3,2,16,10,10,3,11,4,5,16,3,11,4])
n ={i:len([j for j in np.where(x==i)[0]])for i in set(x)}
ix ={i:[j for j in np.where(x==i)[0]]for i in set(x)}
x = np.array([11, 2, 3, 5, 3, 2, 16, 10, 10, 3, 11, 4, 5, 16, 3, 11, 4])
n = {i:len([j for j in np.where(x==i)[0]]) for i in set(x)}
ix = {i:[j for j in np.where(x==i)[0]] for i in set(x)}
If you are interested in the fastest execution, you know in advance which value(s) to look for, and your array is 1D, or you are otherwise interested in the result on the flattened array (in which case the input of the function should be np.flatten(arr) rather than just arr), then Numba is your friend:
import numba as nb
@nb.jit
def count_nb(arr, value):
result = 0
for x in arr:
if x == value:
result += 1
return result
or, for very large arrays where parallelization may be beneficial:
@nb.jit(parallel=True)
def count_nbp(arr, value):
result = 0
for i in nb.prange(arr.size):
if arr[i] == value:
result += 1
return result
Benchmarking these against np.count_nonzero() (which also has a problem of creating a temporary array which may be avoided) and np.unique()-based solution
the following plots are obtained (the second row of plots is a zoom on the faster approach):
Showing that Numba-based solution are noticeably faster than the NumPy counterparts, and, for very large inputs, the parallel approach is faster than the naive one.
if you are dealing with very large arrays using generators could be an option. The nice thing here it that this approach works fine for both arrays and lists and you dont need any additional package. Additionally, you are not using that much memory.
my_array = np.array([0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1])
sum(1 for val in my_array if val==0)
Out: 8