如何选择给定条件的数组元素？

Question 1

Suppose I have a numpy array x = [5, 2, 3, 1, 4, 5], y = ['f', 'o', 'o', 'b', 'a', 'r']. I want to select the elements in y corresponding to elements in x that are greater than 1 and less than 5.

I tried

x = array([5, 2, 3, 1, 4, 5])
y = array(['f','o','o','b','a','r'])
output = y[x > 1 & x < 5] # desired output is ['o','o','a']

but this doesn’t work. How would I do this?

Question 2

Your expression works if you add parentheses:

>>> y[(1 < x) & (x < 5)]
array(['o', 'o', 'a'], 
      dtype='|S1')

Question 3

IMO OP does not actually want but actually wants because they are comparing logical values such as True and False – see this SO post on logical vs. bitwise to see the difference.

>>> x = array([5, 2, 3, 1, 4, 5])
>>> y = array(['f','o','o','b','a','r'])
>>> output = y[np.logical_and(x > 1, x < 5)] # desired output is ['o','o','a']
>>> output
array(['o', 'o', 'a'],
      dtype='|S1')

And equivalent way to do this is with by setting the axis argument appropriately.

>>> output = y[np.all([x > 1, x < 5], axis=0)] # desired output is ['o','o','a']
>>> output
array(['o', 'o', 'a'],
      dtype='|S1')

by the numbers:

>>> %timeit (a < b) & (b < c)
The slowest run took 32.97 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.15 µs per loop

>>> %timeit np.logical_and(a < b, b < c)
The slowest run took 32.59 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.17 µs per loop

>>> %timeit np.all([a < b, b < c], 0)
The slowest run took 67.47 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.06 µs per loop

so using np.all() is slower, but & and logical_and are about the same.

Question 4

Add one detail to @J.F. Sebastian’s and @Mark Mikofski’s answers:
If one wants to get the corresponding indices (rather than the actual values of array), the following code will do:

For satisfying multiple (all) conditions:

select_indices = np.where( np.logical_and( x > 1, x < 5) )[0] #   1 < x <5

For satisfying multiple (or) conditions:

select_indices = np.where( np.logical_or( x < 1, x > 5 ) )[0] # x <1 or x >5

Question 5

I like to use np.vectorize for such tasks. Consider the following:

>>> # Arrays
>>> x = np.array([5, 2, 3, 1, 4, 5])
>>> y = np.array(['f','o','o','b','a','r'])

>>> # Function containing the constraints
>>> func = np.vectorize(lambda t: t>1 and t<5)

>>> # Call function on x
>>> y[func(x)]
>>> array(['o', 'o', 'a'], dtype='<U1')

The advantage is you can add many more types of constraints in the vectorized function.

Hope it helps.

Question 6

Actually I would do it this way:

L1 is the index list of elements satisfying condition 1;(maybe you can use somelist.index(condition1) or np.where(condition1) to get L1.)

Similarly, you get L2, a list of elements satisfying condition 2;

Then you find intersection using intersect(L1,L2).

You can also find intersection of multiple lists if you get multiple conditions to satisfy.

Then you can apply index in any other array, for example, x.

Question 7

For 2D arrays, you can do this. Create a 2D mask using the condition. Typecast the condition mask to int or float, depending on the array, and multiply it with the original array.

In [8]: arr
Out[8]: 
array([[ 1.,  2.,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9., 10.]])

In [9]: arr*(arr % 2 == 0).astype(np.int) 
Out[9]: 
array([[ 0.,  2.,  0.,  4.,  0.],
       [ 6.,  0.,  8.,  0., 10.]])

如何选择给定条件的数组元素？

问题：如何选择给定条件的数组元素？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

7行代码 Python热力图可视化分析缺失数据处理

Python 流程图 — 一键转化代码为流程图

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

Python是强类型的吗？

函数式编程中的“ pythonic”等同于“ fold”函数是什么？

如何保护Python代码？

用多重继承调用父类init，正确的方法是什么？

pyenv，virtualenv，anaconda有什么区别？

如何使用boto3将S3对象保存到文件

如何选择给定条件的数组元素？

问题：如何选择给定条件的数组元素？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

相关文章

排行榜展示

文章展示