问题:脾气暴躁的地方有多个条件

我有一组距离称为dists。我想选择两个值之间的距离。我编写了以下代码行:

 dists[(np.where(dists >= r)) and (np.where(dists <= r + dr))]

但是,这仅针对条件选择

 (np.where(dists <= r + dr))

如果我通过使用临时变量按顺序执行命令,则效果很好。为什么上面的代码不起作用,如何使它起作用?

干杯

I have an array of distances called dists. I want to select dists which are between two values. I wrote the following line of code to do that:

 dists[(np.where(dists >= r)) and (np.where(dists <= r + dr))]

However this selects only for the condition

 (np.where(dists <= r + dr))

If I do the commands sequentially by using a temporary variable it works fine. Why does the above code not work, and how do I get it to work?

Cheers


回答 0

您的特定情况下,最好的方法将两个条件更改为一个条件:

dists[abs(dists - r - dr/2.) <= dr/2.]

它仅创建一个布尔数组,在我看来是更易于阅读,因为它说,(尽管我将重新定义r为您感兴趣的区域的中心,而不是开始的位置,所以r = r + dr/2.)但这并不能回答您的问题。


问题的答案:如果您只是想过滤出不符合标准的元素,则
实际上并不需要:wheredists

dists[(dists >= r) & (dists <= r+dr)]

因为&将会为您提供基本元素and(括号是必需的)。

或者,如果您where出于某些原因要使用,可以执行以下操作:

 dists[(np.where((dists >= r) & (dists <= r + dr)))]

原因:
不起作用的原因是因为np.where返回的是索引列表,而不是布尔数组。您试图and在两个数字列表之间移动,这些数字当然没有您期望的True/ False值。如果ab都是两个True值,则a and b返回b。所以说这样的话[0,1,2] and [2,3,4]只会给你[2,3,4]。它在起作用:

In [230]: dists = np.arange(0,10,.5)
In [231]: r = 5
In [232]: dr = 1

In [233]: np.where(dists >= r)
Out[233]: (array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),)

In [234]: np.where(dists <= r+dr)
Out[234]: (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),)

In [235]: np.where(dists >= r) and np.where(dists <= r+dr)
Out[235]: (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),)

您期望比较的只是布尔数组,例如

In [236]: dists >= r
Out[236]: 
array([False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True], dtype=bool)

In [237]: dists <= r + dr
Out[237]: 
array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True, False, False, False, False, False,
       False, False], dtype=bool)

In [238]: (dists >= r) & (dists <= r + dr)
Out[238]: 
array([False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True, False, False, False, False, False,
       False, False], dtype=bool)

现在,您可以调用np.where组合的布尔数组:

In [239]: np.where((dists >= r) & (dists <= r + dr))
Out[239]: (array([10, 11, 12]),)

In [240]: dists[np.where((dists >= r) & (dists <= r + dr))]
Out[240]: array([ 5. ,  5.5,  6. ])

或者使用花式索引简单地用布尔数组对原始数组进行索引

In [241]: dists[(dists >= r) & (dists <= r + dr)]
Out[241]: array([ 5. ,  5.5,  6. ])

The best way in your particular case would just be to change your two criteria to one criterion:

dists[abs(dists - r - dr/2.) <= dr/2.]

It only creates one boolean array, and in my opinion is easier to read because it says, (Though I’d redefine r to be the center of your region of interest instead of the beginning, so r = r + dr/2.) But that doesn’t answer your question.


The answer to your question:
You don’t actually need where if you’re just trying to filter out the elements of dists that don’t fit your criteria:

dists[(dists >= r) & (dists <= r+dr)]

Because the & will give you an elementwise and (the parentheses are necessary).

Or, if you do want to use where for some reason, you can do:

 dists[(np.where((dists >= r) & (dists <= r + dr)))]

Why:
The reason it doesn’t work is because np.where returns a list of indices, not a boolean array. You’re trying to get and between two lists of numbers, which of course doesn’t have the True/False values that you expect. If a and b are both True values, then a and b returns b. So saying something like [0,1,2] and [2,3,4] will just give you [2,3,4]. Here it is in action:

In [230]: dists = np.arange(0,10,.5)
In [231]: r = 5
In [232]: dr = 1

In [233]: np.where(dists >= r)
Out[233]: (array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),)

In [234]: np.where(dists <= r+dr)
Out[234]: (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),)

In [235]: np.where(dists >= r) and np.where(dists <= r+dr)
Out[235]: (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),)

What you were expecting to compare was simply the boolean array, for example

In [236]: dists >= r
Out[236]: 
array([False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True], dtype=bool)

In [237]: dists <= r + dr
Out[237]: 
array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True, False, False, False, False, False,
       False, False], dtype=bool)

In [238]: (dists >= r) & (dists <= r + dr)
Out[238]: 
array([False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True, False, False, False, False, False,
       False, False], dtype=bool)

Now you can call np.where on the combined boolean array:

In [239]: np.where((dists >= r) & (dists <= r + dr))
Out[239]: (array([10, 11, 12]),)

In [240]: dists[np.where((dists >= r) & (dists <= r + dr))]
Out[240]: array([ 5. ,  5.5,  6. ])

Or simply index the original array with the boolean array using fancy indexing

In [241]: dists[(dists >= r) & (dists <= r + dr)]
Out[241]: array([ 5. ,  5.5,  6. ])

回答 1

公认的答案已经很好地解释了这个问题。但是,应用多个条件的Numpythonic方法更多是使用numpy逻辑函数。在这种情况下,您可以使用np.logical_and

np.where(np.logical_and(np.greater_equal(dists,r),np.greater_equal(dists,r + dr)))

The accepted answer explained the problem well enough. However, the the more Numpythonic approach for applying multiple conditions is to use numpy logical functions. In this ase you can use np.logical_and:

np.where(np.logical_and(np.greater_equal(dists,r),np.greater_equal(dists,r + dr)))

回答 2

这里要指出的一件有趣的事情是:在这种情况下,通常也可以使用ORAND的方式,但有一点点变化。代替“ and”和“ or”,而使用Ampersand(&)Pipe Operator(|),它将起作用。

当我们使用‘and’时

ar = np.array([3,4,5,14,2,4,3,7])
np.where((ar>3) and (ar<6), 'yo', ar)

Output:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

当我们使用&符时

ar = np.array([3,4,5,14,2,4,3,7])
np.where((ar>3) & (ar<6), 'yo', ar)

Output:
array(['3', 'yo', 'yo', '14', '2', 'yo', '3', '7'], dtype='<U11')

当我们尝试应用大熊猫Dataframe的多个过滤器时,情况也是如此。现在,其背后的原因必须与逻辑运算符和按位运算符有关,并且为了对它们有更多的了解,我建议在stackoverflow中仔细研究一下此答案或类似的Q / A。

更新

用户问,为什么需要在括号内给出(ar> 3)和(ar <6)。好吧,这就是事情。在我开始讨论这里发生的事情之前,需要了解Python中的运算符优先级。

类似于BODMAS所涉及的内容,python还优先执行应首先执行的操作。首先执行括号内的项目,然后按位运算符开始工作。我将在下面显示两种情况,当您确实使用和不使用“(”,“)”时会发生什么。

情况1:

np.where( ar>3 & ar<6, 'yo', ar)
np.where( np.array([3,4,5,14,2,4,3,7])>3 & np.array([3,4,5,14,2,4,3,7])<6, 'yo', ar)

由于这里没有括号,因此按位运算符(&)在这里变得困惑,您甚至要求它获得逻辑与,因为在运算符优先级表中(如果看到的话)&被赋予了优先于<>运算符。这是从最低优先级到最高优先级的表格。

在此处输入图片说明

它甚至不执行<>操作被要求执行逻辑与操作。这就是为什么它会导致该错误。

您可以查看以下链接以了解更多信息:运算符优先级

现在转到案例2:

如果您确实使用了支架,那么您会清楚地看到会发生什么。

np.where( (ar>3) & (ar<6), 'yo', ar)
np.where( (array([False,  True,  True,  True, False,  True, False,  True])) & (array([ True,  True,  True, False,  True,  True,  True, False])), 'yo', ar)

真假两个数组。而且,您可以轻松地对其执行逻辑AND操作。这给你:

np.where( array([False,  True,  True, False, False,  True, False, False]),  'yo', ar)

休息一下,np.where,对于给定的情况,在任何情况下,True都会分配第一个值(即“ yo”),如果为False,则分配另一个值(即在此保留原始值)。

就这样。我希望我能很好地解释查询。

One interesting thing to point here; the usual way of using OR and AND too will work in this case, but with a small change. Instead of “and” and instead of “or”, rather use Ampersand(&) and Pipe Operator(|) and it will work.

When we use ‘and’:

ar = np.array([3,4,5,14,2,4,3,7])
np.where((ar>3) and (ar<6), 'yo', ar)

Output:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

When we use Ampersand(&):

ar = np.array([3,4,5,14,2,4,3,7])
np.where((ar>3) & (ar<6), 'yo', ar)

Output:
array(['3', 'yo', 'yo', '14', '2', 'yo', '3', '7'], dtype='<U11')

And this is same in the case when we are trying to apply multiple filters in case of pandas Dataframe. Now the reasoning behind this has to do something with Logical Operators and Bitwise Operators and for more understanding about same, I’d suggest to go through this answer or similar Q/A in stackoverflow.

UPDATE

A user asked, why is there a need for giving (ar>3) and (ar<6) inside the parenthesis. Well here’s the thing. Before I start talking about what’s happening here, one needs to know about Operator precedence in Python.

Similar to what BODMAS is about, python also gives precedence to what should be performed first. Items inside the parenthesis are performed first and then the bitwise operator comes to work. I’ll show below what happens in both the cases when you do use and not use “(“, “)”.

Case1:

np.where( ar>3 & ar<6, 'yo', ar)
np.where( np.array([3,4,5,14,2,4,3,7])>3 & np.array([3,4,5,14,2,4,3,7])<6, 'yo', ar)

Since there are no brackets here, the bitwise operator(&) is getting confused here that what are you even asking it to get logical AND of, because in the operator precedence table if you see, & is given precedence over < or > operators. Here’s the table from from lowest precedence to highest precedence.

enter image description here

It’s not even performing the < and > operation and being asked to perform a logical AND operation. So that’s why it gives that error.

One can check out the following link to learn more about: operator precedence

Now to Case 2:

If you do use the bracket, you clearly see what happens.

np.where( (ar>3) & (ar<6), 'yo', ar)
np.where( (array([False,  True,  True,  True, False,  True, False,  True])) & (array([ True,  True,  True, False,  True,  True,  True, False])), 'yo', ar)

Two arrays of True and False. And you can easily perform logical AND operation on them. Which gives you:

np.where( array([False,  True,  True, False, False,  True, False, False]),  'yo', ar)

And rest you know, np.where, for given cases, wherever True, assigns first value(i.e. here ‘yo’) and if False, the other(i.e. here, keeping the original).

That’s all. I hope I explained the query well.


回答 3

我喜欢np.vectorize用于此类任务。考虑以下:

>>> # function which returns True when constraints are satisfied.
>>> func = lambda d: d >= r and d<= (r+dr) 
>>>
>>> # Apply constraints element-wise to the dists array.
>>> result = np.vectorize(func)(dists) 
>>>
>>> result = np.where(result) # Get output.

您也可以使用np.argwhere代替以np.where获得清晰的输出。但这是您的电话:)

希望能帮助到你。

I like to use np.vectorize for such tasks. Consider the following:

>>> # function which returns True when constraints are satisfied.
>>> func = lambda d: d >= r and d<= (r+dr) 
>>>
>>> # Apply constraints element-wise to the dists array.
>>> result = np.vectorize(func)(dists) 
>>>
>>> result = np.where(result) # Get output.

You can also use np.argwhere instead of np.where for clear output. But that is your call :)

Hope it helps.


回答 4

尝试:

np.intersect1d(np.where(dists >= r)[0],np.where(dists <= r + dr)[0])

Try:

np.intersect1d(np.where(dists >= r)[0],np.where(dists <= r + dr)[0])

回答 5

这应该工作:

dists[((dists >= r) & (dists <= r+dr))]

最优雅的方式~~

This should work:

dists[((dists >= r) & (dists <= r+dr))]

The most elegant way~~


回答 6

尝试:

import numpy as np
dist = np.array([1,2,3,4,5])
r = 2
dr = 3
np.where(np.logical_and(dist> r, dist<=r+dr))

输出:(array([2,3]),)

您可以查看逻辑功能以获取更多详细信息。

Try:

import numpy as np
dist = np.array([1,2,3,4,5])
r = 2
dr = 3
np.where(np.logical_and(dist> r, dist<=r+dr))

Output: (array([2, 3]),)

You can see Logic functions for more details.


回答 7

我已经解决了这个简单的例子

import numpy as np

ar = np.array([3,4,5,14,2,4,3,7])

print [X for X in list(ar) if (X >= 3 and X <= 6)]

>>> 
[3, 4, 5, 4, 3]

I have worked out this simple example

import numpy as np

ar = np.array([3,4,5,14,2,4,3,7])

print [X for X in list(ar) if (X >= 3 and X <= 6)]

>>> 
[3, 4, 5, 4, 3]

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。