使用python list comprehension根据条件查找元素的索引

问题:使用python list comprehension根据条件查找元素的索引

当来自Matlab背景时,以下Python代码似乎很冗长

>>> a = [1, 2, 3, 1, 2, 3]
>>> [index for index,value in enumerate(a) if value > 2]
[2, 5]

在Matlab中,我可以写:

>> a = [1, 2, 3, 1, 2, 3];
>> find(a>2)
ans =
     3     6

是否有使用Python编写此代码的简便方法,还是只保留长版本?


感谢您对Python语法原理的所有建议和解释。

在numpy网站上找到以下内容后,我想我已经找到了喜欢的解决方案:

http://docs.scipy.org/doc/numpy/user/basics.indexing.html#boolean-or-mask-index-arrays

将来自该网站的信息应用于上述我的问题,将得到以下结果:

>>> from numpy import array
>>> a = array([1, 2, 3, 1, 2, 3])
>>> b = a>2 
array([False, False, True, False, False, True], dtype=bool)
>>> r = array(range(len(b)))
>>> r(b)
[2, 5]

然后,下面的内容应该可以工作(但是我手头没有Python解释器来对其进行测试):

class my_array(numpy.array):
    def find(self, b):
        r = array(range(len(b)))
        return r(b)


>>> a = my_array([1, 2, 3, 1, 2, 3])
>>> a.find(a>2)
[2, 5]

The following Python code appears to be very long winded when coming from a Matlab background

>>> a = [1, 2, 3, 1, 2, 3]
>>> [index for index,value in enumerate(a) if value > 2]
[2, 5]

When in Matlab I can write:

>> a = [1, 2, 3, 1, 2, 3];
>> find(a>2)
ans =
     3     6

Is there a short hand method of writing this in Python, or do I just stick with the long version?


Thank you for all the suggestions and explanation of the rationale for Python’s syntax.

After finding the following on the numpy website, I think I have found a solution I like:

http://docs.scipy.org/doc/numpy/user/basics.indexing.html#boolean-or-mask-index-arrays

Applying the information from that website to my problem above, would give the following:

>>> from numpy import array
>>> a = array([1, 2, 3, 1, 2, 3])
>>> b = a>2 
array([False, False, True, False, False, True], dtype=bool)
>>> r = array(range(len(b)))
>>> r(b)
[2, 5]

The following should then work (but I haven’t got a Python interpreter on hand to test it):

class my_array(numpy.array):
    def find(self, b):
        r = array(range(len(b)))
        return r(b)


>>> a = my_array([1, 2, 3, 1, 2, 3])
>>> a.find(a>2)
[2, 5]

回答 0

  • 在Python中,您根本不会为此使用索引,而只处理值[value for value in a if value > 2]。通常,处理索引意味着您没有采取最佳方法。

  • 如果确实需要类似于Matlab的API,则可以使用numpy,这是Python中用于多维数组和数值数学的软件包,受Matlab的启发很大。您将使用numpy数组而不是列表。

    >>> import numpy
    >>> a = numpy.array([1, 2, 3, 1, 2, 3])
    >>> a
    array([1, 2, 3, 1, 2, 3])
    >>> numpy.where(a > 2)
    (array([2, 5]),)
    >>> a > 2
    array([False, False,  True, False, False,  True], dtype=bool)
    >>> a[numpy.where(a > 2)]
    array([3, 3])
    >>> a[a > 2]
    array([3, 3])
  • In Python, you wouldn’t use indexes for this at all, but just deal with the values—[value for value in a if value > 2]. Usually dealing with indexes means you’re not doing something the best way.

  • If you do need an API similar to Matlab’s, you would use numpy, a package for multidimensional arrays and numerical math in Python which is heavily inspired by Matlab. You would be using a numpy array instead of a list.

    >>> import numpy
    >>> a = numpy.array([1, 2, 3, 1, 2, 3])
    >>> a
    array([1, 2, 3, 1, 2, 3])
    >>> numpy.where(a > 2)
    (array([2, 5]),)
    >>> a > 2
    array([False, False,  True, False, False,  True], dtype=bool)
    >>> a[numpy.where(a > 2)]
    array([3, 3])
    >>> a[a > 2]
    array([3, 3])
    

回答 1

其他方式:

>>> [i for i in range(len(a)) if a[i] > 2]
[2, 5]

通常,请记住,虽然这find是一个现成的函数,但列表推导是一个通用的解决方案,因此非常有效。没有什么可以阻止您find使用Python 编写函数并在以后根据需要使用它。即:

>>> def find_indices(lst, condition):
...   return [i for i, elem in enumerate(lst) if condition(elem)]
... 
>>> find_indices(a, lambda e: e > 2)
[2, 5]

请注意,我在这里使用列表来模仿Matlab。使用生成器和迭代器会更Pythonic。

Another way:

>>> [i for i in range(len(a)) if a[i] > 2]
[2, 5]

In general, remember that while find is a ready-cooked function, list comprehensions are a general, and thus very powerful solution. Nothing prevents you from writing a find function in Python and use it later as you wish. I.e.:

>>> def find_indices(lst, condition):
...   return [i for i, elem in enumerate(lst) if condition(elem)]
... 
>>> find_indices(a, lambda e: e > 2)
[2, 5]

Note that I’m using lists here to mimic Matlab. It would be more Pythonic to use generators and iterators.


回答 2

对我来说,效果很好:

>>> import numpy as np
>>> a = np.array([1, 2, 3, 1, 2, 3])
>>> np.where(a > 2)[0]
[2 5]

For me it works well:

>>> import numpy as np
>>> a = np.array([1, 2, 3, 1, 2, 3])
>>> np.where(a > 2)[0]
[2 5]

回答 3

也许另一个问题是,“一旦获得这些索引,您将如何处理这些索引?” 如果要使用它们创建另一个列表,那么在Python中,它们是不必要的中间步骤。如果想要所有与给定条件匹配的值,只需使用内置过滤器:

matchingVals = filter(lambda x : x>2, a)

或编写您自己的列表理解:

matchingVals = [x for x in a if x > 2]

如果要从列表中删除它们,那么Python的方法不一定是从列表中删除,而是像编写新列表一样编写列表理解,然后使用listvar[:]左侧的就地分配-侧:

a[:] = [x for x in a if x <= 2]

Matlab find之所以提供它,是因为其以数组为中心的模型通过使用数组索引选择项目而起作用。当然,您可以在Python中执行此操作,但更Pythonic的方式是使用迭代器和生成器,如@EliBendersky所述。

Maybe another question is, “what are you going to do with those indices once you get them?” If you are going to use them to create another list, then in Python, they are an unnecessary middle step. If you want all the values that match a given condition, just use the builtin filter:

matchingVals = filter(lambda x : x>2, a)

Or write your own list comprhension:

matchingVals = [x for x in a if x > 2]

If you want to remove them from the list, then the Pythonic way is not to necessarily remove from the list, but write a list comprehension as if you were creating a new list, and assigning back in-place using the listvar[:] on the left-hand-side:

a[:] = [x for x in a if x <= 2]

Matlab supplies find because its array-centric model works by selecting items using their array indices. You can do this in Python, certainly, but the more Pythonic way is using iterators and generators, as already mentioned by @EliBendersky.


回答 4

即使答案很晚:我认为这仍然是一个很好的问题,而且恕我直言,Python(没有其他库或工具包(例如numpy))仍然缺乏方便的方法来根据手动定义的过滤器访问列表元素的索引。

您可以手动定义一个提供该功能的功能:

def indices(list, filtr=lambda x: bool(x)):
    return [i for i,x in enumerate(list) if filtr(x)]

print(indices([1,0,3,5,1], lambda x: x==1))

Yield:[0,4]

在我的想象中,完美的方法将是创建列表的子类并添加索引作为类方法。这样,只需要使用filter方法:

class MyList(list):
    def __init__(self, *args):
        list.__init__(self, *args)
    def indices(self, filtr=lambda x: bool(x)):
        return [i for i,x in enumerate(self) if filtr(x)]

my_list = MyList([1,0,3,5,1])
my_list.indices(lambda x: x==1)

我在这里详细介绍了该主题:http//tinyurl.com/jajrr87

Even if it’s a late answer: I think this is still a very good question and IMHO Python (without additional libraries or toolkits like numpy) is still lacking a convenient method to access the indices of list elements according to a manually defined filter.

You could manually define a function, which provides that functionality:

def indices(list, filtr=lambda x: bool(x)):
    return [i for i,x in enumerate(list) if filtr(x)]

print(indices([1,0,3,5,1], lambda x: x==1))

Yields: [0, 4]

In my imagination the perfect way would be making a child class of list and adding the indices function as class method. In this way only the filter method would be needed:

class MyList(list):
    def __init__(self, *args):
        list.__init__(self, *args)
    def indices(self, filtr=lambda x: bool(x)):
        return [i for i,x in enumerate(self) if filtr(x)]

my_list = MyList([1,0,3,5,1])
my_list.indices(lambda x: x==1)

I elaborated a bit more on that topic here: http://tinyurl.com/jajrr87