从列表或元组中明确选择项目

问题:从列表或元组中明确选择项目

我有以下Python列表(也可以是元组):

myList = ['foo', 'bar', 'baz', 'quux']

我可以说

>>> myList[0:3]
['foo', 'bar', 'baz']
>>> myList[::2]
['foo', 'baz']
>>> myList[1::2]
['bar', 'quux']

如何显式挑选索引没有特定模式的项目?例如,我要选择[0,2,3]。或者,从1000个很大的清单中,我要选择[87, 342, 217, 998, 500]。是否有一些Python语法可以做到这一点?看起来像这样:

>>> myBigList[87, 342, 217, 998, 500]

I have the following Python list (can also be a tuple):

myList = ['foo', 'bar', 'baz', 'quux']

I can say

>>> myList[0:3]
['foo', 'bar', 'baz']
>>> myList[::2]
['foo', 'baz']
>>> myList[1::2]
['bar', 'quux']

How do I explicitly pick out items whose indices have no specific patterns? For example, I want to select [0,2,3]. Or from a very big list of 1000 items, I want to select [87, 342, 217, 998, 500]. Is there some Python syntax that does that? Something that looks like:

>>> myBigList[87, 342, 217, 998, 500]

回答 0

list( myBigList[i] for i in [87, 342, 217, 998, 500] )

我将答案与python 2.5.2进行了比较:

  • 19.7微秒: [ myBigList[i] for i in [87, 342, 217, 998, 500] ]

  • 20.6 USEC: map(myBigList.__getitem__, (87, 342, 217, 998, 500))

  • 22.7 USEC: itemgetter(87, 342, 217, 998, 500)(myBigList)

  • 24.6 USEC: list( myBigList[i] for i in [87, 342, 217, 998, 500] )

请注意,在Python 3中,第1个已更改为与第4个相同。


另一种选择是以a开头,numpy.array它允许通过列表或a进行索引numpy.array

>>> import numpy
>>> myBigList = numpy.array(range(1000))
>>> myBigList[(87, 342, 217, 998, 500)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: invalid index
>>> myBigList[[87, 342, 217, 998, 500]]
array([ 87, 342, 217, 998, 500])
>>> myBigList[numpy.array([87, 342, 217, 998, 500])]
array([ 87, 342, 217, 998, 500])

tuple不工作方式相同那些片。

list( myBigList[i] for i in [87, 342, 217, 998, 500] )

I compared the answers with python 2.5.2:

  • 19.7 usec: [ myBigList[i] for i in [87, 342, 217, 998, 500] ]

  • 20.6 usec: map(myBigList.__getitem__, (87, 342, 217, 998, 500))

  • 22.7 usec: itemgetter(87, 342, 217, 998, 500)(myBigList)

  • 24.6 usec: list( myBigList[i] for i in [87, 342, 217, 998, 500] )

Note that in Python 3, the 1st was changed to be the same as the 4th.


Another option would be to start out with a numpy.array which allows indexing via a list or a numpy.array:

>>> import numpy
>>> myBigList = numpy.array(range(1000))
>>> myBigList[(87, 342, 217, 998, 500)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: invalid index
>>> myBigList[[87, 342, 217, 998, 500]]
array([ 87, 342, 217, 998, 500])
>>> myBigList[numpy.array([87, 342, 217, 998, 500])]
array([ 87, 342, 217, 998, 500])

The tuple doesn’t work the same way as those are slices.


回答 1

那这个呢:

from operator import itemgetter
itemgetter(0,2,3)(myList)
('foo', 'baz', 'quux')

What about this:

from operator import itemgetter
itemgetter(0,2,3)(myList)
('foo', 'baz', 'quux')

回答 2

它不是内置的,但是如果您愿意,可以创建一个将元组作为“索引”的list的子类:

class MyList(list):

    def __getitem__(self, index):
        if isinstance(index, tuple):
            return [self[i] for i in index]
        return super(MyList, self).__getitem__(index)


seq = MyList("foo bar baaz quux mumble".split())
print seq[0]
print seq[2,4]
print seq[1::2]

印刷

foo
['baaz', 'mumble']
['bar', 'quux']

It isn’t built-in, but you can make a subclass of list that takes tuples as “indexes” if you’d like:

class MyList(list):

    def __getitem__(self, index):
        if isinstance(index, tuple):
            return [self[i] for i in index]
        return super(MyList, self).__getitem__(index)


seq = MyList("foo bar baaz quux mumble".split())
print seq[0]
print seq[2,4]
print seq[1::2]

printing

foo
['baaz', 'mumble']
['bar', 'quux']

回答 3

也许列表理解是按顺序进行的:

L = ['a', 'b', 'c', 'd', 'e', 'f']
print [ L[index] for index in [1,3,5] ]

生成:

['b', 'd', 'f']

那是您要找的东西吗?

Maybe a list comprehension is in order:

L = ['a', 'b', 'c', 'd', 'e', 'f']
print [ L[index] for index in [1,3,5] ]

Produces:

['b', 'd', 'f']

Is that what you are looking for?


回答 4

>>> map(myList.__getitem__, (2,2,1,3))
('baz', 'baz', 'bar', 'quux')

您也可以创建自己的List类,该类支持将元组用作__getitem__要执行的操作的参数myList[(2,2,1,3)]

>>> map(myList.__getitem__, (2,2,1,3))
('baz', 'baz', 'bar', 'quux')

You can also create your own List class which supports tuples as arguments to __getitem__ if you want to be able to do myList[(2,2,1,3)].


回答 5

我只想指出,即使itemgetter的语法看起来也很整洁,但是在大型列表上执行时有点慢。

import timeit
from operator import itemgetter
start=timeit.default_timer()
for i in range(1000000):
    itemgetter(0,2,3)(myList)
print ("Itemgetter took ", (timeit.default_timer()-start))

物品获取者1.065209062149279

start=timeit.default_timer()
for i in range(1000000):
    myList[0],myList[2],myList[3]
print ("Multiple slice took ", (timeit.default_timer()-start))

多个切片花费0.6225321444745759

I just want to point out, even syntax of itemgetter looks really neat, but it’s kinda slow when perform on large list.

import timeit
from operator import itemgetter
start=timeit.default_timer()
for i in range(1000000):
    itemgetter(0,2,3)(myList)
print ("Itemgetter took ", (timeit.default_timer()-start))

Itemgetter took 1.065209062149279

start=timeit.default_timer()
for i in range(1000000):
    myList[0],myList[2],myList[3]
print ("Multiple slice took ", (timeit.default_timer()-start))

Multiple slice took 0.6225321444745759


回答 6

另一个可能的解决方案:

sek=[]
L=[1,2,3,4,5,6,7,8,9,0]
for i in [2, 4, 7, 0, 3]:
   a=[L[i]]
   sek=sek+a
print (sek)

Another possible solution:

sek=[]
L=[1,2,3,4,5,6,7,8,9,0]
for i in [2, 4, 7, 0, 3]:
   a=[L[i]]
   sek=sek+a
print (sek)

回答 7

当你有一个布尔numpy数组时,就像 mask

[mylist[i] for i in np.arange(len(mask), dtype=int)[mask]]

适用于任何序列或np.array的lambda:

subseq = lambda myseq, mask : [myseq[i] for i in np.arange(len(mask), dtype=int)[mask]]

newseq = subseq(myseq, mask)

like often when you have a boolean numpy array like mask

[mylist[i] for i in np.arange(len(mask), dtype=int)[mask]]

A lambda that works for any sequence or np.array:

subseq = lambda myseq, mask : [myseq[i] for i in np.arange(len(mask), dtype=int)[mask]]

newseq = subseq(myseq, mask)