如何在Python中获取排序数组的索引

问题:如何在Python中获取排序数组的索引

我有一个数字列表:

myList = [1, 2, 3, 100, 5]

现在,如果我对该列表进行排序以获得[1, 2, 3, 5, 100]。我想要的是按排序顺序排列的原始列表中元素的索引,即[0, 1, 2, 4, 3] — ala MATLAB的sort函数,它既返回值又返回索引。

I have a numerical list:

myList = [1, 2, 3, 100, 5]

Now if I sort this list to obtain [1, 2, 3, 5, 100]. What I want is the indices of the elements from the original list in the sorted order i.e. [0, 1, 2, 4, 3] — ala MATLAB’s sort function that returns both values and indices.


回答 0

如果使用的是numpy,则可以使用argsort()函数:

>>> import numpy
>>> numpy.argsort(myList)
array([0, 1, 2, 4, 3])

http://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html

这将返回对数组或列表进行排序的参数。

If you are using numpy, you have the argsort() function available:

>>> import numpy
>>> numpy.argsort(myList)
array([0, 1, 2, 4, 3])

http://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html

This returns the arguments that would sort the array or list.


回答 1

如下所示:

>>> myList = [1, 2, 3, 100, 5]
>>> [i[0] for i in sorted(enumerate(myList), key=lambda x:x[1])]
[0, 1, 2, 4, 3]

enumerate(myList) 给您一个包含(索引,值)元组的列表:

[(0, 1), (1, 2), (2, 3), (3, 100), (4, 5)]

您可以通过将列表传递给sorted并指定一个函数来提取排序键(每个元组的第二个元素;这就是它的lambda目的)对列表进行排序。最后,使用[i[0] for i in ...]列表推导来提取每个已排序元素的原始索引。

Something like next:

>>> myList = [1, 2, 3, 100, 5]
>>> [i[0] for i in sorted(enumerate(myList), key=lambda x:x[1])]
[0, 1, 2, 4, 3]

enumerate(myList) gives you a list containing tuples of (index, value):

[(0, 1), (1, 2), (2, 3), (3, 100), (4, 5)]

You sort the list by passing it to sorted and specifying a function to extract the sort key (the second element of each tuple; that’s what the lambda is for. Finally, the original index of each sorted element is extracted using the [i[0] for i in ...] list comprehension.


回答 2

myList = [1, 2, 3, 100, 5]    
sorted(range(len(myList)),key=myList.__getitem__)

[0, 1, 2, 4, 3]
myList = [1, 2, 3, 100, 5]    
sorted(range(len(myList)),key=myList.__getitem__)

[0, 1, 2, 4, 3]

回答 3

答案enumerate很好,但我个人不喜欢用于按值排序的lambda。以下只是反转索引和值,并对它们进行排序。因此,它将首先按值排序,然后按索引排序。

sorted((e,i) for i,e in enumerate(myList))

The answers with enumerate are nice, but I personally don’t like the lambda used to sort by the value. The following just reverses the index and the value, and sorts that. So it’ll first sort by value, then by index.

sorted((e,i) for i,e in enumerate(myList))

回答 4

使用enumerate和更新了答案itemgetter

sorted(enumerate(a), key=lambda x: x[1])
# [(0, 1), (1, 2), (2, 3), (4, 5), (3, 100)]

将列表压缩在一起:元组中的第一个元素将是索引,第二个是值(然后使用元组的第二个值对其进行排序x[1],x是元组)

或者用itemgetteroperatormodule`:

from operator import itemgetter
sorted(enumerate(a), key=itemgetter(1))

Updated answer with enumerate and itemgetter:

sorted(enumerate(a), key=lambda x: x[1])
# [(0, 1), (1, 2), (2, 3), (4, 5), (3, 100)]

Zip the lists together: The first element in the tuple will the index, the second is the value (then sort it using the second value of the tuple x[1], x is the tuple)

Or using itemgetter from the operatormodule`:

from operator import itemgetter
sorted(enumerate(a), key=itemgetter(1))

回答 5

我使用perfplot(我的一个项目)对这些进行了快速性能检查,发现很难推荐除numpy之外的其他任何东西(请注意对数刻度):


复制剧情的代码:

import perfplot
import numpy


def sorted_enumerate(seq):
    return [i for (v, i) in sorted((v, i) for (i, v) in enumerate(seq))]


def sorted_enumerate_key(seq):
    return [x for x, y in sorted(enumerate(seq), key=lambda x: x[1])]


def sorted_range(seq):
    return sorted(range(len(seq)), key=seq.__getitem__)


def numpy_argsort(x):
    return numpy.argsort(x)


perfplot.save(
    "argsort.png",
    setup=lambda n: numpy.random.rand(n),
    kernels=[sorted_enumerate, sorted_enumerate_key, sorted_range, numpy_argsort],
    n_range=[2 ** k for k in range(15)],
    xlabel="len(x)",
)

I did a quick performance check on these with perfplot (a project of mine) and found that it’s hard to recommend anything else but numpy (note the log scale):


Code to reproduce the plot:

import perfplot
import numpy


def sorted_enumerate(seq):
    return [i for (v, i) in sorted((v, i) for (i, v) in enumerate(seq))]


def sorted_enumerate_key(seq):
    return [x for x, y in sorted(enumerate(seq), key=lambda x: x[1])]


def sorted_range(seq):
    return sorted(range(len(seq)), key=seq.__getitem__)


def numpy_argsort(x):
    return numpy.argsort(x)


perfplot.save(
    "argsort.png",
    setup=lambda n: numpy.random.rand(n),
    kernels=[sorted_enumerate, sorted_enumerate_key, sorted_range, numpy_argsort],
    n_range=[2 ** k for k in range(15)],
    xlabel="len(x)",
)

回答 6

如果您不想使用numpy,

sorted(range(len(seq)), key=seq.__getitem__)

是最快的,这表现在这里

If you do not want to use numpy,

sorted(range(len(seq)), key=seq.__getitem__)

is fastest, as demonstrated here.


回答 7

本质上,您需要argsort执行,所需的实现取决于您是要使用外部库(例如NumPy)还是要保持纯Python的依赖关系。

您需要问自己的问题是:您是否想要

  • 将数组/列表排序的索引
  • 元素在排序数组/列表中将具有的索引

不幸的是,问题中的示例并未明确说明所需的内容,因为两者都会给出相同的结果:

>>> arr = np.array([1, 2, 3, 100, 5])

>>> np.argsort(np.argsort(arr))
array([0, 1, 2, 4, 3], dtype=int64)

>>> np.argsort(arr)
array([0, 1, 2, 4, 3], dtype=int64)

选择argsort实施

如果您可以使用NumPy,则只需使用该函数numpy.argsort或方法即可numpy.ndarray.argsort

已经在其他一些答案中提到了没有NumPy的实现,因此我将根据此处的基准答案来概述最快的解决方案

def argsort(l):
    return sorted(range(len(l)), key=l.__getitem__)

获取将对数组/列表进行排序的索引

要获取对数组/列表进行排序的索引,您只需调用argsort数组或列表即可。我在这里使用的是NumPy版本,但是Python实现应该给出相同的结果

>>> arr = np.array([3, 1, 2, 4])
>>> np.argsort(arr)
array([1, 2, 0, 3], dtype=int64)

结果包含获取排序数组所需的索引。

由于排序数组将是[1, 2, 3, 4]argsorted数组,因此包含原始元素中这些元素的索引。

  • 最小值为1,它1在原始索引中为index ,因此结果的第一个元素为1
  • 由于2at 2是原始索引的索引,因此结果的第二个元素是2
  • 由于3at 0是原始索引的索引,因此结果的第三个元素是0
  • 最大值4,它3在原始索引中,因此结果的最后一个元素是3

获取元素在排序数组/列表中的索引

在这种情况下,您需要申请argsort 两次

>>> arr = np.array([3, 1, 2, 4])
>>> np.argsort(np.argsort(arr))
array([2, 0, 1, 3], dtype=int64)

在这种情况下 :

  • 原始元素的第一个元素是3,这是第三个最大值,因此它将2在排序后的数组/列表中具有索引,因此第一个元素是2
  • 原始元素的第二个元素是1,这是最小值,因此它将0在排序后的数组/列表中具有索引,因此第二个元素是0
  • 原始元素的第三个元素是2,这是第二个最小的值,因此它将1在排序后的数组/列表中具有索引,因此第三个元素是1
  • 原始元素的第四个元素4是最大值,因此它将3在排序后的数组/列表中具有索引,因此最后一个元素是3

Essentially you need to do an argsort, what implementation you need depends if you want to use external libraries (e.g. NumPy) or if you want to stay pure-Python without dependencies.

The question you need to ask yourself is: Do you want the

  • indices that would sort the array/list
  • indices that the elements would have in the sorted array/list

Unfortunately the example in the question doesn’t make it clear what is desired because both will give the same result:

>>> arr = np.array([1, 2, 3, 100, 5])

>>> np.argsort(np.argsort(arr))
array([0, 1, 2, 4, 3], dtype=int64)

>>> np.argsort(arr)
array([0, 1, 2, 4, 3], dtype=int64)

Choosing the argsort implementation

If you have NumPy at your disposal you can simply use the function numpy.argsort or method numpy.ndarray.argsort.

An implementation without NumPy was mentioned in some other answers already, so I’ll just recap the fastest solution according to the benchmark answer here

def argsort(l):
    return sorted(range(len(l)), key=l.__getitem__)

Getting the indices that would sort the array/list

To get the indices that would sort the array/list you can simply call argsort on the array or list. I’m using the NumPy versions here but the Python implementation should give the same results

>>> arr = np.array([3, 1, 2, 4])
>>> np.argsort(arr)
array([1, 2, 0, 3], dtype=int64)

The result contains the indices that are needed to get the sorted array.

Since the sorted array would be [1, 2, 3, 4] the argsorted array contains the indices of these elements in the original.

  • The smallest value is 1 and it is at index 1 in the original so the first element of the result is 1.
  • The 2 is at index 2 in the original so the second element of the result is 2.
  • The 3 is at index 0 in the original so the third element of the result is 0.
  • The largest value 4 and it is at index 3 in the original so the last element of the result is 3.

Getting the indices that the elements would have in the sorted array/list

In this case you would need to apply argsort twice:

>>> arr = np.array([3, 1, 2, 4])
>>> np.argsort(np.argsort(arr))
array([2, 0, 1, 3], dtype=int64)

In this case :

  • the first element of the original is 3, which is the third largest value so it would have index 2 in the sorted array/list so the first element is 2.
  • the second element of the original is 1, which is the smallest value so it would have index 0 in the sorted array/list so the second element is 0.
  • the third element of the original is 2, which is the second-smallest value so it would have index 1 in the sorted array/list so the third element is 1.
  • the fourth element of the original is 4 which is the largest value so it would have index 3 in the sorted array/list so the last element is 3.

回答 8

其他答案是错误的。

运行argsort一次不是解决方案。例如,以下代码:

import numpy as np
x = [3,1,2]
np.argsort(x)

Yieldarray([1, 2, 0], dtype=int64)不是我们想要的。

答案应该是运行argsort两次:

import numpy as np
x = [3,1,2]
np.argsort(np.argsort(x))

给出array([2, 0, 1], dtype=int64)预期。

The other answers are WRONG.

Running argsort once is not the solution. For example, the following code:

import numpy as np
x = [3,1,2]
np.argsort(x)

yields array([1, 2, 0], dtype=int64) which is not what we want.

The answer should be to run argsort twice:

import numpy as np
x = [3,1,2]
np.argsort(np.argsort(x))

gives array([2, 0, 1], dtype=int64) as expected.


回答 9

将numpy导入为np

索引

S=[11,2,44,55,66,0,10,3,33]

r=np.argsort(S)

[output]=array([5, 1, 7, 6, 0, 8, 2, 3, 4])

argsort按排序顺序返回S的索引

物有所值

np.sort(S)

[output]=array([ 0,  2,  3, 10, 11, 33, 44, 55, 66])

Import numpy as np

FOR INDEX

S=[11,2,44,55,66,0,10,3,33]

r=np.argsort(S)

[output]=array([5, 1, 7, 6, 0, 8, 2, 3, 4])

argsort Returns the indices of S in sorted order

FOR VALUE

np.sort(S)

[output]=array([ 0,  2,  3, 10, 11, 33, 44, 55, 66])

回答 10

我们将创建另一个从0到n-1的索引数组,然后将其压缩到原始数组,然后根据原始值对其进行排序

ar = [1,2,3,4,5]
new_ar = list(zip(ar,[i for i in range(len(ar))]))
new_ar.sort()

`

We will create another array of indexes from 0 to n-1 Then zip this to the original array and then sort it on the basis of the original values

ar = [1,2,3,4,5]
new_ar = list(zip(ar,[i for i in range(len(ar))]))
new_ar.sort()

`