Python 实用宝典

Question 1

The documentation doesn’t guarantee that. Is there any other place that it is documented?

I’m guessing it might be stable since the sort method on lists is guaranteed to be stable (Notes 9th point: “Starting with Python 2.3, the sort() method is guaranteed to be stable”), and sorted is functionally similar. However, I’m not able to find any definitive source that says so.

Purpose: I need to sort based on a primary key and also a secondary key in cases where the primary key is equal in both records. If sorted() is guaranteed to be stable, I can sort on the secondary key, then sort on the primary key and get the result I need.

PS: To avoid any confusion, I’m using stable in the sense of “a sort is stable if it guarantees not to change the relative order of elements that compare equal”.

Question 2

Yes, the intention of the manual is indeed to guarantee that sorted is stable and indeed that it uses exactly the same algorithm as the sort method. I do realize that the docs aren’t 100% clear about this identity; doc patches are always happily accepted!

Question 3

They are stable.

By the way: you sometimes can ignore knowing whether sort and sorted are stable, by combining a multi-pass sort in a single-pass one.

For example, if you want to sort objects based on their last_name, first_name attributes, you can do it in one pass:

sorted_list= sorted(
    your_sequence_of_items,
    key= lambda item: (item.last_name, item.first_name))

taking advantage of tuple comparison.

This answer, as-is, covers the original question. For further sorting-related questions, there is the Python Sorting How-To.

Question 4

The documentation changed in the meantime (relevant commit) and the current documentation of sorted explicitly guarantees it:

The built-in sorted() function is guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes (for example, sort by department, then by salary grade).

This part of the documentation was added to Python 2.7 and Python 3.4(+) so any compliant implementation of that language version should have a stable sorted.

Note that for CPython the list.sort has been stable since Python 2.3

Tim Peters rewrote his list.sort() implementation – this one is a “stable sort” (equal inputs appear in the same order in the output) and faster than before.

I’m not 100% sure on sorted, nowadays it simple uses list.sort, but I haven’t checked the history for that. But it’s likely that it “always” used list.sort.

Question 5

The “What’s New” docs for Python 2.4 effectively make the point that sorted() first creates a list, then calls sort() on it, providing you with the guarantee you need though not in the “official” docs. You could also just check the source, if you’re really concerned.

Question 6

The Python 3.6 doc on sorting now states that

Sorts are guaranteed to be stable

Furthermore, in that document, there is a link to the stable Timsort, which states that

Timsort has been Python’s standard sorting algorithm since version 2.3

Question 7

I have a numerical list:

myList = [1, 2, 3, 100, 5]

Now if I sort this list to obtain [1, 2, 3, 5, 100]. What I want is the indices of the elements from the original list in the sorted order i.e. [0, 1, 2, 4, 3] — ala MATLAB’s sort function that returns both values and indices.

Question 8

If you are using numpy, you have the argsort() function available:

>>> import numpy
>>> numpy.argsort(myList)
array([0, 1, 2, 4, 3])

http://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html

This returns the arguments that would sort the array or list.

Question 9

Something like next:

>>> myList = [1, 2, 3, 100, 5]
>>> [i[0] for i in sorted(enumerate(myList), key=lambda x:x[1])]
[0, 1, 2, 4, 3]

enumerate(myList) gives you a list containing tuples of (index, value):

[(0, 1), (1, 2), (2, 3), (3, 100), (4, 5)]

You sort the list by passing it to sorted and specifying a function to extract the sort key (the second element of each tuple; that’s what the lambda is for. Finally, the original index of each sorted element is extracted using the [i[0] for i in ...] list comprehension.

Question 10

myList = [1, 2, 3, 100, 5]    
sorted(range(len(myList)),key=myList.__getitem__)

[0, 1, 2, 4, 3]

Question 11

The answers with enumerate are nice, but I personally don’t like the lambda used to sort by the value. The following just reverses the index and the value, and sorts that. So it’ll first sort by value, then by index.

sorted((e,i) for i,e in enumerate(myList))

Question 12

Updated answer with enumerate and itemgetter:

sorted(enumerate(a), key=lambda x: x[1])
# [(0, 1), (1, 2), (2, 3), (4, 5), (3, 100)]

Zip the lists together: The first element in the tuple will the index, the second is the value (then sort it using the second value of the tuple x[1], x is the tuple)

Or using itemgetter from the operatormodule`:

from operator import itemgetter
sorted(enumerate(a), key=itemgetter(1))

Question 13

I did a quick performance check on these with perfplot (a project of mine) and found that it’s hard to recommend anything else but numpy (note the log scale):

Code to reproduce the plot:

import perfplot
import numpy


def sorted_enumerate(seq):
    return [i for (v, i) in sorted((v, i) for (i, v) in enumerate(seq))]


def sorted_enumerate_key(seq):
    return [x for x, y in sorted(enumerate(seq), key=lambda x: x[1])]


def sorted_range(seq):
    return sorted(range(len(seq)), key=seq.__getitem__)


def numpy_argsort(x):
    return numpy.argsort(x)


perfplot.save(
    "argsort.png",
    setup=lambda n: numpy.random.rand(n),
    kernels=[sorted_enumerate, sorted_enumerate_key, sorted_range, numpy_argsort],
    n_range=[2 ** k for k in range(15)],
    xlabel="len(x)",
)

Question 14

If you do not want to use numpy,

sorted(range(len(seq)), key=seq.__getitem__)

is fastest, as demonstrated here.

Question 15

Essentially you need to do an argsort, what implementation you need depends if you want to use external libraries (e.g. NumPy) or if you want to stay pure-Python without dependencies.

The question you need to ask yourself is: Do you want the

indices that would sort the array/list
indices that the elements would have in the sorted array/list

Unfortunately the example in the question doesn’t make it clear what is desired because both will give the same result:

>>> arr = np.array([1, 2, 3, 100, 5])

>>> np.argsort(np.argsort(arr))
array([0, 1, 2, 4, 3], dtype=int64)

>>> np.argsort(arr)
array([0, 1, 2, 4, 3], dtype=int64)

Choosing the `argsort` implementation

If you have NumPy at your disposal you can simply use the function numpy.argsort or method numpy.ndarray.argsort.

An implementation without NumPy was mentioned in some other answers already, so I’ll just recap the fastest solution according to the benchmark answer here

def argsort(l):
    return sorted(range(len(l)), key=l.__getitem__)

Getting the indices that would sort the array/list

To get the indices that would sort the array/list you can simply call argsort on the array or list. I’m using the NumPy versions here but the Python implementation should give the same results

>>> arr = np.array([3, 1, 2, 4])
>>> np.argsort(arr)
array([1, 2, 0, 3], dtype=int64)

The result contains the indices that are needed to get the sorted array.

Since the sorted array would be [1, 2, 3, 4] the argsorted array contains the indices of these elements in the original.

The smallest value is 1 and it is at index 1 in the original so the first element of the result is 1.
The 2 is at index 2 in the original so the second element of the result is 2.
The 3 is at index 0 in the original so the third element of the result is 0.
The largest value 4 and it is at index 3 in the original so the last element of the result is 3.

Getting the indices that the elements would have in the sorted array/list

In this case you would need to apply argsort twice:

>>> arr = np.array([3, 1, 2, 4])
>>> np.argsort(np.argsort(arr))
array([2, 0, 1, 3], dtype=int64)

In this case :

the first element of the original is 3, which is the third largest value so it would have index 2 in the sorted array/list so the first element is 2.
the second element of the original is 1, which is the smallest value so it would have index 0 in the sorted array/list so the second element is 0.
the third element of the original is 2, which is the second-smallest value so it would have index 1 in the sorted array/list so the third element is 1.
the fourth element of the original is 4 which is the largest value so it would have index 3 in the sorted array/list so the last element is 3.

Question 16

The other answers are WRONG.

Running argsort once is not the solution. For example, the following code:

import numpy as np
x = [3,1,2]
np.argsort(x)

yields array([1, 2, 0], dtype=int64) which is not what we want.

The answer should be to run argsort twice:

import numpy as np
x = [3,1,2]
np.argsort(np.argsort(x))

gives array([2, 0, 1], dtype=int64) as expected.

Question 17

Import numpy as np

FOR INDEX

S=[11,2,44,55,66,0,10,3,33]

r=np.argsort(S)

[output]=array([5, 1, 7, 6, 0, 8, 2, 3, 4])

argsort Returns the indices of S in sorted order

FOR VALUE

np.sort(S)

[output]=array([ 0,  2,  3, 10, 11, 33, 44, 55, 66])

Question 18

We will create another array of indexes from 0 to n-1 Then zip this to the original array and then sort it on the basis of the original values

ar = [1,2,3,4,5]
new_ar = list(zip(ar,[i for i in range(len(ar))]))
new_ar.sort()

`

Python 实用宝典

标签归档：sorted

python的sorted（）函数是否保证稳定？

问题：python的sorted（）函数是否保证稳定？

回答 0

回答 1

回答 2

回答 3

回答 4

如何在Python中获取排序数组的索引

问题：如何在Python中获取排序数组的索引

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

选择`argsort`实施

获取将对数组/列表进行排序的索引

获取元素在排序数组/列表中的索引

Choosing the `argsort` implementation

Getting the indices that would sort the array/list

Getting the indices that the elements would have in the sorted array/list

回答 8

回答 9

回答 10

有趣好用的Python教程

问题：python的sorted（）函数是否保证稳定？

回答 0

回答 1

回答 2

回答 3

回答 4

问题：如何在Python中获取排序数组的索引

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

选择argsort实施

获取将对数组/列表进行排序的索引

获取元素在排序数组/列表中的索引

Choosing the argsort implementation

Getting the indices that would sort the array/list

Getting the indices that the elements would have in the sorted array/list

回答 8

回答 9

回答 10

有趣好用的Python教程

选择`argsort`实施

Choosing the `argsort` implementation