标签归档:sorting

使用Python / NumPy对数组中的项目进行排名,而无需对数组进行两次排序

问题:使用Python / NumPy对数组中的项目进行排名,而无需对数组进行两次排序

我有一个数字数组,我想创建另一个数组,该数组代表第一个数组中每个项目的等级。我正在使用Python和NumPy。

例如:

array = [4,2,7,1]
ranks = [2,1,3,0]

这是我想出的最好方法:

array = numpy.array([4,2,7,1])
temp = array.argsort()
ranks = numpy.arange(len(array))[temp.argsort()]

有没有更好/更快的方法可以避免对数组进行两次排序?

I have an array of numbers and I’d like to create another array that represents the rank of each item in the first array. I’m using Python and NumPy.

For example:

array = [4,2,7,1]
ranks = [2,1,3,0]

Here’s the best method I’ve come up with:

array = numpy.array([4,2,7,1])
temp = array.argsort()
ranks = numpy.arange(len(array))[temp.argsort()]

Are there any better/faster methods that avoid sorting the array twice?


回答 0

在最后一步中,在左侧使用切片:

array = numpy.array([4,2,7,1])
temp = array.argsort()
ranks = numpy.empty_like(temp)
ranks[temp] = numpy.arange(len(array))

通过在最后一步中反转排列,可以避免两次排序。

Use slicing on the left-hand side in the last step:

array = numpy.array([4,2,7,1])
temp = array.argsort()
ranks = numpy.empty_like(temp)
ranks[temp] = numpy.arange(len(array))

This avoids sorting twice by inverting the permutation in the last step.


回答 1

使用argsort两次,首先获取数组的顺序,然后获取排名:

array = numpy.array([4,2,7,1])
order = array.argsort()
ranks = order.argsort()

在处理2D(或更高维)数组时,请确保将轴参数传递给argsort以在正确的轴上排序。

Use argsort twice, first to obtain the order of the array, then to obtain ranking:

array = numpy.array([4,2,7,1])
order = array.argsort()
ranks = order.argsort()

When dealing with 2D (or higher dimensional) arrays, be sure to pass an axis argument to argsort to order over the correct axis.


回答 2

这个问题已有几年历史了,可以接受的答案很好,但是我认为以下仍然值得一提。如果您不介意对的依赖scipy,则可以使用scipy.stats.rankdata

In [22]: from scipy.stats import rankdata

In [23]: a = [4, 2, 7, 1]

In [24]: rankdata(a)
Out[24]: array([ 3.,  2.,  4.,  1.])

In [25]: (rankdata(a) - 1).astype(int)
Out[25]: array([2, 1, 3, 0])

的一个不错的功能rankdata是,该method参数提供了几种处理关系的选项。例如,在中有3次出现20次,两次出现40次b

In [26]: b = [40, 20, 70, 10, 20, 50, 30, 40, 20]

默认值将平均等级分配给绑定值:

In [27]: rankdata(b)
Out[27]: array([ 6.5,  3. ,  9. ,  1. ,  3. ,  8. ,  5. ,  6.5,  3. ])

method='ordinal' 分配连续等级:

In [28]: rankdata(b, method='ordinal')
Out[28]: array([6, 2, 9, 1, 3, 8, 5, 7, 4])

method='min' 将绑定值的最小等级分配给所有绑定值:

In [29]: rankdata(b, method='min')
Out[29]: array([6, 2, 9, 1, 2, 8, 5, 6, 2])

有关更多选项,请参见文档字符串。

This question is a few years old, and the accepted answer is great, but I think the following is still worth mentioning. If you don’t mind the dependency on scipy, you can use scipy.stats.rankdata:

In [22]: from scipy.stats import rankdata

In [23]: a = [4, 2, 7, 1]

In [24]: rankdata(a)
Out[24]: array([ 3.,  2.,  4.,  1.])

In [25]: (rankdata(a) - 1).astype(int)
Out[25]: array([2, 1, 3, 0])

A nice feature of rankdata is that the method argument provides several options for handling ties. For example, there are three occurrences of 20 and two occurrences of 40 in b:

In [26]: b = [40, 20, 70, 10, 20, 50, 30, 40, 20]

The default assigns the average rank to the tied values:

In [27]: rankdata(b)
Out[27]: array([ 6.5,  3. ,  9. ,  1. ,  3. ,  8. ,  5. ,  6.5,  3. ])

method='ordinal' assigns consecutive ranks:

In [28]: rankdata(b, method='ordinal')
Out[28]: array([6, 2, 9, 1, 3, 8, 5, 7, 4])

method='min' assigns the minimum rank of the tied values to all the tied values:

In [29]: rankdata(b, method='min')
Out[29]: array([6, 2, 9, 1, 2, 8, 5, 6, 2])

See the docstring for more options.


回答 3

我试图将两个以上的解决方案扩展到一个以上维度的数组A,假设您逐行处理数组(轴= 1)。

我用行循环扩展了第一个代码;可能可以改善

temp = A.argsort(axis=1)
rank = np.empty_like(temp)
rangeA = np.arange(temp.shape[1])
for iRow in xrange(temp.shape[0]): 
    rank[iRow, temp[iRow,:]] = rangeA

根据k.rooijers的建议,第二个变为:

temp = A.argsort(axis=1)
rank = temp.argsort(axis=1)

我随机生成400个形状为(1000,100)的数组;第一个代码大约是7.5,第二个代码是3.8。

I tried to extend both solution for arrays A of more than one dimension, supposing you process your array row-by-row (axis=1).

I extended the first code with a loop on rows; probably it can be improved

temp = A.argsort(axis=1)
rank = np.empty_like(temp)
rangeA = np.arange(temp.shape[1])
for iRow in xrange(temp.shape[0]): 
    rank[iRow, temp[iRow,:]] = rangeA

And the second one, following k.rooijers suggestion, becomes:

temp = A.argsort(axis=1)
rank = temp.argsort(axis=1)

I randomly generated 400 arrays with shape (1000,100); the first code took about 7.5, the second one 3.8.


回答 4

有关平均排名的矢量化版本,请参见下文。我喜欢np.unique,它确实扩大了可以有效地向量化代码的范围。除了避免python for循环外,这种方法还避免了对’a’的隐式双循环。

import numpy as np

a = np.array( [4,1,6,8,4,1,6])

a = np.array([4,2,7,2,1])
rank = a.argsort().argsort()

unique, inverse = np.unique(a, return_inverse = True)

unique_rank_sum = np.zeros_like(unique)
np.add.at(unique_rank_sum, inverse, rank)
unique_count = np.zeros_like(unique)
np.add.at(unique_count, inverse, 1)

unique_rank_mean = unique_rank_sum.astype(np.float) / unique_count

rank_mean = unique_rank_mean[inverse]

print rank_mean

For a vectorized version of an averaged rank, see below. I love np.unique, it really widens the scope of what code can and cannot be efficiently vectorized. Aside from avoiding python for-loops, this approach also avoids the implicit double loop over ‘a’.

import numpy as np

a = np.array( [4,1,6,8,4,1,6])

a = np.array([4,2,7,2,1])
rank = a.argsort().argsort()

unique, inverse = np.unique(a, return_inverse = True)

unique_rank_sum = np.zeros_like(unique)
np.add.at(unique_rank_sum, inverse, rank)
unique_count = np.zeros_like(unique)
np.add.at(unique_count, inverse, 1)

unique_rank_mean = unique_rank_sum.astype(np.float) / unique_count

rank_mean = unique_rank_mean[inverse]

print rank_mean

回答 5

除了解决方案的简洁和简短之外,还存在性能问题。这是一个小基准:

import numpy as np
from scipy.stats import rankdata
l = list(reversed(range(1000)))

%%timeit -n10000 -r5
x = (rankdata(l) - 1).astype(int)
>>> 128 µs ± 2.72 µs per loop (mean ± std. dev. of 5 runs, 10000 loops each)

%%timeit -n10000 -r5
a = np.array(l)
r = a.argsort().argsort()
>>> 69.1 µs ± 464 ns per loop (mean ± std. dev. of 5 runs, 10000 loops each)

%%timeit -n10000 -r5
a = np.array(l)
temp = a.argsort()
r = np.empty_like(temp)
r[temp] = np.arange(len(a))
>>> 63.7 µs ± 1.27 µs per loop (mean ± std. dev. of 5 runs, 10000 loops each)

Apart from the elegance and shortness of solutions, there is also the question of performance. Here is a little benchmark:

import numpy as np
from scipy.stats import rankdata
l = list(reversed(range(1000)))

%%timeit -n10000 -r5
x = (rankdata(l) - 1).astype(int)
>>> 128 µs ± 2.72 µs per loop (mean ± std. dev. of 5 runs, 10000 loops each)

%%timeit -n10000 -r5
a = np.array(l)
r = a.argsort().argsort()
>>> 69.1 µs ± 464 ns per loop (mean ± std. dev. of 5 runs, 10000 loops each)

%%timeit -n10000 -r5
a = np.array(l)
temp = a.argsort()
r = np.empty_like(temp)
r[temp] = np.arange(len(a))
>>> 63.7 µs ± 1.27 µs per loop (mean ± std. dev. of 5 runs, 10000 loops each)

回答 6

两次使用argsort()可以做到:

>>> array = [4,2,7,1]
>>> ranks = numpy.array(array).argsort().argsort()
>>> ranks
array([2, 1, 3, 0])

Use argsort() twice will do it:

>>> array = [4,2,7,1]
>>> ranks = numpy.array(array).argsort().argsort()
>>> ranks
array([2, 1, 3, 0])

回答 7

我尝试了上述方法,但失败了,因为我有很多zeore。是的,即使使用浮点数,重复项也可能很重要。

因此,我通过添加领带检查步骤编写了一个修改后的一维解决方案:

def ranks (v):
    import numpy as np
    t = np.argsort(v)
    r = np.empty(len(v),int)
    r[t] = np.arange(len(v))
    for i in xrange(1, len(r)):
        if v[t[i]] <= v[t[i-1]]: r[t[i]] = r[t[i-1]]
    return r

# test it
print sorted(zip(ranks(v), v))

我相信它会尽可能高效。

I tried the above methods, but failed because I had many zeores. Yes, even with floats duplicate items may be important.

So I wrote a modified 1D solution by adding a tie-checking step:

def ranks (v):
    import numpy as np
    t = np.argsort(v)
    r = np.empty(len(v),int)
    r[t] = np.arange(len(v))
    for i in xrange(1, len(r)):
        if v[t[i]] <= v[t[i-1]]: r[t[i]] = r[t[i-1]]
    return r

# test it
print sorted(zip(ranks(v), v))

I believe it’s as efficient as it can be.


回答 8

我喜欢k.rooijers的方法,但是正如rcoup所写,重复数字是根据数组位置进行排序的。这对我不利,因此我修改了版本以对等级进行后处理,并将所有重复的数字合并为合并的平均等级:

import numpy as np
a = np.array([4,2,7,2,1])
r = np.array(a.argsort().argsort(), dtype=float)
f = a==a
for i in xrange(len(a)):
   if not f[i]: continue
   s = a == a[i]
   ls = np.sum(s)
   if ls > 1:
      tr = np.sum(r[s])
      r[s] = float(tr)/ls
   f[s] = False

print r  # array([ 3. ,  1.5,  4. ,  1.5,  0. ])

我希望这也可以对其他人有所帮助,我试图找到其他解决方案,但是找不到任何解决方案…

I liked the method by k.rooijers, but as rcoup wrote, repeated numbers are ranked according to array position. This was no good for me, so I modified the version to postprocess the ranks and merge any repeated numbers into a combined average rank:

import numpy as np
a = np.array([4,2,7,2,1])
r = np.array(a.argsort().argsort(), dtype=float)
f = a==a
for i in xrange(len(a)):
   if not f[i]: continue
   s = a == a[i]
   ls = np.sum(s)
   if ls > 1:
      tr = np.sum(r[s])
      r[s] = float(tr)/ls
   f[s] = False

print r  # array([ 3. ,  1.5,  4. ,  1.5,  0. ])

I hope this might help others too, I tried to find anothers solution to this, but couldn’t find any…


回答 9

argsort和slice是对称操作。

尝试两次切片,而不是argsort两次。因为slice比argsort快

array = numpy.array([4,2,7,1])
order = array.argsort()
ranks = np.arange(array.shape[0])[order][order]

argsort and slice are symmetry operations.

try slice twice instead of argsort twice. since slice is faster than argsort

array = numpy.array([4,2,7,1])
order = array.argsort()
ranks = np.arange(array.shape[0])[order][order]

回答 10

更通用的版本之一:

In [140]: x = np.random.randn(10, 3)

In [141]: i = np.argsort(x, axis=0)

In [142]: ranks = np.empty_like(i)

In [143]: np.put_along_axis(ranks, i, np.repeat(np.arange(x.shape[0])[:,None], x.shape[1], axis=1), axis=0)

请参阅如何将numpy.argsort()用作两个以上维度的索引?泛化为更多的暗淡。

More general version of one of the answers:

In [140]: x = np.random.randn(10, 3)

In [141]: i = np.argsort(x, axis=0)

In [142]: ranks = np.empty_like(i)

In [143]: np.put_along_axis(ranks, i, np.repeat(np.arange(x.shape[0])[:,None], x.shape[1], axis=1), axis=0)

See How to use numpy.argsort() as indices in more than 2 dimensions? to generalize to more dims.


如何在Python 3中使用自定义比较功能?

问题:如何在Python 3中使用自定义比较功能?

Python 2.x中,我可以将自定义函数传递给sort和.sort函数

>>> x=['kar','htar','har','ar']
>>>
>>> sorted(x)
['ar', 'har', 'htar', 'kar']
>>> 
>>> sorted(x,cmp=customsort)
['kar', 'htar', 'har', 'ar']

因为用我的语言,辅音是伴随着这个顺序

"k","kh",....,"ht",..."h",...,"a"

但是在Python 3.x中,看起来我无法传递cmp关键字

>>> sorted(x,cmp=customsort)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'cmp' is an invalid keyword argument for this function

有其他选择吗?或者我也应该编写自己的排序函数吗?

注意:我通过使用“ k”,“ kh”等进行了简化。实际字符是Unicode,甚至更复杂,有时在辅音前后都有元音,所以我完成了自定义比较功能,因此这一部分还可以。唯一的问题是我无法将自定义比较功能传递给sort或.sort

In Python 2.x, I could pass custom function to sorted and .sort functions

>>> x=['kar','htar','har','ar']
>>>
>>> sorted(x)
['ar', 'har', 'htar', 'kar']
>>> 
>>> sorted(x,cmp=customsort)
['kar', 'htar', 'har', 'ar']

Because, in My language, consonents are comes with this order

"k","kh",....,"ht",..."h",...,"a"

But In Python 3.x, looks like I could not pass cmp keyword

>>> sorted(x,cmp=customsort)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'cmp' is an invalid keyword argument for this function

Is there any alternatives or should I write my own sorted function too?

Note: I simplified by using “k”, “kh”, etc. Actual characters are Unicodes and even more complicated, sometimes there is vowels comes before and after consonents, I’ve done custom comparison function, So that part is ok. Only the problem is I could not pass my custom comparison function to sorted or .sort


回答 0

使用key参数(并按照配方上如何将旧的转换cmp功能的key功能)。

functoolscmp_to_keydocs.python.org/3.6/library/functools.html#functools.cmp_to_key中提到了一个功能

Use the key argument (and follow the recipe on how to convert your old cmp function to a key function).

functools has a function cmp_to_key mentioned at docs.python.org/3.6/library/functools.html#functools.cmp_to_key


回答 1

使用key关键字和functools.cmp_to_key转换比较功能:

sorted(x, key=functools.cmp_to_key(customsort))

Use the key keyword and functools.cmp_to_key to transform your comparison function:

sorted(x, key=functools.cmp_to_key(customsort))

回答 2

而不是customsort(),您需要一个函数来将每个单词转换为Python已经知道如何排序的东西。例如,您可以将每个单词转换为数字列表,其中每个数字代表每个字母在字母表中的位置。像这样:

my_alphabet = ['a', 'b', 'c']

def custom_key(word):
   numbers = []
   for letter in word:
      numbers.append(my_alphabet.index(letter))
   return numbers

x=['cbaba', 'ababa', 'bbaa']
x.sort(key=custom_key)

由于您的语言包括多字符字母,因此您的custom_key函数显然需要更加复杂。那应该给您大致的想法。

Instead of a customsort(), you need a function that translates each word into something that Python already knows how to sort. For example, you could translate each word into a list of numbers where each number represents where each letter occurs in your alphabet. Something like this:

my_alphabet = ['a', 'b', 'c']

def custom_key(word):
   numbers = []
   for letter in word:
      numbers.append(my_alphabet.index(letter))
   return numbers

x=['cbaba', 'ababa', 'bbaa']
x.sort(key=custom_key)

Since your language includes multi-character letters, your custom_key function will obviously need to be more complicated. That should give you the general idea though.


回答 3

完整的python3 cmp_to_key lambda示例:

from functools import cmp_to_key

nums = [28, 50, 17, 12, 121]
nums.sort(key=cmp_to_key(lambda x, y: 1 if str(x)+str(y) < str(y)+str(x) else -1))

与普通对象排序相比:

class NumStr:
    def __init__(self, v):
        self.v = v
    def __lt__(self, other):
        return self.v + other.v < other.v + self.v


A = [NumStr("12"), NumStr("121")]
A.sort()
print(A[0].v, A[1].v)

A = [obj.v for obj in A]
print(A)

A complete python3 cmp_to_key lambda example:

from functools import cmp_to_key

nums = [28, 50, 17, 12, 121]
nums.sort(key=cmp_to_key(lambda x, y: 1 if str(x)+str(y) < str(y)+str(x) else -1))

compare to common object sorting:

class NumStr:
    def __init__(self, v):
        self.v = v
    def __lt__(self, other):
        return self.v + other.v < other.v + self.v


A = [NumStr("12"), NumStr("121")]
A.sort()
print(A[0].v, A[1].v)

A = [obj.v for obj in A]
print(A)

回答 4

我不知道这是否有帮助,但是您可以签出该locale模块。看起来您可以将语言环境设置为您的语言,并使用locale.strcoll您的语言的排序规则来比较字符串。

I don’t know if this will help, but you may check out the locale module. It looks like you can set the locale to your language and use locale.strcoll to compare strings using your language’s sorting rules.


回答 5

请改用key参数。它采用一个函数,该函数接受要处理的值,并返回单个值,该值给出了用于排序的键。

sorted(x, key=somekeyfunc)

Use the key argument instead. It takes a function that takes the value being processed and returns a single value giving the key to use to sort by.

sorted(x, key=somekeyfunc)

熊猫groupby排序

问题:熊猫groupby排序

我想按两列对数据框进行分组,然后对各组中的汇总结果进行排序。

In [167]:
df

Out[167]:
count   job source
0   2   sales   A
1   4   sales   B
2   6   sales   C
3   3   sales   D
4   7   sales   E
5   5   market  A
6   3   market  B
7   2   market  C
8   4   market  D
9   1   market  E

In [168]:
df.groupby(['job','source']).agg({'count':sum})

Out[168]:
            count
job     source  
market  A   5
        B   3
        C   2
        D   4
        E   1
sales   A   2
        B   4
        C   6
        D   3
        E   7

现在,我想在每个组中按降序对计数列进行排序。然后只取前三行。得到类似的东西:

            count
job     source  
market  A   5
        D   4
        B   3
sales   E   7
        C   6
        B   4

I want to group my dataframe by two columns and then sort the aggregated results within the groups.

In [167]:
df

Out[167]:
count   job source
0   2   sales   A
1   4   sales   B
2   6   sales   C
3   3   sales   D
4   7   sales   E
5   5   market  A
6   3   market  B
7   2   market  C
8   4   market  D
9   1   market  E

In [168]:
df.groupby(['job','source']).agg({'count':sum})

Out[168]:
            count
job     source  
market  A   5
        B   3
        C   2
        D   4
        E   1
sales   A   2
        B   4
        C   6
        D   3
        E   7

I would now like to sort the count column in descending order within each of the groups. And then take only the top three rows. To get something like:

            count
job     source  
market  A   5
        D   4
        B   3
sales   E   7
        C   6
        B   4

回答 0

您实际上想要做的是再次使用groupby(在第一个groupby的结果上):对每个组的前三个元素进行排序并取其值。

从第一个分组依据的结果开始:

In [60]: df_agg = df.groupby(['job','source']).agg({'count':sum})

我们按索引的第一级分组:

In [63]: g = df_agg['count'].groupby(level=0, group_keys=False)

然后,我们要对每个组进行排序(“排序”)并采用前三个元素:

In [64]: res = g.apply(lambda x: x.order(ascending=False).head(3))

但是,为此,有一个快捷功能可以实现nlargest

In [65]: g.nlargest(3)
Out[65]:
job     source
market  A         5
        D         4
        B         3
sales   E         7
        C         6
        B         4
dtype: int64

What you want to do is actually again a groupby (on the result of the first groupby): sort and take the first three elements per group.

Starting from the result of the first groupby:

In [60]: df_agg = df.groupby(['job','source']).agg({'count':sum})

We group by the first level of the index:

In [63]: g = df_agg['count'].groupby('job', group_keys=False)

Then we want to sort (‘order’) each group and take the first three elements:

In [64]: res = g.apply(lambda x: x.sort_values(ascending=False).head(3))

However, for this, there is a shortcut function to do this, nlargest:

In [65]: g.nlargest(3)
Out[65]:
job     source
market  A         5
        D         4
        B         3
sales   E         7
        C         6
        B         4
dtype: int64

So in one go, this looks like:

df_agg['count'].groupby('job', group_keys=False).nlargest(3)

回答 1

您也可以一次性完成,方法是先进行排序,然后使用head进行每个组的前3个。

In[34]: df.sort_values(['job','count'],ascending=False).groupby('job').head(3)

Out[35]: 
   count     job source
4      7   sales      E
2      6   sales      C
1      4   sales      B
5      5  market      A
8      4  market      D
6      3  market      B

You could also just do it in one go, by doing the sort first and using head to take the first 3 of each group.

In[34]: df.sort_values(['job','count'],ascending=False).groupby('job').head(3)

Out[35]: 
   count     job source
4      7   sales      E
2      6   sales      C
1      4   sales      B
5      5  market      A
8      4  market      D
6      3  market      B

回答 2

这是另一个在排序顺序上排在前3位并在组内排序的示例:

In [43]: import pandas as pd                                                                                                                                                       

In [44]:  df = pd.DataFrame({"name":["Foo", "Foo", "Baar", "Foo", "Baar", "Foo", "Baar", "Baar"], "count_1":[5,10,12,15,20,25,30,35], "count_2" :[100,150,100,25,250,300,400,500]})

In [45]: df                                                                                                                                                                        
Out[45]: 
   count_1  count_2  name
0        5      100   Foo
1       10      150   Foo
2       12      100  Baar
3       15       25   Foo
4       20      250  Baar
5       25      300   Foo
6       30      400  Baar
7       35      500  Baar


### Top 3 on sorted order:
In [46]: df.groupby(["name"])["count_1"].nlargest(3)                                                                                                                               
Out[46]: 
name   
Baar  7    35
      6    30
      4    20
Foo   5    25
      3    15
      1    10
dtype: int64


### Sorting within groups based on column "count_1":
In [48]: df.groupby(["name"]).apply(lambda x: x.sort_values(["count_1"], ascending = False)).reset_index(drop=True)
Out[48]: 
   count_1  count_2  name
0       35      500  Baar
1       30      400  Baar
2       20      250  Baar
3       12      100  Baar
4       25      300   Foo
5       15       25   Foo
6       10      150   Foo
7        5      100   Foo

Here’s other example of taking top 3 on sorted order, and sorting within the groups:

In [43]: import pandas as pd                                                                                                                                                       

In [44]:  df = pd.DataFrame({"name":["Foo", "Foo", "Baar", "Foo", "Baar", "Foo", "Baar", "Baar"], "count_1":[5,10,12,15,20,25,30,35], "count_2" :[100,150,100,25,250,300,400,500]})

In [45]: df                                                                                                                                                                        
Out[45]: 
   count_1  count_2  name
0        5      100   Foo
1       10      150   Foo
2       12      100  Baar
3       15       25   Foo
4       20      250  Baar
5       25      300   Foo
6       30      400  Baar
7       35      500  Baar


### Top 3 on sorted order:
In [46]: df.groupby(["name"])["count_1"].nlargest(3)                                                                                                                               
Out[46]: 
name   
Baar  7    35
      6    30
      4    20
Foo   5    25
      3    15
      1    10
dtype: int64


### Sorting within groups based on column "count_1":
In [48]: df.groupby(["name"]).apply(lambda x: x.sort_values(["count_1"], ascending = False)).reset_index(drop=True)
Out[48]: 
   count_1  count_2  name
0       35      500  Baar
1       30      400  Baar
2       20      250  Baar
3       12      100  Baar
4       25      300   Foo
5       15       25   Foo
6       10      150   Foo
7        5      100   Foo

回答 3

试试这个代替

执行“ groupby”并按降序排序的简单方法

df.groupby(['companyName'])['overallRating'].sum().sort_values(ascending=False).head(20)

Try this Instead

simple way to do ‘groupby’ and sorting in descending order

df.groupby(['companyName'])['overallRating'].sum().sort_values(ascending=False).head(20)

回答 4

如果您不需要汇总一列,请使用@tvashtar的答案。如果您确实需要求和,则可以使用@joris的答案或与此非常相似的答案。

df.groupby(['job']).apply(lambda x: (x.groupby('source')
                                      .sum()
                                      .sort_values('count', ascending=False))
                                     .head(3))

If you don’t need to sum a column, then use @tvashtar’s answer. If you do need to sum, then you can use @joris’ answer or this one which is very similar to it.

df.groupby(['job']).apply(lambda x: (x.groupby('source')
                                      .sum()
                                      .sort_values('count', ascending=False))
                                     .head(3))

Python数据结构按字母顺序排序列表

问题:Python数据结构按字母顺序排序列表

我对python中的数据结构有些困惑;()[]{}。我正在尝试对一个简单的列表进行排序,可能是因为无法识别未能排序的数据类型。

我的清单很简单: ['Stem', 'constitute', 'Sedge', 'Eflux', 'Whim', 'Intrigue']

我的问题是这是什么类型的数据,以及如何按字母顺序对单词进行排序?

I am a bit confused regarding data structure in python; (),[], and {}. I am trying to sort a simple list, probably since I cannot identify the type of data I am failing to sort it.

My list is simple: ['Stem', 'constitute', 'Sedge', 'Eflux', 'Whim', 'Intrigue']

My question is what type of data this is, and how to sort the words alphabetically?


回答 0

[]表示列表()表示元组{}表示字典。您应该看一下官方的Python教程,因为这些是Python编程的基础知识。

您所拥有的是一个字符串列表。您可以像这样对它进行排序:

In [1]: lst = ['Stem', 'constitute', 'Sedge', 'Eflux', 'Whim', 'Intrigue']

In [2]: sorted(lst)
Out[2]: ['Eflux', 'Intrigue', 'Sedge', 'Stem', 'Whim', 'constitute']

如您所见,以大写字母开头的单词优先于以小写字母开头的单词。如果要独立对它们进行排序,请执行以下操作:

In [4]: sorted(lst, key=str.lower)
Out[4]: ['constitute', 'Eflux', 'Intrigue', 'Sedge', 'Stem', 'Whim']

您还可以按以下相反顺序对列表进行排序:

In [12]: sorted(lst, reverse=True)
Out[12]: ['constitute', 'Whim', 'Stem', 'Sedge', 'Intrigue', 'Eflux']

In [13]: sorted(lst, key=str.lower, reverse=True)
Out[13]: ['Whim', 'Stem', 'Sedge', 'Intrigue', 'Eflux', 'constitute']

请注意:如果您使用的是Python 3,那么str对于每个包含人类可读文本的字符串来说,数据类型都是正确的。但是,如果仍然需要使用Python 2,则可以处理Unicode字符串,其数据类型unicode为Python 2,而不是str。在这种情况下,如果您有一个unicode字符串列表,则必须编写key=unicode.lower而不是key=str.lower

[] denotes a list, () denotes a tuple and {} denotes a dictionary. You should take a look at the official Python tutorial as these are the very basics of programming in Python.

What you have is a list of strings. You can sort it like this:

In [1]: lst = ['Stem', 'constitute', 'Sedge', 'Eflux', 'Whim', 'Intrigue']

In [2]: sorted(lst)
Out[2]: ['Eflux', 'Intrigue', 'Sedge', 'Stem', 'Whim', 'constitute']

As you can see, words that start with an uppercase letter get preference over those starting with a lowercase letter. If you want to sort them independently, do this:

In [4]: sorted(lst, key=str.lower)
Out[4]: ['constitute', 'Eflux', 'Intrigue', 'Sedge', 'Stem', 'Whim']

You can also sort the list in reverse order by doing this:

In [12]: sorted(lst, reverse=True)
Out[12]: ['constitute', 'Whim', 'Stem', 'Sedge', 'Intrigue', 'Eflux']

In [13]: sorted(lst, key=str.lower, reverse=True)
Out[13]: ['Whim', 'Stem', 'Sedge', 'Intrigue', 'Eflux', 'constitute']

Please note: If you work with Python 3, then str is the correct data type for every string that contains human-readable text. However, if you still need to work with Python 2, then you might deal with unicode strings which have the data type unicode in Python 2, and not str. In such a case, if you have a list of unicode strings, you must write key=unicode.lower instead of key=str.lower.


回答 1

Python有一个称为的内置函数sorted,它将为您提供的任何可迭代项(例如list([1,2,3]); dict({1:2,3:4},尽管它只会返回键的排序列表; set({1,2,3,4)))提供排序列表。;或元组((1,2,3,4)))。

>>> x = [3,2,1]
>>> sorted(x)
[1, 2, 3]
>>> x
[3, 2, 1]

列表还具有一种sort将就地执行排序的方法(x.sort()返回None,但更改x对象)。

>>> x = [3,2,1]
>>> x.sort()
>>> x
[1, 2, 3]

两者都带有一个key参数,该参数应该是可调用的(函数/ lambda),可用于更改排序依据。
例如,(key,value)要从按值排序的字典中获取-pairs 列表,可以使用以下代码:

>>> x = {3:2,2:1,1:5}
>>> sorted(x.items(), key=lambda kv: kv[1])  # Items returns a list of `(key,value)`-pairs
[(2, 1), (3, 2), (1, 5)]

Python has a built-in function called sorted, which will give you a sorted list from any iterable you feed it (such as a list ([1,2,3]); a dict ({1:2,3:4}, although it will just return a sorted list of the keys; a set ({1,2,3,4); or a tuple ((1,2,3,4))).

>>> x = [3,2,1]
>>> sorted(x)
[1, 2, 3]
>>> x
[3, 2, 1]

Lists also have a sort method that will perform the sort in-place (x.sort() returns None but changes the x object) .

>>> x = [3,2,1]
>>> x.sort()
>>> x
[1, 2, 3]

Both also take a key argument, which should be a callable (function/lambda) you can use to change what to sort by.
For example, to get a list of (key,value)-pairs from a dict which is sorted by value you can use the following code:

>>> x = {3:2,2:1,1:5}
>>> sorted(x.items(), key=lambda kv: kv[1])  # Items returns a list of `(key,value)`-pairs
[(2, 1), (3, 2), (1, 5)]

回答 2

您正在处理python列表,对它进行排序就很容易。

my_list = ['Stem', 'constitute', 'Sedge', 'Eflux', 'Whim', 'Intrigue']
my_list.sort()

You’re dealing with a python list, and sorting it is as easy as doing this.

my_list = ['Stem', 'constitute', 'Sedge', 'Eflux', 'Whim', 'Intrigue']
my_list.sort()

回答 3

您可以使用内置sorted功能。

print sorted(['Stem', 'constitute', 'Sedge', 'Eflux', 'Whim', 'Intrigue'])

You can use built-in sorted function.

print sorted(['Stem', 'constitute', 'Sedge', 'Eflux', 'Whim', 'Intrigue'])

回答 4

ListName.sort()将按字母顺序排序。您可以reverse=False/True在方括号中添加以反转项目顺序:ListName.sort(reverse=False)

ListName.sort() will sort it alphabetically. You can add reverse=False/True in the brackets to reverse the order of items: ListName.sort(reverse=False)


回答 5

>>> a = ()
>>> type(a)
<type 'tuple'>
>>> a = []
>>> type(a)
<type 'list'>
>>> a = {}
>>> type(a)
<type 'dict'>
>>> a =  ['Stem', 'constitute', 'Sedge', 'Eflux', 'Whim', 'Intrigue'] 
>>> a.sort()
>>> a
['Eflux', 'Intrigue', 'Sedge', 'Stem', 'Whim', 'constitute']
>>> 
>>> a = ()
>>> type(a)
<type 'tuple'>
>>> a = []
>>> type(a)
<type 'list'>
>>> a = {}
>>> type(a)
<type 'dict'>
>>> a =  ['Stem', 'constitute', 'Sedge', 'Eflux', 'Whim', 'Intrigue'] 
>>> a.sort()
>>> a
['Eflux', 'Intrigue', 'Sedge', 'Stem', 'Whim', 'constitute']
>>> 

从整数列表中,获取最接近给定值的数字

问题:从整数列表中,获取最接近给定值的数字

给定一个整数列表,我想找到哪个数字与我在输入中提供的数字最接近:

>>> myList = [4, 1, 88, 44, 3]
>>> myNumber = 5
>>> takeClosest(myList, myNumber)
...
4

有什么快速的方法可以做到这一点吗?

Given a list of integers, I want to find which number is the closest to a number I give in input:

>>> myList = [4, 1, 88, 44, 3]
>>> myNumber = 5
>>> takeClosest(myList, myNumber)
...
4

Is there any quick way to do this?


回答 0

如果不确定列表是否已排序,则可以使用内置min()函数,查找与指定数字之间的最小距离的元素。

>>> min(myList, key=lambda x:abs(x-myNumber))
4

请注意,它也可用于带有int键的字典,例如{1: "a", 2: "b"}。此方法花费O(n)时间。


如果列表已经排序,或者您可以只对数组进行一次排序,则使用@Lauritz答案中所示的二等分方法,该方法只需要O(log n)时间(但请注意,检查列表是否已排序为O) (n),排序为O(n log n)。)

If we are not sure that the list is sorted, we could use the built-in min() function, to find the element which has the minimum distance from the specified number.

>>> min(myList, key=lambda x:abs(x-myNumber))
4

Note that it also works with dicts with int keys, like {1: "a", 2: "b"}. This method takes O(n) time.


If the list is already sorted, or you could pay the price of sorting the array once only, use the bisection method illustrated in @Lauritz’s answer which only takes O(log n) time (note however checking if a list is already sorted is O(n) and sorting is O(n log n).)


回答 1

我将重命名该函数take_closest以符合PEP8命名约定。

如果您的意思是快速执行而不是快速编写,min那么除非是在一个非常狭窄的用例中,否则不应其作为选择的武器。该min解决方案需要检查列表中的每一个数字,并做到每个号码的计算。使用bisect.bisect_left替代几乎总是更快。

“几乎”来自bisect_left要求对列表进行排序才能工作的事实。希望您的用例能够对列表进行一次排序,然后再将其保留。即使不是,只要您不需要在每次调用之前进行排序take_closest,该bisect模块就可能排在最前面。如果您有疑问,请尝试两者并查看实际差异。

from bisect import bisect_left

def take_closest(myList, myNumber):
    """
    Assumes myList is sorted. Returns closest value to myNumber.

    If two numbers are equally close, return the smallest number.
    """
    pos = bisect_left(myList, myNumber)
    if pos == 0:
        return myList[0]
    if pos == len(myList):
        return myList[-1]
    before = myList[pos - 1]
    after = myList[pos]
    if after - myNumber < myNumber - before:
       return after
    else:
       return before

Bisect的工作方式是反复将列表减半,并myNumber通过查看中间值找出必须放入的一半。这意味着它的运行时间为O(log n),而不是最高投票答案O(n)运行时间。如果我们比较这两种方法并同时提供两种myList,则结果如下:

$ python -m timeit -s“
从最近的导入take_closest
来自随机进口randint
a = range(-1000,1000,10)“” take_closest(a,randint(-1100,1100))“

100000次循环,每循环3:2.22最佳

$ python -m timeit -s“
最接近的导入with_min
来自随机进口randint
a = range(-1000,1000,10)“” with_min(a,randint(-1100,1100))“

10000次循环,最好为3次:每个循环43.9微秒

因此,在此特定测试中,bisect速度快了将近20倍。对于更长的列表,差异会更大。

如果我们通过消除myList必须排序的前提条件来公平地竞争该怎么办?假设我们在每次 take_closest调用时对列表的副本进行排序,而min解决方案保持不变。使用上述测试中的200个项目列表,该bisect解决方案仍然是最快的,尽管只有30%。

考虑到排序步骤为O(n log(n)),这是一个奇怪的结果!唯一min仍然丢失的原因是,排序是在高度优化的C代码中完成的,而min必须为每个项目调用lambda函数。随着myList规模的增长,min解决方案最终将更快。请注意,min为了赢得解决方案,我们必须堆叠所有有利条件。

I’ll rename the function take_closest to conform with PEP8 naming conventions.

If you mean quick-to-execute as opposed to quick-to-write, min should not be your weapon of choice, except in one very narrow use case. The min solution needs to examine every number in the list and do a calculation for each number. Using bisect.bisect_left instead is almost always faster.

The “almost” comes from the fact that bisect_left requires the list to be sorted to work. Hopefully, your use case is such that you can sort the list once and then leave it alone. Even if not, as long as you don’t need to sort before every time you call take_closest, the bisect module will likely come out on top. If you’re in doubt, try both and look at the real-world difference.

from bisect import bisect_left

def take_closest(myList, myNumber):
    """
    Assumes myList is sorted. Returns closest value to myNumber.

    If two numbers are equally close, return the smallest number.
    """
    pos = bisect_left(myList, myNumber)
    if pos == 0:
        return myList[0]
    if pos == len(myList):
        return myList[-1]
    before = myList[pos - 1]
    after = myList[pos]
    if after - myNumber < myNumber - before:
       return after
    else:
       return before

Bisect works by repeatedly halving a list and finding out which half myNumber has to be in by looking at the middle value. This means it has a running time of O(log n) as opposed to the O(n) running time of the highest voted answer. If we compare the two methods and supply both with a sorted myList, these are the results:

$ python -m timeit -s "
from closest import take_closest
from random import randint
a = range(-1000, 1000, 10)" "take_closest(a, randint(-1100, 1100))"

100000 loops, best of 3: 2.22 usec per loop

$ python -m timeit -s "
from closest import with_min
from random import randint
a = range(-1000, 1000, 10)" "with_min(a, randint(-1100, 1100))"

10000 loops, best of 3: 43.9 usec per loop

So in this particular test, bisect is almost 20 times faster. For longer lists, the difference will be greater.

What if we level the playing field by removing the precondition that myList must be sorted? Let’s say we sort a copy of the list every time take_closest is called, while leaving the min solution unaltered. Using the 200-item list in the above test, the bisect solution is still the fastest, though only by about 30%.

This is a strange result, considering that the sorting step is O(n log(n))! The only reason min is still losing is that the sorting is done in highly optimalized c code, while min has to plod along calling a lambda function for every item. As myList grows in size, the min solution will eventually be faster. Note that we had to stack everything in its favour for the min solution to win.


回答 2

>>> takeClosest = lambda num,collection:min(collection,key=lambda x:abs(x-num))
>>> takeClosest(5,[4,1,88,44,3])
4

一个拉姆达是写一个“匿名”功能(即没有名称的功能)的一种特殊方式。您可以为它分配任何名称,因为lambda是一个表达式。

上面的“长篇”写法是:

def takeClosest(num,collection):
   return min(collection,key=lambda x:abs(x-num))
>>> takeClosest = lambda num,collection:min(collection,key=lambda x:abs(x-num))
>>> takeClosest(5,[4,1,88,44,3])
4

A lambda is a special way of writing an “anonymous” function (a function that doesn’t have a name). You can assign it any name you want because a lambda is an expression.

The “long” way of writing the the above would be:

def takeClosest(num,collection):
   return min(collection,key=lambda x:abs(x-num))

回答 3

def closest(list, Number):
    aux = []
    for valor in list:
        aux.append(abs(Number-valor))

    return aux.index(min(aux))

此代码将为您提供列表中最接近的Number的索引。

KennyTM提供的解决方案是最好的整体解决方案,但是在您无法使用它的情况下(例如brython),此功能可以完成工作

def closest(list, Number):
    aux = []
    for valor in list:
        aux.append(abs(Number-valor))

    return aux.index(min(aux))

This code will give you the index of the closest number of Number in the list.

The solution given by KennyTM is the best overall, but in the cases you cannot use it (like brython), this function will do the work


回答 4

遍历列表,然后将当前最接近的数字与进行比较abs(currentNumber - myNumber)

def takeClosest(myList, myNumber):
    closest = myList[0]
    for i in range(1, len(myList)):
        if abs(i - myNumber) < closest:
            closest = i
    return closest

Iterate over the list and compare the current closest number with abs(currentNumber - myNumber):

def takeClosest(myList, myNumber):
    closest = myList[0]
    for i in range(1, len(myList)):
        if abs(i - myNumber) < closest:
            closest = i
    return closest

回答 5

重要的是要注意,Lauritz的使用bisect的建议思想实际上并未在MyList中找到与MyNumber最接近的值。相反,bisect会在MyList中的MyNumber之后按顺序查找下一个值。因此,在OP的情况下,您实际上得到的是返回的位置44而不是位置4。

>>> myList = [1, 3, 4, 44, 88] 
>>> myNumber = 5
>>> pos = (bisect_left(myList, myNumber))
>>> myList[pos]
...
44

要获得最接近5的值,您可以尝试将列表转换为数组,并使用numpy的argmin这样。

>>> import numpy as np
>>> myNumber = 5   
>>> myList = [1, 3, 4, 44, 88] 
>>> myArray = np.array(myList)
>>> pos = (np.abs(myArray-myNumber)).argmin()
>>> myArray[pos]
...
4

我不知道这会有多快,我的猜测是“不太”。

It’s important to note that Lauritz’s suggestion idea of using bisect does not actually find the closest value in MyList to MyNumber. Instead, bisect finds the next value in order after MyNumber in MyList. So in OP’s case you’d actually get the position of 44 returned instead of the position of 4.

>>> myList = [1, 3, 4, 44, 88] 
>>> myNumber = 5
>>> pos = (bisect_left(myList, myNumber))
>>> myList[pos]
...
44

To get the value that’s closest to 5 you could try converting the list to an array and using argmin from numpy like so.

>>> import numpy as np
>>> myNumber = 5   
>>> myList = [1, 3, 4, 44, 88] 
>>> myArray = np.array(myList)
>>> pos = (np.abs(myArray-myNumber)).argmin()
>>> myArray[pos]
...
4

I don’t know how fast this would be though, my guess would be “not very”.


回答 6

扩展了古斯塔沃·利马(Gustavo Lima)的答案。无需创建全新的列表即可完成相同的操作。随着FOR循环的进行,列表中的值可以用差分代替。

def f_ClosestVal(v_List, v_Number):
"""Takes an unsorted LIST of INTs and RETURNS INDEX of value closest to an INT"""
for _index, i in enumerate(v_List):
    v_List[_index] = abs(v_Number - i)
return v_List.index(min(v_List))

myList = [1, 88, 44, 4, 4, -2, 3]
v_Num = 5
print(f_ClosestVal(myList, v_Num)) ## Gives "3," the index of the first "4" in the list.

Expanding upon Gustavo Lima’s answer. The same thing can be done without creating an entirely new list. The values in the list can be replaced with the differentials as the FOR loop progresses.

def f_ClosestVal(v_List, v_Number):
"""Takes an unsorted LIST of INTs and RETURNS INDEX of value closest to an INT"""
for _index, i in enumerate(v_List):
    v_List[_index] = abs(v_Number - i)
return v_List.index(min(v_List))

myList = [1, 88, 44, 4, 4, -2, 3]
v_Num = 5
print(f_ClosestVal(myList, v_Num)) ## Gives "3," the index of the first "4" in the list.

回答 7

如果我可以补充@Lauritz的答案

为了避免出现运行错误,请不要忘记在该bisect_left行之前添加一个条件:

if (myNumber > myList[-1] or myNumber < myList[0]):
    return False

因此完整的代码如下所示:

from bisect import bisect_left

def takeClosest(myList, myNumber):
    """
    Assumes myList is sorted. Returns closest value to myNumber.
    If two numbers are equally close, return the smallest number.
    If number is outside of min or max return False
    """
    if (myNumber > myList[-1] or myNumber < myList[0]):
        return False
    pos = bisect_left(myList, myNumber)
    if pos == 0:
            return myList[0]
    if pos == len(myList):
            return myList[-1]
    before = myList[pos - 1]
    after = myList[pos]
    if after - myNumber < myNumber - before:
       return after
    else:
       return before

If I may add to @Lauritz’s answer

In order not to have a run error don’t forget to add a condition before the bisect_left line:

if (myNumber > myList[-1] or myNumber < myList[0]):
    return False

so the full code will look like:

from bisect import bisect_left

def takeClosest(myList, myNumber):
    """
    Assumes myList is sorted. Returns closest value to myNumber.
    If two numbers are equally close, return the smallest number.
    If number is outside of min or max return False
    """
    if (myNumber > myList[-1] or myNumber < myList[0]):
        return False
    pos = bisect_left(myList, myNumber)
    if pos == 0:
            return myList[0]
    if pos == len(myList):
            return myList[-1]
    before = myList[pos - 1]
    after = myList[pos]
    if after - myNumber < myNumber - before:
       return after
    else:
       return before

为什么“ return list.sort()”返回None,而不返回列表?

问题:为什么“ return list.sort()”返回None,而不返回列表?

我已经能够验证findUniqueWords结果是否为sorted list。但是,它不返回列表。为什么?

def findUniqueWords(theList):
    newList = []
    words = []

    # Read a line at a time
    for item in theList:

        # Remove any punctuation from the line
        cleaned = cleanUp(item)

        # Split the line into separate words
        words = cleaned.split()

        # Evaluate each word
        for word in words:

            # Count each unique word
            if word not in newList:
                newList.append(word)

    answer = newList.sort()
    return answer

I’ve been able to verify that the findUniqueWords does result in a sorted list. However, it does not return the list. Why?

def findUniqueWords(theList):
    newList = []
    words = []

    # Read a line at a time
    for item in theList:

        # Remove any punctuation from the line
        cleaned = cleanUp(item)

        # Split the line into separate words
        words = cleaned.split()

        # Evaluate each word
        for word in words:

            # Count each unique word
            if word not in newList:
                newList.append(word)

    answer = newList.sort()
    return answer

回答 0

list.sort对列表进行适当排序,即不返回新列表。写就好了

newList.sort()
return newList

list.sort sorts the list in place, i.e. it doesn’t return a new list. Just write

newList.sort()
return newList

回答 1

问题在这里:

answer = newList.sort()

sort不返回排序列表;而是将列表排序到位。

用:

answer = sorted(newList)

The problem is here:

answer = newList.sort()

sort does not return the sorted list; rather, it sorts the list in place.

Use:

answer = sorted(newList)

回答 2

这是 Guido van Rossum在Python开发清单中的一封电子邮件,解释了为什么他选择不返回self影响该对象的操作并且不返回新对象的原因。

这来自一种编码样式(在各种其他语言中很流行,我相信尤其是Lisp令人反感),其中可以将单个对象上的一系列副作用链接成这样:

 x.compress().chop(y).sort(z)

这将与

  x.compress()
  x.chop(y)
  x.sort(z)

我发现链接对可读性构成威胁;它要求读者必须对每种方法都非常熟悉。第二种形式清楚地表明,每个调用都作用于同一对象,因此,即使您不太了解类及其方法,您也可以理解第二种和第三种调用都适用于x(并且所有呼叫都是出于副作用),而不是其他。

我想为返回新值的操作保留链接,例如字符串处理操作:

 y = x.rstrip("\n").split(":").lower()

Here is an email from Guido van Rossum in Python’s dev list explaining why he choose not to return self on operations that affects the object and don’t return a new one.

This comes from a coding style (popular in various other languages, I believe especially Lisp revels in it) where a series of side effects on a single object can be chained like this:

 x.compress().chop(y).sort(z)

which would be the same as

  x.compress()
  x.chop(y)
  x.sort(z)

I find the chaining form a threat to readability; it requires that the reader must be intimately familiar with each of the methods. The second form makes it clear that each of these calls acts on the same object, and so even if you don’t know the class and its methods very well, you can understand that the second and third call are applied to x (and that all calls are made for their side-effects), and not to something else.

I’d like to reserve chaining for operations that return new values, like string processing operations:

 y = x.rstrip("\n").split(":").lower()

回答 3

蟒习惯性地返回None从功能和变异的数据的方法,例如list.sortlist.appendrandom.shuffle,与想法是,它暗示一个事实,即它是变异。

如果要进行迭代并返回其项目的新排序列表,请使用sorted内置函数。

Python habitually returns None from functions and methods that mutate the data, such as list.sort, list.append, and random.shuffle, with the idea being that it hints to the fact that it was mutating.

If you want to take an iterable and return a new, sorted list of its items, use the sorted builtin function.


回答 4

Python有两种排序方式:sort 方法(或“成员函数”)和sort 函数。sort方法对命名对象的内容进行操作-将其视为对象要对其自身重新排序的操作。排序功能是对由对象表示的数据进行的操作,并按排序顺序返回内容相同的新对象。

给定一个名为l的整数列表,如果我们调用,则列表本身将重新排序l.sort()

>>> l = [1, 5, 2341, 467, 213, 123]
>>> l.sort()
>>> l
[1, 5, 123, 213, 467, 2341]

此方法没有返回值。但是,如果我们尝试分配的结果l.sort()呢?

>>> l = [1, 5, 2341, 467, 213, 123]
>>> r = l.sort()
>>> print(r)
None

r现在实际上等于什么。这是程序员在离开Python一段时间后很可能会忘记的那些怪异的,有些令人讨厌的细节(这就是为什么我要编写这个东西,所以不会再忘记了)。

sorted()另一方面,该函数不会对的内容做任何事情l,但会返回一个新的,排序后的列表,其内容与以下内容相同l

>>> l = [1, 5, 2341, 467, 213, 123]
>>> r = sorted(l)
>>> l
[1, 5, 2341, 467, 213, 123]
>>> r
[1, 5, 123, 213, 467, 2341]

请注意,返回的值是不是一个深拷贝,所以要谨慎了包含列表照常中的元素侧effecty操作:

>>> spam = [8, 2, 4, 7]
>>> eggs = [3, 1, 4, 5]
>>> l = [spam, eggs]
>>> r = sorted(l)
>>> l
[[8, 2, 4, 7], [3, 1, 4, 5]]
>>> r
[[3, 1, 4, 5], [8, 2, 4, 7]]
>>> spam.sort()
>>> eggs.sort()
>>> l
[[2, 4, 7, 8], [1, 3, 4, 5]]
>>> r
[[1, 3, 4, 5], [2, 4, 7, 8]]

Python has two kinds of sorts: a sort method (or “member function”) and a sort function. The sort method operates on the contents of the object named — think of it as an action that the object is taking to re-order itself. The sort function is an operation over the data represented by an object and returns a new object with the same contents in a sorted order.

Given a list of integers named l the list itself will be reordered if we call l.sort():

>>> l = [1, 5, 2341, 467, 213, 123]
>>> l.sort()
>>> l
[1, 5, 123, 213, 467, 2341]

This method has no return value. But what if we try to assign the result of l.sort()?

>>> l = [1, 5, 2341, 467, 213, 123]
>>> r = l.sort()
>>> print(r)
None

r now equals actually nothing. This is one of those weird, somewhat annoying details that a programmer is likely to forget about after a period of absence from Python (which is why I am writing this, so I don’t forget again).

The function sorted(), on the other hand, will not do anything to the contents of l, but will return a new, sorted list with the same contents as l:

>>> l = [1, 5, 2341, 467, 213, 123]
>>> r = sorted(l)
>>> l
[1, 5, 2341, 467, 213, 123]
>>> r
[1, 5, 123, 213, 467, 2341]

Be aware that the returned value is not a deep copy, so be cautious about side-effecty operations over elements contained within the list as usual:

>>> spam = [8, 2, 4, 7]
>>> eggs = [3, 1, 4, 5]
>>> l = [spam, eggs]
>>> r = sorted(l)
>>> l
[[8, 2, 4, 7], [3, 1, 4, 5]]
>>> r
[[3, 1, 4, 5], [8, 2, 4, 7]]
>>> spam.sort()
>>> eggs.sort()
>>> l
[[2, 4, 7, 8], [1, 3, 4, 5]]
>>> r
[[1, 3, 4, 5], [2, 4, 7, 8]]

回答 5

要了解为什么它不返回列表:

sort()不返回任何值,而sort()方法仅以特定顺序对给定列表的元素进行排序- 升序降序而不返回任何值。

所以问题在于answer = newList.sort()答案是否定的。

相反,您可以这样做return newList.sort()

sort()方法的语法为:

list.sort(key=..., reverse=...)

另外,您也可以出于相同目的使用Python的内置函数sorted()。

sorted(list, key=..., reverse=...)

注意:sort()和sorted()之间最简单的区别是:sort()不返回任何值,而sorted()返回可迭代的列表。

所以就你而言answer = sorted(newList)

To understand why it does not return the list:

sort() doesn’t return any value while the sort() method just sorts the elements of a given list in a specific order – ascending or descending without returning any value.

So problem is with answer = newList.sort() where answer is none.

Instead you can just do return newList.sort().

The syntax of the sort() method is:

list.sort(key=..., reverse=...)

Alternatively, you can also use Python’s in-built function sorted() for the same purpose.

sorted(list, key=..., reverse=...)

Note: The simplest difference between sort() and sorted() is: sort() doesn’t return any value while, sorted() returns an iterable list.

So in your case answer = sorted(newList).


回答 6

如果要返回排序列表,可以使用sorted()方法。比较方便

l1 = []
n = int(input())

for i in range(n):
  user = int(input())
  l1.append(user)
sorted(l1,reverse=True)

list.sort()方法就地修改列表,并返回None。

如果您仍然想使用排序,则可以执行此操作。

l1 = []
n = int(input())

for i in range(n):
  user = int(input())
  l1.append(user)
l1.sort(reverse=True)
print(l1)

you can use sorted() method if you want it to return the sorted list. It’s more convenient.

l1 = []
n = int(input())

for i in range(n):
  user = int(input())
  l1.append(user)
sorted(l1,reverse=True)

list.sort() method modifies the list in-place and returns None.

if you still want to use sort you can do this.

l1 = []
n = int(input())

for i in range(n):
  user = int(input())
  l1.append(user)
l1.sort(reverse=True)
print(l1)

sorted(key = lambda:…)后面的语法

问题:sorted(key = lambda:…)后面的语法

我不太明白该sorted()参数背后的语法:

key=lambda variable: variable[0]

是不是lambda随心所欲?为什么variable在一个什么样的表述中两次dict

I don’t quite understand the syntax behind the sorted() argument:

key=lambda variable: variable[0]

Isn’t lambda arbitrary? Why is variable stated twice in what looks like a dict?


回答 0

key是一个函数,在比较集合的项目之前将调用该函数。传递给的参数key必须是可调用的。

使用lambda创建一个匿名函数(可调用)。在sorted可调用的情况下仅采用一个参数。Python lambda很简单。它只能做并真正返回一件事。

语法lambda是单词,lambda后跟参数名称列表,然后是单个代码块。参数列表和代码块用冒号表示。这类似于在python其他构建体,以及诸如whileforif等。它们都是通常具有代码块的语句。Lambda只是带有代码块的语句的另一个实例。

我们可以将lambda与def的使用进行比较,以创建一个函数。

adder_lambda = lambda parameter1,parameter2: parameter1+parameter2
def adder_regular(parameter1, parameter2): return parameter1+parameter2

lambda只是为我们提供了一种无需分配名称的方法。这非常适合用作函数的参数。

variable 在此使用两次,因为在冒号的左手边它是参数的名称,而在右手边它在代码块中用于计算某些内容。

key is a function that will be called to transform the collection’s items before they are compared. The parameter passed to key must be something that is callable.

The use of lambda creates an anonymous function (which is callable). In the case of sorted the callable only takes one parameters. Python’s lambda is pretty simple. It can only do and return one thing really.

The syntax of lambda is the word lambda followed by the list of parameter names then a single block of code. The parameter list and code block are delineated by colon. This is similar to other constructs in python as well such as while, for, if and so on. They are all statements that typically have a code block. Lambda is just another instance of a statement with a code block.

We can compare the use of lambda with that of def to create a function.

adder_lambda = lambda parameter1,parameter2: parameter1+parameter2
def adder_regular(parameter1, parameter2): return parameter1+parameter2

lambda just gives us a way of doing this without assigning a name. Which makes it great for using as a parameter to a function.

variable is used twice here because on the left hand of the colon it is the name of a parameter and on the right hand side it is being used in the code block to compute something.


回答 1

我认为这里的所有答案都很好地涵盖了lambda函数在sorted()上下文中的作用的核心,但是我仍然感觉缺乏对直观理解的描述,所以这里是我的两分钱。

为了完整起见,我先说一下显而易见的事情:sorted()返回已排序元素的列表,以及是否要以特定方式排序或是否要对元素的复杂列表进行排序(例如,嵌套列表或元组列表),我们可以调用key参数。

对我来说,关键参数的直观理解,为什么它必须是可调用的以及使用lambda作为(匿名)可调用函数来完成此操作的过程分为两个部分。

  1. 最终,使用lamba意味着您不必编写(定义)整个函数,就像一个例子所提供的那样。Lambda函数可以被创建,使用和立即销毁-因此它们不会使您的代码与只会被使用一次的更多代码捆绑在一起。据我了解,这是lambda函数的核心实用程序,它在此类角色中的应用广泛。它的语法纯属约定,从本质上讲,这通常是程序语法的本质。学习语法并完成它。

Lambda语法如下:

lambda input_variable(s)好吃的一班轮

例如

In [1]: f00 = lambda x: x/2

In [2]: f00(10)
Out[2]: 5.0

In [3]: (lambda x: x/2)(10)
Out[3]: 5.0

In [4]: (lambda x, y: x / y)(10, 2)
Out[4]: 5.0

In [5]: (lambda: 'amazing lambda')() # func with no args!
Out[5]: 'amazing lambda'
  1. key参数背后的想法是,它应该接受一组指令,这些指令本质上将把“ sorted()”功能指向那些应该用于排序的列表元素。当它说时key=,它的真正含义是:当我一次遍历列表中的一个元素时(即对于列表中的e),我将把当前元素传递给我在key参数中提供的函数,并使用该元素。创建一个转换后的列表,该列表将通知我最终排序列表的顺序。

看看这个:

mylist = [3,6,3,2,4,8,23]
sorted(mylist, key=WhatToSortBy)

基本示例:

sorted(mylist)

[2、3、3、4、6、8、23]#所有数字按从小到大的顺序排列。

范例1:

mylist = [3,6,3,2,4,8,23]
sorted(mylist, key=lambda x: x%2==0)

[3,3,23,6,2,4,8]#此排序结果对您来说是否直观?

请注意,我的lambda函数告诉sorted在排序之前检查(e)是偶数还是奇数。

可是等等!您可能(或者也许应该)想知道两件事-首先,为什么我的赔率比我的赔率还要高(因为我的关键值似乎是在告诉我的排序函数通过使用中的mod运算符来优先考虑偶数x%2==0)。第二,为什么我的偶数不正常?2先于6吧?通过分析此结果,我们将更深入地了解sorted()’key’参数如何工作,尤其是与匿名lambda函数结合使用时。

首先,您会注意到虽然赔率先于偶数,但偶数本身并未排序。为什么是这样??让我们阅读文档

关键函数从Python 2.4开始,list.sort()和sorted()都添加了一个关键参数,以指定要在进行比较之前在每个列表元素上调用的函数。

我们必须在这里在各行之间进行一些阅读,但这告诉我们,sort函数仅被调用一次,并且如果我们指定key参数,那么我们将按key函数指向我们的值进行排序。

那么使用模数返回的示例又是什么呢?布尔值:True == 1False == 0。那么排序如何处理这个键?它基本上将原始列表转换为1和0的序列。

[3,6,3,2,4,8,23]变为[0,1,0,1,1,1,0]

现在我们到了某个地方。对转换后的列表进行排序会得到什么?

[0,0,0,1,1,1,1]

好的,现在我们知道了为什么赔率要高于平均赔率了。但是,下一个问题是:为什么最终列表中的6仍然排在2之前?嗯,这很容易-因为排序只发生一次!即那些1仍然代表原始列表值,它们处于彼此相对的原始位置。由于排序仅发生一次,并且我们不调用任何排序函数来将原始偶数值从低到高排序,因此这些值相对于彼此保持原始顺序。

那么最后的问题是:当我打印出最终的排序列表时,我如何在概念上思考布尔值的顺序如何转换回原始值?

Sorted()是一种内置方法(事实)使用称为Timsort的混合排序算法结合了合并排序和插入排序的各个方面。在我看来,当您调用它时,有一种机制可以将这些值保存在内存中,并将它们与由(…!)lambda函数确定的布尔标识(掩码)捆绑在一起。顺序由通过lambda函数计算出的布尔身份确定,但请记住,这些子列表(一个和一个零)本身并不按其原始值排序。因此,最终列表虽然由奇数和偶数组织,但不会按子列表排序(在这种情况下,偶数是乱序的)。赔率排序的事实是因为它们在原始列表中已经是巧合的。从所有这些中得出的结论是,当lambda进行该转换时,将保留子列表的原始顺序。

那么,这一切与原始问题有何关系,更重要的是,我们对如何使用关键参数和lambda实现sorted()的直觉?

该lambda函数可以被视为指向我们需要排序的值的指针,它是将值映射到由lambda函数转换后的布尔值的指针,还是其在嵌套列表tuple中的特定元素, dict等,同样由lambda函数确定。

让我们尝试预测当我运行以下代码时会发生什么。

mylist = [(3, 5, 8), (6, 2, 8), ( 2, 9, 4), (6, 8, 5)]
sorted(mylist, key=lambda x: x[1])

我的sorted电话显然说:“请对该列表进行排序”。关键参数通过对mylist中的每个元素(x)说,返回该元素的索引1,然后按lambda函数。由于我们有一个元组列表,因此我们可以从该元组返回一个索引元素。这样我们得到:

[(6,2,8),(3,5,8),(6,8,5),(2,9,4)]

运行该代码,您会发现这就是命令。尝试索引整数列表,您会发现代码中断。

这是一个冗长的解释,但是我希望这有助于“整理”您对使用lambda函数作为sorted()及以后的关键参数的直觉。

I think all of the answers here cover the core of what the lambda function does in the context of sorted() quite nicely, however I still feel like a description that leads to an intuitive understanding is lacking, so here is my two cents.

For the sake of completeness, I’ll state the obvious up front: sorted() returns a list of sorted elements and if we want to sort in a particular way or if we want to sort a complex list of elements (e.g. nested lists or a list of tuples) we can invoke the key argument.

For me, the intuitive understanding of the key argument, why it has to be callable, and the use of lambda as the (anonymous) callable function to accomplish this comes in two parts.

  1. Using lamba ultimately means you don’t have to write (define) an entire function, like the one sblom provided an example of. Lambda functions are created, used, and immediately destroyed – so they don’t funk up your code with more code that will only ever be used once. This, as I understand it, is the core utility of the lambda function and its applications for such roles are broad. Its syntax is purely by convention, which is in essence the nature of programmatic syntax in general. Learn the syntax and be done with it.

Lambda syntax is as follows:

lambda input_variable(s): tasty one liner

e.g.

In [1]: f00 = lambda x: x/2

In [2]: f00(10)
Out[2]: 5.0

In [3]: (lambda x: x/2)(10)
Out[3]: 5.0

In [4]: (lambda x, y: x / y)(10, 2)
Out[4]: 5.0

In [5]: (lambda: 'amazing lambda')() # func with no args!
Out[5]: 'amazing lambda'
  1. The idea behind the key argument is that it should take in a set of instructions that will essentially point the ‘sorted()’ function at those list elements which should used to sort by. When it says key=, what it really means is: As I iterate through the list one element at a time (i.e. for e in list), I’m going to pass the current element to the function I provide in the key argument and use that to create a transformed list which will inform me on the order of final sorted list.

Check it out:

mylist = [3,6,3,2,4,8,23]
sorted(mylist, key=WhatToSortBy)

Base example:

sorted(mylist)

[2, 3, 3, 4, 6, 8, 23] # all numbers are in order from small to large.

Example 1:

mylist = [3,6,3,2,4,8,23]
sorted(mylist, key=lambda x: x%2==0)

[3, 3, 23, 6, 2, 4, 8] # Does this sorted result make intuitive sense to you?

Notice that my lambda function told sorted to check if (e) was even or odd before sorting.

BUT WAIT! You may (or perhaps should) be wondering two things – first, why are my odds coming before my evens (since my key value seems to be telling my sorted function to prioritize evens by using the mod operator in x%2==0). Second, why are my evens out of order? 2 comes before 6 right? By analyzing this result, we’ll learn something deeper about how the sorted() ‘key’ argument works, especially in conjunction with the anonymous lambda function.

Firstly, you’ll notice that while the odds come before the evens, the evens themselves are not sorted. Why is this?? Lets read the docs:

Key Functions Starting with Python 2.4, both list.sort() and sorted() added a key parameter to specify a function to be called on each list element prior to making comparisons.

We have to do a little bit of reading between the lines here, but what this tells us is that the sort function is only called once, and if we specify the key argument, then we sort by the value that key function points us to.

So what does the example using a modulo return? A boolean value: True == 1, False == 0. So how does sorted deal with this key? It basically transforms the original list to a sequence of 1s and 0s.

[3,6,3,2,4,8,23] becomes [0,1,0,1,1,1,0]

Now we’re getting somewhere. What do you get when you sort the transformed list?

[0,0,0,1,1,1,1]

Okay, so now we know why the odds come before the evens. But the next question is: Why does the 6 still come before the 2 in my final list? Well that’s easy – its because sorting only happens once! i.e. Those 1s still represent the original list values, which are in their original positions relative to each other. Since sorting only happens once, and we don’t call any kind of sort function to order the original even values from low to high, those values remain in their original order relative to one another.

The final question is then this: How do I think conceptually about how the order of my boolean values get transformed back in to the original values when I print out the final sorted list?

Sorted() is a built-in method that (fun fact) uses a hybrid sorting algorithm called Timsort that combines aspects of merge sort and insertion sort. It seems clear to me that when you call it, there is a mechanic that holds these values in memory and bundles them with their boolean identity (mask) determined by (…!) the lambda function. The order is determined by their boolean identity calculated from the lambda function, but keep in mind that these sublists (of one’s and zeros) are not themselves sorted by their original values. Hence, the final list, while organized by Odds and Evens, is not sorted by sublist (the evens in this case are out of order). The fact that the odds are ordered is because they were already in order by coincidence in the original list. The takeaway from all this is that when lambda does that transformation, the original order of the sublists are retained.

So how does this all relate back to the original question, and more importantly, our intuition on how we should implement sorted() with its key argument and lambda?

That lambda function can be thought of as a pointer that points to the values we need to sort by, whether its a pointer mapping a value to its boolean transformed by the lambda function, or if its a particular element in a nested list, tuple, dict, etc., again determined by the lambda function.

Lets try and predict what happens when I run the following code.

mylist = [(3, 5, 8), (6, 2, 8), ( 2, 9, 4), (6, 8, 5)]
sorted(mylist, key=lambda x: x[1])

My sorted call obviously says, “Please sort this list”. The key argument makes that a little more specific by saying, for each element (x) in mylist, return index 1 of that element, then sort all of the elements of the original list ‘mylist’ by the sorted order of the list calculated by the lambda function. Since we have a list of tuples, we can return an indexed element from that tuple. So we get:

[(6, 2, 8), (3, 5, 8), (6, 8, 5), (2, 9, 4)]

Run that code, and you’ll find that this is the order. Try indexing a list of integers and you’ll find that the code breaks.

This was a long winded explanation, but I hope this helps to ‘sort’ your intuition on the use of lambda functions as the key argument in sorted() and beyond.


回答 2

lambda是一个Python关键字,用于生成匿名函数

>>> (lambda x: x+2)(3)
5

lambda is a Python keyword that is used to generate anonymous functions.

>>> (lambda x: x+2)(3)
5

回答 3

variable左侧:是参数名称。采用variable右侧正在使用的参数。

意思几乎完全相同:

def some_method(variable):
  return variable[0]

The variable left of the : is a parameter name. The use of variable on the right is making use of the parameter.

Means almost exactly the same as:

def some_method(variable):
  return variable[0]

回答 4

使用key = lambda的sorted()函数的另一个示例。让我们考虑一下您有一个元组列表。在每个元组中,您都有汽车的品牌,型号和重量,并且您想要按品牌,型号或重量对这个元组列表进行排序。您可以使用lambda来完成。

cars = [('citroen', 'xsara', 1100), ('lincoln', 'navigator', 2000), ('bmw', 'x5', 1700)]

print(sorted(cars, key=lambda car: car[0]))
print(sorted(cars, key=lambda car: car[1]))
print(sorted(cars, key=lambda car: car[2]))

结果:

[('bmw', 'x5', '1700'), ('citroen', 'xsara', 1100), ('lincoln', 'navigator', 2000)]
[('lincoln', 'navigator', 2000), ('bmw', 'x5', '1700'), ('citroen', 'xsara', 1100)]
[('citroen', 'xsara', 1100), ('bmw', 'x5', 1700), ('lincoln', 'navigator', 2000)]

One more example of usage sorted() function with key=lambda. Let’s consider you have a list of tuples. In each tuple you have a brand, model and weight of the car and you want to sort this list of tuples by brand, model or weight. You can do it with lambda.

cars = [('citroen', 'xsara', 1100), ('lincoln', 'navigator', 2000), ('bmw', 'x5', 1700)]

print(sorted(cars, key=lambda car: car[0]))
print(sorted(cars, key=lambda car: car[1]))
print(sorted(cars, key=lambda car: car[2]))

Results:

[('bmw', 'x5', '1700'), ('citroen', 'xsara', 1100), ('lincoln', 'navigator', 2000)]
[('lincoln', 'navigator', 2000), ('bmw', 'x5', '1700'), ('citroen', 'xsara', 1100)]
[('citroen', 'xsara', 1100), ('bmw', 'x5', 1700), ('lincoln', 'navigator', 2000)]

回答 5

lambda是匿名函数,不是任意函数。接受的参数将是您正在使用的变量以及对其进行排序的列。

lambda is an anonymous function, not an arbitrary function. The parameter being accepted would be the variable you’re working with, and the column in which you’re sorting it on.


回答 6

由于在的上下文中询问了lambda的用法,因此也请sorted()看看https://wiki.python.org/moin/HowTo/Sorting/#Key_Functions

Since the usage of lambda was asked in the context of sorted(), take a look at this as well https://wiki.python.org/moin/HowTo/Sorting/#Key_Functions


回答 7

换个说法,排序函数中的键(可选。要执行以决定顺序的函数。默认为None)需要一个函数,而您使用的是lambda。

要定义lambda,请指定要排序的对象属性,而python的内置排序函数将自动处理它。

如果要按多个属性排序,则分配key = lambda x:(property1,property2)。

要指定排序方式,请将sorted函数的第三个参数(可选。布尔值。False将按升序排序,True将按降序排序。默认值为False)传递reverse = true。

Just to rephrase, the key (Optional. A Function to execute to decide the order. Default is None) in sorted functions expects a function and you use lambda.

To define lambda, you specify the object property you want to sort and python’s built-in sorted function will automatically take care of it.

If you want to sort by multiple properties then assign key = lambda x: (property1, property2).

To specify order-by, pass reverse= true as the third argument(Optional. A Boolean. False will sort ascending, True will sort descending. Default is False) of sorted function.


回答 8

简单且不耗时的答案,并提供与所提问题相关的示例,请 按照以下示例操作:

 user = [{"name": "Dough", "age": 55}, 
            {"name": "Ben", "age": 44}, 
            {"name": "Citrus", "age": 33},
            {"name": "Abdullah", "age":22},
            ]
    print(sorted(user, key=lambda el: el["name"]))
    print(sorted(user, key= lambda y: y["age"]))

查看列表中的名称,它们以D,B,C和A开头。如果您注意到年龄,则分别是55、44、33和22。第一个打印代码

print(sorted(user, key=lambda el: el["name"]))

结果为:

[{'name': 'Abdullah', 'age': 22}, 
{'name': 'Ben', 'age': 44}, 
{'name': 'Citrus', 'age': 33}, 
{'name': 'Dough', 'age': 55}]

对名称进行排序,因为通过key = lambda el:el [“ name”]我们对名称进行排序,并且名称按字母顺序返回。

第二次印刷代码

print(sorted(user, key= lambda y: y["age"]))

结果:

[{'name': 'Abdullah', 'age': 22},
 {'name': 'Citrus', 'age': 33},
 {'name': 'Ben', 'age': 44}, 
 {'name': 'Dough', 'age': 55}]

按年龄排序,因此列表按年龄升序返回。

尝试使用此代码可以更好地理解。

Simple and not time consuming answer with an example relevant to the question asked Follow this example:

 user = [{"name": "Dough", "age": 55}, 
            {"name": "Ben", "age": 44}, 
            {"name": "Citrus", "age": 33},
            {"name": "Abdullah", "age":22},
            ]
    print(sorted(user, key=lambda el: el["name"]))
    print(sorted(user, key= lambda y: y["age"]))

Look at the names in the list, they starts with D, B, C and A. And if you notice the ages, they are 55, 44, 33 and 22. The first print code

print(sorted(user, key=lambda el: el["name"]))

Results to:

[{'name': 'Abdullah', 'age': 22}, 
{'name': 'Ben', 'age': 44}, 
{'name': 'Citrus', 'age': 33}, 
{'name': 'Dough', 'age': 55}]

sorts the name, because by key=lambda el: el[“name”] we are sorting the names and the names return in alphabetical order.

The second print code

print(sorted(user, key= lambda y: y["age"]))

Result:

[{'name': 'Abdullah', 'age': 22},
 {'name': 'Citrus', 'age': 33},
 {'name': 'Ben', 'age': 44}, 
 {'name': 'Dough', 'age': 55}]

sorts by age, and hence the list returns by ascending order of age.

Try this code for better understanding.


如何以完全相同的方式对两个列表(相互引用)进行排序

问题:如何以完全相同的方式对两个列表(相互引用)进行排序

说我有两个清单:

list1 = [3, 2, 4, 1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']

如果我运行list1.sort(),它将进行排序,[1,1,2,3,4]但是还有没有一种list2同步的方法(因此我可以说item 4属于'three')?因此,预期输出为:

list1 = [1, 1, 2, 3, 4]
list2 = ['one', 'one2', 'two', 'three', 'four']

我的问题是我有一个非常复杂的程序,可以很好地处理列表,但是我有点需要开始引用一些数据。我知道这对字典来说是一个完美的情况,但是我在处理过程中尽量避免使用字典,因为我确实需要对键值进行排序(如果必须使用字典,我知道如何使用它们)。

基本上,该程序的性质是,数据按随机顺序排列(如上),我需要对其进行排序,处理然后发送结果(顺序无关紧要,但是用户需要知道哪个结果属于哪个结果)键)。我考虑过先将其放入字典中,然后再对列表进行排序,但是如果不保持顺序(如果将结果传达给用户,可能会产生影响),我将无法区分具有相同值的项。因此,理想情况下,一旦获得列表,我就想出一种将两个列表排序在一起的方法。这可能吗?

Say I have two lists:

list1 = [3, 2, 4, 1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']

If I run list1.sort(), it’ll sort it to [1,1,2,3,4] but is there a way to get list2 in sync as well (so I can say item 4 belongs to 'three')? So, the expected output would be:

list1 = [1, 1, 2, 3, 4]
list2 = ['one', 'one2', 'two', 'three', 'four']

My problem is I have a pretty complex program that is working fine with lists but I sort of need to start referencing some data. I know this is a perfect situation for dictionaries but I’m trying to avoid dictionaries in my processing because I do need to sort the key values (if I must use dictionaries I know how to use them).

Basically the nature of this program is, the data comes in a random order (like above), I need to sort it, process it and then send out the results (order doesn’t matter but users need to know which result belongs to which key). I thought about putting it in a dictionary first, then sorting list one but I would have no way of differentiating of items in the with the same value if order is not maintained (it may have an impact when communicating the results to users). So ideally, once I get the lists I would rather figure out a way to sort both lists together. Is this possible?


回答 0

解决此问题的一种经典方法是使用“装饰,排序,未装饰”习惯用法,使用python的内置zip函数特别简单:

>>> list1 = [3,2,4,1, 1]
>>> list2 = ['three', 'two', 'four', 'one', 'one2']
>>> list1, list2 = zip(*sorted(zip(list1, list2)))
>>> list1
(1, 1, 2, 3, 4)
>>> list2 
('one', 'one2', 'two', 'three', 'four')

这些当然不再是列表,但是如果需要的话,很容易纠正:

>>> list1, list2 = (list(t) for t in zip(*sorted(zip(list1, list2))))
>>> list1
[1, 1, 2, 3, 4]
>>> list2
['one', 'one2', 'two', 'three', 'four']

值得一提的是,以上可能会为简洁而牺牲速度。就地版本,占用3行,对于我的小型列表来说,在我的机器上快了一点:

>>> %timeit zip(*sorted(zip(list1, list2)))
100000 loops, best of 3: 3.3 us per loop
>>> %timeit tups = zip(list1, list2); tups.sort(); zip(*tups)
100000 loops, best of 3: 2.84 us per loop

另一方面,对于较大的列表,单行版本可能会更快:

>>> %timeit zip(*sorted(zip(list1, list2)))
100 loops, best of 3: 8.09 ms per loop
>>> %timeit tups = zip(list1, list2); tups.sort(); zip(*tups)
100 loops, best of 3: 8.51 ms per loop

正如Quantum7指出的那样,JSF的建议仍然要快一些,但可能只会快一点,因为Python 内部在所有基于键的排序中使用了完全相同的DSU习惯用法。它发生在离裸机更近的地方。(这表明zip例程的优化程度如何!)

我认为zip基于方法的灵活性更高,可读性更高,所以我更喜欢它。

One classic approach to this problem is to use the “decorate, sort, undecorate” idiom, which is especially simple using python’s built-in zip function:

>>> list1 = [3,2,4,1, 1]
>>> list2 = ['three', 'two', 'four', 'one', 'one2']
>>> list1, list2 = zip(*sorted(zip(list1, list2)))
>>> list1
(1, 1, 2, 3, 4)
>>> list2 
('one', 'one2', 'two', 'three', 'four')

These of course are no longer lists, but that’s easily remedied, if it matters:

>>> list1, list2 = (list(t) for t in zip(*sorted(zip(list1, list2))))
>>> list1
[1, 1, 2, 3, 4]
>>> list2
['one', 'one2', 'two', 'three', 'four']

It’s worth noting that the above may sacrifice speed for terseness; the in-place version, which takes up 3 lines, is a tad faster on my machine for small lists:

>>> %timeit zip(*sorted(zip(list1, list2)))
100000 loops, best of 3: 3.3 us per loop
>>> %timeit tups = zip(list1, list2); tups.sort(); zip(*tups)
100000 loops, best of 3: 2.84 us per loop

On the other hand, for larger lists, the one-line version could be faster:

>>> %timeit zip(*sorted(zip(list1, list2)))
100 loops, best of 3: 8.09 ms per loop
>>> %timeit tups = zip(list1, list2); tups.sort(); zip(*tups)
100 loops, best of 3: 8.51 ms per loop

As Quantum7 points out, JSF’s suggestion is a bit faster still, but it will probably only ever be a little bit faster, because Python uses the very same DSU idiom internally for all key-based sorts. It’s just happening a little closer to the bare metal. (This shows just how well optimized the zip routines are!)

I think the zip-based approach is more flexible and is a little more readable, so I prefer it.


回答 1

您可以使用值作为键对索引进行排序:

indexes = range(len(list1))
indexes.sort(key=list1.__getitem__)

要获得给定排序索引的排序列表:

sorted_list1 = map(list1.__getitem__, indexes)
sorted_list2 = map(list2.__getitem__, indexes)

在您的情况下,您不应有list1list2而应有一个单对列表:

data = [(3, 'three'), (2, 'two'), (4, 'four'), (1, 'one'), (1, 'one2')]

易于创建;在Python中很容易排序:

data.sort() # sort using a pair as a key

仅按第一个值排序:

data.sort(key=lambda pair: pair[0])

You can sort indexes using values as keys:

indexes = range(len(list1))
indexes.sort(key=list1.__getitem__)

To get sorted lists given sorted indexes:

sorted_list1 = map(list1.__getitem__, indexes)
sorted_list2 = map(list2.__getitem__, indexes)

In your case you shouldn’t have list1, list2 but rather a single list of pairs:

data = [(3, 'three'), (2, 'two'), (4, 'four'), (1, 'one'), (1, 'one2')]

It is easy to create; it is easy to sort in Python:

data.sort() # sort using a pair as a key

Sort by the first value only:

data.sort(key=lambda pair: pair[0])

回答 2

我一直使用senderle给出的答案,直到发现为止np.argsort。下面是它的工作原理。

# idx works on np.array and not lists.
list1 = np.array([3,2,4,1])
list2 = np.array(["three","two","four","one"])
idx   = np.argsort(list1)

list1 = np.array(list1)[idx]
list2 = np.array(list2)[idx]

我发现此解决方案更加直观,并且效果很好。性能:

def sorting(l1, l2):
    # l1 and l2 has to be numpy arrays
    idx = np.argsort(l1)
    return l1[idx], l2[idx]

# list1 and list2 are np.arrays here...
%timeit sorting(list1, list2)
100000 loops, best of 3: 3.53 us per loop

# This works best when the lists are NOT np.array
%timeit zip(*sorted(zip(list1, list2)))
100000 loops, best of 3: 2.41 us per loop

# 0.01us better for np.array (I think this is negligible)
%timeit tups = zip(list1, list2); tups.sort(); zip(*tups)
100000 loops, best for 3 loops: 1.96 us per loop

尽管np.argsort不是最快的,但我发现它更易于使用。

I have used the answer given by senderle for a long time until I discovered np.argsort. Here is how it works.

# idx works on np.array and not lists.
list1 = np.array([3,2,4,1])
list2 = np.array(["three","two","four","one"])
idx   = np.argsort(list1)

list1 = np.array(list1)[idx]
list2 = np.array(list2)[idx]

I find this solution more intuitive, and it works really well. The perfomance:

def sorting(l1, l2):
    # l1 and l2 has to be numpy arrays
    idx = np.argsort(l1)
    return l1[idx], l2[idx]

# list1 and list2 are np.arrays here...
%timeit sorting(list1, list2)
100000 loops, best of 3: 3.53 us per loop

# This works best when the lists are NOT np.array
%timeit zip(*sorted(zip(list1, list2)))
100000 loops, best of 3: 2.41 us per loop

# 0.01us better for np.array (I think this is negligible)
%timeit tups = zip(list1, list2); tups.sort(); zip(*tups)
100000 loops, best for 3 loops: 1.96 us per loop

Even though np.argsort isn’t the fastest one, I find it easier to use.


回答 3

施瓦兹变换。内置的Python排序是稳定的,因此这两个1不会引起问题。

>>> l1 = [3, 2, 4, 1, 1]
>>> l2 = ['three', 'two', 'four', 'one', 'second one']
>>> zip(*sorted(zip(l1, l2)))
[(1, 1, 2, 3, 4), ('one', 'second one', 'two', 'three', 'four')]

Schwartzian transform. The built-in Python sorting is stable, so the two 1s don’t cause a problem.

>>> l1 = [3, 2, 4, 1, 1]
>>> l2 = ['three', 'two', 'four', 'one', 'second one']
>>> zip(*sorted(zip(l1, l2)))
[(1, 1, 2, 3, 4), ('one', 'second one', 'two', 'three', 'four')]

回答 4

关于什么:

list1 = [3,2,4,1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']

sortedRes = sorted(zip(list1, list2), key=lambda x: x[0]) # use 0 or 1 depending on what you want to sort
>>> [(1, 'one'), (1, 'one2'), (2, 'two'), (3, 'three'), (4, 'four')]

What about:

list1 = [3,2,4,1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']

sortedRes = sorted(zip(list1, list2), key=lambda x: x[0]) # use 0 or 1 depending on what you want to sort
>>> [(1, 'one'), (1, 'one2'), (2, 'two'), (3, 'three'), (4, 'four')]

回答 5

您可以使用zip()sort()函数来完成此操作:

Python 2.6.5 (r265:79063, Jun 12 2010, 17:07:01)
[GCC 4.3.4 20090804 (release) 1] on cygwin
>>> list1 = [3,2,4,1,1]
>>> list2 = ['three', 'two', 'four', 'one', 'one2']
>>> zipped = zip(list1, list2)
>>> zipped.sort()
>>> slist1 = [i for (i, s) in zipped]
>>> slist1
[1, 1, 2, 3, 4]
>>> slist2 = [s for (i, s) in zipped]
>>> slist2
['one', 'one2', 'two', 'three', 'four']

希望这可以帮助

You can use the zip() and sort() functions to accomplish this:

Python 2.6.5 (r265:79063, Jun 12 2010, 17:07:01)
[GCC 4.3.4 20090804 (release) 1] on cygwin
>>> list1 = [3,2,4,1,1]
>>> list2 = ['three', 'two', 'four', 'one', 'one2']
>>> zipped = zip(list1, list2)
>>> zipped.sort()
>>> slist1 = [i for (i, s) in zipped]
>>> slist1
[1, 1, 2, 3, 4]
>>> slist2 = [s for (i, s) in zipped]
>>> slist2
['one', 'one2', 'two', 'three', 'four']

Hope this helps


回答 6

除非在list2中有两个相同的值,否则可以在sorted()方法中使用key参数。

代码如下:

sorted(list2, key = lambda x: list1[list2.index(x)]) 

它根据list1中的对应值对list2进行排序,但请确保在使用此列表时,list2中的两个值都不会相等,因为list.index()函数会给出第一个值

You can use the key argument in sorted() method unless you have two same values in list2.

The code is given below:

sorted(list2, key = lambda x: list1[list2.index(x)]) 

It sorts list2 according to corresponding values in list1, but make sure that while using this, no two values in list2 evaluate to be equal because list.index() function give the first value


回答 7

一种方法是通过对标识[0,1,2,.. n]进行排序来跟踪每个索引的位置

这适用于任意数量的列表。

然后将每个项目移到其位置。最好使用接头。

list1 = [3,2,4,1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']

index = list(range(len(list1)))
print(index)
'[0, 1, 2, 3, 4]'

index.sort(key = list1.__getitem__)
print(index)
'[3, 4, 1, 0, 2]'

list1[:] = [list1[i] for i in index]
list2[:] = [list2[i] for i in index]

print(list1)
print(list2)
'[1, 1, 2, 3, 4]'
"['one', 'one2', 'two', 'three', 'four']"

请注意,我们可以对列表进行迭代而无需对它们进行排序:

list1_iter = (list1[i] for i in index)

One way is to track where each index goes to by sorting the identity [0,1,2,..n]

This works for any number of lists.

Then move each item to its position. Using splices is best.

list1 = [3,2,4,1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']

index = list(range(len(list1)))
print(index)
'[0, 1, 2, 3, 4]'

index.sort(key = list1.__getitem__)
print(index)
'[3, 4, 1, 0, 2]'

list1[:] = [list1[i] for i in index]
list2[:] = [list2[i] for i in index]

print(list1)
print(list2)
'[1, 1, 2, 3, 4]'
"['one', 'one2', 'two', 'three', 'four']"

Note we could have iterated the lists without even sorting them:

list1_iter = (list1[i] for i in index)

回答 8

如果您使用的是numpy,则可以np.argsort用来获取排序的索引,并将这些索引应用于列表。这适用于您要排序的任何数量的列表。

import numpy as np

arr1 = np.array([4,3,1,32,21])
arr2 = arr1 * 10
sorted_idxs = np.argsort(arr1)

print(sorted_idxs)
>>> array([2, 1, 0, 4, 3])

print(arr1[sorted_idxs])
>>> array([ 1,  3,  4, 21, 32])

print(arr2[sorted_idxs])
>>> array([ 10,  30,  40, 210, 320])

If you are using numpy you can use np.argsort to get the sorted indices and apply those indices to the list. This works for any number of list that you would want to sort.

import numpy as np

arr1 = np.array([4,3,1,32,21])
arr2 = arr1 * 10
sorted_idxs = np.argsort(arr1)

print(sorted_idxs)
>>> array([2, 1, 0, 4, 3])

print(arr1[sorted_idxs])
>>> array([ 1,  3,  4, 21, 32])

print(arr2[sorted_idxs])
>>> array([ 10,  30,  40, 210, 320])

回答 9

算法解决方案:

list1 = [3,2,4,1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']


lis = [(list1[i], list2[i]) for i in range(len(list1))]
list1.sort()
list2 = [x[1] for i in range(len(list1)) for x in lis if x[0] == i]

输出: -> 输出速度: 0.2s

>>>list1
>>>[1, 1, 2, 3, 4]
>>>list2
>>>['one', 'one2', 'two', 'three', 'four']

an algorithmic solution:

list1 = [3,2,4,1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']


lis = [(list1[i], list2[i]) for i in range(len(list1))]
list1.sort()
list2 = [x[1] for i in range(len(list1)) for x in lis if x[0] == i]

Outputs: -> Output speed: 0.2s

>>>list1
>>>[1, 1, 2, 3, 4]
>>>list2
>>>['one', 'one2', 'two', 'three', 'four']

回答 10

在对另一个列表进行排序时,保留字符串列表顺序的另一种方法如下:

list1 = [3,2,4,1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']

# sort on list1 while retaining order of string list
sorted_list1 = [y for _,y in sorted(zip(list1,list2),key=lambda x: x[0])]
sorted_list2 = sorted(list1)

print(sorted_list1)
print(sorted_list2)

输出

['one', 'one2', 'two', 'three', 'four']
[1, 1, 2, 3, 4]

Another approach to retaining the order of a string list when sorting against another list is as follows:

list1 = [3,2,4,1, 1]
list2 = ['three', 'two', 'four', 'one', 'one2']

# sort on list1 while retaining order of string list
sorted_list1 = [y for _,y in sorted(zip(list1,list2),key=lambda x: x[0])]
sorted_list2 = sorted(list1)

print(sorted_list1)
print(sorted_list2)

output

['one', 'one2', 'two', 'three', 'four']
[1, 1, 2, 3, 4]

回答 11

我想扩展开放式jfs的答案,这对我的问题非常有用:将两个列表按经过装饰的第三个列表排序

我们可以以任何方式创建装饰列表,但是在这种情况下,我们将根据要排序的两个原始列表之一的元素来创建它:

# say we have the following list and we want to sort both by the algorithms name 
# (if we were to sort by the string_list, it would sort by the numerical 
# value in the strings)
string_list = ["0.123 Algo. XYZ", "0.345 Algo. BCD", "0.987 Algo. ABC"]
dict_list = [{"dict_xyz": "XYZ"}, {"dict_bcd": "BCD"}, {"dict_abc": "ABC"}]

# thus we need to create the decorator list, which we can now use to sort
decorated = [text[6:] for text in string_list]  
# decorated list to sort
>>> decorated
['Algo. XYZ', 'Algo. BCD', 'Algo. ABC']

现在我们可以应用jfs的解决方案将我们的两个列表按第三个排序

# create and sort the list of indices
sorted_indices = list(range(len(string_list)))
sorted_indices.sort(key=decorated.__getitem__)

# map sorted indices to the two, original lists
sorted_stringList = list(map(string_list.__getitem__, sorted_indices))
sorted_dictList = list(map(dict_list.__getitem__, sorted_indices))

# output
>>> sorted_stringList
['0.987 Algo. ABC', '0.345 Algo. BCD', '0.123 Algo. XYZ']
>>> sorted_dictList
[{'dict_abc': 'ABC'}, {'dict_bcd': 'BCD'}, {'dict_xyz': 'XYZ'}]

编辑:大家好,我对此发表了一篇文章,如果您愿意的话请查看 :)🐍🐍🐍

I would like to expand open jfs’s answer, which worked great for my problem: sorting two lists by a third, decorated list:

We can create our decorated list in any way, but in this case we will create it from the elements of one of the two original lists, that we want to sort:

# say we have the following list and we want to sort both by the algorithms name 
# (if we were to sort by the string_list, it would sort by the numerical 
# value in the strings)
string_list = ["0.123 Algo. XYZ", "0.345 Algo. BCD", "0.987 Algo. ABC"]
dict_list = [{"dict_xyz": "XYZ"}, {"dict_bcd": "BCD"}, {"dict_abc": "ABC"}]

# thus we need to create the decorator list, which we can now use to sort
decorated = [text[6:] for text in string_list]  
# decorated list to sort
>>> decorated
['Algo. XYZ', 'Algo. BCD', 'Algo. ABC']

Now we can apply jfs’s solution to sort our two lists by the third

# create and sort the list of indices
sorted_indices = list(range(len(string_list)))
sorted_indices.sort(key=decorated.__getitem__)

# map sorted indices to the two, original lists
sorted_stringList = list(map(string_list.__getitem__, sorted_indices))
sorted_dictList = list(map(dict_list.__getitem__, sorted_indices))

# output
>>> sorted_stringList
['0.987 Algo. ABC', '0.345 Algo. BCD', '0.123 Algo. XYZ']
>>> sorted_dictList
[{'dict_abc': 'ABC'}, {'dict_bcd': 'BCD'}, {'dict_xyz': 'XYZ'}]

Edit: Hey guys I made a block post about this, check it out if you feel like it :) 🐍🐍🐍


回答 12

newsource=[];newtarget=[]
for valueT in targetFiles:
    for valueS in sourceFiles:
            l1=len(valueS);l2=len(valueT);
            j=0
            while (j< l1):
                    if (str(valueT) == valueS[j:l1]) :
                            newsource.append(valueS)
                            newtarget.append(valueT)
                    j+=1
newsource=[];newtarget=[]
for valueT in targetFiles:
    for valueS in sourceFiles:
            l1=len(valueS);l2=len(valueT);
            j=0
            while (j< l1):
                    if (str(valueT) == valueS[j:l1]) :
                            newsource.append(valueS)
                            newtarget.append(valueT)
                    j+=1

检查列表是否已排序的Python方法

问题:检查列表是否已排序的Python方法

有没有一种pythonic的方法来检查列表是否已经排序ASCDESC

listtimestamps = [1, 2, 3, 5, 6, 7]

诸如此类的东西isttimestamps.isSorted()会返回TrueFalse

我想输入一些消息的时间戳列表,并检查事务是否以正确的顺序出现。

Is there a pythonic way to check if a list is already sorted in ASC or DESC

listtimestamps = [1, 2, 3, 5, 6, 7]

something like isttimestamps.isSorted() that returns True or False.

I want to input a list of timestamps for some messages and check if the the transactions appeared in the correct order.


回答 0

实际上,我们没有给出anijhaw寻找的答案。这是一行代码:

all(l[i] <= l[i+1] for i in xrange(len(l)-1))

对于Python 3:

all(l[i] <= l[i+1] for i in range(len(l)-1))

Actually we are not giving the answer anijhaw is looking for. Here is the one liner:

all(l[i] <= l[i+1] for i in xrange(len(l)-1))

For Python 3:

all(l[i] <= l[i+1] for i in range(len(l)-1))

回答 1

我只会用

if sorted(lst) == lst:
    # code here

除非这是一个很大的列表,否则您可能需要创建一个自定义函数。

如果您只是要对它进行排序(如果未排序的话),那么请忘记对它进行排序。

lst.sort()

并不要考虑太多。

如果您想使用自定义功能,可以执行以下操作

def is_sorted(lst, key=lambda x: x):
    for i, el in enumerate(lst[1:]):
        if key(el) < key(lst[i]): # i is the index of the previous element
            return False
    return True

如果列表已经被排序了,那么它将是O(n)(那时O(n)处于for循环中!)因此,除非您希望大多数时间都不对列表进行排序(并且相当随机),否则,再次,只需对列表进行排序。

I would just use

if sorted(lst) == lst:
    # code here

unless it’s a very big list in which case you might want to create a custom function.

if you are just going to sort it if it’s not sorted, then forget the check and sort it.

lst.sort()

and don’t think about it too much.

if you want a custom function, you can do something like

def is_sorted(lst, key=lambda x: x):
    for i, el in enumerate(lst[1:]):
        if key(el) < key(lst[i]): # i is the index of the previous element
            return False
    return True

This will be O(n) if the list is already sorted though (and O(n) in a for loop at that!) so, unless you expect it to be not sorted (and fairly random) most of the time, I would, again, just sort the list.


回答 2

该迭代器形式比使用整数索引快10-15%:

# python2 only
if str is bytes:
    from itertools import izip as zip

def is_sorted(l):
    return all(a <= b for a, b in zip(l, l[1:]))

This iterator form is 10-15% faster than using integer indexing:

# python2 only
if str is bytes:
    from itertools import izip as zip

def is_sorted(l):
    return all(a <= b for a, b in zip(l, l[1:]))

回答 3

实现此目的的一种好方法是使用imap来自itertools以下函数:

from itertools import imap, tee
import operator

def is_sorted(iterable, compare=operator.le):
  a, b = tee(iterable)
  next(b, None)
  return all(imap(compare, a, b))

这种实现是快速的,并且适用于任何迭代。

A beautiful way to implement this is to use the imap function from itertools:

from itertools import imap, tee
import operator

def is_sorted(iterable, compare=operator.le):
  a, b = tee(iterable)
  next(b, None)
  return all(imap(compare, a, b))

This implementation is fast and works on any iterables.


回答 4

我运行了一个基准测试sorted(lst, reverse=True) == lst对于长名单来说all(l[i] >= l[i+1] for i in xrange(len(l)-1))是最快的,对于短名单来说是最快的。这些基准测试是在MacBook Pro 2010 13英寸(Core2 Duo 2.66GHz,4GB 1067MHz DDR3 RAM,Mac OS X 10.6.5)上运行的。

更新:我修改了脚本,以便您可以在自己的系统上直接运行它。先前的版本存在错误。另外,我添加了已排序和未排序的输入。

  • 最适合短列表: all(l[i] >= l[i+1] for i in xrange(len(l)-1))
  • 最适合长排序列表: sorted(l, reverse=True) == l
  • 最适合简短的未排序列表: all(l[i] >= l[i+1] for i in xrange(len(l)-1))
  • 最适合长时间未排序的列表: all(l[i] >= l[i+1] for i in xrange(len(l)-1))

因此,在大多数情况下,都有明显的赢家。

更新: aaronsterling的答案(第6和第7名)实际上在所有情况下都是最快的。#7最快,因为它没有间接层来查找密钥。

#!/usr/bin/env python

import itertools
import time

def benchmark(f, *args):
    t1 = time.time()
    for i in xrange(1000000):
        f(*args)
    t2 = time.time()
    return t2-t1

L1 = range(4, 0, -1)
L2 = range(100, 0, -1)
L3 = range(0, 4)
L4 = range(0, 100)

# 1.
def isNonIncreasing(l, key=lambda x,y: x >= y): 
    return all(key(l[i],l[i+1]) for i in xrange(len(l)-1))
print benchmark(isNonIncreasing, L1) # 2.47253704071
print benchmark(isNonIncreasing, L2) # 34.5398209095
print benchmark(isNonIncreasing, L3) # 2.1916718483
print benchmark(isNonIncreasing, L4) # 2.19576501846

# 2.
def isNonIncreasing(l):
    return all(l[i] >= l[i+1] for i in xrange(len(l)-1))
print benchmark(isNonIncreasing, L1) # 1.86919999123
print benchmark(isNonIncreasing, L2) # 21.8603689671
print benchmark(isNonIncreasing, L3) # 1.95684289932
print benchmark(isNonIncreasing, L4) # 1.95272517204

# 3.
def isNonIncreasing(l, key=lambda x,y: x >= y): 
    return all(key(a,b) for (a,b) in itertools.izip(l[:-1],l[1:]))
print benchmark(isNonIncreasing, L1) # 2.65468883514
print benchmark(isNonIncreasing, L2) # 29.7504849434
print benchmark(isNonIncreasing, L3) # 2.78062295914
print benchmark(isNonIncreasing, L4) # 3.73436689377

# 4.
def isNonIncreasing(l):
    return all(a >= b for (a,b) in itertools.izip(l[:-1],l[1:]))
print benchmark(isNonIncreasing, L1) # 2.06947803497
print benchmark(isNonIncreasing, L2) # 15.6351969242
print benchmark(isNonIncreasing, L3) # 2.45671010017
print benchmark(isNonIncreasing, L4) # 3.48461818695

# 5.
def isNonIncreasing(l):
    return sorted(l, reverse=True) == l
print benchmark(isNonIncreasing, L1) # 2.01579380035
print benchmark(isNonIncreasing, L2) # 5.44593787193
print benchmark(isNonIncreasing, L3) # 2.01813793182
print benchmark(isNonIncreasing, L4) # 4.97615599632

# 6.
def isNonIncreasing(l, key=lambda x, y: x >= y): 
    for i, el in enumerate(l[1:]):
        if key(el, l[i-1]):
            return False
    return True
print benchmark(isNonIncreasing, L1) # 1.06842684746
print benchmark(isNonIncreasing, L2) # 1.67291283607
print benchmark(isNonIncreasing, L3) # 1.39491200447
print benchmark(isNonIncreasing, L4) # 1.80557894707

# 7.
def isNonIncreasing(l):
    for i, el in enumerate(l[1:]):
        if el >= l[i-1]:
            return False
    return True
print benchmark(isNonIncreasing, L1) # 0.883186101913
print benchmark(isNonIncreasing, L2) # 1.42852401733
print benchmark(isNonIncreasing, L3) # 1.09229516983
print benchmark(isNonIncreasing, L4) # 1.59502696991

I ran a benchmark and sorted(lst, reverse=True) == lst was the fastest for long lists, and all(l[i] >= l[i+1] for i in xrange(len(l)-1)) was the fastest for short lists. These benchmarks were run on a MacBook Pro 2010 13″ (Core2 Duo 2.66GHz, 4GB 1067MHz DDR3 RAM, Mac OS X 10.6.5).

UPDATE: I revised the script so that you can run it directly on your own system. The previous version had bugs. Also, I have added both sorted and unsorted inputs.

  • Best for short sorted lists: all(l[i] >= l[i+1] for i in xrange(len(l)-1))
  • Best for long sorted lists: sorted(l, reverse=True) == l
  • Best for short unsorted lists: all(l[i] >= l[i+1] for i in xrange(len(l)-1))
  • Best for long unsorted lists: all(l[i] >= l[i+1] for i in xrange(len(l)-1))

So in most cases there is a clear winner.

UPDATE: aaronsterling’s answers (#6 and #7) are actually the fastest in all cases. #7 is the fastest because it doesn’t have a layer of indirection to lookup the key.

#!/usr/bin/env python

import itertools
import time

def benchmark(f, *args):
    t1 = time.time()
    for i in xrange(1000000):
        f(*args)
    t2 = time.time()
    return t2-t1

L1 = range(4, 0, -1)
L2 = range(100, 0, -1)
L3 = range(0, 4)
L4 = range(0, 100)

# 1.
def isNonIncreasing(l, key=lambda x,y: x >= y): 
    return all(key(l[i],l[i+1]) for i in xrange(len(l)-1))
print benchmark(isNonIncreasing, L1) # 2.47253704071
print benchmark(isNonIncreasing, L2) # 34.5398209095
print benchmark(isNonIncreasing, L3) # 2.1916718483
print benchmark(isNonIncreasing, L4) # 2.19576501846

# 2.
def isNonIncreasing(l):
    return all(l[i] >= l[i+1] for i in xrange(len(l)-1))
print benchmark(isNonIncreasing, L1) # 1.86919999123
print benchmark(isNonIncreasing, L2) # 21.8603689671
print benchmark(isNonIncreasing, L3) # 1.95684289932
print benchmark(isNonIncreasing, L4) # 1.95272517204

# 3.
def isNonIncreasing(l, key=lambda x,y: x >= y): 
    return all(key(a,b) for (a,b) in itertools.izip(l[:-1],l[1:]))
print benchmark(isNonIncreasing, L1) # 2.65468883514
print benchmark(isNonIncreasing, L2) # 29.7504849434
print benchmark(isNonIncreasing, L3) # 2.78062295914
print benchmark(isNonIncreasing, L4) # 3.73436689377

# 4.
def isNonIncreasing(l):
    return all(a >= b for (a,b) in itertools.izip(l[:-1],l[1:]))
print benchmark(isNonIncreasing, L1) # 2.06947803497
print benchmark(isNonIncreasing, L2) # 15.6351969242
print benchmark(isNonIncreasing, L3) # 2.45671010017
print benchmark(isNonIncreasing, L4) # 3.48461818695

# 5.
def isNonIncreasing(l):
    return sorted(l, reverse=True) == l
print benchmark(isNonIncreasing, L1) # 2.01579380035
print benchmark(isNonIncreasing, L2) # 5.44593787193
print benchmark(isNonIncreasing, L3) # 2.01813793182
print benchmark(isNonIncreasing, L4) # 4.97615599632

# 6.
def isNonIncreasing(l, key=lambda x, y: x >= y): 
    for i, el in enumerate(l[1:]):
        if key(el, l[i-1]):
            return False
    return True
print benchmark(isNonIncreasing, L1) # 1.06842684746
print benchmark(isNonIncreasing, L2) # 1.67291283607
print benchmark(isNonIncreasing, L3) # 1.39491200447
print benchmark(isNonIncreasing, L4) # 1.80557894707

# 7.
def isNonIncreasing(l):
    for i, el in enumerate(l[1:]):
        if el >= l[i-1]:
            return False
    return True
print benchmark(isNonIncreasing, L1) # 0.883186101913
print benchmark(isNonIncreasing, L2) # 1.42852401733
print benchmark(isNonIncreasing, L3) # 1.09229516983
print benchmark(isNonIncreasing, L4) # 1.59502696991

回答 5

我会这样做的(从这里的很多答案中窃取了[亚伦·斯特林,伟业同志,保罗·麦奎尔(Paul McGuire)等等),大部分都是阿明·罗纳彻(Armin Ronacher):

from itertools import tee, izip

def pairwise(iterable):
    a, b = tee(iterable)
    next(b, None)
    return izip(a, b)

def is_sorted(iterable, key=lambda a, b: a <= b):
    return all(key(a, b) for a, b in pairwise(iterable))

一件好事:您不必实现该系列的第二个可迭代对象(与列表切片不同)。

I’d do this (stealing from a lot of answers here [Aaron Sterling, Wai Yip Tung, sorta from Paul McGuire] and mostly Armin Ronacher):

from itertools import tee, izip

def pairwise(iterable):
    a, b = tee(iterable)
    next(b, None)
    return izip(a, b)

def is_sorted(iterable, key=lambda a, b: a <= b):
    return all(key(a, b) for a, b in pairwise(iterable))

One nice thing: you don’t have to realize the second iterable for the series (unlike with a list slice).


回答 6

我使用基于numpy.diff()的这种单线:

def issorted(x):
    """Check if x is sorted"""
    return (numpy.diff(x) >= 0).all() # is diff between all consecutive entries >= 0?

我还没有真正针对任何其他方法计时,但是我认为它比任何纯Python方法都快,尤其是对于大n而言,因为numpy.diff中的循环(可能)直接在C中运行(n-1个减法,后跟n个) -1比较)。

但是,如果x是无符号的int,则需要小心,这可能会导致numpy.diff()中的无声整数下溢,从而导致误报。这是修改后的版本:

def issorted(x):
    """Check if x is sorted"""
    try:
        if x.dtype.kind == 'u':
            # x is unsigned int array, risk of int underflow in np.diff
            x = numpy.int64(x)
    except AttributeError:
        pass # no dtype, not an array
    return (numpy.diff(x) >= 0).all()

I use this one-liner based on numpy.diff():

def issorted(x):
    """Check if x is sorted"""
    return (numpy.diff(x) >= 0).all() # is diff between all consecutive entries >= 0?

I haven’t really timed it against any other method, but I assume it’s faster than any pure Python method, especially for large n, since the loop in numpy.diff (probably) runs directly in C (n-1 subtractions followed by n-1 comparisons).

However, you need to be careful if x is an unsigned int, which might cause silent integer underflow in numpy.diff(), resulting in a false positive. Here’s a modified version:

def issorted(x):
    """Check if x is sorted"""
    try:
        if x.dtype.kind == 'u':
            # x is unsigned int array, risk of int underflow in np.diff
            x = numpy.int64(x)
    except AttributeError:
        pass # no dtype, not an array
    return (numpy.diff(x) >= 0).all()

回答 7

这类似于最佳答案,但我更喜欢它,因为它避免了显式索引。假设列表名称为lst,则可以
(item, next_item)使用zip以下命令从列表中生成元组:

all(x <= y for x,y in zip(lst, lst[1:]))

在Python 3中,zip已经返回了生成器;在Python 2中,可以使用它itertools.izip来提高内存效率。

小型演示:

>>> lst = [1, 2, 3, 4]
>>> zip(lst, lst[1:])
[(1, 2), (2, 3), (3, 4)]
>>> all(x <= y for x,y in zip(lst, lst[1:]))
True
>>> 
>>> lst = [1, 2, 3, 2]
>>> zip(lst, lst[1:])
[(1, 2), (2, 3), (3, 2)]
>>> all(x <= y for x,y in zip(lst, lst[1:]))
False

(3, 2)评估元组时,最后一个失败。

奖励:检查无法索引的有限(!)生成器:

>>> def gen1():
...     yield 1
...     yield 2
...     yield 3
...     yield 4
...     
>>> def gen2():
...     yield 1
...     yield 2
...     yield 4
...     yield 3
... 
>>> g1_1 = gen1()
>>> g1_2 = gen1()
>>> next(g1_2)
1
>>> all(x <= y for x,y in zip(g1_1, g1_2))
True
>>>
>>> g2_1 = gen2()
>>> g2_2 = gen2()
>>> next(g2_2)
1
>>> all(x <= y for x,y in zip(g2_1, g2_2))
False

itertools.izip如果您使用的是Python 2,请确保在此处使用,否则您将失去不必从生成器创建列表的目的。

This is similar to the top answer, but I like it better because it avoids explicit indexing. Assuming your list has the name lst, you can generate
(item, next_item) tuples from your list with zip:

all(x <= y for x,y in zip(lst, lst[1:]))

In Python 3, zip already returns a generator, in Python 2 you can use itertools.izip for better memory efficiency.

Small demo:

>>> lst = [1, 2, 3, 4]
>>> zip(lst, lst[1:])
[(1, 2), (2, 3), (3, 4)]
>>> all(x <= y for x,y in zip(lst, lst[1:]))
True
>>> 
>>> lst = [1, 2, 3, 2]
>>> zip(lst, lst[1:])
[(1, 2), (2, 3), (3, 2)]
>>> all(x <= y for x,y in zip(lst, lst[1:]))
False

The last one fails when the tuple (3, 2) is evaluated.

Bonus: checking finite (!) generators which cannot be indexed:

>>> def gen1():
...     yield 1
...     yield 2
...     yield 3
...     yield 4
...     
>>> def gen2():
...     yield 1
...     yield 2
...     yield 4
...     yield 3
... 
>>> g1_1 = gen1()
>>> g1_2 = gen1()
>>> next(g1_2)
1
>>> all(x <= y for x,y in zip(g1_1, g1_2))
True
>>>
>>> g2_1 = gen2()
>>> g2_2 = gen2()
>>> next(g2_2)
1
>>> all(x <= y for x,y in zip(g2_1, g2_2))
False

Make sure to use itertools.izip here if you are using Python 2, otherwise you would defeat the purpose of not having to create lists from the generators.


回答 8

SapphireSun是完全正确的。您可以使用lst.sort()。Python的排序实现(TimSort)检查列表是否已排序。如果这样,sort()将在线性时间内完成。听起来像是一种Python方式,可以确保对列表进行排序;)

SapphireSun is quite right. You can just use lst.sort(). Python’s sort implementation (TimSort) check if the list is already sorted. If so sort() will completed in linear time. Sounds like a Pythonic way to ensure a list is sorted ;)


回答 9

尽管我认为不能保证该sorted内置函数使用调用其cmp函数i+1, i,但对于CPython来说确实如此。

因此,您可以执行以下操作:

def my_cmp(x, y):
   cmpval = cmp(x, y)
   if cmpval < 0:
      raise ValueError
   return cmpval

def is_sorted(lst):
   try:
      sorted(lst, cmp=my_cmp)
      return True
   except ValueError:
      return False

print is_sorted([1,2,3,5,6,7])
print is_sorted([1,2,5,3,6,7])

还是这样(如果if语句-> EAFP出了错?;-)):

def my_cmp(x, y):
   assert(x >= y)
   return -1

def is_sorted(lst):
   try:
      sorted(lst, cmp=my_cmp)
      return True
   except AssertionError:
      return False

Although I don’t think there is a guarantee for that the sorted built-in calls its cmp function with i+1, i, it does seem to do so for CPython.

So you could do something like:

def my_cmp(x, y):
   cmpval = cmp(x, y)
   if cmpval < 0:
      raise ValueError
   return cmpval

def is_sorted(lst):
   try:
      sorted(lst, cmp=my_cmp)
      return True
   except ValueError:
      return False

print is_sorted([1,2,3,5,6,7])
print is_sorted([1,2,5,3,6,7])

Or this way (without if statements -> EAFP gone wrong? ;-) ):

def my_cmp(x, y):
   assert(x >= y)
   return -1

def is_sorted(lst):
   try:
      sorted(lst, cmp=my_cmp)
      return True
   except AssertionError:
      return False

回答 10

完全不是Pythonic,但我们至少需要一个reduce()答案,对吗?

def is_sorted(iterable):
    prev_or_inf = lambda prev, i: i if prev <= i else float('inf')
    return reduce(prev_or_inf, iterable, float('-inf')) < float('inf')

累加器变量仅存储该最后检查的值,如果任何值小于前一个值,则累加器将设置为无穷大(因此,最后的值仍然为无穷大,因为“先前的值”将始终大于当前的)。

Not very Pythonic at all, but we need at least one reduce() answer, right?

def is_sorted(iterable):
    prev_or_inf = lambda prev, i: i if prev <= i else float('inf')
    return reduce(prev_or_inf, iterable, float('-inf')) < float('inf')

The accumulator variable simply stores that last-checked value, and if any value is smaller than the previous value, the accumulator is set to infinity (and thus will still be infinity at the end, since the ‘previous value’ will always be bigger than the current one).


回答 11

正如@aaronsterling所指出的那样,以下解决方案是最短的,并且在对数组进行排序且不太小时似乎是最快的:def is_sorted(lst):return(sorted(lst)== lst)

如果大多数情况下未对数组进行排序,则希望使用不扫描整个数组并在发现未排序前缀后立即返回False的解决方案。以下是我能找到的最快的解决方案,它并不是特别优雅:

def is_sorted(lst):
    it = iter(lst)
    try:
        prev = it.next()
    except StopIteration:
        return True
    for x in it:
        if prev > x:
            return False
        prev = x
    return True

使用Nathan Farrington的基准测试,在所有情况下都比使用sorted(lst)更好的运行时间,但在大型排序列表上运行时除外。

这是我计算机上的基准测试结果。

sorted(lst)== lst解决方案

  • L1:1.23838591576
  • L2:4.19063091278
  • L3:1.17992287346
  • L4:4.668399500847

第二种解决方案:

  • L1:0.81095790863
  • L2:0.802397012711
  • L3:1.06135106087
  • L4:8.82761001587

As noted by @aaronsterling the following solution is the shortest and seems fastest when the array is sorted and not too small: def is_sorted(lst): return (sorted(lst) == lst)

If most of the time the array is not sorted, it would be desirable to use a solution that does not scan the entire array and returns False as soon as an unsorted prefix is discovered. Following is the fastest solution I could find, it is not particularly elegant:

def is_sorted(lst):
    it = iter(lst)
    try:
        prev = it.next()
    except StopIteration:
        return True
    for x in it:
        if prev > x:
            return False
        prev = x
    return True

Using Nathan Farrington’s benchmark, this achieves better runtime than using sorted(lst) in all cases except when running on the large sorted list.

Here are the benchmark results on my computer.

sorted(lst)==lst solution

  • L1: 1.23838591576
  • L2: 4.19063091278
  • L3: 1.17996287346
  • L4: 4.68399500847

Second solution:

  • L1: 0.81095790863
  • L2: 0.802397012711
  • L3: 1.06135106087
  • L4: 8.82761001587

回答 12

如果您想要最快的方法用于numpy数组,请使用numba,如果使用conda,则应已安装

该代码将很快,因为它将由numba编译

import numba
@numba.jit
def issorted(vec, ascending=True):
    if len(vec) < 2:
        return True
    if ascending:
        for i in range(1, len(vec)):
            if vec[i-1] > vec[i]:
                return False
        return True
    else:
        for i in range(1, len(vec)):
            if vec[i-1] < vec[i]:
                return False
        return True

然后:

>>> issorted(array([4,9,100]))
>>> True

If you want the fastest way for numpy arrays, use numba, which if you use conda should be already installed

The code will be fast because it will be compiled by numba

import numba
@numba.jit
def issorted(vec, ascending=True):
    if len(vec) < 2:
        return True
    if ascending:
        for i in range(1, len(vec)):
            if vec[i-1] > vec[i]:
                return False
        return True
    else:
        for i in range(1, len(vec)):
            if vec[i-1] < vec[i]:
                return False
        return True

and then:

>>> issorted(array([4,9,100]))
>>> True

回答 13

只是添加另一种方式(即使它需要一个附加模块)iteration_utilities.all_monotone

>>> from iteration_utilities import all_monotone
>>> listtimestamps = [1, 2, 3, 5, 6, 7]
>>> all_monotone(listtimestamps)
True

>>> all_monotone([1,2,1])
False

要检查DESC订单:

>>> all_monotone(listtimestamps, decreasing=True)
False

>>> all_monotone([3,2,1], decreasing=True)
True

strict如果需要严格检查(如果连续元素不应相等)单调序列,则还有一个参数。

在您的情况下这不是问题,但是如果您的序列包含nan值,那么某些方法将失败,例如sorted:

def is_sorted_using_sorted(iterable):
    return sorted(iterable) == iterable

>>> is_sorted_using_sorted([3, float('nan'), 1])  # definetly False, right?
True

>>> all_monotone([3, float('nan'), 1])
False

请注意,iteration_utilities.all_monotone与此处提到的其他解决方案相比,该方法的执行速度更快,尤其是对于未排序的输入(请参阅基准)。

Just to add another way (even if it requires an additional module): iteration_utilities.all_monotone:

>>> from iteration_utilities import all_monotone
>>> listtimestamps = [1, 2, 3, 5, 6, 7]
>>> all_monotone(listtimestamps)
True

>>> all_monotone([1,2,1])
False

To check for DESC order:

>>> all_monotone(listtimestamps, decreasing=True)
False

>>> all_monotone([3,2,1], decreasing=True)
True

There is also a strict parameter if you need to check for strictly (if successive elements should not be equal) monotonic sequences.

It’s not a problem in your case but if your sequences contains nan values then some methods will fail, for example with sorted:

def is_sorted_using_sorted(iterable):
    return sorted(iterable) == iterable

>>> is_sorted_using_sorted([3, float('nan'), 1])  # definetly False, right?
True

>>> all_monotone([3, float('nan'), 1])
False

Note that iteration_utilities.all_monotone performs faster compared to the other solutions mentioned here especially for unsorted inputs (see benchmark).


回答 14

from itertools import tee

def is_sorted(l):
    l1, l2 = tee(l)
    next(l2, None)
    return all(a <= b for a, b in zip(l1, l2))

Lazy

from itertools import tee

def is_sorted(l):
    l1, l2 = tee(l)
    next(l2, None)
    return all(a <= b for a, b in zip(l1, l2))

回答 15

的Python 3.6.8

from more_itertools import pairwise

class AssertionHelper:
    @classmethod
    def is_ascending(cls, data: iter) -> bool:
        for a, b in pairwise(data):
            if a > b:
                return False
        return True

    @classmethod
    def is_descending(cls, data: iter) -> bool:
        for a, b in pairwise(data):
            if a < b:
                return False
        return True

    @classmethod
    def is_sorted(cls, data: iter) -> bool:
        return cls.is_ascending(data) or cls.is_descending(data)
>>> AssertionHelper.is_descending((1, 2, 3, 4))
False
>>> AssertionHelper.is_ascending((1, 2, 3, 4))
True
>>> AssertionHelper.is_sorted((1, 2, 3, 4))
True

Python 3.6.8

from more_itertools import pairwise

class AssertionHelper:
    @classmethod
    def is_ascending(cls, data: iter) -> bool:
        for a, b in pairwise(data):
            if a > b:
                return False
        return True

    @classmethod
    def is_descending(cls, data: iter) -> bool:
        for a, b in pairwise(data):
            if a < b:
                return False
        return True

    @classmethod
    def is_sorted(cls, data: iter) -> bool:
        return cls.is_ascending(data) or cls.is_descending(data)
>>> AssertionHelper.is_descending((1, 2, 3, 4))
False
>>> AssertionHelper.is_ascending((1, 2, 3, 4))
True
>>> AssertionHelper.is_sorted((1, 2, 3, 4))
True


回答 16

最简单的方法:

def isSorted(arr):
  i = 1
  while i < len(arr):
    if(result[i] < result[i - 1]):
      return False
    i += 1
  return True

Simplest way:

def isSorted(arr):
  i = 1
  while i < len(arr):
    if(result[i] < result[i - 1]):
      return False
    i += 1
  return True

回答 17

from functools import reduce

# myiterable can be of any iterable type (including list)
isSorted = reduce(lambda r, e: (r[0] and (r[1] or r[2] <= e), False, e), myiterable, (True, True, None))[0]

派生的减少值是(sortedSoFarFlagfirstTimeFlaglastElementValue)的三部分元组。它最初开始与(TrueTrueNone),其也被用作结果为空列表(被视为排序,因为有外的顺序没有元素)。在处理每个元素时,它会计算元组的新值(使用先前的元组值和下一个elementValue):

[0] (sortedSoFarFlag) evaluates true if: prev_0 is true and (prev_1 is true or prev_2 <= elementValue)
[1] (firstTimeFlag): False
[2] (lastElementValue): elementValue

减少的最终结果是一个元组:

[0]: True/False depending on whether the entire list was in sorted order
[1]: True/False depending on whether the list was empty
[2]: the last element value

第一个值是我们感兴趣的值,因此我们通常[0]从reduce结果中获取该值。

from functools import reduce

# myiterable can be of any iterable type (including list)
isSorted = reduce(lambda r, e: (r[0] and (r[1] or r[2] <= e), False, e), myiterable, (True, True, None))[0]

The derived reduction value is a 3-part tuple of (sortedSoFarFlag, firstTimeFlag, lastElementValue). It initially starts with (True, True, None), which is also used as the result for an empty list (regarded as sorted because there are no out-of-order elements). As it processes each element it calculates new values for the tuple (using previous tuple values with the next elementValue):

[0] (sortedSoFarFlag) evaluates true if: prev_0 is true and (prev_1 is true or prev_2 <= elementValue)
[1] (firstTimeFlag): False
[2] (lastElementValue): elementValue

The final result of the reduction is a tuple of:

[0]: True/False depending on whether the entire list was in sorted order
[1]: True/False depending on whether the list was empty
[2]: the last element value

The first value is the one we’re interested in, so we use [0] to grab that from the reduce result.


回答 18

由于我没有在上方看到此选项,因此将其添加到所有答案中。用表示列表l,然后:

import numpy as np

# Trasform the list to a numpy array
x = np.array(l)

# check if ascendent sorted:
all(x[:-1] <= x[1:])

# check if descendent sorted:
all(x[:-1] >= x[1:])

As I don’t see this option above I will add it to all the answers. Let denote the list by l, then:

import numpy as np

# Trasform the list to a numpy array
x = np.array(l)

# check if ascendent sorted:
all(x[:-1] <= x[1:])

# check if descendent sorted:
all(x[:-1] >= x[1:])

回答 19

使用赋值表达式的解决方案(在Python 3.8中添加):

def is_sorted(seq):
    seq_iter = iter(seq)
    cur = next(seq_iter, None)
    return all((prev := cur) <= (cur := nxt) for nxt in seq_iter)

z = list(range(10))
print(z)
print(is_sorted(z))

import random
random.shuffle(z)
print(z)
print(is_sorted(z))

z = []
print(z)
print(is_sorted(z))

给出:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
True
[1, 7, 5, 9, 4, 0, 8, 3, 2, 6]
False
[]
True

A solution using assignment expressions (added in Python 3.8):

def is_sorted(seq):
    seq_iter = iter(seq)
    cur = next(seq_iter, None)
    return all((prev := cur) <= (cur := nxt) for nxt in seq_iter)

z = list(range(10))
print(z)
print(is_sorted(z))

import random
random.shuffle(z)
print(z)
print(is_sorted(z))

z = []
print(z)
print(is_sorted(z))

Gives:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
True
[1, 7, 5, 9, 4, 0, 8, 3, 2, 6]
False
[]
True

回答 20

实际上,这是使用递归实现的最短方法:

如果已排序将打印True,否则将打印False

 def is_Sorted(lst):
    if len(lst) == 1:
       return True
    return lst[0] <= lst[1] and is_Sorted(lst[1:])

 any_list = [1,2,3,4]
 print is_Sorted(any_list)

This is in fact the shortest way to do it using recursion:

if it’s Sorted will print True else will print out False

 def is_Sorted(lst):
    if len(lst) == 1:
       return True
    return lst[0] <= lst[1] and is_Sorted(lst[1:])

 any_list = [1,2,3,4]
 print is_Sorted(any_list)

回答 21

这个怎么样 ?简单明了。

def is_list_sorted(al):

    llength =len(al)


    for i in range (llength):
        if (al[i-1] > al[i]):
            print(al[i])
            print(al[i+1])
            print('Not sorted')
            return -1

    else :
        print('sorted')
        return  true

How about this one ? Simple and straightforward.

def is_list_sorted(al):

    llength =len(al)


    for i in range (llength):
        if (al[i-1] > al[i]):
            print(al[i])
            print(al[i+1])
            print('Not sorted')
            return -1

    else :
        print('sorted')
        return  true

回答 22

绝对在Python 3及更高版本中适用于整数或字符串:

def tail(t):
    return t[:]

letters = ['a', 'b', 'c', 'd', 'e']
rest = tail(letters)
rest.sort()
if letters == rest:
    print ('Given list is SORTED.')
else:
    print ('List NOT Sorted.')

================================================== ===================

查找给定列表是否已排序的另一种方法

trees1 = list ([1, 4, 5, 3, 2])
trees2 = list (trees1)
trees2.sort()
if trees1 == trees2:
    print ('trees1 is SORTED')
else:
    print ('Not sorted')

Definitely works in Python 3 and above for integers or strings:

def tail(t):
    return t[:]

letters = ['a', 'b', 'c', 'd', 'e']
rest = tail(letters)
rest.sort()
if letters == rest:
    print ('Given list is SORTED.')
else:
    print ('List NOT Sorted.')

=====================================================================

Another way of finding if the given list is sorted or not

trees1 = list ([1, 4, 5, 3, 2])
trees2 = list (trees1)
trees2.sort()
if trees1 == trees2:
    print ('trees1 is SORTED')
else:
    print ('Not sorted')

如何按值对Counter排序?-Python

问题:如何按值对Counter排序?-Python

除了执行反向列表理解的列表理解之外,还有一种Python方式可以按值对Counter进行排序吗?如果是这样,它比这更快:

>>> from collections import Counter
>>> x = Counter({'a':5, 'b':3, 'c':7})
>>> sorted(x)
['a', 'b', 'c']
>>> sorted(x.items())
[('a', 5), ('b', 3), ('c', 7)]
>>> [(l,k) for k,l in sorted([(j,i) for i,j in x.items()])]
[('b', 3), ('a', 5), ('c', 7)]
>>> [(l,k) for k,l in sorted([(j,i) for i,j in x.items()], reverse=True)]
[('c', 7), ('a', 5), ('b', 3)

Other than doing list comprehensions of reversed list comprehension, is there a pythonic way to sort Counter by value? If so, it is faster than this:

>>> from collections import Counter
>>> x = Counter({'a':5, 'b':3, 'c':7})
>>> sorted(x)
['a', 'b', 'c']
>>> sorted(x.items())
[('a', 5), ('b', 3), ('c', 7)]
>>> [(l,k) for k,l in sorted([(j,i) for i,j in x.items()])]
[('b', 3), ('a', 5), ('c', 7)]
>>> [(l,k) for k,l in sorted([(j,i) for i,j in x.items()], reverse=True)]
[('c', 7), ('a', 5), ('b', 3)

回答 0

使用Counter.most_common()方法,它将为您排序项目:

>>> from collections import Counter
>>> x = Counter({'a':5, 'b':3, 'c':7})
>>> x.most_common()
[('c', 7), ('a', 5), ('b', 3)]

它将以最有效的方式进行;如果您要求前N个而不是所有值,heapq则使用a代替直接排序:

>>> x.most_common(1)
[('c', 7)]

在计数器外部,可以始终根据key功能调整排序;.sort()sorted()都接受赎回,让您指定要排序的输入序列的值; sorted(x, key=x.get, reverse=True)将为您提供与相同的排序x.most_common(),但仅返回键,例如:

>>> sorted(x, key=x.get, reverse=True)
['c', 'a', 'b']

或者您可以仅对给定的值(key, value)对进行排序:

>>> sorted(x.items(), key=lambda pair: pair[1], reverse=True)
[('c', 7), ('a', 5), ('b', 3)]

有关更多信息,请参见Python排序方法

Use the Counter.most_common() method, it’ll sort the items for you:

>>> from collections import Counter
>>> x = Counter({'a':5, 'b':3, 'c':7})
>>> x.most_common()
[('c', 7), ('a', 5), ('b', 3)]

It’ll do so in the most efficient manner possible; if you ask for a Top N instead of all values, a heapq is used instead of a straight sort:

>>> x.most_common(1)
[('c', 7)]

Outside of counters, sorting can always be adjusted based on a key function; .sort() and sorted() both take callable that lets you specify a value on which to sort the input sequence; sorted(x, key=x.get, reverse=True) would give you the same sorting as x.most_common(), but only return the keys, for example:

>>> sorted(x, key=x.get, reverse=True)
['c', 'a', 'b']

or you can sort on only the value given (key, value) pairs:

>>> sorted(x.items(), key=lambda pair: pair[1], reverse=True)
[('c', 7), ('a', 5), ('b', 3)]

See the Python sorting howto for more information.


回答 1

@MartijnPieters答案的一个相当不错的补充是,由于仅返回一个元组,因此可以按出现的顺序返回字典Collections.most_common。我经常将它与方便的日志文件的json输出结合起来:

from collections import Counter, OrderedDict

x = Counter({'a':5, 'b':3, 'c':7})
y = OrderedDict(x.most_common())

随着输出:

OrderedDict([('c', 7), ('a', 5), ('b', 3)])
{
  "c": 7, 
  "a": 5, 
  "b": 3
}

A rather nice addition to @MartijnPieters answer is to get back a dictionary sorted by occurrence since Collections.most_common only returns a tuple. I often couple this with a json output for handy log files:

from collections import Counter, OrderedDict

x = Counter({'a':5, 'b':3, 'c':7})
y = OrderedDict(x.most_common())

With the output:

OrderedDict([('c', 7), ('a', 5), ('b', 3)])
{
  "c": 7, 
  "a": 5, 
  "b": 3
}

回答 2

是:

>>> from collections import Counter
>>> x = Counter({'a':5, 'b':3, 'c':7})

使用排序的关键字键和lambda函数:

>>> sorted(x.items(), key=lambda i: i[1])
[('b', 3), ('a', 5), ('c', 7)]
>>> sorted(x.items(), key=lambda i: i[1], reverse=True)
[('c', 7), ('a', 5), ('b', 3)]

这适用于所有词典。但是Counter具有特殊功能,可以为您提供已排序的项目(从最频繁到最不频繁)。叫做most_common()

>>> x.most_common()
[('c', 7), ('a', 5), ('b', 3)]
>>> list(reversed(x.most_common()))  # in order of least to most
[('b', 3), ('a', 5), ('c', 7)]

您还可以指定要查看的项目数:

>>> x.most_common(2)  # specify number you want
[('c', 7), ('a', 5)]

Yes:

>>> from collections import Counter
>>> x = Counter({'a':5, 'b':3, 'c':7})

Using the sorted keyword key and a lambda function:

>>> sorted(x.items(), key=lambda i: i[1])
[('b', 3), ('a', 5), ('c', 7)]
>>> sorted(x.items(), key=lambda i: i[1], reverse=True)
[('c', 7), ('a', 5), ('b', 3)]

This works for all dictionaries. However Counter has a special function which already gives you the sorted items (from most frequent, to least frequent). It’s called most_common():

>>> x.most_common()
[('c', 7), ('a', 5), ('b', 3)]
>>> list(reversed(x.most_common()))  # in order of least to most
[('b', 3), ('a', 5), ('c', 7)]

You can also specify how many items you want to see:

>>> x.most_common(2)  # specify number you want
[('c', 7), ('a', 5)]

回答 3

更一般的排序方式,其中key关键字定义排序方式,在数字类型表示降序之前减去:

>>> x = Counter({'a':5, 'b':3, 'c':7})
>>> sorted(x.items(), key=lambda k: -k[1])  # Ascending
[('c', 7), ('a', 5), ('b', 3)]

More general sorted, where the key keyword defines the sorting method, minus before numerical type indicates descending:

>>> x = Counter({'a':5, 'b':3, 'c':7})
>>> sorted(x.items(), key=lambda k: -k[1])  # Ascending
[('c', 7), ('a', 5), ('b', 3)]