标签归档:sorting

按两个字段对Python列表进行排序

问题:按两个字段对Python列表进行排序

我有一个从排序的csv创建的以下列表

list1 = sorted(csv1, key=operator.itemgetter(1))

我实际上想按两个条件对列表进行排序:首先按字段1中的值,然后按字段2中的值。我该怎么做?

I have the following list created from a sorted csv

list1 = sorted(csv1, key=operator.itemgetter(1))

I would actually like to sort the list by two criteria: first by the value in field 1 and then by the value in field 2. How do I do this?


回答 0

像这样:

import operator
list1 = sorted(csv1, key=operator.itemgetter(1, 2))

like this:

import operator
list1 = sorted(csv1, key=operator.itemgetter(1, 2))

回答 1

使用lambda函数时无需导入任何内容。
以下list按第一个元素排序,然后按第二个元素排序。

sorted(list, key=lambda x: (x[0], -x[1]))

No need to import anything when using lambda functions.
The following sorts list by the first element, then by the second element.

sorted(list, key=lambda x: (x[0], -x[1]))

回答 2

Python具有稳定的排序方式,因此,只要性能不成问题,最简单的方法就是按字段2对其进行排序,然后再次按字段1对其进行排序。

这将为您提供所需的结果,唯一的陷阱是,如果列表很大(或者您希望经常对其进行排序),则两次调用sort可能是不可接受的开销。

list1 = sorted(csv1, key=operator.itemgetter(2))
list1 = sorted(list1, key=operator.itemgetter(1))

这样一来,还可以轻松处理需要对某些列进行反向排序的情况,只需在必要时添加’reverse = True’参数即可。

否则,您可以将多个参数传递给itemgetter或手动构建一个元组。这可能会更快一些,但是有一个问题,就是如果某些列想要反向排序,它不能很好地推广(数字列仍然可以通过取反来反转,但是这会使排序保持稳定)。

因此,如果您不需要对任何列进行反向排序,则可以向itemgetter输入多个参数(如果可能),并且这些列不是数字的,或者您希望保持排序稳定以进行多个连续排序。

编辑:对于在理解此答案的原始方式时遇到问题的评论者,以下示例准确显示了排序的稳定性,从而确保了我们可以对每个键进行单独的排序并最终对多个条件下的数据进行排序:

DATA = [
    ('Jones', 'Jane', 58),
    ('Smith', 'Anne', 30),
    ('Jones', 'Fred', 30),
    ('Smith', 'John', 60),
    ('Smith', 'Fred', 30),
    ('Jones', 'Anne', 30),
    ('Smith', 'Jane', 58),
    ('Smith', 'Twin2', 3),
    ('Jones', 'John', 60),
    ('Smith', 'Twin1', 3),
    ('Jones', 'Twin1', 3),
    ('Jones', 'Twin2', 3)
]

# Sort by Surname, Age DESCENDING, Firstname
print("Initial data in random order")
for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

print('''
First we sort by first name, after this pass all
Twin1 come before Twin2 and Anne comes before Fred''')
DATA.sort(key=lambda row: row[1])

for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

print('''
Second pass: sort by age in descending order.
Note that after this pass rows are sorted by age but
Twin1/Twin2 and Anne/Fred pairs are still in correct
firstname order.''')
DATA.sort(key=lambda row: row[2], reverse=True)
for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

print('''
Final pass sorts the Jones from the Smiths.
Within each family members are sorted by age but equal
age members are sorted by first name.
''')
DATA.sort(key=lambda row: row[0])
for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

这是一个可运行的示例,但是为了节省运行它的人员,输出为:

Initial data in random order
Jones      Jane       58
Smith      Anne       30
Jones      Fred       30
Smith      John       60
Smith      Fred       30
Jones      Anne       30
Smith      Jane       58
Smith      Twin2      3
Jones      John       60
Smith      Twin1      3
Jones      Twin1      3
Jones      Twin2      3

First we sort by first name, after this pass all
Twin1 come before Twin2 and Anne comes before Fred
Smith      Anne       30
Jones      Anne       30
Jones      Fred       30
Smith      Fred       30
Jones      Jane       58
Smith      Jane       58
Smith      John       60
Jones      John       60
Smith      Twin1      3
Jones      Twin1      3
Smith      Twin2      3
Jones      Twin2      3

Second pass: sort by age in descending order.
Note that after this pass rows are sorted by age but
Twin1/Twin2 and Anne/Fred pairs are still in correct
firstname order.
Smith      John       60
Jones      John       60
Jones      Jane       58
Smith      Jane       58
Smith      Anne       30
Jones      Anne       30
Jones      Fred       30
Smith      Fred       30
Smith      Twin1      3
Jones      Twin1      3
Smith      Twin2      3
Jones      Twin2      3

Final pass sorts the Jones from the Smiths.
Within each family members are sorted by age but equal
age members are sorted by first name.

Jones      John       60
Jones      Jane       58
Jones      Anne       30
Jones      Fred       30
Jones      Twin1      3
Jones      Twin2      3
Smith      John       60
Smith      Jane       58
Smith      Anne       30
Smith      Fred       30
Smith      Twin1      3
Smith      Twin2      3

特别要注意的是,在第二步中,reverse=True参数如何按顺序保留名字,而仅对列表进行排序然后反转,则会丢失第三个排序键的期望顺序。

Python has a stable sort, so provided that performance isn’t an issue the simplest way is to sort it by field 2 and then sort it again by field 1.

That will give you the result you want, the only catch is that if it is a big list (or you want to sort it often) calling sort twice might be an unacceptable overhead.

list1 = sorted(csv1, key=operator.itemgetter(2))
list1 = sorted(list1, key=operator.itemgetter(1))

Doing it this way also makes it easy to handle the situation where you want some of the columns reverse sorted, just include the ‘reverse=True’ parameter when necessary.

Otherwise you can pass multiple parameters to itemgetter or manually build a tuple. That is probably going to be faster, but has the problem that it doesn’t generalise well if some of the columns want to be reverse sorted (numeric columns can still be reversed by negating them but that stops the sort being stable).

So if you don’t need any columns reverse sorted, go for multiple arguments to itemgetter, if you might, and the columns aren’t numeric or you want to keep the sort stable go for multiple consecutive sorts.

Edit: For the commenters who have problems understanding how this answers the original question, here is an example that shows exactly how the stable nature of the sorting ensures we can do separate sorts on each key and end up with data sorted on multiple criteria:

DATA = [
    ('Jones', 'Jane', 58),
    ('Smith', 'Anne', 30),
    ('Jones', 'Fred', 30),
    ('Smith', 'John', 60),
    ('Smith', 'Fred', 30),
    ('Jones', 'Anne', 30),
    ('Smith', 'Jane', 58),
    ('Smith', 'Twin2', 3),
    ('Jones', 'John', 60),
    ('Smith', 'Twin1', 3),
    ('Jones', 'Twin1', 3),
    ('Jones', 'Twin2', 3)
]

# Sort by Surname, Age DESCENDING, Firstname
print("Initial data in random order")
for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

print('''
First we sort by first name, after this pass all
Twin1 come before Twin2 and Anne comes before Fred''')
DATA.sort(key=lambda row: row[1])

for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

print('''
Second pass: sort by age in descending order.
Note that after this pass rows are sorted by age but
Twin1/Twin2 and Anne/Fred pairs are still in correct
firstname order.''')
DATA.sort(key=lambda row: row[2], reverse=True)
for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

print('''
Final pass sorts the Jones from the Smiths.
Within each family members are sorted by age but equal
age members are sorted by first name.
''')
DATA.sort(key=lambda row: row[0])
for d in DATA:
    print("{:10s} {:10s} {}".format(*d))

This is a runnable example, but to save people running it the output is:

Initial data in random order
Jones      Jane       58
Smith      Anne       30
Jones      Fred       30
Smith      John       60
Smith      Fred       30
Jones      Anne       30
Smith      Jane       58
Smith      Twin2      3
Jones      John       60
Smith      Twin1      3
Jones      Twin1      3
Jones      Twin2      3

First we sort by first name, after this pass all
Twin1 come before Twin2 and Anne comes before Fred
Smith      Anne       30
Jones      Anne       30
Jones      Fred       30
Smith      Fred       30
Jones      Jane       58
Smith      Jane       58
Smith      John       60
Jones      John       60
Smith      Twin1      3
Jones      Twin1      3
Smith      Twin2      3
Jones      Twin2      3

Second pass: sort by age in descending order.
Note that after this pass rows are sorted by age but
Twin1/Twin2 and Anne/Fred pairs are still in correct
firstname order.
Smith      John       60
Jones      John       60
Jones      Jane       58
Smith      Jane       58
Smith      Anne       30
Jones      Anne       30
Jones      Fred       30
Smith      Fred       30
Smith      Twin1      3
Jones      Twin1      3
Smith      Twin2      3
Jones      Twin2      3

Final pass sorts the Jones from the Smiths.
Within each family members are sorted by age but equal
age members are sorted by first name.

Jones      John       60
Jones      Jane       58
Jones      Anne       30
Jones      Fred       30
Jones      Twin1      3
Jones      Twin2      3
Smith      John       60
Smith      Jane       58
Smith      Anne       30
Smith      Fred       30
Smith      Twin1      3
Smith      Twin2      3

Note in particular how in the second step the reverse=True parameter keeps the firstnames in order whereas simply sorting then reversing the list would lose the desired order for the third sort key.


回答 3

list1 = sorted(csv1, key=lambda x: (x[1], x[2]) )
list1 = sorted(csv1, key=lambda x: (x[1], x[2]) )

回答 4

employees.sort(key = lambda x:x[1])
employees.sort(key = lambda x:x[0])

我们也可以将.sort与lambda一起使用2次,因为python sort到位且稳定。这将首先根据第二个元素x [1]对列表进行排序。然后,它将对第一个元素x [0](最高优先级)进行排序。

employees[0] = Employee's Name
employees[1] = Employee's Salary

这等效于执行以下操作:employee.sort(key = lambda x:(x [0],x [1]))

employees.sort(key = lambda x:x[1])
employees.sort(key = lambda x:x[0])

We can also use .sort with lambda 2 times because python sort is in place and stable. This will first sort the list according to the second element, x[1]. Then, it will sort the first element, x[0] (highest priority).

employees[0] = Employee's Name
employees[1] = Employee's Salary

This is equivalent to doing the following: employees.sort(key = lambda x:(x[0], x[1]))


回答 5

您可以按升序使用:

sorted_data= sorted(non_sorted_data, key=lambda k: (k[1],k[0]))

或按降序使用:

sorted_data= sorted(non_sorted_data, key=lambda k: (k[1],k[0]),reverse=True)

In ascending order you can use:

sorted_data= sorted(non_sorted_data, key=lambda k: (k[1],k[0]))

or in descending order you can use:

sorted_data= sorted(non_sorted_data, key=lambda k: (k[1],k[0]),reverse=True)

回答 6

使用下面的字典排序列表将以降序对列表进行排序,第一列为薪水,第二列为年龄

d=[{'salary':123,'age':23},{'salary':123,'age':25}]
d=sorted(d, key=lambda i: (i['salary'], i['age']),reverse=True)

输出:[{‘salary’:123,’age’:25},{‘salary’:123,’age’:23}]

Sorting list of dicts using below will sort list in descending order on first column as salary and second column as age

d=[{'salary':123,'age':23},{'salary':123,'age':25}]
d=sorted(d, key=lambda i: (i['salary'], i['age']),reverse=True)

Output: [{‘salary’: 123, ‘age’: 25}, {‘salary’: 123, ‘age’: 23}]


不区分大小写的列表排序,而不降低结果大小?

问题:不区分大小写的列表排序,而不降低结果大小?

我有一个这样的字符串列表:

['Aden', 'abel']

我要对项目排序,不区分大小写。所以我想得到:

['abel', 'Aden']

但与sorted()或相反list.sort(),因为大写字母先于小写字母。

我如何忽略这种情况?我已经看到了涉及降低所有列表项的解决方案,但是我不想更改列表项的大小写。

I have a list of strings like this:

['Aden', 'abel']

I want to sort the items, case-insensitive. So I want to get:

['abel', 'Aden']

But I get the opposite with sorted() or list.sort(), because uppercase appears before lowercase.

How can I ignore the case? I’ve seen solutions which involves lowercasing all list items, but I don’t want to change the case of the list items.


回答 0

在Python 3.3+中,有str.casefold一种专为无条件匹配而设计的方法:

sorted_list = sorted(unsorted_list, key=str.casefold)

在Python 2中使用lower()

sorted_list = sorted(unsorted_list, key=lambda s: s.lower())

它适用于普通字符串和unicode字符串,因为它们都有lower方法。

在Python 2中,它可以将普通字符串和unicode字符串混合使用,因为这两种类型的值可以相互比较。但是,Python 3并不是这样工作的:您无法比较字节字符串和unicode字符串,因此在Python 3中,您应该做明智的事情,并且只能对一种类型的字符串列表进行排序。

>>> lst = ['Aden', u'abe1']
>>> sorted(lst)
['Aden', u'abe1']
>>> sorted(lst, key=lambda s: s.lower())
[u'abe1', 'Aden']

In Python 3.3+ there is the str.casefold method that’s specifically designed for caseless matching:

sorted_list = sorted(unsorted_list, key=str.casefold)

In Python 2 use lower():

sorted_list = sorted(unsorted_list, key=lambda s: s.lower())

It works for both normal and unicode strings, since they both have a lower method.

In Python 2 it works for a mix of normal and unicode strings, since values of the two types can be compared with each other. Python 3 doesn’t work like that, though: you can’t compare a byte string and a unicode string, so in Python 3 you should do the sane thing and only sort lists of one type of string.

>>> lst = ['Aden', u'abe1']
>>> sorted(lst)
['Aden', u'abe1']
>>> sorted(lst, key=lambda s: s.lower())
[u'abe1', 'Aden']

回答 1

>>> x = ['Aden', 'abel']
>>> sorted(x, key=str.lower) # Or unicode.lower if all items are unicode
['abel', 'Aden']

在Python 3中str是unicode,但在Python 2中,您可以使用这种更通用的方法,该方法对str和都适用unicode

>>> sorted(x, key=lambda s: s.lower())
['abel', 'Aden']
>>> x = ['Aden', 'abel']
>>> sorted(x, key=str.lower) # Or unicode.lower if all items are unicode
['abel', 'Aden']

In Python 3 str is unicode but in Python 2 you can use this more general approach which works for both str and unicode:

>>> sorted(x, key=lambda s: s.lower())
['abel', 'Aden']

回答 2

您也可以尝试使用此方法对列表进行就地排序:

>>> x = ['Aden', 'abel']
>>> x.sort(key=lambda y: y.lower())
>>> x
['abel', 'Aden']

You can also try this to sort the list in-place:

>>> x = ['Aden', 'abel']
>>> x.sort(key=lambda y: y.lower())
>>> x
['abel', 'Aden']

回答 3

这在Python 3中有效,并且不涉及小写结果(!)。

values.sort(key=str.lower)

This works in Python 3 and does not involves lowercasing the result (!).

values.sort(key=str.lower)

回答 4

在python3中,您可以使用

list1.sort(key=lambda x: x.lower()) #Case In-sensitive             
list1.sort() #Case Sensitive

In python3 you can use

list1.sort(key=lambda x: x.lower()) #Case In-sensitive             
list1.sort() #Case Sensitive

回答 5

我是通过Python 3.3做到的:

 def sortCaseIns(lst):
    lst2 = [[x for x in range(0, 2)] for y in range(0, len(lst))]
    for i in range(0, len(lst)):
        lst2[i][0] = lst[i].lower()
        lst2[i][1] = lst[i]
    lst2.sort()
    for i in range(0, len(lst)):
        lst[i] = lst2[i][1]

然后,您可以调用此函数:

sortCaseIns(yourListToSort)

I did it this way for Python 3.3:

 def sortCaseIns(lst):
    lst2 = [[x for x in range(0, 2)] for y in range(0, len(lst))]
    for i in range(0, len(lst)):
        lst2[i][0] = lst[i].lower()
        lst2[i][1] = lst[i]
    lst2.sort()
    for i in range(0, len(lst)):
        lst[i] = lst2[i][1]

Then you just can call this function:

sortCaseIns(yourListToSort)

回答 6

不区分大小写的排序,在Python 2 OR 3中对字符串进行排序(在Python 2.7.17和Python 3.6.9中测试):

>>> x = ["aa", "A", "bb", "B", "cc", "C"]
>>> x.sort()
>>> x
['A', 'B', 'C', 'aa', 'bb', 'cc']
>>> x.sort(key=str.lower)           # <===== there it is!
>>> x
['A', 'aa', 'B', 'bb', 'C', 'cc']

关键是key=str.lower。这些命令只是这些命令的外观,以便于复制粘贴,因此您可以对其进行测试:

x = ["aa", "A", "bb", "B", "cc", "C"]
x.sort()
x
x.sort(key=str.lower)
x

请注意,但是,如果您的字符串是unicode字符串(如u'some string'),则仅在Python 2中(在这种情况下,在Python 3中不是),上述x.sort(key=str.lower)命令将失败并输出以下错误:

TypeError: descriptor 'lower' requires a 'str' object but received a 'unicode'

如果出现此错误,请升级到Python 3来处理unicode排序,或者先使用列表推导将unicode字符串转换为ASCII字符串,如下所示:

# for Python2, ensure all elements are ASCII (NOT unicode) strings first
x = [str(element) for element in x]  
# for Python2, this sort will only work on ASCII (NOT unicode) strings
x.sort(key=str.lower)

参考文献:

  1. https://docs.python.org/3/library/stdtypes.html#list.sort
  2. 将Unicode字符串转换为Python中的字符串(包含多余的符号)
  3. https://www.programiz.com/python-programming/list-comprehension

Case-insensitive sort, sorting the string in place, in Python 2 OR 3 (tested in Python 2.7.17 and Python 3.6.9):

>>> x = ["aa", "A", "bb", "B", "cc", "C"]
>>> x.sort()
>>> x
['A', 'B', 'C', 'aa', 'bb', 'cc']
>>> x.sort(key=str.lower)           # <===== there it is!
>>> x
['A', 'aa', 'B', 'bb', 'C', 'cc']

The key is key=str.lower. Here’s what those commands look like with just the commands, for easy copy-pasting so you can test them:

x = ["aa", "A", "bb", "B", "cc", "C"]
x.sort()
x
x.sort(key=str.lower)
x

Note that if your strings are unicode strings, however (like u'some string'), then in Python 2 only (NOT in Python 3 in this case) the above x.sort(key=str.lower) command will fail and output the following error:

TypeError: descriptor 'lower' requires a 'str' object but received a 'unicode'

If you get this error, then either upgrade to Python 3 where they handle unicode sorting, or convert your unicode strings to ASCII strings first, using a list comprehension, like this:

# for Python2, ensure all elements are ASCII (NOT unicode) strings first
x = [str(element) for element in x]  
# for Python2, this sort will only work on ASCII (NOT unicode) strings
x.sort(key=str.lower)

References:

  1. https://docs.python.org/3/library/stdtypes.html#list.sort
  2. Convert a Unicode string to a string in Python (containing extra symbols)
  3. https://www.programiz.com/python-programming/list-comprehension

回答 7

试试这个

def cSort(inlist, minisort=True):
    sortlist = []
    newlist = []
    sortdict = {}
    for entry in inlist:
        try:
            lentry = entry.lower()
        except AttributeError:
            sortlist.append(lentry)
        else:
            try:
                sortdict[lentry].append(entry)
            except KeyError:
                sortdict[lentry] = [entry]
                sortlist.append(lentry)

    sortlist.sort()
    for entry in sortlist:
        try:
            thislist = sortdict[entry]
            if minisort: thislist.sort()
            newlist = newlist + thislist
        except KeyError:
            newlist.append(entry)
    return newlist

lst = ['Aden', 'abel']
print cSort(lst)

输出量

['abel', 'Aden']

Try this

def cSort(inlist, minisort=True):
    sortlist = []
    newlist = []
    sortdict = {}
    for entry in inlist:
        try:
            lentry = entry.lower()
        except AttributeError:
            sortlist.append(lentry)
        else:
            try:
                sortdict[lentry].append(entry)
            except KeyError:
                sortdict[lentry] = [entry]
                sortlist.append(lentry)

    sortlist.sort()
    for entry in sortlist:
        try:
            thislist = sortdict[entry]
            if minisort: thislist.sort()
            newlist = newlist + thislist
        except KeyError:
            newlist.append(entry)
    return newlist

lst = ['Aden', 'abel']
print cSort(lst)

Output

['abel', 'Aden']


泡泡排序作业

问题:泡泡排序作业

在课堂上,我们正在做排序算法,尽管我在谈论它们并编写伪代码时理解得很好,但是我在为它们编写实际代码时遇到了问题。

这是我在Python中的尝试:

mylist = [12, 5, 13, 8, 9, 65]

def bubble(badList):
    length = len(badList) - 1
    unsorted = True

    while unsorted:
        for element in range(0,length):
            unsorted = False
            if badList[element] > badList[element + 1]:
                hold = badList[element + 1]
                badList[element + 1] = badList[element]
                badList[element] = hold
                print badList
            else:
                unsorted = True

print bubble(mylist)

现在,(据我所知)该排序正确,但是一旦完成,它就会无限循环。

如何修复此代码,以便函数正确完成并正确排序任何(合理)大小的列表?

PS:我知道我不应该在函数中真正打印输出,而应该得到返回值,但是我还没有这样做,因为我的代码还没有真正起作用。

In class we are doing sorting algorithms and, although I understand them fine when talking about them and writing pseudocode, I am having problems writing actual code for them.

This is my attempt in Python:

mylist = [12, 5, 13, 8, 9, 65]

def bubble(badList):
    length = len(badList) - 1
    unsorted = True

    while unsorted:
        for element in range(0,length):
            unsorted = False
            if badList[element] > badList[element + 1]:
                hold = badList[element + 1]
                badList[element + 1] = badList[element]
                badList[element] = hold
                print badList
            else:
                unsorted = True

print bubble(mylist)

Now, this (as far as I can tell) sorts correctly, but once it finishes it just loops indefinitely.

How can this code be fixed so the function finishes properly and correctly sorts a list of any (reasonable) size?

P.S. I know I should not really have prints in a function and I should have a return, but I just have not done that yet as my code does not really work yet.


回答 0

为了解释为什么您的脚本现在无法正常工作,我将变量重命名unsortedsorted

首先,您的列表尚未排序。当然,我们设置sortedFalse

一旦开始while循环,我们就假定该列表已经排序。想法是这样的:一旦找到两个顺序不正确的元素,便将其设置sortedFalse只有在没有顺序错误的元素时sorted才会保留。True

sorted = False  # We haven't started sorting yet

while not sorted:
    sorted = True  # Assume the list is now sorted
    for element in range(0, length):
        if badList[element] > badList[element + 1]:
            sorted = False  # We found two elements in the wrong order
            hold = badList[element + 1]
            badList[element + 1] = badList[element]
            badList[element] = hold
    # We went through the whole list. At this point, if there were no elements
    # in the wrong order, sorted is still True. Otherwise, it's false, and the
    # while loop executes again.

还有一些小问题,可以帮助提高代码的效率或可读性。

  • for循环中,使用变量element。从技术上讲,element不是元素;它是代表列表索引的数字。而且,它很长。在这些情况下,只需使用一个临时变量名即可,例如i“ index”。

    for i in range(0, length):
  • range命令还可以仅接受一个参数(名为stop)。在这种情况下,您将获得从0到该参数的所有整数的列表。

    for i in range(length):
  • Python样式指南》建议使用下划线将变量命名为小写。对于这样的小脚本,这是一个很小的缺点。它会让您习惯于Python代码最常类似于的东西。

    def bubble(bad_list):
  • 要交换两个变量的值,请将它们写为元组分配。右侧被评估为元组(例如(badList[i+1], badList[i])is (3, 5)),然后被分配给左侧的两个变量((badList[i], badList[i+1]))。

    bad_list[i], bad_list[i+1] = bad_list[i+1], bad_list[i]

放在一起,你会得到:

my_list = [12, 5, 13, 8, 9, 65]

def bubble(bad_list):
    length = len(bad_list) - 1
    sorted = False

    while not sorted:
        sorted = True
        for i in range(length):
            if bad_list[i] > bad_list[i+1]:
                sorted = False
                bad_list[i], bad_list[i+1] = bad_list[i+1], bad_list[i]

bubble(my_list)
print my_list

(顺便说一句,我也删除了您的打印声明。)

To explain why your script isn’t working right now, I’ll rename the variable unsorted to sorted.

At first, your list isn’t yet sorted. Of course, we set sorted to False.

As soon as we start the while loop, we assume that the list is already sorted. The idea is this: as soon as we find two elements that are not in the right order, we set sorted back to False. sorted will remain True only if there were no elements in the wrong order.

sorted = False  # We haven't started sorting yet

while not sorted:
    sorted = True  # Assume the list is now sorted
    for element in range(0, length):
        if badList[element] > badList[element + 1]:
            sorted = False  # We found two elements in the wrong order
            hold = badList[element + 1]
            badList[element + 1] = badList[element]
            badList[element] = hold
    # We went through the whole list. At this point, if there were no elements
    # in the wrong order, sorted is still True. Otherwise, it's false, and the
    # while loop executes again.

There are also minor little issues that would help the code be more efficient or readable.

  • In the for loop, you use the variable element. Technically, element is not an element; it’s a number representing a list index. Also, it’s quite long. In these cases, just use a temporary variable name, like i for “index”.

    for i in range(0, length):
    
  • The range command can also take just one argument (named stop). In that case, you get a list of all the integers from 0 to that argument.

    for i in range(length):
    
  • The Python Style Guide recommends that variables be named in lowercase with underscores. This is a very minor nitpick for a little script like this; it’s more to get you accustomed to what Python code most often resembles.

    def bubble(bad_list):
    
  • To swap the values of two variables, write them as a tuple assignment. The right hand side gets evaluated as a tuple (say, (badList[i+1], badList[i]) is (3, 5)) and then gets assigned to the two variables on the left hand side ((badList[i], badList[i+1])).

    bad_list[i], bad_list[i+1] = bad_list[i+1], bad_list[i]
    

Put it all together, and you get this:

my_list = [12, 5, 13, 8, 9, 65]

def bubble(bad_list):
    length = len(bad_list) - 1
    sorted = False

    while not sorted:
        sorted = True
        for i in range(length):
            if bad_list[i] > bad_list[i+1]:
                sorted = False
                bad_list[i], bad_list[i+1] = bad_list[i+1], bad_list[i]

bubble(my_list)
print my_list

(I removed your print statement too, by the way.)


回答 1

气泡分类的目的是在每轮底部移动较重的项目,同时向上移动较轻的项目。在内部循环中,您可以比较元素,而不必每次都迭代整个列表。将最重的已经排在最后。该交换变量是一个额外的检查,所以我们可以标记列表现在可以正确排序,并避免不必要的计算仍在继续。

def bubble(badList):
    length = len(badList)
    for i in range(0,length):
        swapped = False
        for element in range(0, length-i-1):
            if badList[element] > badList[element + 1]:
                hold = badList[element + 1]
                badList[element + 1] = badList[element]
                badList[element] = hold
                swapped = True
        if not swapped: break

    return badList

您的版本1已更正:

def bubble(badList):
    length = len(badList) - 1
    unsorted = True
    while unsorted:
        unsorted = False
        for element in range(0,length):
            #unsorted = False
            if badList[element] > badList[element + 1]:
                 hold = badList[element + 1]
                 badList[element + 1] = badList[element]
                 badList[element] = hold
                 unsorted = True
                 #print badList
             #else:
                 #unsorted = True

     return badList

The goal of bubble sort is to move the heavier items at the bottom in each round, while moving the lighter items up. In the inner loop, where you compare the elements, you don’t have to iterate the whole list in each turn. The heaviest is already placed last. The swapped variable is an extra check so we can mark that the list is now sorted and avoid continuing with unnecessary calculations.

def bubble(badList):
    length = len(badList)
    for i in range(0,length):
        swapped = False
        for element in range(0, length-i-1):
            if badList[element] > badList[element + 1]:
                hold = badList[element + 1]
                badList[element + 1] = badList[element]
                badList[element] = hold
                swapped = True
        if not swapped: break

    return badList

Your version 1, corrected:

def bubble(badList):
    length = len(badList) - 1
    unsorted = True
    while unsorted:
        unsorted = False
        for element in range(0,length):
            #unsorted = False
            if badList[element] > badList[element + 1]:
                 hold = badList[element + 1]
                 badList[element + 1] = badList[element]
                 badList[element] = hold
                 unsorted = True
                 #print badList
             #else:
                 #unsorted = True

     return badList

回答 2

当您使用负含义的变量名时,您需要反转它们的值,这就是发生的情况。以下内容将更容易理解:

sorted = False
while not sorted:
    ...

另一方面,算法的逻辑有点偏离。您需要检查在for循环期间是否交换了两个元素。这是我的写法:

def bubble(values):
    length = len(values) - 1
    sorted = False
    while not sorted:
        sorted = True
        for element in range(0,length):
            if values[element] > values[element + 1]:
                 hold = values[element + 1]
                 values[element + 1] = values[element]
                 values[element] = hold
                 sorted = False
    return values

This is what happens when you use variable name of negative meaning, you need to invert their values. The following would be easier to understand:

sorted = False
while not sorted:
    ...

On the other hand, the logic of the algorithm is a little bit off. You need to check whether two elements swapped during the for loop. Here’s how I would write it:

def bubble(values):
    length = len(values) - 1
    sorted = False
    while not sorted:
        sorted = True
        for element in range(0,length):
            if values[element] > values[element + 1]:
                 hold = values[element + 1]
                 values[element + 1] = values[element]
                 values[element] = hold
                 sorted = False
    return values

回答 3

您对Unsorted变量的使用是错误的;您想要一个变量来告诉您是否交换了两个元素;如果这样做,则可以退出循环,否则,需要再次循环。要解决您在此处遇到的问题,只需在if情况的正文中输入“ unsorted = false”;删除其他情况;并在for循环前添加“ unsorted = true” 。

Your use of the Unsorted variable is wrong; you want to have a variable that tells you if you have swapped two elements; if you have done that, you can exit your loop, otherwise, you need to loop again. To fix what you’ve got here, just put the “unsorted = false” in the body of your if case; remove your else case; and put “unsorted = true before your for loop.


回答 4

def bubble_sort(l):
    for passes_left in range(len(l)-1, 0, -1):
        for index in range(passes_left):
            if l[index] < l[index + 1]:
               l[index], l[index + 1] = l[index + 1], l[index]
    return l
def bubble_sort(l):
    for passes_left in range(len(l)-1, 0, -1):
        for index in range(passes_left):
            if l[index] < l[index + 1]:
               l[index], l[index + 1] = l[index + 1], l[index]
    return l

回答 5

#一个非常简单的函数,可以通过减少第二个数组的问题空间来优化(显然)。但是相同的O(n ^ 2)复杂度。

def bubble(arr):
    l = len(arr)        
    for a in range(l):
        for b in range(l-1):
            if (arr[a] < arr[b]):
            arr[a], arr[b] = arr[b], arr[a]
    return arr 

#A very simple function, can be optimized (obviously) by decreasing the problem space of the 2nd array. But same O(n^2) complexity.

def bubble(arr):
    l = len(arr)        
    for a in range(l):
        for b in range(l-1):
            if (arr[a] < arr[b]):
            arr[a], arr[b] = arr[b], arr[a]
    return arr 

回答 6

您那里有几个错误。第一个是长度,第二个是您未排序的使用(如McWafflestix所述)。如果要打印列表,您可能还想返回该列表:

mylist = [12, 5, 13, 8, 9, 65]

def bubble(badList):
    length = len(badList) - 2
    unsorted = True

    while unsorted:
        for element in range(0,length):
            unsorted = False

            if badList[element] > badList[element + 1]:
                hold = badList[element + 1]
                badList[element + 1] = badList[element]
                badList[element] = hold
                print badList
                unsorted = True

    return badList

print bubble(mylist)

埃塔:你是对的,上面的小车就像地狱一样。我的缺点是不通过更多示例进行测试。

def bubble2(badList):
    swapped = True
    length = len(badList) - 2

    while swapped:
        swapped = False
        for i in range(0, length):
            if badList[i] > badList[i + 1]:

                # swap
                hold = badList[i + 1]
                badList[i + 1] = badList[i]
                badList[i] = hold

                swapped = True

    return badList

You’ve got a couple of errors in there. The first is in length, and the second is in your use of unsorted (as stated by McWafflestix). You probably also want to return the list if you’re going to print it:

mylist = [12, 5, 13, 8, 9, 65]

def bubble(badList):
    length = len(badList) - 2
    unsorted = True

    while unsorted:
        for element in range(0,length):
            unsorted = False

            if badList[element] > badList[element + 1]:
                hold = badList[element + 1]
                badList[element + 1] = badList[element]
                badList[element] = hold
                print badList
                unsorted = True

    return badList

print bubble(mylist)

eta: You’re right, the above is buggy as hell. My bad for not testing through some more examples.

def bubble2(badList):
    swapped = True
    length = len(badList) - 2

    while swapped:
        swapped = False
        for i in range(0, length):
            if badList[i] > badList[i + 1]:

                # swap
                hold = badList[i + 1]
                badList[i + 1] = badList[i]
                badList[i] = hold

                swapped = True

    return badList

回答 7

我是一个新鲜的初学者,昨天开始阅读有关Python的文章。受到您的榜样的启发,我创作了一些也许具有80年代风格的作品,但仍然可以

lista1 = [12, 5, 13, 8, 9, 65]

i=0
while i < len(lista1)-1:
    if lista1[i] > lista1[i+1]:
        x = lista1[i]
        lista1[i] = lista1[i+1]
        lista1[i+1] = x
        i=0
        continue
    else:
        i+=1

print(lista1)

I am a fresh fresh beginner, started to read about Python yesterday. Inspired by your example I created something maybe more in the 80-ties style, but nevertheless it kinda works

lista1 = [12, 5, 13, 8, 9, 65]

i=0
while i < len(lista1)-1:
    if lista1[i] > lista1[i+1]:
        x = lista1[i]
        lista1[i] = lista1[i+1]
        lista1[i+1] = x
        i=0
        continue
    else:
        i+=1

print(lista1)

回答 8

原始算法的问题在于,如果列表中的数字更小,它将无法将其带到正确的排序位置。程序每次都需要从头开始,以确保数字始终排序。

我简化了代码,它现在适用于任何数字列表,而不管列表如何,即使有重复的数字也是如此。这是代码

mylist = [9, 8, 5, 4, 12, 1, 7, 5, 2]
print mylist

def bubble(badList):
    length = len(badList) - 1
    element = 0
    while element < length:
        if badList[element] > badList[element + 1]:
            hold = badList[element + 1]
            badList[element + 1] = badList[element]
            badList[element] = hold
            element = 0
            print badList
        else:
            element = element + 1

print bubble(mylist)

The problem with the original algorithm is that if you had a lower number further in the list, it would not bring it to the correct sorted position. The program needs to go back the the beginning each time to ensure that the numbers sort all the way through.

I simplified the code and it will now work for any list of numbers regardless of the list and even if there are repeating numbers. Here’s the code

mylist = [9, 8, 5, 4, 12, 1, 7, 5, 2]
print mylist

def bubble(badList):
    length = len(badList) - 1
    element = 0
    while element < length:
        if badList[element] > badList[element + 1]:
            hold = badList[element + 1]
            badList[element + 1] = badList[element]
            badList[element] = hold
            element = 0
            print badList
        else:
            element = element + 1

print bubble(mylist)

回答 9

def bubble_sort(l):
    exchanged = True
    iteration = 0
    n = len(l)

    while(exchanged):
        iteration += 1
        exchanged = False

        # Move the largest element to the end of the list
        for i in range(n-1):
            if l[i] > l[i+1]:
                exchanged = True
                l[i], l[i+1] = l[i+1], l[i]
        n -= 1   # Largest element already towards the end

    print 'Iterations: %s' %(iteration)
    return l
def bubble_sort(l):
    exchanged = True
    iteration = 0
    n = len(l)

    while(exchanged):
        iteration += 1
        exchanged = False

        # Move the largest element to the end of the list
        for i in range(n-1):
            if l[i] > l[i+1]:
                exchanged = True
                l[i], l[i+1] = l[i+1], l[i]
        n -= 1   # Largest element already towards the end

    print 'Iterations: %s' %(iteration)
    return l

回答 10

def bubbleSort(alist):
if len(alist) <= 1:
    return alist
for i in range(0,len(alist)):
   print "i is :%d",i
   for j in range(0,i):
      print "j is:%d",j
      print "alist[i] is :%d, alist[j] is :%d"%(alist[i],alist[j])
      if alist[i] > alist[j]:
         alist[i],alist[j] = alist[j],alist[i]
return alist

alist = [54,26,93,17,77,31,44,55,20,-23,-34,16,11,11,11]

打印bubbleSort(alist)

def bubbleSort(alist):
if len(alist) <= 1:
    return alist
for i in range(0,len(alist)):
   print "i is :%d",i
   for j in range(0,i):
      print "j is:%d",j
      print "alist[i] is :%d, alist[j] is :%d"%(alist[i],alist[j])
      if alist[i] > alist[j]:
         alist[i],alist[j] = alist[j],alist[i]
return alist

alist = [54,26,93,17,77,31,44,55,20,-23,-34,16,11,11,11]

print bubbleSort(alist)


回答 11

def bubble_sort(a):
    t = 0
    sorted = False # sorted = False because we have not began to sort
    while not sorted:
    sorted = True # Assume sorted = True first, it will switch only there is any change
        for key in range(1,len(a)):
            if a[key-1] > a[key]:
                sorted = False
                t = a[key-1]; a[key-1] = a[key]; a[key] = t;
    print a
def bubble_sort(a):
    t = 0
    sorted = False # sorted = False because we have not began to sort
    while not sorted:
    sorted = True # Assume sorted = True first, it will switch only there is any change
        for key in range(1,len(a)):
            if a[key-1] > a[key]:
                sorted = False
                t = a[key-1]; a[key-1] = a[key]; a[key] = t;
    print a

回答 12

一个简单的例子:

a = len(alist)-1
while a > 0:
    for b in range(0,a):
        #compare with the adjacent element
        if alist[b]>=alist[b+1]:
            #swap both elements
            alist[b], alist[b+1] = alist[b+1], alist[b]
    a-=1

这只是简单地将元素从0到a(基本上是该回合中所有未排序的元素),并将其与相邻元素进行比较,如果大于相邻元素,则进行交换。在回合结束时,对最后一个元素进行排序,然后在没有该元素的情况下再次运行该过程,直到对所有元素进行了排序。

不需要条件是否sort成立。

请注意,此算法仅在交换时才考虑数字的位置,因此重复的数字不会对其产生影响。

PS。我知道这个问题发布已经很长时间了,但是我只想分享这个想法。

A simpler example:

a = len(alist)-1
while a > 0:
    for b in range(0,a):
        #compare with the adjacent element
        if alist[b]>=alist[b+1]:
            #swap both elements
            alist[b], alist[b+1] = alist[b+1], alist[b]
    a-=1

This simply takes the elements from 0 to a(basically, all the unsorted elements in that round) and compares it with its adjacent element, and making a swap if it is greater than its adjacent element. At the end the round, the last element is sorted, and the process runs again without it, until all elements have been sorted.

There is no need for a condition whether sort is true or not.

Note that this algorithm takes into consideration the position of the numbers only when swapping, so repeated numbers will not affect it.

PS. I know it has been very long since this question was posted, but I just wanted to share this idea.


回答 13

arr = [5,4,3,1,6,8,10,9] # array not sorted

for i in range(len(arr)):
    for j in range(i, len(arr)):
        if(arr[i] > arr[j]):
            arr[i], arr[j] = arr[j], arr[i]

            print (arr)
arr = [5,4,3,1,6,8,10,9] # array not sorted

for i in range(len(arr)):
    for j in range(i, len(arr)):
        if(arr[i] > arr[j]):
            arr[i], arr[j] = arr[j], arr[i]

            print (arr)

回答 14

def bubble_sort(li):
    l = len(li)
    tmp = None
    sorted_l = sorted(li)
    while (li != sorted_l):
        for ele in range(0,l-1):
            if li[ele] > li[ele+1]:
                tmp = li[ele+1]
                li[ele+1] = li [ele]
                li[ele] = tmp
    return li
def bubble_sort(li):
    l = len(li)
    tmp = None
    sorted_l = sorted(li)
    while (li != sorted_l):
        for ele in range(0,l-1):
            if li[ele] > li[ele+1]:
                tmp = li[ele+1]
                li[ele+1] = li [ele]
                li[ele] = tmp
    return li

回答 15

def bubbleSort ( arr ):
    swapped = True 
    length = len ( arr )
    j = 0

    while swapped:
        swapped = False
        j += 1 
        for i in range ( length  - j ):
            if arr [ i ] > arr [ i + 1 ]:
                # swap
                tmp = arr [ i ]
                arr [ i ] = arr [ i + 1]
                arr [ i + 1 ] = tmp 

                swapped = True

if __name__ == '__main__':
    # test list
    a = [ 67, 45, 39, -1, -5, -44 ];

    print ( a )
    bubbleSort ( a )
    print ( a )
def bubbleSort ( arr ):
    swapped = True 
    length = len ( arr )
    j = 0

    while swapped:
        swapped = False
        j += 1 
        for i in range ( length  - j ):
            if arr [ i ] > arr [ i + 1 ]:
                # swap
                tmp = arr [ i ]
                arr [ i ] = arr [ i + 1]
                arr [ i + 1 ] = tmp 

                swapped = True

if __name__ == '__main__':
    # test list
    a = [ 67, 45, 39, -1, -5, -44 ];

    print ( a )
    bubbleSort ( a )
    print ( a )

回答 16

def bubblesort(array):
    for i in range(len(array)-1):
        for j in range(len(array)-1-i):
            if array[j] > array[j+1]:
                array[j], array[j+1] = array[j+1], array[j]
    return(array)

print(bubblesort([3,1,6,2,5,4]))
def bubblesort(array):
    for i in range(len(array)-1):
        for j in range(len(array)-1-i):
            if array[j] > array[j+1]:
                array[j], array[j+1] = array[j+1], array[j]
    return(array)

print(bubblesort([3,1,6,2,5,4]))

回答 17

我考虑添加我的解决方案,因为这里曾经有解决方案

  1. 更长的时间
  2. 更大的空间复杂度
  3. 或进行过多的操作

那应该是

所以,这是我的解决方案:


def countInversions(arr):
    count = 0
    n = len(arr)
    for i in range(n):
        _count = count
        for j in range(0, n - i - 1):
            if arr[j] > arr[j + 1]:
                count += 1
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
        if _count == count:
            break
    return count

I consider adding my solution because ever solution here is having

  1. greater time
  2. greater space complexity
  3. or doing too much operations

then is should be

So, here is my solution:


def countInversions(arr):
    count = 0
    n = len(arr)
    for i in range(n):
        _count = count
        for j in range(0, n - i - 1):
            if arr[j] > arr[j + 1]:
                count += 1
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
        if _count == count:
            break
    return count

回答 18

如果有人对使用列表理解的较短实现感兴趣:

def bubble_sort(lst: list) -> None:
    [swap_items(lst, i, i+1) for left in range(len(lst)-1, 0, -1) for i in range(left) if lst[i] > lst[i+1]]


def swap_items(lst: list, pos1: int, pos2: int) -> None:
    lst[pos1], lst[pos2] = lst[pos2], lst[pos1]

If anyone is interested in a shorter implementation using a list comprehension:

def bubble_sort(lst: list) -> None:
    [swap_items(lst, i, i+1) for left in range(len(lst)-1, 0, -1) for i in range(left) if lst[i] > lst[i+1]]


def swap_items(lst: list, pos1: int, pos2: int) -> None:
    lst[pos1], lst[pos2] = lst[pos2], lst[pos1]

回答 19

这是不带for循环的气泡排序的另一种形式。基本上,您正在考虑的lastIndexarray然后慢慢进行decrementing,直到它成为数组的第一个索引为止。

algorithm将继续通过这样的阵列移动,直到整个通而没有任何由swaps发生。

泡沫基本上Quadratic Time: O(n²)是关于性能的。

class BubbleSort: 
  def __init__(self, arr):
    self.arr = arr;

  def bubbleSort(self):
    count = 0;
    lastIndex = len(self.arr) - 1;
    
    while(count < lastIndex):
      if(self.arr[count] > self.arr[count + 1]):
        self.swap(count)  
      count = count + 1;

      if(count == lastIndex):
        count = 0;
        lastIndex = lastIndex - 1;   

  def swap(self, count):
    temp = self.arr[count];
    self.arr[count] = self.arr[count + 1];
    self.arr[count + 1] = temp;
    
arr = [9, 1, 5, 3, 8, 2]
p1 = BubbleSort(arr)

print(p1.bubbleSort())

Here is a different variation of bubble sort without for loop. Basically you are considering the lastIndex of the array and slowly decrementing it until it first index of the array.

The algorithm will continue to move through the array like this until an entire pass is made without any swaps occurring.

The bubble is sort is basically Quadratic Time: O(n²) when it comes to performance.

class BubbleSort: 
  def __init__(self, arr):
    self.arr = arr;

  def bubbleSort(self):
    count = 0;
    lastIndex = len(self.arr) - 1;
    
    while(count < lastIndex):
      if(self.arr[count] > self.arr[count + 1]):
        self.swap(count)  
      count = count + 1;

      if(count == lastIndex):
        count = 0;
        lastIndex = lastIndex - 1;   

  def swap(self, count):
    temp = self.arr[count];
    self.arr[count] = self.arr[count + 1];
    self.arr[count + 1] = temp;
    
arr = [9, 1, 5, 3, 8, 2]
p1 = BubbleSort(arr)

print(p1.bubbleSort())

回答 20

the-fury和Martin Cote提供的答案解决了无限循环的问题,但是我的代码仍然无法正常工作(对于较大的列表,它将无法正确排序)。我最终放弃了unsorted变量,而是使用了计数器。

def bubble(badList):
    length = len(badList) - 1
    n = 0
    while n < len(badList):
        for element in range(0,length):
            if badList[element] > badList[element + 1]:
                hold = badList[element + 1]
                badList[element + 1] = badList[element]
                badList[element] = hold
                n = 0
            else:
                n += 1
    return badList

if __name__ == '__main__':
    mylist = [90, 10, 2, 76, 17, 66, 57, 23, 57, 99]
    print bubble(mylist)

如果有人可以在注释中提供任何有关如何改进我的代码的指针,将不胜感激。

Answers provided by the-fury and Martin Cote fixed the problem of the infinite loop, but my code would still not work correctly (for a larger list, it would not sort correctly.). I ended up ditching the unsorted variable and used a counter instead.

def bubble(badList):
    length = len(badList) - 1
    n = 0
    while n < len(badList):
        for element in range(0,length):
            if badList[element] > badList[element + 1]:
                hold = badList[element + 1]
                badList[element + 1] = badList[element]
                badList[element] = hold
                n = 0
            else:
                n += 1
    return badList

if __name__ == '__main__':
    mylist = [90, 10, 2, 76, 17, 66, 57, 23, 57, 99]
    print bubble(mylist)

If anyone could provide any pointers on how to improve my code in the comments, it would be much appreciated.


回答 21

试试这个

a = int(input("Enter Limit"))


val = []

for z in range(0,a):
    b = int(input("Enter Number in List"))
    val.append(b)


for y in range(0,len(val)):
   for x in range(0,len(val)-1):
       if val[x]>val[x+1]:
           t = val[x]
           val[x] = val[x+1]
           val[x+1] = t

print(val)

Try this

a = int(input("Enter Limit"))


val = []

for z in range(0,a):
    b = int(input("Enter Number in List"))
    val.append(b)


for y in range(0,len(val)):
   for x in range(0,len(val)-1):
       if val[x]>val[x+1]:
           t = val[x]
           val[x] = val[x+1]
           val[x+1] = t

print(val)

回答 22

idk如果这可能会在9年后对您有所帮助…它是一个简单的气泡排序程序

    l=[1,6,3,7,5,9,8,2,4,10]

    for i in range(1,len(l)):
        for j in range (i+1,len(l)):
            if l[i]>l[j]:
                l[i],l[j]=l[j],l[i]

idk if this might help you after 9 years… its a simple bubble sort program

    l=[1,6,3,7,5,9,8,2,4,10]

    for i in range(1,len(l)):
        for j in range (i+1,len(l)):
            if l[i]>l[j]:
                l[i],l[j]=l[j],l[i]

回答 23

def merge_bubble(arr):
    k = len(arr)
    while k>2:
        for i in range(0,k-1):
            for j in range(0,k-1):
                if arr[j] > arr[j+1]:
                    arr[j],arr[j+1] = arr[j+1],arr[j]

        return arr
        break
    else:
        if arr[0] > arr[1]:
            arr[0],arr[1] = arr[1],arr[0]
        return arr 
def merge_bubble(arr):
    k = len(arr)
    while k>2:
        for i in range(0,k-1):
            for j in range(0,k-1):
                if arr[j] > arr[j+1]:
                    arr[j],arr[j+1] = arr[j+1],arr[j]

        return arr
        break
    else:
        if arr[0] > arr[1]:
            arr[0],arr[1] = arr[1],arr[0]
        return arr 

回答 24

def bubble_sort(l):
    for i in range(len(l) -1):
        for j in range(len(l)-i-1):
            if l[j] > l[j+1]:
                l[j],l[j+1] = l[j+1], l[j]
    return l
def bubble_sort(l):
    for i in range(len(l) -1):
        for j in range(len(l)-i-1):
            if l[j] > l[j+1]:
                l[j],l[j+1] = l[j+1], l[j]
    return l

回答 25

def bubble_sorted(arr:list):
    while True:
        for i in range(0,len(arr)-1):
            count = 0
            if arr[i] > arr[i+1]:
                count += 1
                arr[i], arr[i+1] = arr[i+1], arr[i]
        if count == 0:
            break
    return arr
arr = [30,20,80,40,50,10,60,70,90]
print(bubble_sorted(arr))
#[20, 30, 40, 50, 10, 60, 70, 80, 90]
def bubble_sorted(arr:list):
    while True:
        for i in range(0,len(arr)-1):
            count = 0
            if arr[i] > arr[i+1]:
                count += 1
                arr[i], arr[i+1] = arr[i+1], arr[i]
        if count == 0:
            break
    return arr
arr = [30,20,80,40,50,10,60,70,90]
print(bubble_sorted(arr))
#[20, 30, 40, 50, 10, 60, 70, 80, 90]

回答 26

def bubbleSort(a): def swap(x, y): temp = a[x] a[x] = a[y] a[y] = temp #outer loop for j in range(len(a)): #slicing to the center, inner loop, python style for i in range(j, len(a) - j):
#find the min index and swap if a[i] < a[j]: swap(j, i) #find the max index and swap if a[i] > a[len(a) - j - 1]: swap(len(a) - j - 1, i) return a

def bubbleSort(a): def swap(x, y): temp = a[x] a[x] = a[y] a[y] = temp #outer loop for j in range(len(a)): #slicing to the center, inner loop, python style for i in range(j, len(a) - j):
#find the min index and swap if a[i] < a[j]: swap(j, i) #find the max index and swap if a[i] > a[len(a) - j - 1]: swap(len(a) - j - 1, i) return a


python是否有排序列表?

问题:python是否有排序列表?

我所说的结构是:

  • x.push()操作复杂度O(log n)
  • O(log n)查找元素的复杂度
  • O(n)复杂度进行计算list(x)将被排序

我也有一个有关性能的相关问题list(...).insert(...),现在在这里

By which I mean a structure with:

  • O(log n) complexity for x.push() operations
  • O(log n) complexity to find an element
  • O(n) complexity to compute list(x) which will be sorted

I also had a related question about performance of list(...).insert(...) which is now here.


回答 0

标准Python列表不以任何形式排序。标准的heapq模块可用于将O(log n)追加到现有列表中,并删除O(log n)中最小的模块,但在定义中不是排序列表。

有许多符合您需求的Python平衡树实现,例如rbtreeRBTreepyavl

The standard Python list is not sorted in any form. The standard heapq module can be used to append in O(log n) to an existing list and remove the smallest one in O(log n), but isn’t a sorted list in your definition.

There are various implementations of balanced trees for Python that meet your requirements, e.g. rbtree, RBTree, or pyavl.


回答 1

您的big-O需求是否有特定原因?还是您只是想要它快?所述sortedcontainers模块是纯Python和快速(如在快速作为-C实现比如blist和rbtree)。

性能比较表明它与基准测试blist的排序列表类型更快或看齐。还要注意,rbtree,RBTree和PyAVL提供排序的dict和set类型,但没有排序的列表类型。

如果需要性能,请始终记住要进行基准测试。在它还显示基准比较之前,应该怀疑使用Big-O表示法证明快速的模块。

免责声明:我是Python sortedcontainers模块的作者。


安装:

pip install sortedcontainers

用法:

>>> from sortedcontainers import SortedList
>>> l = SortedList()
>>> l.update([0, 4, 1, 3, 2])
>>> l.index(3)
3
>>> l.add(5)
>>> l[-1]
5

Is there a particular reason for your big-O requirements? Or do you just want it to be fast? The sortedcontainers module is pure-Python and fast (as in fast-as-C implementations like blist and rbtree).

The performance comparison shows it benchmarks faster or on par with blist’s sorted list type. Note also that rbtree, RBTree, and PyAVL provide sorted dict and set types but don’t have a sorted list type.

If performance is a requirement, always remember to benchmark. A module that substantiates the claim of being fast with Big-O notation should be suspect until it also shows benchmark comparisons.

Disclaimer: I am the author of the Python sortedcontainers module.


Installation:

pip install sortedcontainers

Usage:

>>> from sortedcontainers import SortedList
>>> l = SortedList()
>>> l.update([0, 4, 1, 3, 2])
>>> l.index(3)
3
>>> l.add(5)
>>> l[-1]
5

回答 2

尽管我仍然从未检查过基本Python列表操作的“大O”速度,但bisect在这种情况下,标准模块可能也值得一提:

import bisect
L = [0, 100]

bisect.insort(L, 50)
bisect.insort(L, 20)
bisect.insort(L, 21)

print L
## [0, 20, 21, 50, 100]

i = bisect.bisect(L, 20)
print L[i-1], L[i]
## 20, 21

PS。啊,对不起,bisect在提到的问题中被提及。不过,我认为,如果此信息在这里,不会有太大危害)

PPS。而CPython的名单实际上是数组(不是,比方说,skiplists或等)。好吧,我想它们一定很简单,但就我而言,这个名称有点误导。


因此,如果我没记错的话,平分/列表速度可能是:

  • 对于push():在最坏的情况下为O(n);
  • 搜索:如果我们认为数组索引的速度为O(1),则搜索应为O(log(n))操作;
  • 用于创建列表:O(n)应该是列表复制的速度,否则为同一列表的O(1))

更新。在评论中进行讨论之后,让我在这里链接这些SO问题:如何实现Python的列表以及什么是Python列表函数的运行时复杂性

Though I have still never checked the “big O” speeds of basic Python list operations, the bisect standard module is probably also worth mentioning in this context:

import bisect
L = [0, 100]

bisect.insort(L, 50)
bisect.insort(L, 20)
bisect.insort(L, 21)

print L
## [0, 20, 21, 50, 100]

i = bisect.bisect(L, 20)
print L[i-1], L[i]
## 20, 21

PS. Ah, sorry, bisect is mentioned in the referenced question. Still, I think it won’t be much harm if this information will be here )

PPS. And CPython lists are actually arrays (not, say, skiplists or etc) . Well, I guess they have to be something simple, but as for me, the name is a little bit misleading.


So, if I am not mistaken, the bisect/list speeds would probably be:

  • for a push(): O(n) for the worst case ;
  • for a search: if we consider the speed of array indexing to be O(1), search should be an O(log(n)) operation ;
  • for the list creation: O(n) should be the speed of the list copying, otherwise it’s O(1) for the same list )

Upd. Following a discussion in the comments, let me link here these SO questions: How is Python’s List Implemented and What is the runtime complexity of python list functions


回答 3

import bisect

class sortedlist(list):
    '''just a list but with an insort (insert into sorted position)'''
    def insort(self, x):
        bisect.insort(self, x)
import bisect

class sortedlist(list):
    '''just a list but with an insort (insert into sorted position)'''
    def insort(self, x):
        bisect.insort(self, x)

回答 4

尽管(尚未)提供自定义搜索功能,但该heapq模块可能适合您的需求。它使用常规列表实现堆队列。您必须编写自己的有效成员资格测试,该测试利用队列的内部结构(可以在O(log n)中完成,我想说…)。有一个缺点:提取排序列表的复杂度为O(n log n)

Though it does not (yet) provide a custom search function, the heapq module may suit your needs. It implements a heap queue using a regular list. You’d have to write your own efficient membership test that makes use of the queue’s internal structure (that can be done in O(log n), I’d say…). There is one downside: extracting a sorted list has complexity O(n log n).


回答 5

我会使用biscectsortedcontainers模块。我确实没有经验,但是我认为该heapq模块有效。它包含一个Heap Queue

I would use the biscect or sortedcontainers modules. I don’t really am experienced, but I think the heapq module works. It contains a Heap Queue


回答 6

在Python上实现您自己的排序列表可能并不困难。以下是概念证明:

import bisect

class sortlist:
    def __init__(self, list):
        self.list = list
        self.sort()
    def sort(self):
        l = []
        for i in range(len(self.list)):
            bisect.insort(l, self.list[i])
        self.list = l
        self.len = i
    def insert(self, value):
        bisect.insort(self.list, value)
        self.len += 1
    def show(self):
        print self.list
    def search(self,value):
        left = bisect.bisect_left(self.list, value)
        if abs(self.list[min([left,self.len-1])] - value) >= abs(self.list[left-1] - value):
            return self.list[left-1]
        else:
            return self.list[left]

list = [101, 3, 10, 14, 23, 86, 44, 45, 45, 50, 66, 95, 17, 77, 79, 84, 85, 91, 73]
slist = sortlist(list)
slist.show()
slist.insert(99)
slist.show()
print slist.search(100000000)
print slist.search(0)
print slist.search(56.7)

=========结果============

[3、10、14、17、23、44、45、45、50、66、73、77、79、84、85、86、91、95、101]

[3、10、14、17、23、44、45、45、50、66、73、77、79、84、85、86、91、95、99、101]

101

3

50

It may not be hard to implement your own sortlist on Python. Below is a proof of concept:

import bisect

class sortlist:
    def __init__(self, list):
        self.list = list
        self.sort()
    def sort(self):
        l = []
        for i in range(len(self.list)):
            bisect.insort(l, self.list[i])
        self.list = l
        self.len = i
    def insert(self, value):
        bisect.insort(self.list, value)
        self.len += 1
    def show(self):
        print self.list
    def search(self,value):
        left = bisect.bisect_left(self.list, value)
        if abs(self.list[min([left,self.len-1])] - value) >= abs(self.list[left-1] - value):
            return self.list[left-1]
        else:
            return self.list[left]

list = [101, 3, 10, 14, 23, 86, 44, 45, 45, 50, 66, 95, 17, 77, 79, 84, 85, 91, 73]
slist = sortlist(list)
slist.show()
slist.insert(99)
slist.show()
print slist.search(100000000)
print slist.search(0)
print slist.search(56.7)

========= Results ============

[3, 10, 14, 17, 23, 44, 45, 45, 50, 66, 73, 77, 79, 84, 85, 86, 91, 95, 101]

[3, 10, 14, 17, 23, 44, 45, 45, 50, 66, 73, 77, 79, 84, 85, 86, 91, 95, 99, 101]

101

3

50


如何对字符串列表进行数字排序?

问题:如何对字符串列表进行数字排序?

我知道这听起来微不足道,但是我没有意识到sort()Python 的功能很奇怪。我有一个实际上是字符串形式的“数字”列表,因此我首先将它们转换为整数,然后尝试进行排序。

list1=["1","10","3","22","23","4","2","200"]
for item in list1:
    item=int(item)

list1.sort()
print list1

给我:

['1', '10', '2', '200', '22', '23', '3', '4']

我想要的是

['1','2','3','4','10','22','23','200']

我四处寻找与排序数字集相关的算法,但是我发现所有算法都涉及对字母数字集进行排序。

我知道这可能是个没有脑子的问题,但是google和我的教科书没有提供比该.sort()功能有用的功能。

I know that this sounds trivial but I did not realize that the sort() function of Python was weird. I have a list of “numbers” that are actually in string form, so I first convert them to ints, then attempt a sort.

list1=["1","10","3","22","23","4","2","200"]
for item in list1:
    item=int(item)

list1.sort()
print list1

Gives me:

['1', '10', '2', '200', '22', '23', '3', '4']

What I want is

['1','2','3','4','10','22','23','200']

I’ve looked around for some of the algorithms associated with sorting numeric sets, but the ones I found all involve sorting alphanumeric sets.

I know this is probably a no brainer problem but google and my textbook don’t offer anything more or less useful than the .sort() function.


回答 0

您实际上尚未将字符串转换为int。或更确切地说,您做了,但是随后您对结果什么也没做。您想要的是:

list1 = ["1","10","3","22","23","4","2","200"]
list1 = [int(x) for x in list1]
list1.sort()

如果由于某种原因需要保留字符串而不是整数(通常是一个坏主意,但是可能需要保留前导零或其他东西),则可以使用函数。sort接受一个命名参数,key该参数是在比较每个元素之前对其进行调用的函数。比较键函数的返回值,而不是直接比较列表元素:

list1 = ["1","10","3","22","23","4","2","200"]
# call int(x) on each element before comparing it
list1.sort(key=int)

You haven’t actually converted your strings to ints. Or rather, you did, but then you didn’t do anything with the results. What you want is:

list1 = ["1","10","3","22","23","4","2","200"]
list1 = [int(x) for x in list1]
list1.sort()

If for some reason you need to keep strings instead of ints (usually a bad idea, but maybe you need to preserve leading zeros or something), you can use a key function. sort takes a named parameter, key, which is a function that is called on each element before it is compared. The key function’s return values are compared instead of comparing the list elements directly:

list1 = ["1","10","3","22","23","4","2","200"]
# call int(x) on each element before comparing it
list1.sort(key=int)

回答 1

你可以传递一个函数的key参数.sort方法。这样,系统将按key(x)而不是x进行排序。

list1.sort(key=int)

顺便说一句,到列表中,以永久的整数转换,使用map功能

list1 = list(map(int, list1))   # you don't need to call list() in Python 2.x

或列表理解

list1 = [int(x) for x in list1]

You could pass a function to the key parameter to the .sort method. With this, the system will sort by key(x) instead of x.

list1.sort(key=int)

BTW, to convert the list to integers permanently, use the map function

list1 = list(map(int, list1))   # you don't need to call list() in Python 2.x

or list comprehension

list1 = [int(x) for x in list1]

回答 2

如果要使用sorted()功能:sorted(list1, key=int)

它返回一个新的排序列表。

In case you want to use sorted() function: sorted(list1, key=int)

It returns a new sorted list.


回答 3

Python的排序并不奇怪。只是这段代码:

for item in list1:
   item=int(item)

没有按照您的想法去做- item不会被替换回列表中,只是被丢弃了。

无论如何,正确的解决方案是使用key=int其他人向您展示的方法。

Python’s sort isn’t weird. It’s just that this code:

for item in list1:
   item=int(item)

isn’t doing what you think it is – item is not replaced back into the list, it is simply thrown away.

Anyway, the correct solution is to use key=int as others have shown you.


回答 4

您还可以使用:

import re

def sort_human(l):
    convert = lambda text: float(text) if text.isdigit() else text
    alphanum = lambda key: [convert(c) for c in re.split('([-+]?[0-9]*\.?[0-9]*)', key)]
    l.sort(key=alphanum)
    return l

这与您可以在互联网上找到的其他内容非常相似,但也适用于字母数字[abc0.1, abc0.2, ...]

You can also use:

import re

def sort_human(l):
    convert = lambda text: float(text) if text.isdigit() else text
    alphanum = lambda key: [convert(c) for c in re.split('([-+]?[0-9]*\.?[0-9]*)', key)]
    l.sort(key=alphanum)
    return l

This is very similar to other stuff that you can find on the internet but also works for alphanumericals like [abc0.1, abc0.2, ...].


回答 5

Seamus Campbell的答案不适用于python2.x。
list1 = sorted(list1, key=lambda e: int(e))使用lambda功能效果很好。

Seamus Campbell‘s answer doesnot work on python2.x.
list1 = sorted(list1, key=lambda e: int(e)) using lambda function works well.


回答 6

昨天我也遇到了同样的问题,并找到了一个名为[natsort] [1]的模块,它可以解决您的问题。用:

from natsort import natsorted # pip install natsort

# Example list of strings
a = ['1', '10', '2', '3', '11']

[In]  sorted(a)
[Out] ['1', '10', '11', '2', '3']

[In]  natsorted(a)
[Out] ['1', '2', '3', '10', '11']

# Your array may contain strings
[In]  natsorted(['string11', 'string3', 'string1', 'string10', 'string100'])
[Out] ['string1', 'string3', 'string10', 'string11', 'string100']

它也适用于字典sorted。[1]:https//pypi.org/project/natsort/

I approached the same problem yesterday and found a module called [natsort][1], which solves your problem. Use:

from natsort import natsorted # pip install natsort

# Example list of strings
a = ['1', '10', '2', '3', '11']

[In]  sorted(a)
[Out] ['1', '10', '11', '2', '3']

[In]  natsorted(a)
[Out] ['1', '2', '3', '10', '11']

# Your array may contain strings
[In]  natsorted(['string11', 'string3', 'string1', 'string10', 'string100'])
[Out] ['string1', 'string3', 'string10', 'string11', 'string100']

It also works for dictionaries as an equivalent of sorted. [1]: https://pypi.org/project/natsort/


回答 7

尝试此操作,它将按降序对列表进行排序(在这种情况下,无需指定键):

处理

listB = [24, 13, -15, -36, 8, 22, 48, 25, 46, -9]
listC = sorted(listB, reverse=True) # listB remains untouched
print listC

输出:

 [48, 46, 25, 24, 22, 13, 8, -9, -15, -36]

Try this, it’ll sort the list in-place in descending order (there’s no need to specify a key in this case):

Process

listB = [24, 13, -15, -36, 8, 22, 48, 25, 46, -9]
listC = sorted(listB, reverse=True) # listB remains untouched
print listC

output:

 [48, 46, 25, 24, 22, 13, 8, -9, -15, -36]

回答 8

最新的解决方案是正确的。您正在以字符串形式读取解决方案,在这种情况下,顺序为1、100、104、2、21、20010010010、3,依此类推。

您必须将输入的内容转换为int:

排序的字符串:

stringList = (1, 10, 2, 21, 3)

排序的整数:

intList = (1, 2, 3, 10, 21)

要进行转换,只需将stringList放入int(blahblah)中。

再次:

stringList = (1, 10, 2, 21, 3)

newList = int (stringList)

print newList

=> returns (1, 2, 3, 10, 21) 

The most recent solution is right. You are reading solutions as a string, in which case the order is 1, then 100, then 104 followed by 2 then 21, then 2001001010, 3 and so forth.

You have to CAST your input as an int instead:

sorted strings:

stringList = (1, 10, 2, 21, 3)

sorted ints:

intList = (1, 2, 3, 10, 21)

To cast, just put the stringList inside int ( blahblah ).

Again:

stringList = (1, 10, 2, 21, 3)

newList = int (stringList)

print newList

=> returns (1, 2, 3, 10, 21) 

回答 9

如果您想使用数字字符串更好地采用我的代码中所示的另一个列表,它将很好地工作。

list1=["1","10","3","22","23","4","2","200"]

k=[]    
for item in list1:    
    k.append(int(item))

k.sort()
print(k)
# [1, 2, 3, 4, 10, 22, 23, 200]

If you want to use strings of the numbers better take another list as shown in my code it will work fine.

list1=["1","10","3","22","23","4","2","200"]

k=[]    
for item in list1:    
    k.append(int(item))

k.sort()
print(k)
# [1, 2, 3, 4, 10, 22, 23, 200]

回答 10

排序数字列表的简单方法

numlists = ["5","50","7","51","87","97","53"]
results = list(map(int, numlists))
results.sort(reverse=False)
print(results)

Simple way to sort a numerical list

numlists = ["5","50","7","51","87","97","53"]
results = list(map(int, numlists))
results.sort(reverse=False)
print(results)

回答 11

真正的问题是按字母数字排序。因此,如果您有一个列表[‘1’,’2’,’10’,’19’]并进行排序,则您会得到[‘1’,’10’。’19’,’2’]。即10排在2之前,因为它着眼于第一个字符并以此排序。似乎python中的大多数方法都按此顺序返回事物。例如,如果您有一个名为abc的目录,且文件标记为1.jpg,2.jpg等,最多说15.jpg,并且您执行file_list = os.listdir(abc),则file_list的顺序不是您期望的那样,而是file_list = [‘1.jpg’,’11 .jpg’—’15.jpg’,’2.jpg]。如果处理文件的顺序很重要(大概是您用数字命名它们的原因),那么该顺序就不是您认为的那样。您可以通过使用“零”填充来避免这种情况。例如,如果您有一个列表alist = [’01’,’03’,’05’,’10’,’02’,’04’,’06],然后对其进行排序,则会得到所需的顺序。alist = [’01’,’02’等],因为第一个字符是1之前的0。您需要填充的零个数由列表中的最大值确定。例如,如果最大的值介于100和1000,您需要填充001、002 — 010,011–100、101等个位数。

real problem is that sort sorts things alphanumerically. So if you have a list [‘1’, ‘2’, ’10’, ’19’] and run sort you get [‘1′, ’10’. ’19’, ‘2’]. ie 10 comes before 2 because it looks at the first character and sorts starting from that. It seems most methods in python return things in that order. For example if you have a directory named abc with the files labelled as 1.jpg, 2.jpg etc say up to 15.jpg and you do file_list=os.listdir(abc) the file_list is not ordered as you expect but rather as file_list=[‘1.jpg’, ’11.jpg’—’15.jpg’, ‘2.jpg]. If the order in which files are processed is important (presumably that’s why you named them numerically) the order is not what you think it will be. You can avoid this by using “zeros” padding. For example if you have a list alist=[’01’, ’03’, ’05’, ’10’, ’02’,’04’, ’06] and you run sort on it you get the order you wanted. alist=[’01’, ’02’ etc] because the first character is 0 which comes before 1. The amount of zeros padding you need is determined by the largest value in the list.For example if the largest is say between 100 and 1000 you need to pad single digits as 001, 002 —010,011–100, 101 etc.


回答 12

scores = ['91','89','87','86','85']
scores.sort()
print (scores)

这在python版本3中对我有用,尽管在版本2中没有。

scores = ['91','89','87','86','85']
scores.sort()
print (scores)

This worked for me using python version 3, though it didn’t in version 2.


使用其构造函数初始化OrderedDict的正确方法,使其保留初始数据的顺序?

问题:使用其构造函数初始化OrderedDict的正确方法,使其保留初始数据的顺序?

初始化有序词典(OD)以便保留初始数据顺序的正确方法是什么?

from collections import OrderedDict

# Obviously wrong because regular dict loses order
d = OrderedDict({'b':2, 'a':1}) 

# An OD is represented by a list of tuples, so would this work?
d = OrderedDict([('b',2), ('a', 1)])

# What about using a list comprehension, will 'd' preserve the order of 'l'
l = ['b', 'a', 'c', 'aa']
d = OrderedDict([(i,i) for i in l])

题:

  • OrderedDict在初始化时是否会保留元组列表的顺序,元组的元组或列表的元组或列表的列表等的顺序(上述第二和第三示例)?

  • 如何验证是否OrderedDict实际维持订单?由于a dict具有不可预测的顺序,如果我的测试向量幸运地具有与dict不可预测的顺序相同的初始顺序,该怎么办?例如,如果不是d = OrderedDict({'b':2, 'a':1})我写d = OrderedDict({'a':1, 'b':2}),我可能会错误地得出结论认为该顺序已保留。在这种情况下,我发现a dict是按字母顺序排列的,但这可能并不总是正确的。什么是使用反例来验证数据结构是否保留顺序的可靠方法,而无需反复尝试测试向量,直到一个中断为止?

PS:我将在此留出参考:“ OrderedDict构造函数和update()方法都接受关键字参数,但是它们的顺序丢失了,因为Python的函数使用常规无序字典来调用语义传递关键字参数”

PPS:希望将来,OrderedDict也将保留kwarg的顺序(示例1):http : //bugs.python.org/issue16991

What’s the correct way to initialize an ordered dictionary (OD) so that it retains the order of initial data?

from collections import OrderedDict

# Obviously wrong because regular dict loses order
d = OrderedDict({'b':2, 'a':1}) 

# An OD is represented by a list of tuples, so would this work?
d = OrderedDict([('b',2), ('a', 1)])

# What about using a list comprehension, will 'd' preserve the order of 'l'
l = ['b', 'a', 'c', 'aa']
d = OrderedDict([(i,i) for i in l])

Question:

  • Will an OrderedDict preserve the order of a list of tuples, or tuple of tuples or tuple of lists or list of lists etc. passed at the time of initialization (2nd & 3rd example above)?

  • How does one go about verifying if OrderedDict actually maintains an order? Since a dict has an unpredictable order, what if my test vectors luckily have the same initial order as the unpredictable order of a dict? For example, if instead of d = OrderedDict({'b':2, 'a':1}) I write d = OrderedDict({'a':1, 'b':2}), I can wrongly conclude that the order is preserved. In this case, I found out that a dict is ordered alphabetically, but that may not be always true. What’s a reliable way to use a counterexample to verify whether a data structure preserves order or not, short of trying test vectors repeatedly until one breaks?

P.S. I’ll just leave this here for reference: “The OrderedDict constructor and update() method both accept keyword arguments, but their order is lost because Python’s function call semantics pass-in keyword arguments using a regular unordered dictionary”

P.P.S : Hopefully, in future, OrderedDict will preserve the order of kwargs also (example 1): http://bugs.python.org/issue16991


回答 0

OrderedDict将保留其有权访问的任何订单。将有序数据传递给它进行初始化的唯一方法是传递键值对的列表(或更普遍地讲,是可迭代的),如最后两个示例中所示。正如您链接的文档所述,当您传入关键字参数或dict参数时,OrderedDict无法访问任何顺序,因为其中的任何顺序都在OrderedDict构造函数看到之前被删除。

请注意,在上一个示例中使用列表推导并没有什么改变。OrderedDict([(i,i) for i in l])和之间没有区别OrderedDict([('b', 'b'), ('a', 'a'), ('c', 'c'), ('aa', 'aa')])。评估列表理解并创建列表,并将其传入;OrderedDict对它的创建方式一无所知。

The OrderedDict will preserve any order that it has access to. The only way to pass ordered data to it to initialize is to pass a list (or, more generally, an iterable) of key-value pairs, as in your last two examples. As the documentation you linked to says, the OrderedDict does not have access to any order when you pass in keyword arguments or a dict argument, since any order there is removed before the OrderedDict constructor sees it.

Note that using a list comprehension in your last example doesn’t change anything. There’s no difference between OrderedDict([(i,i) for i in l]) and OrderedDict([('b', 'b'), ('a', 'a'), ('c', 'c'), ('aa', 'aa')]). The list comprehension is evaluated and creates the list and it is passed in; OrderedDict knows nothing about how it was created.


回答 1

# An OD is represented by a list of tuples, so would this work?
d = OrderedDict([('b', 2), ('a', 1)])

是的,那行得通。根据定义,列表总是按照其表示方式进行排序。这也适用于列表理解,生成的列表的提供方式与提供数据的方式相同(即,来自列表的来源将是确定性的,来源于setdict不那么多)。

如何验证是否OrderedDict实际维持订单。由于字典具有不可预测的顺序,如果我的测试向量幸运地具有与字典的不可预测顺序相同的初始顺序,该怎么办?例如,如果不是d = OrderedDict({'b':2, 'a':1})我写d = OrderedDict({'a':1, 'b':2}),我可能会错误地得出结论认为该顺序已保留。在这种情况下,我发现a dict是按字母顺序排列的,但这可能并不总是正确的。也就是说,使用反例来验证数据结构是否保留顺序还是一种可靠的方法是一种可靠的方法,可以反复尝试测试向量,直到一个中断。

您保留2元组的源列表作为参考,并在进行单元测试时将其用作测试用例的测试数据。遍历它们并确保维持订单。

# An OD is represented by a list of tuples, so would this work?
d = OrderedDict([('b', 2), ('a', 1)])

Yes, that will work. By definition, a list is always ordered the way it is represented. This goes for list-comprehension too, the list generated is in the same way the data was provided (i.e. source from a list it will be deterministic, sourced from a set or dict not so much).

How does one go about verifying if OrderedDict actually maintains an order. Since a dict has an unpredictable order, what if my test vectors luckily has the same initial order as the unpredictable order of a dict?. For example, if instead of d = OrderedDict({'b':2, 'a':1}) I write d = OrderedDict({'a':1, 'b':2}), I can wrongly conclude that the order is preserved. In this case, I found out that a dict is order alphabetically, but that may not be always true. i.e. what’s a reliable way to use a counter example to verify if a data structure preserves order or not short of trying test vectors repeatedly until one breaks.

You keep your source list of 2-tuple around for reference, and use that as your test data for your test cases when you do unit tests. Iterate through them and ensure the order is maintained.


有效地对numpy数组进行降序排序?

问题:有效地对numpy数组进行降序排序?

令我惊讶的是,之前没有提出过这个具体问题,但是我真的没有在SO或文档中找到它np.sort

假设我有一个包含整数的随机numpy数组,例如:

> temp = np.random.randint(1,10, 10)    
> temp
array([2, 4, 7, 4, 2, 2, 7, 6, 4, 4])

如果对它进行排序,则默认情况下我将获得升序:

> np.sort(temp)
array([2, 2, 2, 4, 4, 4, 4, 6, 7, 7])

但我希望解决方案按降序排序。

现在,我知道我可以永远做:

reverse_order = np.sort(temp)[::-1]

但这最后的陈述有效吗?它不是按升序创建副本,然后反转此副本以反转顺序获得结果吗?如果确实如此,是否有有效的选择?看起来好像不np.sort接受参数来更改排序操作中的比较符号以使事情相反。

I am surprised this specific question hasn’t been asked before, but I really didn’t find it on SO nor on the documentation of np.sort.

Say I have a random numpy array holding integers, e.g:

> temp = np.random.randint(1,10, 10)    
> temp
array([2, 4, 7, 4, 2, 2, 7, 6, 4, 4])

If I sort it, I get ascending order by default:

> np.sort(temp)
array([2, 2, 2, 4, 4, 4, 4, 6, 7, 7])

but I want the solution to be sorted in descending order.

Now, I know I can always do:

reverse_order = np.sort(temp)[::-1]

but is this last statement efficient? Doesn’t it create a copy in ascending order, and then reverses this copy to get the result in reversed order? If this is indeed the case, is there an efficient alternative? It doesn’t look like np.sort accepts parameters to change the sign of the comparisons in the sort operation to get things in reverse order.


回答 0

temp[::-1].sort()对数组进行排序,然后np.sort(temp)[::-1]创建一个新数组。

In [25]: temp = np.random.randint(1,10, 10)

In [26]: temp
Out[26]: array([5, 2, 7, 4, 4, 2, 8, 6, 4, 4])

In [27]: id(temp)
Out[27]: 139962713524944

In [28]: temp[::-1].sort()

In [29]: temp
Out[29]: array([8, 7, 6, 5, 4, 4, 4, 4, 2, 2])

In [30]: id(temp)
Out[30]: 139962713524944

temp[::-1].sort() sorts the array in place, whereas np.sort(temp)[::-1] creates a new array.

In [25]: temp = np.random.randint(1,10, 10)

In [26]: temp
Out[26]: array([5, 2, 7, 4, 4, 2, 8, 6, 4, 4])

In [27]: id(temp)
Out[27]: 139962713524944

In [28]: temp[::-1].sort()

In [29]: temp
Out[29]: array([8, 7, 6, 5, 4, 4, 4, 4, 2, 2])

In [30]: id(temp)
Out[30]: 139962713524944

回答 1

>>> a=np.array([5, 2, 7, 4, 4, 2, 8, 6, 4, 4])

>>> np.sort(a)
array([2, 2, 4, 4, 4, 4, 5, 6, 7, 8])

>>> -np.sort(-a)
array([8, 7, 6, 5, 4, 4, 4, 4, 2, 2])
>>> a=np.array([5, 2, 7, 4, 4, 2, 8, 6, 4, 4])

>>> np.sort(a)
array([2, 2, 4, 4, 4, 4, 5, 6, 7, 8])

>>> -np.sort(-a)
array([8, 7, 6, 5, 4, 4, 4, 4, 2, 2])

回答 2

对于短数组,我建议np.argsort()通过查找已排序的否定数组的索引来使用,这比反转已排序的数组要快一些:

In [37]: temp = np.random.randint(1,10, 10)

In [38]: %timeit np.sort(temp)[::-1]
100000 loops, best of 3: 4.65 µs per loop

In [39]: %timeit temp[np.argsort(-temp)]
100000 loops, best of 3: 3.91 µs per loop

For short arrays I suggest using np.argsort() by finding the indices of the sorted negatived array, which is slightly faster than reversing the sorted array:

In [37]: temp = np.random.randint(1,10, 10)

In [38]: %timeit np.sort(temp)[::-1]
100000 loops, best of 3: 4.65 µs per loop

In [39]: %timeit temp[np.argsort(-temp)]
100000 loops, best of 3: 3.91 µs per loop

回答 3

不幸的是,当您有一个复杂的数组时,只能np.sort(temp)[::-1]正常工作。这里提到的其他两种方法无效。

Unfortunately when you have a complex array, only np.sort(temp)[::-1] works properly. The two other methods mentioned here are not effective.


回答 4

注意尺寸。

x  # initial numpy array
I = np.argsort(x) or I = x.argsort() 
y = np.sort(x)    or y = x.sort()
z  # reverse sorted array

全反转

z = x[-I]
z = -np.sort(-x)
z = np.flip(y)
  • flip更改1.15需要以前的版本。解决方案:。1.14 axispip install --upgrade numpy

第一维反转

z = y[::-1]
z = np.flipud(y)
z = np.flip(y, axis=0)

逆向二维

z = y[::-1, :]
z = np.fliplr(y)
z = np.flip(y, axis=1)

测试中

在100×10×10阵列上测试1000次。

Method       | Time (ms)
-------------+----------
y[::-1]      | 0.126659  # only in first dimension
-np.sort(-x) | 0.133152
np.flip(y)   | 0.121711
x[-I]        | 4.611778

x.sort()     | 0.024961
x.argsort()  | 0.041830
np.flip(x)   | 0.002026

这主要是由于重新索引而不是argsort

# Timing code
import time
import numpy as np


def timeit(fun, xs):
    t = time.time()
    for i in range(len(xs)):  # inline and map gave much worse results for x[-I], 5*t
        fun(xs[i])
    t = time.time() - t
    print(np.round(t,6))

I, N = 1000, (100, 10, 10)
xs = np.random.rand(I,*N)
timeit(lambda x: np.sort(x)[::-1], xs)
timeit(lambda x: -np.sort(-x), xs)
timeit(lambda x: np.flip(x.sort()), xs)
timeit(lambda x: x[-x.argsort()], xs)
timeit(lambda x: x.sort(), xs)
timeit(lambda x: x.argsort(), xs)
timeit(lambda x: np.flip(x), xs)

Be careful with dimensions.

Let

x  # initial numpy array
I = np.argsort(x) or I = x.argsort() 
y = np.sort(x)    or y = x.sort()
z  # reverse sorted array

Full Reverse

z = x[I[::-1]]
z = -np.sort(-x)
z = np.flip(y)
  • flip changed in 1.15, previous versions 1.14 required axis. Solution: pip install --upgrade numpy.

First Dimension Reversed

z = y[::-1]
z = np.flipud(y)
z = np.flip(y, axis=0)

Second Dimension Reversed

z = y[::-1, :]
z = np.fliplr(y)
z = np.flip(y, axis=1)

Testing

Testing on a 100×10×10 array 1000 times.

Method       | Time (ms)
-------------+----------
y[::-1]      | 0.126659  # only in first dimension
-np.sort(-x) | 0.133152
np.flip(y)   | 0.121711
x[I[::-1]]   | 4.611778

x.sort()     | 0.024961
x.argsort()  | 0.041830
np.flip(x)   | 0.002026

This is mainly due to reindexing rather than argsort.

# Timing code
import time
import numpy as np


def timeit(fun, xs):
    t = time.time()
    for i in range(len(xs)):  # inline and map gave much worse results for x[-I], 5*t
        fun(xs[i])
    t = time.time() - t
    print(np.round(t,6))

I, N = 1000, (100, 10, 10)
xs = np.random.rand(I,*N)
timeit(lambda x: np.sort(x)[::-1], xs)
timeit(lambda x: -np.sort(-x), xs)
timeit(lambda x: np.flip(x.sort()), xs)
timeit(lambda x: x[x.argsort()[::-1]], xs)
timeit(lambda x: x.sort(), xs)
timeit(lambda x: x.argsort(), xs)
timeit(lambda x: np.flip(x), xs)

回答 5

您好,我在寻找一种对二维numpy数组进行反向排序的解决方案,但找不到任何有效的方法,但是我想我偶然发现了一个我上载的解决方案,以防万一有人在同一条船上。

x=np.sort(array)
y=np.fliplr(x)

np.sort对升序进行排序,这不是您想要的,但是命令fliplr将行从左向右翻转!似乎可以工作!

希望它可以帮助您!

我猜这与上面关于-np.sort(-a)的建议相似,但是由于评论它并不总是有效而推迟了我的建议。也许我的解决方案也不总是可行,但是我已经用几个阵列对其进行了测试,似乎还可以。

Hello I was searching for a solution to reverse sorting a two dimensional numpy array, and I couldn’t find anything that worked, but I think I have stumbled on a solution which I am uploading just in case anyone is in the same boat.

x=np.sort(array)
y=np.fliplr(x)

np.sort sorts ascending which is not what you want, but the command fliplr flips the rows left to right! Seems to work!

Hope it helps you out!

I guess it’s similar to the suggest about -np.sort(-a) above but I was put off going for that by comment that it doesn’t always work. Perhaps my solution won’t always work either however I have tested it with a few arrays and seems to be OK.


回答 6

您可以先对数组进行排序(默认为升序),然后应用np.flip()https://docs.scipy.org/doc/numpy/reference/generated/numpy.flip.html

仅供参考,它也适用于日期时间对象。

例:

    x = np.array([2,3,1,0]) 
    x_sort_asc=np.sort(x) 
    print(x_sort_asc)

    >>> array([0, 1, 2, 3])

    x_sort_desc=np.flip(x_sort_asc) 
    print(x_sort_desc)

    >>> array([3,2,1,0])

You could sort the array first (Ascending by default) and then apply np.flip() (https://docs.scipy.org/doc/numpy/reference/generated/numpy.flip.html)

FYI It works with datetime objects as well.

Example:

    x = np.array([2,3,1,0]) 
    x_sort_asc=np.sort(x) 
    print(x_sort_asc)

    >>> array([0, 1, 2, 3])

    x_sort_desc=np.flip(x_sort_asc) 
    print(x_sort_desc)

    >>> array([3,2,1,0])

回答 7

这是一个快速窍门

In[3]: import numpy as np
In[4]: temp = np.random.randint(1,10, 10)
In[5]: temp
Out[5]: array([5, 4, 2, 9, 2, 3, 4, 7, 5, 8])

In[6]: sorted = np.sort(temp)
In[7]: rsorted = list(reversed(sorted))
In[8]: sorted
Out[8]: array([2, 2, 3, 4, 4, 5, 5, 7, 8, 9])

In[9]: rsorted
Out[9]: [9, 8, 7, 5, 5, 4, 4, 3, 2, 2]

Here is a quick trick

In[3]: import numpy as np
In[4]: temp = np.random.randint(1,10, 10)
In[5]: temp
Out[5]: array([5, 4, 2, 9, 2, 3, 4, 7, 5, 8])

In[6]: sorted = np.sort(temp)
In[7]: rsorted = list(reversed(sorted))
In[8]: sorted
Out[8]: array([2, 2, 3, 4, 4, 5, 5, 7, 8, 9])

In[9]: rsorted
Out[9]: [9, 8, 7, 5, 5, 4, 4, 3, 2, 2]

回答 8

我建议使用这个…

np.arange(start_index, end_index, intervals)[::-1]

例如:

np.arange(10, 20, 0.5)
np.arange(10, 20, 0.5)[::-1]

然后您的恢复:

[ 19.5,  19. ,  18.5,  18. ,  17.5,  17. ,  16.5,  16. ,  15.5,
    15. ,  14.5,  14. ,  13.5,  13. ,  12.5,  12. ,  11.5,  11. ,
    10.5,  10. ]

i suggest using this …

np.arange(start_index, end_index, intervals)[::-1]

for example:

np.arange(10, 20, 0.5)
np.arange(10, 20, 0.5)[::-1]

Then your resault:

[ 19.5,  19. ,  18.5,  18. ,  17.5,  17. ,  16.5,  16. ,  15.5,
    15. ,  14.5,  14. ,  13.5,  13. ,  12.5,  12. ,  11.5,  11. ,
    10.5,  10. ]

如何在Python中按字母顺序对unicode字符串排序?

问题:如何在Python中按字母顺序对unicode字符串排序?

Python默认情况下按字节值排序,这意味着é在z和其他同样有趣的事情之后。在Python中按字母顺序排序的最佳方法是什么?

有图书馆吗?我什么都找不到。最好是排序应该有语言支持,因此它理解åäö应该用瑞典语在z之后排序,但是ü应该用u进行排序,依此类推。因此,Unicode支持非常必要。

如果没有它的库,什么是最好的方法?只需将字母映射到整数值,然后将字符串映射到整数列表?

Python sorts by byte value by default, which means é comes after z and other equally funny things. What is the best way to sort alphabetically in Python?

Is there a library for this? I couldn’t find anything. Preferrably sorting should have language support so it understands that åäö should be sorted after z in Swedish, but that ü should be sorted by u, etc. Unicode support is thereby pretty much a requirement.

If there is no library for it, what is the best way to do this? Just make a mapping from letter to a integer value and map the string to a integer list with that?


回答 0

IBM的ICU库可以做到这一点(还有更多)。它具有Python绑定:PyICU

更新:在ICU之间进行排序的核心区别locale.strcoll在于,ICU使用完整的Unicode排序算法,strcoll使用ISO 14651

此处简要总结了这两种算法之间的区别:http : //unicode.org/faq/collat​​ion.html#13。这些是非常奇特的特殊情况,在实践中几乎没有关系。

>>> import icu # pip install PyICU
>>> sorted(['a','b','c','ä'])
['a', 'b', 'c', 'ä']
>>> collator = icu.Collator.createInstance(icu.Locale('de_DE.UTF-8'))
>>> sorted(['a','b','c','ä'], key=collator.getSortKey)
['a', 'ä', 'b', 'c']

IBM’s ICU library does that (and a lot more). It has Python bindings: PyICU.

Update: The core difference in sorting between ICU and locale.strcoll is that ICU uses the full Unicode Collation Algorithm while strcoll uses ISO 14651.

The differences between those two algorithms are briefly summarized here: http://unicode.org/faq/collation.html#13. These are rather exotic special cases, which should rarely matter in practice.

>>> import icu # pip install PyICU
>>> sorted(['a','b','c','ä'])
['a', 'b', 'c', 'ä']
>>> collator = icu.Collator.createInstance(icu.Locale('de_DE.UTF-8'))
>>> sorted(['a','b','c','ä'], key=collator.getSortKey)
['a', 'ä', 'b', 'c']

回答 1

我没有在答案中看到这一点。我的应用程序使用python的标准库根据语言环境进行排序。这很容易。

# python2.5 code below
# corpus is our unicode() strings collection as a list
corpus = [u"Art", u"Älg", u"Ved", u"Wasa"]

import locale
# this reads the environment and inits the right locale
locale.setlocale(locale.LC_ALL, "")
# alternatively, (but it's bad to hardcode)
# locale.setlocale(locale.LC_ALL, "sv_SE.UTF-8")

corpus.sort(cmp=locale.strcoll)

# in python2.x, locale.strxfrm is broken and does not work for unicode strings
# in python3.x however:
# corpus.sort(key=locale.strxfrm)

给Lennart和其他回答者的问题:没有人知道“语言环境”吗?还是不符合这项任务?

I don’t see this in the answers. My Application sorts according to the locale using python’s standard library. It is pretty easy.

# python2.5 code below
# corpus is our unicode() strings collection as a list
corpus = [u"Art", u"Älg", u"Ved", u"Wasa"]

import locale
# this reads the environment and inits the right locale
locale.setlocale(locale.LC_ALL, "")
# alternatively, (but it's bad to hardcode)
# locale.setlocale(locale.LC_ALL, "sv_SE.UTF-8")

corpus.sort(cmp=locale.strcoll)

# in python2.x, locale.strxfrm is broken and does not work for unicode strings
# in python3.x however:
# corpus.sort(key=locale.strxfrm)

Question to Lennart and other answerers: Doesn’t anyone know ‘locale’ or is it not up to this task?


回答 2

尝试使用James Tauber的Python Unicode排序规则算法。它可能无法完全满足您的要求,但似乎值得一看。有关这些问题的更多信息,请参阅Christopher Lenz的这篇文章

Try James Tauber’s Python Unicode Collation Algorithm. It may not do exactly as you want, but seems well worth a look. For a bit more information about the issues, see this post by Christopher Lenz.


回答 3

您可能也对pyuca感兴趣:

http://jtauber.com/blog/2006/01/27/python_unicode_collat​​ion_algorithm/

尽管肯定不是最精确的方法,但至少可以使它正确一些,这是一种非常简单的方法。由于语言环境不是线程安全的,因此它还会在Web应用程序中击败语言环境,并在整个过程中设置语言设置。它比依赖外部C库的PyICU设置起来容易。

我将脚本上传到github上,因为在撰写本文时原始脚本已关闭,因此我不得不依靠网络缓存来获取它:

https://github.com/href/Python-Unicode-Collat​​ion-Algorithm

我成功地使用此脚本对plone模块中的德语/法语/意大利语文本进行了合理排序。

You might also be interested in pyuca:

http://jtauber.com/blog/2006/01/27/python_unicode_collation_algorithm/

Though it is certainly not the most exact way, it is a very simple way to at least get it somewhat right. It also beats locale in a webapp as locale is not threadsafe and sets the language settings process-wide. It also easier to set up than PyICU which relies on an external C library.

I uploaded the script to github as the original was down at the time of this writing and I had to resort to web caches to get it:

https://github.com/href/Python-Unicode-Collation-Algorithm

I successfully used this script to sanely sort German/French/Italian text in a plone module.


回答 4

摘要和扩展答案:

locale.strcoll在Python 2下,并locale.strxfrm假设您已经安装了有问题的语言环境,实际上可以解决问题,并且表现出色。我也在Windows下进行了测试,这在令人困惑的语言环境名称方面是不同的,但另一方面,默认情况下似乎安装了所有受支持的语言环境。

ICU不一定在实践中会做得更好,但是会做得更多。最值得注意的是,它支持可将不同语言的文本拆分为单词的拆分器。这对于没有单词分隔符的语言非常有用。您需要有一个语料库来用作拆分的基础,因为虽然不包括在内。

它还具有较长的语言环境名称,因此您可以获得语言环境的漂亮显示名称,支持除Gregorian之外的其他日历(尽管我不确定Python接口是否支持该日历),以及成吨的其他或多或少晦涩的语言环境支持。

因此,总而言之:如果您希望按字母顺序和语言环境进行排序,则可以使用该locale模块,除非您有特殊要求,或者还需要更多的语言环境相关功能,例如单词拆分器。

A summary and extended answer:

locale.strcoll under Python 2, and locale.strxfrm will in fact solve the problem, and does a good job, assuming that you have the locale in question installed. I tested it under Windows too, where the locale names confusingly are different, but on the other hand it seems to have all locales that are supported installed by default.

ICU doesn’t necessarily do this better in practice, it however does way more. Most notably it has support for splitters that can split texts in different languages into words. This is very useful for languages that doesn’t have word separators. You’ll need to have a corpus of words to use as a base for the splitting, because that’s not included, though.

It also has long names for the locales so you can get pretty display names for the locale, support for other calendars than Gregorian (although I’m not sure the Python interface supports that) and tons and tons of other more or less obscure locale supports.

So all in all: If you want to sort alphabetically and locale-dependent, you can use the locale module, unless you have special requirements, or also need more locale dependent functionality, like words splitter.


回答 5

我看到答案已经做了出色的工作,只想指出“ 人类排序”中的编码效率低下。要将选择性逐字符转换应用于unicode字符串s,它使用以下代码:

spec_dict = {'Å':'A', 'Ä':'A'}

def spec_order(s):
    return ''.join([spec_dict.get(ch, ch) for ch in s])

Python有一种更好,更快和更简洁的方式来执行此辅助任务(在Unicode字符串上-字节字符串的类似方法具有不同的且不太有用的规范!-):

spec_dict = dict((ord(k), spec_dict[k]) for k in spec_dict)

def spec_order(s):
    return s.translate(spec_dict)

传递给该translate方法的dict将Unicode序号(不是字符串)作为键,这就是为什么我们需要从原始char-to-char进行重建的原因spec_dict。(传递给dict的dict中的值(相对于键,键必须为序数)可以是Unicode序数,任意Unicode字符串,也可以是None来删除相应字符作为翻译的一部分,因此很容易指定“忽略一个某些字符以进行排序”,“将ä映射到ae以进行排序”等)。

在Python 3中,您可以更简单地完成“重建”步骤,例如:

spec_dict = ''.maketrans(spec_dict)

请参阅的文档你可以使用这个其他方式maketrans在Python 3静态方法。

I see the answers have already done an excellent job, just wanted to point out one coding inefficiency in Human Sort. To apply a selective char-by-char translation to a unicode string s, it uses the code:

spec_dict = {'Å':'A', 'Ä':'A'}

def spec_order(s):
    return ''.join([spec_dict.get(ch, ch) for ch in s])

Python has a much better, faster and more concise way to perform this auxiliary task (on Unicode strings — the analogous method for byte strings has a different and somewhat less helpful specification!-):

spec_dict = dict((ord(k), spec_dict[k]) for k in spec_dict)

def spec_order(s):
    return s.translate(spec_dict)

The dict you pass to the translate method has Unicode ordinals (not strings) as keys, which is why we need that rebuilding step from the original char-to-char spec_dict. (Values in the dict you pass to translate [as opposed to keys, which must be ordinals] can be Unicode ordinals, arbitrary Unicode strings, or None to remove the corresponding character as part of the translation, so it’s easy to specify “ignore a certain character for sorting purposes”, “map ä to ae for sorting purposes”, and the like).

In Python 3, you can get the “rebuilding” step more simply, e.g.:

spec_dict = ''.maketrans(spec_dict)

See the docs for other ways you can use this maketrans static method in Python 3.


回答 6


回答 7

最近,我一直在使用zope.ucol(https://pypi.python.org/pypi/zope.ucol)执行此任务。例如,对德语ß进行排序:

>>> import zope.ucol
>>> collator = zope.ucol.Collator("de-de")
>>> mylist = [u"a", u'x', u'\u00DF']
>>> print mylist
[u'a', u'x', u'\xdf']
>>> print sorted(mylist, key=collator.key)
[u'a', u'\xdf', u'x']

zope.ucol还包装ICU,因此可以替代PyICU。

Lately I’ve been using zope.ucol (https://pypi.python.org/pypi/zope.ucol) for this task. For example, sorting the german ß:

>>> import zope.ucol
>>> collator = zope.ucol.Collator("de-de")
>>> mylist = [u"a", u'x', u'\u00DF']
>>> print mylist
[u'a', u'x', u'\xdf']
>>> print sorted(mylist, key=collator.key)
[u'a', u'\xdf', u'x']

zope.ucol also wraps ICU, so would be an alternative to PyICU.


回答 8

完整的UCA解决方案

执行此操作的最简单,最简单,最直接的方法是对Perl库模块Unicode :: Collat​​e :: Locale进行调出,它是标准Unicode :: Collat​​e模块的子类。您需要做的就是将"xv"瑞典的语言环境值传递给构造函数。

(对于瑞典语文本,您可能不一定会对此有所了解,但是由于Perl使用抽象字符,因此您可以使用任何Unicode代码点-无论是平台还是构建方式!很少有语言提供这种便利。我提到它是因为我一直在与最近在这个令人发指的问题上与Java失去了很多战斗。)

问题是,我不知道如何从Python访问Perl模块-除了通过使用shell标注或两侧管道之外。因此,为此,我为您提供了一个完整的工作脚本ucsort,您可以调用它来轻松轻松地完成您所要求的。

该脚本100%兼容完整的Unicode排序算法,并支持所有定制选项!而且,如果您安装了可选模块或运行Perl 5.13或更高版本,则可以完全访问易于使用的CLDR语言环境。见下文。

示范

想象一下这样设置的输入集:

b o i j n l m å y e v s k h d f g t ö r x p z a ä c u q

默认的按代码点排序将生成:

a b c d e f g h i j k l m n o p q r s t u v x y z ä å ö

这在每个人的书中都是不正确的。使用我的使用Unicode归类算法的脚本,您将获得以下命令:

% perl ucsort /tmp/swedish_alphabet | fmt
a å ä b c d e f g h i j k l m n o ö p q r s t u v x y z

这是默认的UCA排序。要获取瑞典语言环境,请以下方式调用ucsort

% perl ucsort --locale=sv /tmp/swedish_alphabet | fmt
a b c d e f g h i j k l m n o p q r s t u v x y z å ä ö

这是一个更好的输入演示。首先,输入集:

% fmt /tmp/swedish_set
cTD cDD Cöd Cbd cAD cCD cYD Cud cZD Cod cBD Cnd cQD cFD Ced Cfd cOD
cLD cXD Cid Cpd cID Cgd cVD cMD cÅD cGD Cqd Cäd cJD Cdd Ckd cÖD cÄD
Ctd Czd Cxd cHD cND cKD Cvd Chd Cyd cUD Cld Cmd cED Crd Cad Cåd Ccd
cRD cSD Csd Cjd cPD

按代码点,排序方式如下:

Cad Cbd Ccd Cdd Ced Cfd Cgd Chd Cid Cjd Ckd Cld Cmd Cnd Cod Cpd Cqd
Crd Csd Ctd Cud Cvd Cxd Cyd Czd Cäd Cåd Cöd cAD cBD cCD cDD cED cFD
cGD cHD cID cJD cKD cLD cMD cND cOD cPD cQD cRD cSD cTD cUD cVD cXD
cYD cZD cÄD cÅD cÖD

但是使用默认的UCA可以使这种方式排序:

% ucsort /tmp/swedish_set | fmt
cAD Cad cÅD Cåd cÄD Cäd cBD Cbd cCD Ccd cDD Cdd cED Ced cFD Cfd cGD
Cgd cHD Chd cID Cid cJD Cjd cKD Ckd cLD Cld cMD Cmd cND Cnd cOD Cod
cÖD Cöd cPD Cpd cQD Cqd cRD Crd cSD Csd cTD Ctd cUD Cud cVD Cvd cXD
Cxd cYD Cyd cZD Czd

但是在瑞典语言环境中,这种方式是:

% ucsort --locale=sv /tmp/swedish_set | fmt
cAD Cad cBD Cbd cCD Ccd cDD Cdd cED Ced cFD Cfd cGD Cgd cHD Chd cID
Cid cJD Cjd cKD Ckd cLD Cld cMD Cmd cND Cnd cOD Cod cPD Cpd cQD Cqd
cRD Crd cSD Csd cTD Ctd cUD Cud cVD Cvd cXD Cxd cYD Cyd cZD Czd cÅD
Cåd cÄD Cäd cÖD Cöd

如果您希望大写字母在小写字母之前排序,请执行以下操作:

% ucsort --upper-before-lower --locale=sv /tmp/swedish_set | fmt
Cad cAD Cbd cBD Ccd cCD Cdd cDD Ced cED Cfd cFD Cgd cGD Chd cHD Cid
cID Cjd cJD Ckd cKD Cld cLD Cmd cMD Cnd cND Cod cOD Cpd cPD Cqd cQD
Crd cRD Csd cSD Ctd cTD Cud cUD Cvd cVD Cxd cXD Cyd cYD Czd cZD Cåd
cÅD Cäd cÄD Cöd cÖD

定制排序

您可以使用ucsort做许多其他事情。例如,以下是英文标题的排序方法:

% ucsort --preprocess='s/^(an?|the)\s+//i' /tmp/titles
Anathem
The Book of Skulls
A Civil Campaign
The Claw of the Conciliator
The Demolished Man
Dune
An Early Dawn
The Faded Sun: Kesrith
The Fall of Hyperion
A Feast for Crows
Flowers for Algernon
The Forbidden Tower
Foundation and Empire
Foundations Edge
The Goblin Reservation
The High Crusade
Jack of Shadows
The Man in the High Castle
The Ringworld Engineers
The Robots of Dawn
A Storm of Swords
Stranger in a Strange Land
There Will Be Time
The White Dragon

通常,您将需要Perl 5.10.1或更高版本才能运行脚本。为了支持语言环境,您必须安装可选的CPAN模块Unicode::Collate::Locale。或者,您可以安装Perl 5.13+的开发版本,该版本标准包含该模块。

调用约定

这是一个快速的原型,因此ucsort大多没有记录。但这是它在命令行上接受的开关/选项的摘要:

    # standard options
    --help|?
    --man|m
    --debug|d

    # collator constructor options
    --backwards-levels=i
    --collation-level|level|l=i
    --katakana-before-hiragana
    --normalization|n=s
    --override-CJK=s
    --override-Hangul=s
    --preprocess|P=s
    --upper-before-lower|u
    --variable=s

    # program specific options
    --case-insensitive|insensitive|i
    --input-encoding|e=s
    --locale|L=s
    --paragraph|p
    --reverse-fields|last
    --reverse-output|r
    --right-to-left|reverse-input

是的,好的:那确实是我调用时使用的参数列表Getopt::Long,但是您明白了。:)

如果您可以弄清楚如何直接从Python调用Perl库模块而不调用Perl脚本,则一定要这样做。我只是不知道自己。我很想学习如何。

同时,我相信此脚本将完成您需要做的所有工作,甚至更多! 现在,我将其用于所有文本排序。它最后确实已经需要很长一段时间我。

唯一的缺点是,这种--locale说法会导致性能下降,尽管它对于常规,非语言环境但仍100%符合UCA的排序足够快。由于它将所有内容加载到内存中,因此您可能不想在千兆字节的文档上使用它。我每天使用它很多次,并且可以确保最后一次理智的文本排序非常好。

A Complete UCA Solution

The simplest, easiest, and most straightforward way to do this it to make a callout to the Perl library module, Unicode::Collate::Locale, which is a subclass of the standard Unicode::Collate module. All you need do is pass the constructor a locale value of "xv" for Sweden.

(You may not neccesarily appreciate this for Swedish text, but because Perl uses abstract characters, you can use any Unicode code point you please — no matter the platform or build! Few languages offer such convenience. I mention it because I’ve fighting a losing battle with Java a lot over this maddening problem lately.)

The problem is that I do not know how to access a Perl module from Python — apart, that is, from using a shell callout or two-sided pipe. To that end, I have therefore provided you with a complete working script called ucsort that you can call to do exactly what you have asked for with perfect ease.

This script is 100% compliant with the full Unicode Collation Algorithm, with all tailoring options supported!! And if you have an optional module installed or run Perl 5.13 or better, then you have full access to easy-to-use CLDR locales. See below.

Demonstration

Imagine an input set ordered this way:

b o i j n l m å y e v s k h d f g t ö r x p z a ä c u q

A default sort by code point yields:

a b c d e f g h i j k l m n o p q r s t u v x y z ä å ö

which is incorrect by everybody’s book. Using my script, which uses the Unicode Collation Algorithm, you get this order:

% perl ucsort /tmp/swedish_alphabet | fmt
a å ä b c d e f g h i j k l m n o ö p q r s t u v x y z

That is the default UCA sort. To get the Swedish locale, call ucsort this way:

% perl ucsort --locale=sv /tmp/swedish_alphabet | fmt
a b c d e f g h i j k l m n o p q r s t u v x y z å ä ö

Here is a better input demo. First, the input set:

% fmt /tmp/swedish_set
cTD cDD Cöd Cbd cAD cCD cYD Cud cZD Cod cBD Cnd cQD cFD Ced Cfd cOD
cLD cXD Cid Cpd cID Cgd cVD cMD cÅD cGD Cqd Cäd cJD Cdd Ckd cÖD cÄD
Ctd Czd Cxd cHD cND cKD Cvd Chd Cyd cUD Cld Cmd cED Crd Cad Cåd Ccd
cRD cSD Csd Cjd cPD

By code point, that sorts this way:

Cad Cbd Ccd Cdd Ced Cfd Cgd Chd Cid Cjd Ckd Cld Cmd Cnd Cod Cpd Cqd
Crd Csd Ctd Cud Cvd Cxd Cyd Czd Cäd Cåd Cöd cAD cBD cCD cDD cED cFD
cGD cHD cID cJD cKD cLD cMD cND cOD cPD cQD cRD cSD cTD cUD cVD cXD
cYD cZD cÄD cÅD cÖD

But using the default UCA makes it sort this way:

% ucsort /tmp/swedish_set | fmt
cAD Cad cÅD Cåd cÄD Cäd cBD Cbd cCD Ccd cDD Cdd cED Ced cFD Cfd cGD
Cgd cHD Chd cID Cid cJD Cjd cKD Ckd cLD Cld cMD Cmd cND Cnd cOD Cod
cÖD Cöd cPD Cpd cQD Cqd cRD Crd cSD Csd cTD Ctd cUD Cud cVD Cvd cXD
Cxd cYD Cyd cZD Czd

But in the Swedish locale, this way:

% ucsort --locale=sv /tmp/swedish_set | fmt
cAD Cad cBD Cbd cCD Ccd cDD Cdd cED Ced cFD Cfd cGD Cgd cHD Chd cID
Cid cJD Cjd cKD Ckd cLD Cld cMD Cmd cND Cnd cOD Cod cPD Cpd cQD Cqd
cRD Crd cSD Csd cTD Ctd cUD Cud cVD Cvd cXD Cxd cYD Cyd cZD Czd cÅD
Cåd cÄD Cäd cÖD Cöd

If you prefer uppercase to sort before lowercase, do this:

% ucsort --upper-before-lower --locale=sv /tmp/swedish_set | fmt
Cad cAD Cbd cBD Ccd cCD Cdd cDD Ced cED Cfd cFD Cgd cGD Chd cHD Cid
cID Cjd cJD Ckd cKD Cld cLD Cmd cMD Cnd cND Cod cOD Cpd cPD Cqd cQD
Crd cRD Csd cSD Ctd cTD Cud cUD Cvd cVD Cxd cXD Cyd cYD Czd cZD Cåd
cÅD Cäd cÄD Cöd cÖD

Customized Sorts

You can do many other things with ucsort. For example, here is how to sort titles in English:

% ucsort --preprocess='s/^(an?|the)\s+//i' /tmp/titles
Anathem
The Book of Skulls
A Civil Campaign
The Claw of the Conciliator
The Demolished Man
Dune
An Early Dawn
The Faded Sun: Kesrith
The Fall of Hyperion
A Feast for Crows
Flowers for Algernon
The Forbidden Tower
Foundation and Empire
Foundation’s Edge
The Goblin Reservation
The High Crusade
Jack of Shadows
The Man in the High Castle
The Ringworld Engineers
The Robots of Dawn
A Storm of Swords
Stranger in a Strange Land
There Will Be Time
The White Dragon

You will need Perl 5.10.1 or better to run the script in general. For locale support, you must either install the optional CPAN module Unicode::Collate::Locale. Alternately, you can install a development versions of Perl, 5.13+, which include that module standardly.

Calling Conventions

This is a rapid prototype, so ucsort is mostly un(der)documented. But this is its SYNOPSIS of what switches/options it accepts on the command line:

    # standard options
    --help|?
    --man|m
    --debug|d

    # collator constructor options
    --backwards-levels=i
    --collation-level|level|l=i
    --katakana-before-hiragana
    --normalization|n=s
    --override-CJK=s
    --override-Hangul=s
    --preprocess|P=s
    --upper-before-lower|u
    --variable=s

    # program specific options
    --case-insensitive|insensitive|i
    --input-encoding|e=s
    --locale|L=s
    --paragraph|p
    --reverse-fields|last
    --reverse-output|r
    --right-to-left|reverse-input

Yeah, ok: that’s really the argument list I use for the call to Getopt::Long, but you get the idea. :)

If you can figure out how to call Perl library modules from Python directly without calling a Perl script, by all means do so. I just don’t know how myself. I’d love to learn how.

In the meantime, I believe this script will do what you need done in all its particular — and more! I now use this for all of text sorting. It finally does what I’ve needed for a long, long time.

The only downside is that --locale argument causes performance to go down the tubes, although it’s plenty fast enough for regular, non-locale but still 100% UCA compliant sorting. Since it loads everything in memory, you probably don’t want to use this on gigabyte documents. I use it many times a day, and it sure it great having sane text sorting at last.


回答 9

这是远从你的使用情况的完整解决方案,但你可以看看的unaccent.py从effbot.org脚本。它的主要作用是删除文本中的所有重音。您可以使用“经过消毒的”文本按字母顺序排序。(有关详细说明,请参阅页面。)

It is far from a complete solution for your use case, but you could take a look at the unaccent.py script from effbot.org. What it basically does is remove all accents from a text. You can use that ‘sanitized’ text to sort alphabetically. (For a better description see this page.)


回答 10

杰夫阿特伍德(Jeff Atwood)在“ 自然排序顺序”上写了一篇不错的文章,他链接到一个脚本,该脚本几乎可以完成您的要求

无论如何,它都不是琐碎的脚本,但是可以解决问题。

Jeff Atwood wrote a good post on Natural Sort Order, in it he linked to a script which does pretty much what you ask.

It’s not a trivial script, by any means, but it does the trick.


根据字符串长度对Python列表进行排序

问题:根据字符串长度对Python列表进行排序

我想根据字符串长度对字符串列表进行排序。我尝试使用sort,如下所示,但似乎无法给我正确的结果。

xs = ['dddd','a','bb','ccc']
print xs
xs.sort(lambda x,y: len(x) < len(y))
print xs

['dddd', 'a', 'bb', 'ccc']
['dddd', 'a', 'bb', 'ccc']

可能是什么问题?

I want to sort a list of strings based on the string length. I tried to use sort as follows, but it doesn’t seem to give me correct result.

xs = ['dddd','a','bb','ccc']
print xs
xs.sort(lambda x,y: len(x) < len(y))
print xs

['dddd', 'a', 'bb', 'ccc']
['dddd', 'a', 'bb', 'ccc']

What might be wrong?


回答 0

将传递lambda给时sort,您需要返回一个整数,而不是布尔值。因此,您的代码应改为:

xs.sort(lambda x,y: cmp(len(x), len(y)))

请注意,cmp是一个内置函数,cmp(x, y)如果x小于则返回-1 yx等于则返回0 yx大于则返回1 y

当然,您可以改为使用key参数:

xs.sort(key=lambda s: len(s))

这告诉该sort方法根据键函数返回的值进行排序。

编辑:感谢下面的balpha和Ruslan指出您可以len直接将其作为关键参数传递给函数,从而消除了对a的需要lambda

xs.sort(key=len)

正如Ruslan在下面指出的那样,您还可以使用内置的排序函数而不是list.sort方法,该方法创建一个新列表,而不是就地对现有列表进行排序:

print(sorted(xs, key=len))

When you pass a lambda to sort, you need to return an integer, not a boolean. So your code should instead read as follows:

xs.sort(lambda x,y: cmp(len(x), len(y)))

Note that cmp is a builtin function such that cmp(x, y) returns -1 if x is less than y, 0 if x is equal to y, and 1 if x is greater than y.

Of course, you can instead use the key parameter:

xs.sort(key=lambda s: len(s))

This tells the sort method to order based on whatever the key function returns.

EDIT: Thanks to balpha and Ruslan below for pointing out that you can just pass len directly as the key parameter to the function, thus eliminating the need for a lambda:

xs.sort(key=len)

And as Ruslan points out below, you can also use the built-in sorted function rather than the list.sort method, which creates a new list rather than sorting the existing one in-place:

print(sorted(xs, key=len))

回答 1

与Eli的答案相同-只是使用较短的表格,因为您可以lambda在此处跳过一部分。

创建新列表:

>>> xs = ['dddd','a','bb','ccc']
>>> sorted(xs, key=len)
['a', 'bb', 'ccc', 'dddd']

就地排序:

>>> xs.sort(key=len)
>>> xs
['a', 'bb', 'ccc', 'dddd']

The same as in Eli’s answer – just using a shorter form, because you can skip a lambda part here.

Creating new list:

>>> xs = ['dddd','a','bb','ccc']
>>> sorted(xs, key=len)
['a', 'bb', 'ccc', 'dddd']

In-place sorting:

>>> xs.sort(key=len)
>>> xs
['a', 'bb', 'ccc', 'dddd']

回答 2

我想添加排序时pythonic键函数的工作方式:

装饰-排序-非装饰设计模式:

当使用所谓的decorate-sort-undecorate设计模式实现排序时,Python对关键功能的支持。

它分3个步骤进行:

  1. 列表中的每个元素都会临时替换为“修饰的”版本,其中包含应用于该元素的键函数的结果。

  2. 该列表是根据键的自然顺序排序的。

  3. 装饰元素将替换为原始元素。

用于在进行比较之前在每个列表元素上指定要调用的函数的关键参数。docs

I Would like to add how the pythonic key function works while sorting :

Decorate-Sort-Undecorate Design Pattern :

Python’s support for a key function when sorting is implemented using what is known as the decorate-sort-undecorate design pattern.

It proceeds in 3 steps:

  1. Each element of the list is temporarily replaced with a “decorated” version that includes the result of the key function applied to the element.

  2. The list is sorted based upon the natural order of the keys.

  3. The decorated elements are replaced by the original elements.

Key parameter to specify a function to be called on each list element prior to making comparisons. docs


回答 3

最简单的方法是:

list.sort(key = lambda x:len(x))

The easiest way to do this is:

list.sort(key = lambda x:len(x))


回答 4

编写一个功能Lensort以根据长度对字符串列表进行排序。

def lensort(a):
    n = len(a)
    for i in range(n):
        for j in range(i+1,n):
            if len(a[i]) > len(a[j]):
                temp = a[i]
                a[i] = a[j]
                a[j] = temp
    return a
print lensort(["hello","bye","good"])

Write a function lensort to sort a list of strings based on length.

def lensort(a):
    n = len(a)
    for i in range(n):
        for j in range(i+1,n):
            if len(a[i]) > len(a[j]):
                temp = a[i]
                a[i] = a[j]
                a[j] = temp
    return a
print lensort(["hello","bye","good"])

回答 5

def lensort(list_1):
    list_2=[];list_3=[]
for i in list_1:
    list_2.append([i,len(i)])
list_2.sort(key = lambda x : x[1])
for i in list_2:
    list_3.append(i[0])
return list_3

这对我有用!

def lensort(list_1):
    list_2=[];list_3=[]
for i in list_1:
    list_2.append([i,len(i)])
list_2.sort(key = lambda x : x[1])
for i in list_2:
    list_3.append(i[0])
return list_3

This works for me!


回答 6

我可以使用以下两种方法来实现

def lensort(x):
    list1 = []
    for i in x:
        list1.append([len(i),i])
    return sorted(list1)

lista = ['a', 'bb', 'ccc', 'dddd']
a=lensort(lista)
print([l[1] for l in a])

在使用Lambda的一个Liner中,如下所示,上面已经给出了答案。

 lista = ['a', 'bb', 'ccc', 'dddd']
 lista.sort(key = lambda x:len(x))
 print(lista)

I can do it using below two methods, using function

def lensort(x):
    list1 = []
    for i in x:
        list1.append([len(i),i])
    return sorted(list1)

lista = ['a', 'bb', 'ccc', 'dddd']
a=lensort(lista)
print([l[1] for l in a])

In one Liner using Lambda, as below, a already answered above.

 lista = ['a', 'bb', 'ccc', 'dddd']
 lista.sort(key = lambda x:len(x))
 print(lista)

关于Python的内置sort()方法

问题:关于Python的内置sort()方法

sort()Python 的内置方法使用什么算法?可以看看该方法的代码吗?

What algorithm is the built in sort() method in Python using? Is it possible to have a look at the code for that method?


回答 0

当然!代码在这里,从函数开始,islt进行相当一段时间;-)。就像克里斯的评论所暗示的那样,它是C代码。您还需要阅读此文本文件,以获得文本说明,结果等。

如果您喜欢阅读Java代码而不是C代码,则可以看看Joshua Bloch在Java中和Java中实现timsort(Joshua也是在1997年实现了仍在Java中使用的经过修改的mergesort的人,并且人们希望Java将最终改用他最近的timsort港口)。

timsort的Java端口的一些说明在这里,diff在这里(带有指向所有需要的文件的指针),关键文件在这里 -FWIW,尽管我是比Java程序员更好的C程序员,在这种情况下,我发现Joshua的Java代码比Tim的C代码更具可读性;-)。

Sure! The code’s here, starting with function islt and proceeding for QUITE a while;-). As Chris’s comment suggests, it’s C code. You’ll also want to read this text file for a textual explanation, results, etc etc.

If you prefer reading Java code than C code, you could look at Joshua Bloch’s implementation of timsort in and for Java (Joshua’s also the guy who implemented, in 1997, the modified mergesort that’s still used in Java, and one can hope that Java will eventually switch to his recent port of timsort).

Some explanation of the Java port of timsort is here, the diff is here (with pointers to all needed files), the key file is here — FWIW, while I’m a better C programmer than Java programmer, in this case I find Joshua’s Java code more readable overall than Tim’s C code;-).


回答 1

我只是想提供一个非常有用的链接,而我在Alex的其他全面答案中却没有找到它:对Python timsort的高级解释(带有图形可视化!)。

(是的,该算法现在基本上称为Timsort

I just wanted to supply a very helpful link that I missed in Alex’s otherwise comprehensive answer: A high-level explanation of Python’s timsort (with graph visualizations!).

(Yes, the algorithm is basically known as Timsort now)


回答 2

在早期的python版本中,sort函数实现了quicksort的修改版本。但是,它被认为是不稳定的,从2.3版本开始,他们改用自适应合并排序算法。

In early python-versions, the sort function implemented a modified version of quicksort. However, it was deemed unstable and as of 2.3 they switched to using an adaptive mergesort algorithm.