问题:如何在Python中有效比较两个无序列表(不是集合)?

a = [1, 2, 3, 1, 2, 3]
b = [3, 2, 1, 3, 2, 1]

a和b应该视为相等,因为它们具有完全相同的元素,只是顺序不同。

问题是,我的实际列表将由对象(我的类实例)组成,而不是整数。

a = [1, 2, 3, 1, 2, 3]
b = [3, 2, 1, 3, 2, 1]

a & b should be considered equal, because they have exactly the same elements, only in different order.

The thing is, my actual lists will consist of objects (my class instances), not integers.


回答 0

O(n)Counter()方法最好(如果您的对象是可哈希的):

def compare(s, t):
    return Counter(s) == Counter(t)

O(n log n)sorted()方法是次佳的(如果对象是可排序的):

def compare(s, t):
    return sorted(s) == sorted(t)

O(n * n):如果对象既不可散列也不可排序,则可以使用相等性:

def compare(s, t):
    t = list(t)   # make a mutable copy
    try:
        for elem in s:
            t.remove(elem)
    except ValueError:
        return False
    return not t

O(n): The Counter() method is best (if your objects are hashable):

def compare(s, t):
    return Counter(s) == Counter(t)

O(n log n): The sorted() method is next best (if your objects are orderable):

def compare(s, t):
    return sorted(s) == sorted(t)

O(n * n): If the objects are neither hashable, nor orderable, you can use equality:

def compare(s, t):
    t = list(t)   # make a mutable copy
    try:
        for elem in s:
            t.remove(elem)
    except ValueError:
        return False
    return not t

回答 1

您可以对两者进行排序:

sorted(a) == sorted(b)

一个计数排序也可能是更有效(但它需要的对象是哈希的)。

>>> from collections import Counter
>>> a = [1, 2, 3, 1, 2, 3]
>>> b = [3, 2, 1, 3, 2, 1]
>>> print (Counter(a) == Counter(b))
True

You can sort both:

sorted(a) == sorted(b)

A counting sort could also be more efficient (but it requires the object to be hashable).

>>> from collections import Counter
>>> a = [1, 2, 3, 1, 2, 3]
>>> b = [3, 2, 1, 3, 2, 1]
>>> print (Counter(a) == Counter(b))
True

回答 2

如果知道项目总是可哈希的,则可以使用Counter()O(n),
如果知道项目总是可排序的,则可以使用sorted()O(n log n)。

在一般情况下,您不能依靠能够排序或拥有元素,因此您需要像这样的后备,不幸的是,O(n ^ 2)

len(a)==len(b) and all(a.count(i)==b.count(i) for i in a)

If you know the items are always hashable, you can use a Counter() which is O(n)
If you know the items are always sortable, you can use sorted() which is O(n log n)

In the general case you can’t rely on being able to sort, or has the elements, so you need a fallback like this, which is unfortunately O(n^2)

len(a)==len(b) and all(a.count(i)==b.count(i) for i in a)

回答 3

最好的方法是对列表进行排序并进行比较。(Counter不能用于不可哈希的对象。)对于整数,这很简单:

sorted(a) == sorted(b)

使用任意对象会变得有些棘手。如果您关心对象的身份,即两个列表中是否存在相同的对象,则可以将该id()函数用作排序键。

sorted(a, key=id) == sorted(b, key==id)

(在Python 2.x中,您实际上不需要 key=参数,因为您可以将任何对象与任何对象进行比较。顺序是任意的,但很稳定,因此可以很好地实现此目的;对象的顺序无关紧要但是,在Python 3中,在很多情况下都不允许比较不同类型的对象-例如,您不能将字符串与整数进行比较-因此,如果有对象最好使用显式使用对象的ID。)

另一方面,如果要按比较列表中的对象,则首先需要定义“值”对对象的含义。然后,您将需要某种方式将其提供为键(对于Python 3,则为一致类型)。一种适用于许多任意对象的潜在方式是按其排序repr()。当然,这可能会浪费大量额外时间和内存来构建repr()大型列表等字符串。

sorted(a, key=repr) == sorted(b, key==repr)

如果对象都是您自己的类型,则可以__lt__()在它们上进行定义,以使对象知道如何将自身与其他对象进行比较。然后,您可以对它们进行排序,而不必担心key=参数。当然,您也可以定义__hash__()和使用Counter,这样会更快。

The best way to do this is by sorting the lists and comparing them. (Using Counter won’t work with objects that aren’t hashable.) This is straightforward for integers:

sorted(a) == sorted(b)

It gets a little trickier with arbitrary objects. If you care about object identity, i.e., whether the same objects are in both lists, you can use the id() function as the sort key.

sorted(a, key=id) == sorted(b, key==id)

(In Python 2.x you don’t actually need the key= parameter, because you can compare any object to any object. The ordering is arbitrary but stable, so it works fine for this purpose; it doesn’t matter what order the objects are in, only that the ordering is the same for both lists. In Python 3, though, comparing objects of different types is disallowed in many circumstances — for example, you can’t compare strings to integers — so if you will have objects of various types, best to explicitly use the object’s ID.)

If you want to compare the objects in the list by value, on the other hand, first you need to define what “value” means for the objects. Then you will need some way to provide that as a key (and for Python 3, as a consistent type). One potential way that would work for a lot of arbitrary objects is to sort by their repr(). Of course, this could waste a lot of extra time and memory building repr() strings for large lists and so on.

sorted(a, key=repr) == sorted(b, key==repr)

If the objects are all your own types, you can define __lt__() on them so that the object knows how to compare itself to others. Then you can just sort them and not worry about the key= parameter. Of course you could also define __hash__() and use Counter, which will be faster.


回答 4

https://docs.python.org/3.5/library/unittest.html#unittest.TestCase.assertCountEqual

assertCountEqual(first,second,msg = None)

测试序列首先包含与第二序列相同的元素,而不管其顺序如何。否则,将生成一条错误消息,列出序列之间的差异。

比较第一个和第二个元素时,不会忽略重复的元素。它验证两个序列中每个元素的计数是否相同。等效于:assertEqual(Counter(list(first()),Counter(list(second)))),但也适用于不可哈希对象的序列。

3.2版中的新功能。

或2.7:https//docs.python.org/2.7/library/unittest.html#unittest.TestCase.assertItemsEqual

If you have to do this in tests: https://docs.python.org/3.5/library/unittest.html#unittest.TestCase.assertCountEqual

assertCountEqual(first, second, msg=None)

Test that sequence first contains the same elements as second, regardless of their order. When they don’t, an error message listing the differences between the sequences will be generated.

Duplicate elements are not ignored when comparing first and second. It verifies whether each element has the same count in both sequences. Equivalent to: assertEqual(Counter(list(first)), Counter(list(second))) but works with sequences of unhashable objects as well.

New in version 3.2.

or in 2.7: https://docs.python.org/2.7/library/unittest.html#unittest.TestCase.assertItemsEqual

Outside of tests I would recommend the Counter method.


回答 5

如果列表包含不可散列的项(例如对象列表),则可以使用Counter Class和id()函数,例如:

from collections import Counter
...
if Counter(map(id,a)) == Counter(map(id,b)):
    print("Lists a and b contain the same objects")

If the list contains items that are not hashable (such as a list of objects) you might be able to use the Counter Class and the id() function such as:

from collections import Counter
...
if Counter(map(id,a)) == Counter(map(id,b)):
    print("Lists a and b contain the same objects")

回答 6

我希望以下代码可以在您的情况下工作:-

if ((len(a) == len(b)) and
   (all(i in a for i in b))):
    print 'True'
else:
    print 'False'

这将确保两个列表ab中的所有元素都是相同的,而不管它们的顺序是否相同。

为了更好地理解,请参考我在这个问题上的回答

I hope the below piece of code might work in your case :-

if ((len(a) == len(b)) and
   (all(i in a for i in b))):
    print 'True'
else:
    print 'False'

This will ensure that all the elements in both the lists a & b are same, regardless of whether they are in same order or not.

For better understanding, refer to my answer in this question


回答 7

如果要在测试环境中进行比较,请使用py>=3.2)和assertItemsEqual(a, b)2.7<=py<3.2)。

也适用于不可哈希对象的序列。

If the comparison is to be performed in a testing context, use (py>=3.2) and assertItemsEqual(a, b) (2.7<=py<3.2).

Works on sequences of unhashable objects too.


回答 8

让a,b列出

def ass_equal(a,b):
try:
    map(lambda x: a.pop(a.index(x)), b) # try to remove all the elements of b from a, on fail, throw exception
    if len(a) == 0: # if a is empty, means that b has removed them all
        return True 
except:
    return False # b failed to remove some items from a

无需将它们设为可散列或排序。

Let a,b lists

def ass_equal(a,b):
try:
    map(lambda x: a.pop(a.index(x)), b) # try to remove all the elements of b from a, on fail, throw exception
    if len(a) == 0: # if a is empty, means that b has removed them all
        return True 
except:
    return False # b failed to remove some items from a

No need to make them hashable or sort them.


回答 9

使用该unittest模块可为您提供一种干净而标准的方法。

import unittest

test_object = unittest.TestCase()
test_object.assertCountEqual(a, b)

Using the unittest module gives you a clean and standard approach.

import unittest

test_object = unittest.TestCase()
test_object.assertCountEqual(a, b)

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。