检查列表中的所有元素是否唯一

问题:检查列表中的所有元素是否唯一

检查列表中所有元素是否唯一的最佳方法(与传统方法一样最佳)是什么?

我目前使用的方法Counter是:

>>> x = [1, 1, 1, 2, 3, 4, 5, 6, 2]
>>> counter = Counter(x)
>>> for values in counter.itervalues():
        if values > 1: 
            # do something

我可以做得更好吗?

What is the best way (best as in the conventional way) of checking whether all elements in a list are unique?

My current approach using a Counter is:

>>> x = [1, 1, 1, 2, 3, 4, 5, 6, 2]
>>> counter = Counter(x)
>>> for values in counter.itervalues():
        if values > 1: 
            # do something

Can I do better?


回答 0

不是最高效的,而是简单明了的:

if len(x) > len(set(x)):
   pass # do something

短名单可能不会有太大的不同。

Not the most efficient, but straight forward and concise:

if len(x) > len(set(x)):
   pass # do something

Probably won’t make much of a difference for short lists.


回答 1

这里有两个班轮,它们也会提前退出:

>>> def allUnique(x):
...     seen = set()
...     return not any(i in seen or seen.add(i) for i in x)
...
>>> allUnique("ABCDEF")
True
>>> allUnique("ABACDEF")
False

如果x的元素不可散列,那么您将不得不使用以下列表seen

>>> def allUnique(x):
...     seen = list()
...     return not any(i in seen or seen.append(i) for i in x)
...
>>> allUnique([list("ABC"), list("DEF")])
True
>>> allUnique([list("ABC"), list("DEF"), list("ABC")])
False

Here is a two-liner that will also do early exit:

>>> def allUnique(x):
...     seen = set()
...     return not any(i in seen or seen.add(i) for i in x)
...
>>> allUnique("ABCDEF")
True
>>> allUnique("ABACDEF")
False

If the elements of x aren’t hashable, then you’ll have to resort to using a list for seen:

>>> def allUnique(x):
...     seen = list()
...     return not any(i in seen or seen.append(i) for i in x)
...
>>> allUnique([list("ABC"), list("DEF")])
True
>>> allUnique([list("ABC"), list("DEF"), list("ABC")])
False

回答 2

提前退出的解决方案可能是

def unique_values(g):
    s = set()
    for x in g:
        if x in s: return False
        s.add(x)
    return True

但是对于小情况或如果提早退出并不常见,那么我希望len(x) != len(set(x))这是最快的方法。

An early-exit solution could be

def unique_values(g):
    s = set()
    for x in g:
        if x in s: return False
        s.add(x)
    return True

however for small cases or if early-exiting is not the common case then I would expect len(x) != len(set(x)) being the fastest method.


回答 3

为了速度:

import numpy as np
x = [1, 1, 1, 2, 3, 4, 5, 6, 2]
np.unique(x).size == len(x)

for speed:

import numpy as np
x = [1, 1, 1, 2, 3, 4, 5, 6, 2]
np.unique(x).size == len(x)

回答 4

如何将所有条目添加到集合中并检查其长度呢?

len(set(x)) == len(x)

How about adding all the entries to a set and checking its length?

len(set(x)) == len(x)

回答 5

替代a set,您可以使用dict

len({}.fromkeys(x)) == len(x)

Alternative to a set, you can use a dict.

len({}.fromkeys(x)) == len(x)

回答 6

完全使用排序和分组方式的另一种方法:

from itertools import groupby
is_unique = lambda seq: all(sum(1 for _ in x[1])==1 for x in groupby(sorted(seq)))

它需要排序,但是在第一个重复值上退出。

Another approach entirely, using sorted and groupby:

from itertools import groupby
is_unique = lambda seq: all(sum(1 for _ in x[1])==1 for x in groupby(sorted(seq)))

It requires a sort, but exits on the first repeated value.


回答 7

这是一个有趣的递归O(N 2)版本:

def is_unique(lst):
    if len(lst) > 1:
        return is_unique(s[1:]) and (s[0] not in s[1:])
    return True

Here is a recursive O(N2) version for fun:

def is_unique(lst):
    if len(lst) > 1:
        return is_unique(s[1:]) and (s[0] not in s[1:])
    return True

回答 8

这是递归的提前退出函数:

def distinct(L):
    if len(L) == 2:
        return L[0] != L[1]
    H = L[0]
    T = L[1:]
    if (H in T):
            return False
    else:
            return distinct(T)    

对于我来说,它足够快,而无需使用怪异的(慢速)转换,同时具有功能样式的方法。

Here is a recursive early-exit function:

def distinct(L):
    if len(L) == 2:
        return L[0] != L[1]
    H = L[0]
    T = L[1:]
    if (H in T):
            return False
    else:
            return distinct(T)    

It’s fast enough for me without using weird(slow) conversions while having a functional-style approach.


回答 9

这个怎么样

def is_unique(lst):
    if not lst:
        return True
    else:
        return Counter(lst).most_common(1)[0][1]==1

How about this

def is_unique(lst):
    if not lst:
        return True
    else:
        return Counter(lst).most_common(1)[0][1]==1

回答 10

您可以使用Yan的语法(len(x)> len(set(x))),但可以定义一个函数来代替set(x):

 def f5(seq, idfun=None): 
    # order preserving
    if idfun is None:
        def idfun(x): return x
    seen = {}
    result = []
    for item in seq:
        marker = idfun(item)
        # in old Python versions:
        # if seen.has_key(marker)
        # but in new ones:
        if marker in seen: continue
        seen[marker] = 1
        result.append(item)
    return result

并做len(x)> len(f5(x))。这样会很快,而且还能保留订单。

此处的代码来自:http//www.peterbe.com/plog/uniqifiers-benchmark

You can use Yan’s syntax (len(x) > len(set(x))), but instead of set(x), define a function:

 def f5(seq, idfun=None): 
    # order preserving
    if idfun is None:
        def idfun(x): return x
    seen = {}
    result = []
    for item in seq:
        marker = idfun(item)
        # in old Python versions:
        # if seen.has_key(marker)
        # but in new ones:
        if marker in seen: continue
        seen[marker] = 1
        result.append(item)
    return result

and do len(x) > len(f5(x)). This will be fast and is also order preserving.

Code there is taken from: http://www.peterbe.com/plog/uniqifiers-benchmark


回答 11

在Pandas数据框中使用类似的方法来测试列的内容是否包含唯一值:

if tempDF['var1'].size == tempDF['var1'].unique().size:
    print("Unique")
else:
    print("Not unique")

对我来说,这在包含一百万行的日期框架中的int变量上是瞬时的。

Using a similar approach in a Pandas dataframe to test if the contents of a column contains unique values:

if tempDF['var1'].size == tempDF['var1'].unique().size:
    print("Unique")
else:
    print("Not unique")

For me, this is instantaneous on an int variable in a dateframe containing over a million rows.


回答 12

以上所有答案都很好,但我更喜欢使用30秒内的Pythonall_unique示例

您需要set()在给定列表上使用来删除重复项,并将其长度与列表的长度进行比较。

def all_unique(lst):
  return len(lst) == len(set(lst))

True如果平面列表中的所有值均为unique,则返回,False否则返回

x = [1,2,3,4,5,6]
y = [1,2,2,3,4,5]
all_unique(x) # True
all_unique(y) # False

all answer above are good but i prefer to use all_unique example from 30 seconds of python

you need to use set() on the given list to remove duplicates, compare its length with the length of the list.

def all_unique(lst):
  return len(lst) == len(set(lst))

it returns True if all the values in a flat list are unique, False otherwise

x = [1,2,3,4,5,6]
y = [1,2,2,3,4,5]
all_unique(x) # True
all_unique(y) # False

回答 13

对于初学者:

def AllDifferent(s):
    for i in range(len(s)):
        for i2 in range(len(s)):
            if i != i2:
                if s[i] == s[i2]:
                    return False
    return True

For begginers:

def AllDifferent(s):
    for i in range(len(s)):
        for i2 in range(len(s)):
            if i != i2:
                if s[i] == s[i2]:
                    return False
    return True