问题:将列表转换为集合会更改元素顺序

最近,我注意到当我将a转换listset元素的顺序发生变化,由字符排序。

考虑以下示例:

x=[1,2,20,6,210]
print x 
# [1, 2, 20, 6, 210] # the order is same as initial order

set(x)
# set([1, 2, 20, 210, 6]) # in the set(x) output order is sorted

我的问题是-

  1. 为什么会这样呢?
  2. 如何进行设置操作(尤其是“设置差异”)而不丢失初始顺序?

Recently I noticed that when I am converting a list to set the order of elements is changed and is sorted by character.

Consider this example:

x=[1,2,20,6,210]
print x 
# [1, 2, 20, 6, 210] # the order is same as initial order

set(x)
# set([1, 2, 20, 210, 6]) # in the set(x) output order is sorted

My questions are –

  1. Why is this happening?
  2. How can I do set operations (especially Set Difference) without losing the initial order?

回答 0

  1. A set是无序的数据结构,因此它不保留插入顺序。

  2. 这取决于您的要求。如果您有一个普通列表,并且想要在保留列表顺序的同时删除一些元素集,则可以通过列表理解来做到这一点:

    >>> a = [1, 2, 20, 6, 210]
    >>> b = set([6, 20, 1])
    >>> [x for x in a if x not in b]
    [2, 210]

    如果需要同时支持快速成员资格测试保留插入顺序的数据结构,则可以使用Python字典的键,从Python 3.7开始保证可以保留插入顺序:

    >>> a = dict.fromkeys([1, 2, 20, 6, 210])
    >>> b = dict.fromkeys([6, 20, 1])
    >>> dict.fromkeys(x for x in a if x not in b)
    {2: None, 210: None}

    b并不需要在这里订购–您也可以使用set。请注意,a.keys() - b.keys()返回的设置差为set,因此不会保留插入顺序。

    在旧版本的Python中,您可以collections.OrderedDict改用:

    >>> a = collections.OrderedDict.fromkeys([1, 2, 20, 6, 210])
    >>> b = collections.OrderedDict.fromkeys([6, 20, 1])
    >>> collections.OrderedDict.fromkeys(x for x in a if x not in b)
    OrderedDict([(2, None), (210, None)])
  1. A set is an unordered data structure, so it does not preserve the insertion order.

  2. This depends on your requirements. If you have an normal list, and want to remove some set of elements while preserving the order of the list, you can do this with a list comprehension:

    >>> a = [1, 2, 20, 6, 210]
    >>> b = set([6, 20, 1])
    >>> [x for x in a if x not in b]
    [2, 210]
    

    If you need a data structure that supports both fast membership tests and preservation of insertion order, you can use the keys of a Python dictionary, which starting from Python 3.7 is guaranteed to preserve the insertion order:

    >>> a = dict.fromkeys([1, 2, 20, 6, 210])
    >>> b = dict.fromkeys([6, 20, 1])
    >>> dict.fromkeys(x for x in a if x not in b)
    {2: None, 210: None}
    

    b doesn’t really need to be ordered here – you could use a set as well. Note that a.keys() - b.keys() returns the set difference as a set, so it won’t preserve the insertion order.

    In older versions of Python, you can use collections.OrderedDict instead:

    >>> a = collections.OrderedDict.fromkeys([1, 2, 20, 6, 210])
    >>> b = collections.OrderedDict.fromkeys([6, 20, 1])
    >>> collections.OrderedDict.fromkeys(x for x in a if x not in b)
    OrderedDict([(2, None), (210, None)])
    

回答 1

在Python 3.6中,set()现在应该保持顺序,但是对于Python 2和Python 3还有另一种解决方案:

>>> x = [1, 2, 20, 6, 210]
>>> sorted(set(x), key=x.index)
[1, 2, 20, 6, 210]

In Python 3.6, set() now should keep the order, but there is another solution for Python 2 and 3:

>>> x = [1, 2, 20, 6, 210]
>>> sorted(set(x), key=x.index)
[1, 2, 20, 6, 210]

回答 2

回答第一个问题时,集合是针对集合操作进行优化的数据结构。像数学集一样,它不强制或维持元素的任何特定顺序。集合的抽象概念不强制执行顺序,因此不需要强制执行。从列表创建集合时,Python可以根据其用于集合的内部实现的需要自由更改元素的顺序,从而能够高效地执行集合操作。

Answering your first question, a set is a data structure optimized for set operations. Like a mathematical set, it does not enforce or maintain any particular order of the elements. The abstract concept of a set does not enforce order, so the implementation is not required to. When you create a set from a list, Python has the liberty to change the order of the elements for the needs of the internal implementation it uses for a set, which is able to perform set operations efficiently.


回答 3

通过以下功能删除重复项并保留顺序

def unique(sequence):
    seen = set()
    return [x for x in sequence if not (x in seen or seen.add(x))]

检查此链接

remove duplicates and preserve order by below function

def unique(sequence):
    seen = set()
    return [x for x in sequence if not (x in seen or seen.add(x))]

check this link


回答 4

在数学中,有集合有序集合(osets)。

  • set:唯一元素的无序容器(实现)
  • oset:唯一元素的有序容器(未实现)

在Python中,仅直接实现集合。我们可以使用常规的dict键(3.7+)模拟osets 。

给定

a = [1, 2, 20, 6, 210, 2, 1]
b = {2, 6}

oset = dict.fromkeys(a).keys()
# dict_keys([1, 2, 20, 6, 210])

演示版

删除副本,保留插入顺序。

list(oset)
# [1, 2, 20, 6, 210]

对dict键进行类似集合的操作。

oset - b
# {1, 20, 210}

oset | b
# {1, 2, 5, 6, 20, 210}

oset & b
# {2, 6}

oset ^ b
# {1, 5, 20, 210}

细节

注意:无序结构并不排除有序元素。相反,不能保证维持订单。例:

assert {1, 2, 3} == {2, 3, 1}                    # sets (order is ignored)

assert [1, 2, 3] != [2, 3, 1]                    # lists (order is guaranteed)

可能会很高兴地发现列表多集(mset)是另外两种引人入胜的数学数据结构:

  • list:允许重复的元素的有序容器(已实现)
  • mset:允许重复的元素的无序容器(NotImplemented)*

摘要

Container | Ordered | Unique | Implemented
----------|---------|--------|------------
set       |    n    |    y   |     y
oset      |    y    |    y   |     n
list      |    y    |    n   |     y
mset      |    n    |    n   |     n*  

*可以使用collections.Counter()dict样的多重性(计数)映射间接模拟多重集。

In mathematics, there are sets and ordered sets (osets).

  • set: an unordered container of unique elements (Implemented)
  • oset: an ordered container of unique elements (NotImplemented)

In Python, only sets are directly implemented. We can emulate osets with regular dict keys (3.7+).

Given

a = [1, 2, 20, 6, 210, 2, 1]
b = {2, 6}

Code

oset = dict.fromkeys(a).keys()
# dict_keys([1, 2, 20, 6, 210])

Demo

Replicates are removed, insertion-order is preserved.

list(oset)
# [1, 2, 20, 6, 210]

Set-like operations on dict keys.

oset - b
# {1, 20, 210}

oset | b
# {1, 2, 5, 6, 20, 210}

oset & b
# {2, 6}

oset ^ b
# {1, 5, 20, 210}

Details

Note: an unordered structure does not preclude ordered elements. Rather, maintained order is not guaranteed. Example:

assert {1, 2, 3} == {2, 3, 1}                    # sets (order is ignored)

assert [1, 2, 3] != [2, 3, 1]                    # lists (order is guaranteed)

One may be pleased to discover that a list and multiset (mset) are two more fascinating, mathematical data structures:

  • list: an ordered container of elements that permits replicates (Implemented)
  • mset: an unordered container of elements that permits replicates (NotImplemented)*

Summary

Container | Ordered | Unique | Implemented
----------|---------|--------|------------
set       |    n    |    y   |     y
oset      |    y    |    y   |     n
list      |    y    |    n   |     y
mset      |    n    |    n   |     n*  

*A multiset can be indirectly emulated with collections.Counter(), a dict-like mapping of multiplicities (counts).


回答 5

如其他答案所示,集合是不保留元素顺序的数据结构(和数学概念)-

但是,通过使用集合和字典的组合,可以实现所需的功能-尝试使用以下代码段:

# save the element order in a dict:
x_dict = dict(x,y for y, x in enumerate(my_list) )
x_set = set(my_list)
#perform desired set operations
...
#retrieve ordered list from the set:
new_list = [None] * len(new_set)
for element in new_set:
   new_list[x_dict[element]] = element

As denoted in other answers, sets are data structures (and mathematical concepts) that do not preserve the element order –

However, by using a combination of sets and dictionaries, it is possible that you can achieve wathever you want – try using these snippets:

# save the element order in a dict:
x_dict = dict(x,y for y, x in enumerate(my_list) )
x_set = set(my_list)
#perform desired set operations
...
#retrieve ordered list from the set:
new_list = [None] * len(new_set)
for element in new_set:
   new_list[x_dict[element]] = element

回答 6

在Sven的答案的基础上,我发现使用了collections.OrderedDict这样的代码,它帮助我完成了想要的工作,并允许我向dict中添加更多项:

import collections

x=[1,2,20,6,210]
z=collections.OrderedDict.fromkeys(x)
z
OrderedDict([(1, None), (2, None), (20, None), (6, None), (210, None)])

如果要添加项目,但仍将其视为一组,则可以执行以下操作:

z['nextitem']=None

您可以在字典上执行类似z.keys()的操作并获取集合:

z.keys()
[1, 2, 20, 6, 210]

Building on Sven’s answer, I found using collections.OrderedDict like so helped me accomplish what you want plus allow me to add more items to the dict:

import collections

x=[1,2,20,6,210]
z=collections.OrderedDict.fromkeys(x)
z
OrderedDict([(1, None), (2, None), (20, None), (6, None), (210, None)])

If you want to add items but still treat it like a set you can just do:

z['nextitem']=None

And you can perform an operation like z.keys() on the dict and get the set:

z.keys()
[1, 2, 20, 6, 210]

回答 7

上面最高分数概念的实现将其带回到列表中:

def SetOfListInOrder(incominglist):
    from collections import OrderedDict
    outtemp = OrderedDict()
    for item in incominglist:
        outtemp[item] = None
    return(list(outtemp))

在Python 3.6和Python 2.7上进行了简短测试。

An implementation of the highest score concept above that brings it back to a list:

def SetOfListInOrder(incominglist):
    from collections import OrderedDict
    outtemp = OrderedDict()
    for item in incominglist:
        outtemp[item] = None
    return(list(outtemp))

Tested (briefly) on Python 3.6 and Python 2.7.


回答 8

如果您要在两个初始列表中进行少量元素设置差值运算,而不是使用collections.OrderedDict使实现复杂化并使可读性降低的元素,则可以使用:

# initial lists on which you want to do set difference
>>> nums = [1,2,2,3,3,4,4,5]
>>> evens = [2,4,4,6]
>>> evens_set = set(evens)
>>> result = []
>>> for n in nums:
...   if not n in evens_set and not n in result:
...     result.append(n)
... 
>>> result
[1, 3, 5]

它的时间复杂度不是很好,但是它整洁且易于阅读。

In case you have a small number of elements in your two initial lists on which you want to do set difference operation, instead of using collections.OrderedDict which complicates the implementation and makes it less readable, you can use:

# initial lists on which you want to do set difference
>>> nums = [1,2,2,3,3,4,4,5]
>>> evens = [2,4,4,6]
>>> evens_set = set(evens)
>>> result = []
>>> for n in nums:
...   if not n in evens_set and not n in result:
...     result.append(n)
... 
>>> result
[1, 3, 5]

Its time complexity is not that good but it is neat and easy to read.


回答 9

有趣的是,人们总是使用“现实世界中的问题”开玩笑来解释理论科学中的定义。

如果设置有顺序,则首先需要弄清楚以下问题。如果列表中有重复的元素,那么将其变成集合时的顺序应该是什么?如果我们将两个集合并集,顺序是什么?如果在同一元素上以不同顺序相交的两个集合相交,顺序是什么?

另外,set在搜索特定键方面要快得多,这在set操作中非常有用(这就是为什么需要set而不是list的原因)。

如果您真的在乎索引,只需将其保留为列表即可。如果仍要对许多列表中的元素进行设置操作,最简单的方法是为每个列表创建一个字典,该列表中的集合具有相同的键以及包含原始列表中所有键索引的list值。

def indx_dic(l):
    dic = {}
    for i in range(len(l)):
        if l[i] in dic:
            dic.get(l[i]).append(i)
        else:
            dic[l[i]] = [i]
    return(dic)

a = [1,2,3,4,5,1,3,2]
set_a  = set(a)
dic_a = indx_dic(a)

print(dic_a)
# {1: [0, 5], 2: [1, 7], 3: [2, 6], 4: [3], 5: [4]}
print(set_a)
# {1, 2, 3, 4, 5}

It’s interesting that people always use ‘real world problem’ to make joke on the definition in theoretical science.

If set has order, you first need to figure out the following problems. If your list has duplicate elements, what should the order be when you turn it into a set? What is the order if we union two sets? What is the order if we intersect two sets with different order on the same elements?

Plus, set is much faster in searching for a particular key which is very good in sets operation (and that’s why you need a set, but not list).

If you really care about the index, just keep it as a list. If you still want to do set operation on the elements in many lists, the simplest way is creating a dictionary for each list with the same keys in the set along with a value of list containing all the index of the key in the original list.

def indx_dic(l):
    dic = {}
    for i in range(len(l)):
        if l[i] in dic:
            dic.get(l[i]).append(i)
        else:
            dic[l[i]] = [i]
    return(dic)

a = [1,2,3,4,5,1,3,2]
set_a  = set(a)
dic_a = indx_dic(a)

print(dic_a)
# {1: [0, 5], 2: [1, 7], 3: [2, 6], 4: [3], 5: [4]}
print(set_a)
# {1, 2, 3, 4, 5}

回答 10

这是一种简单的方法:

x=[1,2,20,6,210]
print sorted(set(x))

Here’s an easy way to do it:

x=[1,2,20,6,210]
print sorted(set(x))

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。