标签归档:list

如何获取列表中的元素数量?

问题:如何获取列表中的元素数量?

考虑以下:

items = []
items.append("apple")
items.append("orange")
items.append("banana")

# FAKE METHOD:
items.amount()  # Should return 3

如何获取列表中的元素数量items

Consider the following:

items = []
items.append("apple")
items.append("orange")
items.append("banana")

# FAKE METHOD:
items.amount()  # Should return 3

How do I get the number of elements in the list items?


回答 0

len()函数可以与Python中的几种不同类型一起使用-内置类型和库类型。例如:

>>> len([1,2,3])
3

官方2.x文档在这里: 官方3.x文档在这里:len()
len()

The len() function can be used with several different types in Python – both built-in types and library types. For example:

>>> len([1,2,3])
3

Official 2.x documentation is here: len()
Official 3.x documentation is here: len()


回答 1

如何获得列表的大小?

要查找列表的大小,请使用内置函数len

items = []
items.append("apple")
items.append("orange")
items.append("banana")

现在:

len(items)

返回3。

说明

Python中的所有内容都是一个对象,包括列表。在C实现中,所有对象都有某种头。

列表和其他类似的内置对象在Python中具有“大小”,尤其是具有一个名为的属性ob_size,其中缓存了对象中元素的数量。因此,检查列表中对象的数量非常快。

但是,如果您要检查列表大小是否为零,请不要使用len-而是将列表放在布尔值上下文中-如果为空,则将其视为False,否则将其视为True

来自文档

len(s)

返回对象的长度(项目数)。参数可以是序列(例如字符串,字节,元组,列表或范围)或集合(例如字典,集合或冻结集合)。

len与实施__len__,从数据模型文档

object.__len__(self)

调用以实现内置函数len()。应该返回对象的长度,即> = 0的整数。而且,在Boolean上下文中,未定义__nonzero__()[在Python 2或__bool__()Python 3中]方法且其__len__()方法返回零的对象被视为false。

我们还可以看到这__len__是一种列表方法:

items.__len__()

返回3。

内建类型,你可以得到len的(长)

实际上,我们看到我们可以为所有描述的类型获取此信息:

>>> all(hasattr(cls, '__len__') for cls in (str, bytes, tuple, list, 
                                            xrange, dict, set, frozenset))
True

请勿len用于测试空列表或非空列表

当然,要测试特定长度,只需测试是否相等:

if len(items) == required_length:
    ...

但是在测试零长度列表或反数列表时有一种特殊情况。在这种情况下,请勿测试是否相等。

另外,请勿执行以下操作:

if len(items): 
    ...

相反,只需执行以下操作:

if items:     # Then we have some items, not empty!
    ...

要么

if not items: # Then we have an empty list!
    ...

在这里解释原因,但总之,if items或者if not items更具可读性和性能。

How to get the size of a list?

To find the size of a list, use the builtin function, len:

items = []
items.append("apple")
items.append("orange")
items.append("banana")

And now:

len(items)

returns 3.

Explanation

Everything in Python is an object, including lists. All objects have a header of some sort in the C implementation.

Lists and other similar builtin objects with a “size” in Python, in particular, have an attribute called ob_size, where the number of elements in the object is cached. So checking the number of objects in a list is very fast.

But if you’re checking if list size is zero or not, don’t use len – instead, put the list in a boolean context – it treated as False if empty, True otherwise.

From the docs

len(s)

Return the length (the number of items) of an object. The argument may be a sequence (such as a string, bytes, tuple, list, or range) or a collection (such as a dictionary, set, or frozen set).

len is implemented with __len__, from the data model docs:

object.__len__(self)

Called to implement the built-in function len(). Should return the length of the object, an integer >= 0. Also, an object that doesn’t define a __nonzero__() [in Python 2 or __bool__() in Python 3] method and whose __len__() method returns zero is considered to be false in a Boolean context.

And we can also see that __len__ is a method of lists:

items.__len__()

returns 3.

Builtin types you can get the len (length) of

And in fact we see we can get this information for all of the described types:

>>> all(hasattr(cls, '__len__') for cls in (str, bytes, tuple, list, 
                                            xrange, dict, set, frozenset))
True

Do not use len to test for an empty or nonempty list

To test for a specific length, of course, simply test for equality:

if len(items) == required_length:
    ...

But there’s a special case for testing for a zero length list or the inverse. In that case, do not test for equality.

Also, do not do:

if len(items): 
    ...

Instead, simply do:

if items:     # Then we have some items, not empty!
    ...

or

if not items: # Then we have an empty list!
    ...

I explain why here but in short, if items or if not items is both more readable and more performant.


回答 2

虽然由于“开箱即用”功能在意义上更有意义,所以这可能没有用,但是一个相当简单的技巧是使用length属性创建类:

class slist(list):
    @property
    def length(self):
        return len(self)

您可以这样使用它:

>>> l = slist(range(10))
>>> l.length
10
>>> print l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

从本质上讲,它与列表对象完全相同,其附加好处是具有OOP友好length属性。

和往常一样,您的里程可能会有所不同。

While this may not be useful due to the fact that it’d make a lot more sense as being “out of the box” functionality, a fairly simple hack would be to build a class with a length property:

class slist(list):
    @property
    def length(self):
        return len(self)

You can use it like so:

>>> l = slist(range(10))
>>> l.length
10
>>> print l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Essentially, it’s exactly identical to a list object, with the added benefit of having an OOP-friendly length property.

As always, your mileage may vary.


回答 3

此外,len您还可以使用operator.length_hint(需要Python 3.4+)。对于一个法线而言,list两者都是等效的,但length_hint可以获取列表迭代器的长度,这在某些情况下可能很有用:

>>> from operator import length_hint
>>> l = ["apple", "orange", "banana"]
>>> len(l)
3
>>> length_hint(l)
3

>>> list_iterator = iter(l)
>>> len(list_iterator)
TypeError: object of type 'list_iterator' has no len()
>>> length_hint(list_iterator)
3

但是length_hint根据定义,它只是一个“提示”,因此大多数时候len会更好。

我已经看到了一些建议访问的答案__len__。在处理类似的内置类时list,这是可以的,但可能会导致自定义类出现问题,因为len(和length_hint)实现了一些安全检查。例如,两者都不允许负长度或超过某个值(该sys.maxsize值)的长度。因此,使用len函数而不是__len__方法总是更安全!

Besides len you can also use operator.length_hint (requires Python 3.4+). For a normal list both are equivalent, but length_hint makes it possible to get the length of a list-iterator, which could be useful in certain circumstances:

>>> from operator import length_hint
>>> l = ["apple", "orange", "banana"]
>>> len(l)
3
>>> length_hint(l)
3

>>> list_iterator = iter(l)
>>> len(list_iterator)
TypeError: object of type 'list_iterator' has no len()
>>> length_hint(list_iterator)
3

But length_hint is by definition only a “hint”, so most of the time len is better.

I’ve seen several answers suggesting accessing __len__. This is all right when dealing with built-in classes like list, but it could lead to problems with custom classes, because len (and length_hint) implement some safety checks. For example, both do not allow negative lengths or lengths that exceed a certain value (the sys.maxsize value). So it’s always safer to use the len function instead of the __len__ method!


回答 4

通过前面给出的示例来回答您的问题:

items = []
items.append("apple")
items.append("orange")
items.append("banana")

print items.__len__()

Answering your question as the examples also given previously:

items = []
items.append("apple")
items.append("orange")
items.append("banana")

print items.__len__()

回答 5

并且为了完整性(主要是教育性的),可以不使用该len()功能。我不认为这是一个很好的选择。不要像在PYTHON中那样编程,但这是学习算法的目的。

def count(list):
    item_count = 0
    for item in list[:]:
        item_count += 1
    return item_count

count([1,2,3,4,5])

(中的冒号list[:]是隐式的,因此也是可选的。)

对于新程序员来说,这里的教训是:您无法在不计算点的情况下获得列表中的项目数。问题就变成了:什么时候该计数它们呢?例如,诸如套接字的连接系统调用之类的高性能代码(用C编写)connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen);不会计算元素的长度(将责任归于调用代码)。请注意,地址的长度被传递以节省首先计算长度的步骤吗?另一个选择:通过计算,在将项目添加到传递的对象中时,跟踪项目的数量可能很有意义。请注意,这会占用更多的内存空间。请参阅Naftuli Kay的答案

跟踪长度以提高性能,同时占用更多内存空间的示例。请注意,我从不使用len()函数,因为会跟踪长度:

class MyList(object):
    def __init__(self):
        self._data = []
        self.length = 0 # length tracker that takes up memory but makes length op O(1) time


        # the implicit iterator in a list class
    def __iter__(self):
        for elem in self._data:
            yield elem

    def add(self, elem):
        self._data.append(elem)
        self.length += 1

    def remove(self, elem):
        self._data.remove(elem)
        self.length -= 1

mylist = MyList()
mylist.add(1)
mylist.add(2)
mylist.add(3)
print(mylist.length) # 3
mylist.remove(3)
print(mylist.length) # 2

And for completeness (primarily educational), it is possible without using the len() function. I would not condone this as a good option DO NOT PROGRAM LIKE THIS IN PYTHON, but it serves a purpose for learning algorithms.

def count(list):
    item_count = 0
    for item in list[:]:
        item_count += 1
    return item_count

count([1,2,3,4,5])

(The colon in list[:] is implicit and is therefore also optional.)

The lesson here for new programmers is: You can’t get the number of items in a list without counting them at some point. The question becomes: when is a good time to count them? For example, high-performance code like the connect system call for sockets (written in C) connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen);, does not calculate the length of elements (giving that responsibility to the calling code). Notice that the length of the address is passed along to save the step of counting the length first? Another option: computationally, it might make sense to keep track of the number of items as you add them within the object that you pass. Mind that this takes up more space in memory. See Naftuli Kay‘s answer.

Example of keeping track of the length to improve performance while taking up more space in memory. Note that I never use the len() function because the length is tracked:

class MyList(object):
    def __init__(self):
        self._data = []
        self.length = 0 # length tracker that takes up memory but makes length op O(1) time


        # the implicit iterator in a list class
    def __iter__(self):
        for elem in self._data:
            yield elem

    def add(self, elem):
        self._data.append(elem)
        self.length += 1

    def remove(self, elem):
        self._data.remove(elem)
        self.length -= 1

mylist = MyList()
mylist.add(1)
mylist.add(2)
mylist.add(3)
print(mylist.length) # 3
mylist.remove(3)
print(mylist.length) # 2

回答 6

len()实际工作方式而言,这是其C实现

static PyObject *
builtin_len(PyObject *module, PyObject *obj)
/*[clinic end generated code: output=fa7a270d314dfb6c input=bc55598da9e9c9b5]*/
{
    Py_ssize_t res;

    res = PyObject_Size(obj);
    if (res < 0) {
        assert(PyErr_Occurred());
        return NULL;
    }
    return PyLong_FromSsize_t(res);
}

Py_ssize_t是对象可以具有的最大长度。PyObject_Size()是一个返回对象大小的函数。如果无法确定对象的大小,则返回-1。在这种情况下,将执行以下代码块:

if (res < 0) {
        assert(PyErr_Occurred());
        return NULL;
    }

结果引发了异常。否则,将执行以下代码块:

return PyLong_FromSsize_t(res);

res这是一个C整数,将转换为python long并返回。longs自python 3起,所有python整数都存储。

In terms of how len() actually works, this is its C implementation:

static PyObject *
builtin_len(PyObject *module, PyObject *obj)
/*[clinic end generated code: output=fa7a270d314dfb6c input=bc55598da9e9c9b5]*/
{
    Py_ssize_t res;

    res = PyObject_Size(obj);
    if (res < 0) {
        assert(PyErr_Occurred());
        return NULL;
    }
    return PyLong_FromSsize_t(res);
}

Py_ssize_t is the maximum length that the object can have. PyObject_Size() is a function that returns the size of an object. If it cannot determine the size of an object, it returns -1. In that case, this code block will be executed:

if (res < 0) {
        assert(PyErr_Occurred());
        return NULL;
    }

And an exception is raised as a result. Otherwise, this code block will be executed:

return PyLong_FromSsize_t(res);

res which is a C integer, is converted into a python long and returned. All python integers are stored as longs since Python 3.


如何按字典值对字典列表进行排序?

问题:如何按字典值对字典列表进行排序?

我有一个字典列表,希望每个项目都按特定的属性值排序。

考虑下面的数组,

[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

当排序name,应该成为

[{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]

I have a list of dictionaries and want each item to be sorted by a specific property values.

Take into consideration the array below,

[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

When sorted by name, should become

[{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]

回答 0

使用密钥而不是cmp看起来更干净:

newlist = sorted(list_to_be_sorted, key=lambda k: k['name']) 

或如JFSebastian和其他人所建议的,

from operator import itemgetter
newlist = sorted(list_to_be_sorted, key=itemgetter('name')) 

为了完整性(如fitzgeraldsteele的评论中指出的那样),请添加reverse=True降序排列

newlist = sorted(l, key=itemgetter('name'), reverse=True)

It may look cleaner using a key instead a cmp:

newlist = sorted(list_to_be_sorted, key=lambda k: k['name']) 

or as J.F.Sebastian and others suggested,

from operator import itemgetter
newlist = sorted(list_to_be_sorted, key=itemgetter('name')) 

For completeness (as pointed out in comments by fitzgeraldsteele), add reverse=True to sort descending

newlist = sorted(l, key=itemgetter('name'), reverse=True)

回答 1

import operator

通过key =’name’对字典列表进行排序:

list_of_dicts.sort(key=operator.itemgetter('name'))

按照key =’age’对字典列表进行排序:

list_of_dicts.sort(key=operator.itemgetter('age'))
import operator

To sort the list of dictionaries by key=’name’:

list_of_dicts.sort(key=operator.itemgetter('name'))

To sort the list of dictionaries by key=’age’:

list_of_dicts.sort(key=operator.itemgetter('age'))

回答 2

my_list = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

my_list.sort(lambda x,y : cmp(x['name'], y['name']))

my_list 现在将成为您想要的。

(3年后)进行编辑以添加:

新的key论点更加有效和整洁。更好的答案现在看起来像:

my_list = sorted(my_list, key=lambda k: k['name'])

…IMO比operator.itemgetterymmv 更容易理解。

my_list = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

my_list.sort(lambda x,y : cmp(x['name'], y['name']))

my_list will now be what you want.

(3 years later) Edited to add:

The new key argument is more efficient and neater. A better answer now looks like:

my_list = sorted(my_list, key=lambda k: k['name'])

…the lambda is, IMO, easier to understand than operator.itemgetter, but YMMV.


回答 3

如果要按多个键对列表进行排序,可以执行以下操作:

my_list = [{'name':'Homer', 'age':39}, {'name':'Milhouse', 'age':10}, {'name':'Bart', 'age':10} ]
sortedlist = sorted(my_list , key=lambda elem: "%02d %s" % (elem['age'], elem['name']))

它相当骇人听闻,因为它依赖于将值转换为单个字符串表示形式进行比较,但是它对于包括负数在内的数字也可以正常工作(尽管如果使用数字,则需要使用零填充来适当格式化字符串)

If you want to sort the list by multiple keys you can do the following:

my_list = [{'name':'Homer', 'age':39}, {'name':'Milhouse', 'age':10}, {'name':'Bart', 'age':10} ]
sortedlist = sorted(my_list , key=lambda elem: "%02d %s" % (elem['age'], elem['name']))

It is rather hackish, since it relies on converting the values into a single string representation for comparison, but it works as expected for numbers including negative ones (although you will need to format your string appropriately with zero paddings if you are using numbers)


回答 4

import operator
a_list_of_dicts.sort(key=operator.itemgetter('name'))

‘key’用于按任意值排序,’itemgetter’将该值设置为每个项目的’name’属性。

import operator
a_list_of_dicts.sort(key=operator.itemgetter('name'))

‘key’ is used to sort by an arbitrary value and ‘itemgetter’ sets that value to each item’s ‘name’ attribute.


回答 5

a = [{'name':'Homer', 'age':39}, ...]

# This changes the list a
a.sort(key=lambda k : k['name'])

# This returns a new list (a is not modified)
sorted(a, key=lambda k : k['name']) 
a = [{'name':'Homer', 'age':39}, ...]

# This changes the list a
a.sort(key=lambda k : k['name'])

# This returns a new list (a is not modified)
sorted(a, key=lambda k : k['name']) 

回答 6

我想你的意思是:

[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

排序如下:

sorted(l,cmp=lambda x,y: cmp(x['name'],y['name']))

I guess you’ve meant:

[{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

This would be sorted like this:

sorted(l,cmp=lambda x,y: cmp(x['name'],y['name']))

回答 7

您可以使用自定义比较函数,也可以传入一个计算自定义排序键的函数。通常,这样做效率更高,因为每个项只计算一次密钥,而比较函数将被调用多次。

您可以这样进行:

def mykey(adict): return adict['name']
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=mykey)

但是标准库包含用于获取任意对象项的通用例程:itemgetter。因此,请尝试以下操作:

from operator import itemgetter
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=itemgetter('name'))

You could use a custom comparison function, or you could pass in a function that calculates a custom sort key. That’s usually more efficient as the key is only calculated once per item, while the comparison function would be called many more times.

You could do it this way:

def mykey(adict): return adict['name']
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=mykey)

But the standard library contains a generic routine for getting items of arbitrary objects: itemgetter. So try this instead:

from operator import itemgetter
x = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age':10}]
sorted(x, key=itemgetter('name'))

回答 8

使用Perl的Schwartzian变换,

py = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

sort_on = "name"
decorated = [(dict_[sort_on], dict_) for dict_ in py]
decorated.sort()
result = [dict_ for (key, dict_) in decorated]

>>> result
[{'age': 10, 'name': 'Bart'}, {'age': 39, 'name': 'Homer'}]

有关Perl Schwartzian变换的更多信息

在计算机科学中,Schwartzian变换是一种Perl编程习惯用法,用于提高对项目列表进行排序的效率。当排序实际上是基于元素的某个属性(键)的排序时,此惯用法适用于基于比较的排序,其中计算该属性是一项应执行最少次数的密集操作。Schwartzian转换的显着之处在于它不使用命名的临时数组。

Using Schwartzian transform from Perl,

py = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]

do

sort_on = "name"
decorated = [(dict_[sort_on], dict_) for dict_ in py]
decorated.sort()
result = [dict_ for (key, dict_) in decorated]

gives

>>> result
[{'age': 10, 'name': 'Bart'}, {'age': 39, 'name': 'Homer'}]

More on Perl Schwartzian transform

In computer science, the Schwartzian transform is a Perl programming idiom used to improve the efficiency of sorting a list of items. This idiom is appropriate for comparison-based sorting when the ordering is actually based on the ordering of a certain property (the key) of the elements, where computing that property is an intensive operation that should be performed a minimal number of times. The Schwartzian Transform is notable in that it does not use named temporary arrays.


回答 9

您必须实现自己的比较功能,该功能将通过名称键的值比较字典。请参阅从PythonInfo Wiki对Mini-HOW TO进行排序

You have to implement your own comparison function that will compare the dictionaries by values of name keys. See Sorting Mini-HOW TO from PythonInfo Wiki


回答 10

有时我们需要使用lower()例如

lists = [{'name':'Homer', 'age':39},
  {'name':'Bart', 'age':10},
  {'name':'abby', 'age':9}]

lists = sorted(lists, key=lambda k: k['name'])
print(lists)
# [{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}, {'name':'abby', 'age':9}]

lists = sorted(lists, key=lambda k: k['name'].lower())
print(lists)
# [ {'name':'abby', 'age':9}, {'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]

sometime we need to use lower() for example

lists = [{'name':'Homer', 'age':39},
  {'name':'Bart', 'age':10},
  {'name':'abby', 'age':9}]

lists = sorted(lists, key=lambda k: k['name'])
print(lists)
# [{'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}, {'name':'abby', 'age':9}]

lists = sorted(lists, key=lambda k: k['name'].lower())
print(lists)
# [ {'name':'abby', 'age':9}, {'name':'Bart', 'age':10}, {'name':'Homer', 'age':39}]

回答 11

这是另一种通用解决方案-它按键和值对dict的元素进行排序。它的优点-无需指定键,并且如果某些词典中缺少某些键,它将仍然有效。

def sort_key_func(item):
    """ helper function used to sort list of dicts

    :param item: dict
    :return: sorted list of tuples (k, v)
    """
    pairs = []
    for k, v in item.items():
        pairs.append((k, v))
    return sorted(pairs)
sorted(A, key=sort_key_func)

Here is the alternative general solution – it sorts elements of dict by keys and values. The advantage of it – no need to specify keys, and it would still work if some keys are missing in some of dictionaries.

def sort_key_func(item):
    """ helper function used to sort list of dicts

    :param item: dict
    :return: sorted list of tuples (k, v)
    """
    pairs = []
    for k, v in item.items():
        pairs.append((k, v))
    return sorted(pairs)
sorted(A, key=sort_key_func)

回答 12

使用pandas包是另一种方法,尽管它的大规模运行比其他人提出的更传统的方法要慢得多:

import pandas as pd

listOfDicts = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
df = pd.DataFrame(listOfDicts)
df = df.sort_values('name')
sorted_listOfDicts = df.T.to_dict().values()

以下是一些小型词典和大型(100k +)字典的一些基准值:

setup_large = "listOfDicts = [];\
[listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10})) for _ in range(50000)];\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"

setup_small = "listOfDicts = [];\
listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}));\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"

method1 = "newlist = sorted(listOfDicts, key=lambda k: k['name'])"
method2 = "newlist = sorted(listOfDicts, key=itemgetter('name')) "
method3 = "df = df.sort_values('name');\
sorted_listOfDicts = df.T.to_dict().values()"

import timeit
t = timeit.Timer(method1, setup_small)
print('Small Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_small)
print('Small Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_small)
print('Small Method Pandas: ' + str(t.timeit(100)))

t = timeit.Timer(method1, setup_large)
print('Large Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_large)
print('Large Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_large)
print('Large Method Pandas: ' + str(t.timeit(1)))

#Small Method LC: 0.000163078308105
#Small Method LC2: 0.000134944915771
#Small Method Pandas: 0.0712950229645
#Large Method LC: 0.0321750640869
#Large Method LC2: 0.0206089019775
#Large Method Pandas: 5.81405615807

Using the pandas package is another method, though it’s runtime at large scale is much slower than the more traditional methods proposed by others:

import pandas as pd

listOfDicts = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}]
df = pd.DataFrame(listOfDicts)
df = df.sort_values('name')
sorted_listOfDicts = df.T.to_dict().values()

Here are some benchmark values for a tiny list and a large (100k+) list of dicts:

setup_large = "listOfDicts = [];\
[listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10})) for _ in range(50000)];\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"

setup_small = "listOfDicts = [];\
listOfDicts.extend(({'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}));\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listOfDicts);"

method1 = "newlist = sorted(listOfDicts, key=lambda k: k['name'])"
method2 = "newlist = sorted(listOfDicts, key=itemgetter('name')) "
method3 = "df = df.sort_values('name');\
sorted_listOfDicts = df.T.to_dict().values()"

import timeit
t = timeit.Timer(method1, setup_small)
print('Small Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_small)
print('Small Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_small)
print('Small Method Pandas: ' + str(t.timeit(100)))

t = timeit.Timer(method1, setup_large)
print('Large Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_large)
print('Large Method LC2: ' + str(t.timeit(100)))
t = timeit.Timer(method3, setup_large)
print('Large Method Pandas: ' + str(t.timeit(1)))

#Small Method LC: 0.000163078308105
#Small Method LC2: 0.000134944915771
#Small Method Pandas: 0.0712950229645
#Large Method LC: 0.0321750640869
#Large Method LC2: 0.0206089019775
#Large Method Pandas: 5.81405615807

回答 13

如果你不需要原来listdictionaries,你可以用修改就地sort()使用自定义按键功能的方法。

按键功能:

def get_name(d):
    """ Return the value of a key in a dictionary. """

    return d["name"]

list进行排序:

data_one = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]

就地排序:

data_one.sort(key=get_name)

如果您需要原始的list,请调用将sorted()函数传递给的函数list和键函数,然后将返回的排序list后的变量分配给新变量:

data_two = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
new_data = sorted(data_two, key=get_name)

印刷data_onenew_data

>>> print(data_one)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]
>>> print(new_data)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]

If you do not need the original list of dictionaries, you could modify it in-place with sort() method using a custom key function.

Key function:

def get_name(d):
    """ Return the value of a key in a dictionary. """

    return d["name"]

The list to be sorted:

data_one = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]

Sorting it in-place:

data_one.sort(key=get_name)

If you need the original list, call the sorted() function passing it the list and the key function, then assign the returned sorted list to a new variable:

data_two = [{'name': 'Homer', 'age': 39}, {'name': 'Bart', 'age': 10}]
new_data = sorted(data_two, key=get_name)

Printing data_one and new_data.

>>> print(data_one)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]
>>> print(new_data)
[{'name': 'Bart', 'age': 10}, {'name': 'Homer', 'age': 39}]

回答 14

假设我有一本D包含以下内容的字典。要进行排序,只需使用sort中的key参数来传递自定义函数,如下所示:

D = {'eggs': 3, 'ham': 1, 'spam': 2}
def get_count(tuple):
    return tuple[1]

sorted(D.items(), key = get_count, reverse=True)
# or
sorted(D.items(), key = lambda x: x[1], reverse=True)  # avoiding get_count function call

检查这个出来。

Let’s say I have a dictionary D with elements below. To sort just use key argument in sorted to pass custom function as below :

D = {'eggs': 3, 'ham': 1, 'spam': 2}
def get_count(tuple):
    return tuple[1]

sorted(D.items(), key = get_count, reverse=True)
# or
sorted(D.items(), key = lambda x: x[1], reverse=True)  # avoiding get_count function call

Check this out.


回答 15

我一直是lambda过滤器的忠实拥护者,但是如果您考虑时间复杂性,则不是最佳选择

第一选择

sorted_list = sorted(list_to_sort, key= lambda x: x['name'])
# returns list of values

第二选择

list_to_sort.sort(key=operator.itemgetter('name'))
#edits the list, does not return a new list

快速比较执行时间

# First option
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" "sorted_l = sorted(list_to_sort, key=lambda e: e['name'])"

1000000次循环,最好为3:每个循环0.736微秒

# Second option 
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" -s "import operator" "list_to_sort.sort(key=operator.itemgetter('name'))"

1000000次循环,最好为3:每个循环0.438微秒

I have been a big fan of filter w/ lambda however it is not best option if you considering time complexity

First option

sorted_list = sorted(list_to_sort, key= lambda x: x['name'])
# returns list of values

Second option

list_to_sort.sort(key=operator.itemgetter('name'))
#edits the list, does not return a new list

Fast comparison of exec times

# First option
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" "sorted_l = sorted(list_to_sort, key=lambda e: e['name'])"

1000000 loops, best of 3: 0.736 usec per loop

# Second option 
python3.6 -m timeit -s "list_to_sort = [{'name':'Homer', 'age':39}, {'name':'Bart', 'age':10}, {'name':'Faaa', 'age':57}, {'name':'Errr', 'age':20}]" -s "sorted_l=[]" -s "import operator" "list_to_sort.sort(key=operator.itemgetter('name'))"

1000000 loops, best of 3: 0.438 usec per loop


回答 16

如果需要考虑性能,我会使用内置函数operator.itemgetter来代替lambda手工函数,而使用内置函数来代替。该itemgetter功能似乎比lambda根据我的测试快约20%。

https://wiki.python.org/moin/PythonSpeed

同样,内置函数比手工生成的等效函数运行得更快。例如,map(operator.add,v1,v2)比map(lambda x,y:x + y,v1,v2)快。

这是使用lambdavs 进行排序速度的比较itemgetter

import random
import operator

# create a list of 100 dicts with random 8-letter names and random ages from 0 to 100.
l = [{'name': ''.join(random.choices(string.ascii_lowercase, k=8)), 'age': random.randint(0, 100)} for i in range(100)]

# Test the performance with a lambda function sorting on name
%timeit sorted(l, key=lambda x: x['name'])
13 µs ± 388 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# Test the performance with itemgetter sorting on name
%timeit sorted(l, key=operator.itemgetter('name'))
10.7 µs ± 38.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# Check that each technique produces same sort order
sorted(l, key=lambda x: x['name']) == sorted(l, key=operator.itemgetter('name'))
True

两种技术都以相同的顺序对列表进行排序(通过执行代码块中的final语句进行验证),但是一种方法要快一些。

If performance is a concern, I would use operator.itemgetter instead of lambda as built-in functions perform faster than hand-crafted functions. The itemgetter function seems to perform approximately 20% faster than lambda based on my testing.

From https://wiki.python.org/moin/PythonSpeed:

Likewise, the builtin functions run faster than hand-built equivalents. For example, map(operator.add, v1, v2) is faster than map(lambda x,y: x+y, v1, v2).

Here is a comparison of sorting speed using lambda vs itemgetter.

import random
import operator

# create a list of 100 dicts with random 8-letter names and random ages from 0 to 100.
l = [{'name': ''.join(random.choices(string.ascii_lowercase, k=8)), 'age': random.randint(0, 100)} for i in range(100)]

# Test the performance with a lambda function sorting on name
%timeit sorted(l, key=lambda x: x['name'])
13 µs ± 388 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# Test the performance with itemgetter sorting on name
%timeit sorted(l, key=operator.itemgetter('name'))
10.7 µs ± 38.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# Check that each technique produces same sort order
sorted(l, key=lambda x: x['name']) == sorted(l, key=operator.itemgetter('name'))
True

Both techniques sort the list in the same order (verified by execution of the final statement in the code block) but one is a little faster.


回答 17

您可以使用以下代码

sorted_dct = sorted(dct_name.items(), key = lambda x : x[1])

You may use the following code

sorted_dct = sorted(dct_name.items(), key = lambda x : x[1])

为什么是string.join(list)而不是list.join(string)?

问题:为什么是string.join(list)而不是list.join(string)?

这一直使我感到困惑。看起来这样会更好:

my_list = ["Hello", "world"]
print(my_list.join("-"))
# Produce: "Hello-world"

比这个:

my_list = ["Hello", "world"]
print("-".join(my_list))
# Produce: "Hello-world"

是否有特定原因?

This has always confused me. It seems like this would be nicer:

my_list = ["Hello", "world"]
print(my_list.join("-"))
# Produce: "Hello-world"

Than this:

my_list = ["Hello", "world"]
print("-".join(my_list))
# Produce: "Hello-world"

Is there a specific reason it is like this?


回答 0

这是因为任何可迭代项都可以连接(例如,列表,元组,字典,集合),但是结果和“连接器” 必须是字符串。

例如:

'_'.join(['welcome', 'to', 'stack', 'overflow'])
'_'.join(('welcome', 'to', 'stack', 'overflow'))
'welcome_to_stack_overflow'

使用字符串以外的其他东西会引发以下错误:

TypeError:序列项0:预期的str实例,找到的int

It’s because any iterable can be joined (e.g, list, tuple, dict, set), but the result and the “joiner” must be strings.

For example:

'_'.join(['welcome', 'to', 'stack', 'overflow'])
'_'.join(('welcome', 'to', 'stack', 'overflow'))
'welcome_to_stack_overflow'

Using something else than strings will raise the following error:

TypeError: sequence item 0: expected str instance, int found


回答 1

这在String方法中进行了讨论……最终在Python-Dev中实现,并被Guido接受。该线程始于1999年6月,并str.join包含在2000年9月发布的Python 1.6中(并支持Unicode)。Python 2.0(受支持的str方法,包括join)于2000年10月发布。

  • 此线程中提出了四个选项:
    • str.join(seq)
    • seq.join(str)
    • seq.reduce(str)
    • join 作为内置功能
  • Guido不仅希望支持lists,tuples,而且还支持所有序列/可迭代对象。
  • seq.reduce(str) 对于新来者来说很难。
  • seq.join(str) 从序列到str / unicode引入了意外的依赖关系。
  • join()因为内置函数仅支持特定的数据类型。因此,使用内置的命名空间是不好的。如果join()支持许多数据类型,则创建优化的实现将很困难,如果使用该__add__方法实现,则为O(n²)。
  • 分隔符(sep)不应省略。显式胜于隐式。

此线程中没有其他原因。

以下是一些其他想法(我自己和我朋友的想法):

  • Unicode支持即将到来,但这不是最终的。当时,UTF-8最有可能取代UCS2 / 4。要计算UTF-8字符串的总缓冲区长度,需要知道字符编码规则。
  • 那时,Python已经决定了通用的序列接口规则,用户可以在其中创建类似序列的(可迭代)类。但是Python直到2.2才支持扩展内置类型。那时,很难提供基本的可迭代类(在另一条评论中提到)。

Guido的决定记录在历史邮件中,决定str.join(seq)

有趣,但看起来确实正确!巴里,去吧…-
吉多·范·罗苏姆(Guido van Rossum)

This was discussed in the String methods… finally thread in the Python-Dev achive, and was accepted by Guido. This thread began in Jun 1999, and str.join was included in Python 1.6 which was released in Sep 2000 (and supported Unicode). Python 2.0 (supported str methods including join) was released in Oct 2000.

  • There were four options proposed in this thread:
    • str.join(seq)
    • seq.join(str)
    • seq.reduce(str)
    • join as a built-in function
  • Guido wanted to support not only lists, tuples, but all sequences/iterables.
  • seq.reduce(str) is difficult for new-comers.
  • seq.join(str) introduces unexpected dependency from sequences to str/unicode.
  • join() as a built-in function would support only specific data types. So using a built in namespace is not good. If join() supports many datatypes, creating optimized implementation would be difficult, if implemented using the __add__ method then it’s O(n²).
  • The separator string (sep) should not be omitted. Explicit is better than implicit.

There are no other reasons offered in this thread.

Here are some additional thoughts (my own, and my friend’s):

  • Unicode support was coming, but it was not final. At that time UTF-8 was the most likely about to replace UCS2/4. To calculate total buffer length of UTF-8 strings it needs to know character coding rule.
  • At that time, Python had already decided on a common sequence interface rule where a user could create a sequence-like (iterable) class. But Python didn’t support extending built-in types until 2.2. At that time it was difficult to provide basic iterable class (which is mentioned in another comment).

Guido’s decision is recorded in a historical mail, deciding on str.join(seq):

Funny, but it does seem right! Barry, go for it…
–Guido van Rossum


回答 2

因为该join()方法位于字符串类中,而不是列表类中?

我同意这看起来很有趣。

参见http://www.faqs.org/docs/diveintopython/odbchelper_join.html

历史记录。当我第一次学习Python时,我期望join是一个列表方法,它将分隔符作为参数。很多人都有相同的感觉,join方法背后还有一个故事。在Python 1.6之前,字符串没有所有这些有用的方法。有一个单独的字符串模块,其中包含所有字符串函数。每个函数都将字符串作为第一个参数。这些功能被认为很重要,足以放在字符串本身上,这对于诸如lower,upper和split这样的功能是有意义的。但是许多铁杆Python程序员反对使用新的join方法,认为它应该是列表的方法,或者根本不应该移动,而只是保留旧字符串模块的一部分(仍然有很多方法)里面有用的东西)。

— Mark Pilgrim,深入Python

Because the join() method is in the string class, instead of the list class?

I agree it looks funny.

See http://www.faqs.org/docs/diveintopython/odbchelper_join.html:

Historical note. When I first learned Python, I expected join to be a method of a list, which would take the delimiter as an argument. Lots of people feel the same way, and there’s a story behind the join method. Prior to Python 1.6, strings didn’t have all these useful methods. There was a separate string module which contained all the string functions; each function took a string as its first argument. The functions were deemed important enough to put onto the strings themselves, which made sense for functions like lower, upper, and split. But many hard-core Python programmers objected to the new join method, arguing that it should be a method of the list instead, or that it shouldn’t move at all but simply stay a part of the old string module (which still has lots of useful stuff in it). I use the new join method exclusively, but you will see code written either way, and if it really bothers you, you can use the old string.join function instead.

— Mark Pilgrim, Dive into Python


回答 3

我同意起初这是违反直觉的,但是有充分的理由。Join不能成为列表的方法,因为:

  • 它也必须适用于不同的可迭代对象(元组,生成器等)
  • 在不同类型的字符串之间它必须具有不同的行为。

实际上有两种连接方法(Python 3.0):

>>> b"".join
<built-in method join of bytes object at 0x00A46800>
>>> "".join
<built-in method join of str object at 0x00A28D40>

如果join是列表的一种方法,则它必须检查其参数以确定要调用的参数。而且您不能将byte和str结合在一起,因此它们现在的用法很有意义。

I agree that it’s counterintuitive at first, but there’s a good reason. Join can’t be a method of a list because:

  • it must work for different iterables too (tuples, generators, etc.)
  • it must have different behavior between different types of strings.

There are actually two join methods (Python 3.0):

>>> b"".join
<built-in method join of bytes object at 0x00A46800>
>>> "".join
<built-in method join of str object at 0x00A28D40>

If join was a method of a list, then it would have to inspect its arguments to decide which one of them to call. And you can’t join byte and str together, so the way they have it now makes sense.


回答 4

为什么用它string.join(list)代替list.join(string)

这是因为join是“字符串”方法!它从任何迭代创建一个字符串。如果我们将方法卡在列表中,那么当我们拥有非列表的可迭代对象时该怎么办?

如果您有一个字符串元组怎么办?如果这是一种list方法,则必须将每个这样的字符串迭代器都转换为,list然后才能将元素连接到单个字符串中!例如:

some_strings = ('foo', 'bar', 'baz')

让我们推出自己的列表连接方法:

class OurList(list): 
    def join(self, s):
        return s.join(self)

并使用它,请注意,我们必须首先从每个可迭代对象创建一个列表,以将该字符串连接到该可迭代对象,从而浪费内存和处理能力:

>>> l = OurList(some_strings) # step 1, create our list
>>> l.join(', ') # step 2, use our list join method!
'foo, bar, baz'

因此,我们看到我们必须添加一个额外的步骤来使用我们的列表方法,而不仅仅是使用内置的字符串方法:

>>> ' | '.join(some_strings) # a single step!
'foo | bar | baz'

生成器性能警告

Python用于创建最终字符串的算法str.join实际上必须传递两次迭代,因此,如果为其提供生成器表达式,则必须先将其具体化为列表,然后才能创建最终字符串。

因此,尽管绕过生成器通常比列表理解更好,但这str.join是一个exceptions:

>>> import timeit
>>> min(timeit.repeat(lambda: ''.join(str(i) for i in range(10) if i)))
3.839168446022086
>>> min(timeit.repeat(lambda: ''.join([str(i) for i in range(10) if i])))
3.339879313018173

但是,该str.join操作在语义上仍然是“字符串”操作,因此将其放在str对象上而不是在其他可迭代对象上还是有意义的。

Why is it string.join(list) instead of list.join(string)?

This is because join is a “string” method! It creates a string from any iterable. If we stuck the method on lists, what about when we have iterables that aren’t lists?

What if you have a tuple of strings? If this were a list method, you would have to cast every such iterator of strings as a list before you could join the elements into a single string! For example:

some_strings = ('foo', 'bar', 'baz')

Let’s roll our own list join method:

class OurList(list): 
    def join(self, s):
        return s.join(self)

And to use it, note that we have to first create a list from each iterable to join the strings in that iterable, wasting both memory and processing power:

>>> l = OurList(some_strings) # step 1, create our list
>>> l.join(', ') # step 2, use our list join method!
'foo, bar, baz'

So we see we have to add an extra step to use our list method, instead of just using the builtin string method:

>>> ' | '.join(some_strings) # a single step!
'foo | bar | baz'

Performance Caveat for Generators

The algorithm Python uses to create the final string with str.join actually has to pass over the iterable twice, so if you provide it a generator expression, it has to materialize it into a list first before it can create the final string.

Thus, while passing around generators is usually better than list comprehensions, str.join is an exception:

>>> import timeit
>>> min(timeit.repeat(lambda: ''.join(str(i) for i in range(10) if i)))
3.839168446022086
>>> min(timeit.repeat(lambda: ''.join([str(i) for i in range(10) if i])))
3.339879313018173

Nevertheless, the str.join operation is still semantically a “string” operation, so it still makes sense to have it on the str object than on miscellaneous iterables.


回答 5

将其视为拆分的自然正交运算。

我明白为什么它适用于任何可迭代的,所以不能简单地执行只是在列表中。

为了提高可读性,我想用该语言查看它,但我认为这实际上是不可行的-如果可迭代性是一个接口,则可以将其添加到该接口中,但这只是一个约定,因此没有中央方法将其添加到可迭代的事物集中。

Think of it as the natural orthogonal operation to split.

I understand why it is applicable to anything iterable and so can’t easily be implemented just on list.

For readability, I’d like to see it in the language but I don’t think that is actually feasible – if iterability were an interface then it could be added to the interface but it is just a convention and so there’s no central way to add it to the set of things which are iterable.


回答 6

主要是因为a的结果someString.join()是字符串。

序列(列表或元组等)不会出现在结果中,而只是一个字符串。因为结果是一个字符串,所以作为字符串的方法是有意义的。

Primarily because the result of a someString.join() is a string.

The sequence (list or tuple or whatever) doesn’t appear in the result, just a string. Because the result is a string, it makes sense as a method of a string.


回答 7

- 在“-”中。join(my_list)声明您正在从列表的连接元素转换为字符串。它以结果为导向。(为便于记忆和理解)

我制作了一个methods_of_string的详尽备忘单,供您参考。

string_methonds_44 = {
    'convert': ['join','split', 'rsplit','splitlines', 'partition', 'rpartition'],
    'edit': ['replace', 'lstrip', 'rstrip', 'strip'],
    'search': ['endswith', 'startswith', 'count', 'index', 'find','rindex', 'rfind',],
    'condition': ['isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isnumeric','isidentifier',
                  'islower','istitle', 'isupper','isprintable', 'isspace', ],
    'text': ['lower', 'upper', 'capitalize', 'title', 'swapcase',
             'center', 'ljust', 'rjust', 'zfill', 'expandtabs','casefold'],
    'encode': ['translate', 'maketrans', 'encode'],
    'format': ['format', 'format_map']}

- in “-“.join(my_list) declares that you are converting to a string from joining elements a list.It’s result-oriented.(just for easy memory and understanding)

I make a exhaustive cheatsheet of methods_of_string for your reference.

string_methonds_44 = {
    'convert': ['join','split', 'rsplit','splitlines', 'partition', 'rpartition'],
    'edit': ['replace', 'lstrip', 'rstrip', 'strip'],
    'search': ['endswith', 'startswith', 'count', 'index', 'find','rindex', 'rfind',],
    'condition': ['isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isnumeric','isidentifier',
                  'islower','istitle', 'isupper','isprintable', 'isspace', ],
    'text': ['lower', 'upper', 'capitalize', 'title', 'swapcase',
             'center', 'ljust', 'rjust', 'zfill', 'expandtabs','casefold'],
    'encode': ['translate', 'maketrans', 'encode'],
    'format': ['format', 'format_map']}

回答 8

两者都不好。

string.join(xs,delimit)表示字符串模块知道列表的存在,而列表列表却没有任何业务意义,因为字符串模块仅适用于字符串。

list.join(delimit)更好一点,因为我们习惯于将字符串作为基本类型(从语言上讲,它们是)。但是,这意味着需要动态调度连接,因为在a.split("\n") python编译器,可能不知道a是什么,因此需要查找它(类似于vtable查找),如果您花很多时间这样做,这会很昂贵。次。

如果python运行时编译器知道列表是内置模块,则它可以跳过动态查找并将意图直接编码为字节码,否则,它需要动态地解析“ a”的“ join”,这可能是多层的每次调用的继承关系(因为两次调用之间,join的含义可能已更改,因为python是一种动态语言)。

可悲的是,这是抽象的最终缺陷。无论您选择哪种抽象,您的抽象都仅在您要解决的问题的背景下才有意义,因此,当您开始将它们胶合在一起时,您将永远无法获得与基础意识形态相一致的一致抽象而不将它们包装在与您的意识形态相符的视图中。知道了这一点,python的方法更灵活,因为它更便宜,您可以自己制作包装器或自己的预处理器,为此要花更多的钱才能使它看起来“更漂亮”。

Both are not nice.

string.join(xs, delimit) means that the string module is aware of the existence of a list, which it has no business knowing about, since the string module only works with strings.

list.join(delimit) is a bit nicer because we’re so used to strings being a fundamental type(and lingually speaking, they are). However this means that join needs to be dispatched dynamically because in the arbitrary context of a.split("\n") the python compiler might not know what a is, and will need to look it up(analogously to vtable lookup), which is expensive if you do it a lot of times.

if the python runtime compiler knows that list is a built in module, it can skip the dynamic lookup and encode the intent into the bytecode directly, whereas otherwise it needs to dynamically resolve “join” of “a”, which may be up several layers of inheritence per call(since between calls, the meaning of join may have changed, because python is a dynamic language).

sadly, this is the ultimate flaw of abstraction; no matter what abstraction you choose, your abstraction will only make sense in the context of the problem you’re trying to solve, and as such you can never have a consistent abstraction that doesn’t become inconsistent with underlying ideologies as you start gluing them together without wrapping them in a view that is consistent with your ideology. Knowing this, python’s approach is more flexible since it’s cheaper, it’s up to you to pay more to make it look “nicer”, either by making your own wrapper, or your own preprocessor.


回答 9

变量my_list"-"都是对象。具体来说,它们分别是类list和的实例str。该join函数属于该类str。因此,使用语法"-".join(my_list)是因为对象"-"my_list作为输入。

The variables my_list and "-" are both objects. Specifically, they’re instances of the classes list and str, respectively. The join function belongs to the class str. Therefore, the syntax "-".join(my_list) is used because the object "-" is taking my_list as an input.


如何从列表中随机选择一个项目?

问题:如何从列表中随机选择一个项目?

假设我有以下列表:

foo = ['a', 'b', 'c', 'd', 'e']

从此列表中随机检索项目的最简单方法是什么?

Assume I have the following list:

foo = ['a', 'b', 'c', 'd', 'e']

What is the simplest way to retrieve an item at random from this list?


回答 0

采用 random.choice()

import random

foo = ['a', 'b', 'c', 'd', 'e']
print(random.choice(foo))

对于密码安全的随机选择(例如,用于从单词列表生成密码短语),请使用secrets.choice()

import secrets

foo = ['battery', 'correct', 'horse', 'staple']
print(secrets.choice(foo))

secrets是Python 3.6中的新功能,在旧版本的Python上,您可以使用random.SystemRandom此类:

import random

secure_random = random.SystemRandom()
print(secure_random.choice(foo))

Use random.choice()

import random

foo = ['a', 'b', 'c', 'd', 'e']
print(random.choice(foo))

For cryptographically secure random choices (e.g. for generating a passphrase from a wordlist) use secrets.choice()

import secrets

foo = ['battery', 'correct', 'horse', 'staple']
print(secrets.choice(foo))

secrets is new in Python 3.6, on older versions of Python you can use the random.SystemRandom class:

import random

secure_random = random.SystemRandom()
print(secure_random.choice(foo))

回答 1

如果您想从列表中随机选择一个以上的项目,或者从一组中选择一个项目,则建议random.sample改用。

import random
group_of_items = {1, 2, 3, 4}               # a sequence or set will work here.
num_to_select = 2                           # set the number to select here.
list_of_random_items = random.sample(group_of_items, num_to_select)
first_random_item = list_of_random_items[0]
second_random_item = list_of_random_items[1] 

如果您只是从列表中拉出一个项目,那么选择就不会那么笨拙,因为使用sample的语法将random.sample(some_list, 1)[0]random.choice(some_list)

但是不幸的是,选择仅适用于序列(例如列表或元组)中的单个输出。虽然random.choice(tuple(some_set))可能是从集合中获取单个项目的选项。

编辑:使用秘密

正如许多人指出的那样,如果需要更安全的伪随机样本,则应使用secrets模块:

import secrets                              # imports secure module.
secure_random = secrets.SystemRandom()      # creates a secure random object.
group_of_items = {1, 2, 3, 4}               # a sequence or set will work here.
num_to_select = 2                           # set the number to select here.
list_of_random_items = secure_random.sample(group_of_items, num_to_select)
first_random_item = list_of_random_items[0]
second_random_item = list_of_random_items[1]

编辑:Pythonic一线

如果您希望使用更具Python风格的单行代码来选择多个项目,则可以使用拆包。

import random
first_random_item, second_random_item = random.sample(group_of_items, 2)

If you want to randomly select more than one item from a list, or select an item from a set, I’d recommend using random.sample instead.

import random
group_of_items = {1, 2, 3, 4}               # a sequence or set will work here.
num_to_select = 2                           # set the number to select here.
list_of_random_items = random.sample(group_of_items, num_to_select)
first_random_item = list_of_random_items[0]
second_random_item = list_of_random_items[1] 

If you’re only pulling a single item from a list though, choice is less clunky, as using sample would have the syntax random.sample(some_list, 1)[0] instead of random.choice(some_list).

Unfortunately though, choice only works for a single output from sequences (such as lists or tuples). Though random.choice(tuple(some_set)) may be an option for getting a single item from a set.

EDIT: Using Secrets

As many have pointed out, if you require more secure pseudorandom samples, you should use the secrets module:

import secrets                              # imports secure module.
secure_random = secrets.SystemRandom()      # creates a secure random object.
group_of_items = {1, 2, 3, 4}               # a sequence or set will work here.
num_to_select = 2                           # set the number to select here.
list_of_random_items = secure_random.sample(group_of_items, num_to_select)
first_random_item = list_of_random_items[0]
second_random_item = list_of_random_items[1]

EDIT: Pythonic One-Liner

If you want a more pythonic one-liner for selecting multiple items, you can use unpacking.

import random
first_random_item, second_random_item = random.sample(group_of_items, 2)

回答 2

如果您还需要索引,请使用 random.randrange

from random import randrange
random_index = randrange(len(foo))
print(foo[random_index])

If you also need the index, use random.randrange

from random import randrange
random_index = randrange(len(foo))
print(foo[random_index])

回答 3

从Python 3.6开始,您可以使用该secrets模块,该random模块比加密或安全用途的模块更好。

要从列表中打印随机元素:

import secrets
foo = ['a', 'b', 'c', 'd', 'e']
print(secrets.choice(foo))

要打印随机索引:

print(secrets.randbelow(len(foo)))

有关详细信息,请参阅PEP 506

As of Python 3.6 you can use the secrets module, which is preferable to the random module for cryptography or security uses.

To print a random element from a list:

import secrets
foo = ['a', 'b', 'c', 'd', 'e']
print(secrets.choice(foo))

To print a random index:

print(secrets.randbelow(len(foo)))

For details, see PEP 506.


回答 4

我提出了一个脚本,用于从列表中删除随机拾取的项目,直到它为空:

维持set并删除随机拾取的元素(带有choice),直到列表为空。

s=set(range(1,6))
import random

while len(s)>0:
  s.remove(random.choice(list(s)))
  print(s)

三个运行给出三个不同的答案:

>>> 
set([1, 3, 4, 5])
set([3, 4, 5])
set([3, 4])
set([4])
set([])
>>> 
set([1, 2, 3, 5])
set([2, 3, 5])
set([2, 3])
set([2])
set([])

>>> 
set([1, 2, 3, 5])
set([1, 2, 3])
set([1, 2])
set([1])
set([])

I propose a script for removing randomly picked up items off a list until it is empty:

Maintain a set and remove randomly picked up element (with choice) until list is empty.

s=set(range(1,6))
import random

while len(s)>0:
  s.remove(random.choice(list(s)))
  print(s)

Three runs give three different answers:

>>> 
set([1, 3, 4, 5])
set([3, 4, 5])
set([3, 4])
set([4])
set([])
>>> 
set([1, 2, 3, 5])
set([2, 3, 5])
set([2, 3])
set([2])
set([])

>>> 
set([1, 2, 3, 5])
set([1, 2, 3])
set([1, 2])
set([1])
set([])

回答 5

foo = ['a', 'b', 'c', 'd', 'e']
number_of_samples = 1

在python 2:

random_items = random.sample(population=foo, k=number_of_samples)

在python 3:

random_items = random.choices(population=foo, k=number_of_samples)
foo = ['a', 'b', 'c', 'd', 'e']
number_of_samples = 1

In python 2:

random_items = random.sample(population=foo, k=number_of_samples)

In python 3:

random_items = random.choices(population=foo, k=number_of_samples)

回答 6

numpy 解: numpy.random.choice

对于这个问题,它的作用与接受的答案(import random; random.choice())相同,但是我添加了它,因为程序员可能已经导入numpy了(像我一样),并且这两种方法之间可能存在一些差异,这可能与您的实际用例有关。

import numpy as np    
np.random.choice(foo) # randomly selects a single item

为了重现性,您可以执行以下操作:

np.random.seed(123)
np.random.choice(foo) # first call will always return 'c'

对于以形式返回的一个或多个项目的样本array,请传递size参数:

np.random.choice(foo, 5)          # sample with replacement (default)
np.random.choice(foo, 5, False)   # sample without replacement

numpy solution: numpy.random.choice

For this question, it works the same as the accepted answer (import random; random.choice()), but I added it because the programmer may have imported numpy already (like me) & also there are some differences between the two methods that may concern your actual use case.

import numpy as np    
np.random.choice(foo) # randomly selects a single item

For reproducibility, you can do:

np.random.seed(123)
np.random.choice(foo) # first call will always return 'c'

For samples of one or more items, returned as an array, pass the size argument:

np.random.choice(foo, 5)          # sample with replacement (default)
np.random.choice(foo, 5, False)   # sample without replacement

回答 7

如何从列表中随机选择一个项目?

假设我有以下列表:

foo = ['a', 'b', 'c', 'd', 'e']  

从此列表中随机检索项目的最简单方法是什么?

如果您想接近真正的随机性,那么我建议secrets.choice从标准库(Python 3.6中的新增功能)中进行建议:

>>> from secrets import choice         # Python 3 only
>>> choice(list('abcde'))
'c'

上面的内容等同于我以前的建议,即使用模块中的SystemRandom对象randomchoice方法-早于Python 2:

>>> import random                      # Python 2 compatible
>>> sr = random.SystemRandom()
>>> foo = list('abcde')
>>> foo
['a', 'b', 'c', 'd', 'e']

现在:

>>> sr.choice(foo)
'd'
>>> sr.choice(foo)
'e'
>>> sr.choice(foo)
'a'
>>> sr.choice(foo)
'b'
>>> sr.choice(foo)
'a'
>>> sr.choice(foo)
'c'
>>> sr.choice(foo)
'c'

如果需要确定性伪随机选择,请使用choice函数(实际上是Random对象上的绑定方法):

>>> random.choice
<bound method Random.choice of <random.Random object at 0x800c1034>>

看来是随机的,但实际上不是,我们可以看看是否反复播种:

>>> random.seed(42); random.choice(foo), random.choice(foo), random.choice(foo)
('d', 'a', 'b')
>>> random.seed(42); random.choice(foo), random.choice(foo), random.choice(foo)
('d', 'a', 'b')
>>> random.seed(42); random.choice(foo), random.choice(foo), random.choice(foo)
('d', 'a', 'b')
>>> random.seed(42); random.choice(foo), random.choice(foo), random.choice(foo)
('d', 'a', 'b')
>>> random.seed(42); random.choice(foo), random.choice(foo), random.choice(foo)
('d', 'a', 'b')

一条评论:

这与random.choice是否真正随机无关。如果修复种子,您将获得可重复的结果-这就是种子的设计目的。您也可以将种子传递给SystemRandom。sr = random.SystemRandom(42)

好吧,是的,您可以给它传递一个“种子”参数,但是您会看到该SystemRandom对象只是忽略了它

def seed(self, *args, **kwds):
    "Stub method.  Not used for a system random number generator."
    return None

How to randomly select an item from a list?

Assume I have the following list:

foo = ['a', 'b', 'c', 'd', 'e']  

What is the simplest way to retrieve an item at random from this list?

If you want close to truly random, then I suggest secrets.choice from the standard library (New in Python 3.6.):

>>> from secrets import choice         # Python 3 only
>>> choice(list('abcde'))
'c'

The above is equivalent to my former recommendation, using a SystemRandom object from the random module with the choice method – available earlier in Python 2:

>>> import random                      # Python 2 compatible
>>> sr = random.SystemRandom()
>>> foo = list('abcde')
>>> foo
['a', 'b', 'c', 'd', 'e']

And now:

>>> sr.choice(foo)
'd'
>>> sr.choice(foo)
'e'
>>> sr.choice(foo)
'a'
>>> sr.choice(foo)
'b'
>>> sr.choice(foo)
'a'
>>> sr.choice(foo)
'c'
>>> sr.choice(foo)
'c'

If you want a deterministic pseudorandom selection, use the choice function (which is actually a bound method on a Random object):

>>> random.choice
<bound method Random.choice of <random.Random object at 0x800c1034>>

It seems random, but it’s actually not, which we can see if we reseed it repeatedly:

>>> random.seed(42); random.choice(foo), random.choice(foo), random.choice(foo)
('d', 'a', 'b')
>>> random.seed(42); random.choice(foo), random.choice(foo), random.choice(foo)
('d', 'a', 'b')
>>> random.seed(42); random.choice(foo), random.choice(foo), random.choice(foo)
('d', 'a', 'b')
>>> random.seed(42); random.choice(foo), random.choice(foo), random.choice(foo)
('d', 'a', 'b')
>>> random.seed(42); random.choice(foo), random.choice(foo), random.choice(foo)
('d', 'a', 'b')

A comment:

This is not about whether random.choice is truly random or not. If you fix the seed, you will get the reproducible results — and that’s what seed is designed for. You can pass a seed to SystemRandom, too. sr = random.SystemRandom(42)

Well, yes you can pass it a “seed” argument, but you’ll see that the SystemRandom object simply ignores it:

def seed(self, *args, **kwds):
    "Stub method.  Not used for a system random number generator."
    return None

回答 8

如果您需要索引,请使用:

import random
foo = ['a', 'b', 'c', 'd', 'e']
print int(random.random() * len(foo))
print foo[int(random.random() * len(foo))]

random.choice做同样的事情:)

if you need the index just use:

import random
foo = ['a', 'b', 'c', 'd', 'e']
print int(random.random() * len(foo))
print foo[int(random.random() * len(foo))]

random.choice does the same:)


回答 9

这是带有定义随机索引的变量的代码:

import random

foo = ['a', 'b', 'c', 'd', 'e']
randomindex = random.randint(0,len(foo)-1) 
print (foo[randomindex])
## print (randomindex)

这是没有变量的代码:

import random

foo = ['a', 'b', 'c', 'd', 'e']
print (foo[random.randint(0,len(foo)-1)])

这是用最短和最聪明的方法实现的代码:

import random

foo = ['a', 'b', 'c', 'd', 'e']
print(random.choice(foo))

(python 2.7)

This is the code with a variable that defines the random index:

import random

foo = ['a', 'b', 'c', 'd', 'e']
randomindex = random.randint(0,len(foo)-1) 
print (foo[randomindex])
## print (randomindex)

This is the code without the variable:

import random

foo = ['a', 'b', 'c', 'd', 'e']
print (foo[random.randint(0,len(foo)-1)])

And this is the code in the shortest and smartest way to do it:

import random

foo = ['a', 'b', 'c', 'd', 'e']
print(random.choice(foo))

(python 2.7)


回答 10

以下代码演示了是否需要生产相同的物品。您还可以指定要提取的样本数量。
sample方法返回一个新列表,其中包含总体中的元素,而保留原始总体不变。结果列表按选择顺序排列,因此所有子切片也将是有效的随机样本。

import random as random
random.seed(0)  # don't use seed function, if you want different results in each run
print(random.sample(foo,3))  # 3 is the number of sample you want to retrieve

Output:['d', 'e', 'a']

The following code demonstrates if you need to produce the same items. You can also specify how many samples you want to extract.
The sample method returns a new list containing elements from the population while leaving the original population unchanged. The resulting list is in selection order so that all sub-slices will also be valid random samples.

import random as random
random.seed(0)  # don't use seed function, if you want different results in each run
print(random.sample(foo,3))  # 3 is the number of sample you want to retrieve

Output:['d', 'e', 'a']

回答 11

随机项目选择:

import random

my_list = [1, 2, 3, 4, 5]
num_selections = 2

new_list = random.sample(my_list, num_selections)

要保留列表的顺序,您可以执行以下操作:

randIndex = random.sample(range(len(my_list)), n_selections)
randIndex.sort()
new_list = [my_list[i] for i in randIndex]

重复的https://stackoverflow.com/a/49682832/4383027

Random item selection:

import random

my_list = [1, 2, 3, 4, 5]
num_selections = 2

new_list = random.sample(my_list, num_selections)

To preserve the order of the list, you could do:

randIndex = random.sample(range(len(my_list)), n_selections)
randIndex.sort()
new_list = [my_list[i] for i in randIndex]

Duplicate of https://stackoverflow.com/a/49682832/4383027


回答 12

我们也可以使用randint做到这一点。

from random import randint
l= ['a','b','c']

def get_rand_element(l):
    if l:
        return l[randint(0,len(l)-1)]
    else:
        return None

get_rand_element(l)

We can also do this using randint.

from random import randint
l= ['a','b','c']

def get_rand_element(l):
    if l:
        return l[randint(0,len(l)-1)]
    else:
        return None

get_rand_element(l)

回答 13

您可以:

from random import randint

foo = ["a", "b", "c", "d", "e"]

print(foo[randint(0,4)])

You could just:

from random import randint

foo = ["a", "b", "c", "d", "e"]

print(foo[randint(0,4)])

如何计算列表项的出现?

问题:如何计算列表项的出现?

给定一个项目,我如何计算它在Python列表中的出现次数?

Given an item, how can I count its occurrences in a list in Python?


回答 0

如果只需要一项的计数,请使用以下count方法:

>>> [1, 2, 3, 4, 1, 4, 1].count(1)
3

如果您要计算多个项目,请不要使用它。count循环调用需要为每个count调用单独遍历列表,这可能会对性能造成灾难性影响。如果您要计算所有项目,甚至只是多个项目,请使用Counter,如其他答案中所述。

If you only want one item’s count, use the count method:

>>> [1, 2, 3, 4, 1, 4, 1].count(1)
3

Don’t use this if you want to count multiple items. Calling count in a loop requires a separate pass over the list for every count call, which can be catastrophic for performance. If you want to count all items, or even just multiple items, use Counter, as explained in the other answers.


回答 1

使用Counter如果你正在使用Python 2.7或3.x和你想出现的每个元素的数量:

>>> from collections import Counter
>>> z = ['blue', 'red', 'blue', 'yellow', 'blue', 'red']
>>> Counter(z)
Counter({'blue': 3, 'red': 2, 'yellow': 1})

Use Counter if you are using Python 2.7 or 3.x and you want the number of occurrences for each element:

>>> from collections import Counter
>>> z = ['blue', 'red', 'blue', 'yellow', 'blue', 'red']
>>> Counter(z)
Counter({'blue': 3, 'red': 2, 'yellow': 1})

回答 2

计算列表中一项的出现

仅计算一个列表项的出现次数即可 count()

>>> l = ["a","b","b"]
>>> l.count("a")
1
>>> l.count("b")
2

计算列表中所有项目的出现次数也称为“对列表进行计数”或创建计数计数器。

用count()计算所有项目

要计算l一个项目的出现次数,只需使用列表理解和count()方法

[[x,l.count(x)] for x in set(l)]

(或类似的字典dict((x,l.count(x)) for x in set(l))

例:

>>> l = ["a","b","b"]
>>> [[x,l.count(x)] for x in set(l)]
[['a', 1], ['b', 2]]
>>> dict((x,l.count(x)) for x in set(l))
{'a': 1, 'b': 2}

用Counter()计算所有项目

或者,库中有更快的Countercollections

Counter(l)

例:

>>> l = ["a","b","b"]
>>> from collections import Counter
>>> Counter(l)
Counter({'b': 2, 'a': 1})

计数器快多少?

我检查Counter了清单的计算速度。我尝试了两种方法的几个值,n并且看起来Counter快了大约2的常数。

这是我使用的脚本:

from __future__ import print_function
import timeit

t1=timeit.Timer('Counter(l)', \
                'import random;import string;from collections import Counter;n=1000;l=[random.choice(string.ascii_letters) for x in range(n)]'
                )

t2=timeit.Timer('[[x,l.count(x)] for x in set(l)]',
                'import random;import string;n=1000;l=[random.choice(string.ascii_letters) for x in range(n)]'
                )

print("Counter(): ", t1.repeat(repeat=3,number=10000))
print("count():   ", t2.repeat(repeat=3,number=10000)

并输出:

Counter():  [0.46062711701961234, 0.4022796869976446, 0.3974247490405105]
count():    [7.779430688009597, 7.962715800967999, 8.420845870045014]

Counting the occurrences of one item in a list

For counting the occurrences of just one list item you can use count()

>>> l = ["a","b","b"]
>>> l.count("a")
1
>>> l.count("b")
2

Counting the occurrences of all items in a list is also known as “tallying” a list, or creating a tally counter.

Counting all items with count()

To count the occurrences of items in l one can simply use a list comprehension and the count() method

[[x,l.count(x)] for x in set(l)]

(or similarly with a dictionary dict((x,l.count(x)) for x in set(l)))

Example:

>>> l = ["a","b","b"]
>>> [[x,l.count(x)] for x in set(l)]
[['a', 1], ['b', 2]]
>>> dict((x,l.count(x)) for x in set(l))
{'a': 1, 'b': 2}

Counting all items with Counter()

Alternatively, there’s the faster Counter class from the collections library

Counter(l)

Example:

>>> l = ["a","b","b"]
>>> from collections import Counter
>>> Counter(l)
Counter({'b': 2, 'a': 1})

How much faster is Counter?

I checked how much faster Counter is for tallying lists. I tried both methods out with a few values of n and it appears that Counter is faster by a constant factor of approximately 2.

Here is the script I used:

from __future__ import print_function
import timeit

t1=timeit.Timer('Counter(l)', \
                'import random;import string;from collections import Counter;n=1000;l=[random.choice(string.ascii_letters) for x in range(n)]'
                )

t2=timeit.Timer('[[x,l.count(x)] for x in set(l)]',
                'import random;import string;n=1000;l=[random.choice(string.ascii_letters) for x in range(n)]'
                )

print("Counter(): ", t1.repeat(repeat=3,number=10000))
print("count():   ", t2.repeat(repeat=3,number=10000)

And the output:

Counter():  [0.46062711701961234, 0.4022796869976446, 0.3974247490405105]
count():    [7.779430688009597, 7.962715800967999, 8.420845870045014]

回答 3

获取字典中每个项目出现次数的另一种方法是:

dict((i, a.count(i)) for i in a)

Another way to get the number of occurrences of each item, in a dictionary:

dict((i, a.count(i)) for i in a)

回答 4

list.count(x)返回x出现在列表中的次数

请参阅:http : //docs.python.org/tutorial/datastructures.html#more-on-lists

list.count(x) returns the number of times x appears in a list

see: http://docs.python.org/tutorial/datastructures.html#more-on-lists


回答 5

给定一个项目,我如何计算它在Python列表中的出现次数?

这是一个示例列表:

>>> l = list('aaaaabbbbcccdde')
>>> l
['a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'e']

list.count

list.count方法

>>> l.count('b')
4

这适用于任何列表。元组也有这种方法:

>>> t = tuple('aabbbffffff')
>>> t
('a', 'a', 'b', 'b', 'b', 'f', 'f', 'f', 'f', 'f', 'f')
>>> t.count('f')
6

collections.Counter

然后是collections.Counter。您可以将任何可迭代的对象转储到Counter中,而不仅仅是列表,并且Counter将保留元素计数的数据结构。

用法:

>>> from collections import Counter
>>> c = Counter(l)
>>> c['b']
4

计数器基于Python字典,它们的键是元素,因此键必须是可哈希的。它们基本上就像允许多余元素进入的集合。

的进一步使用 collections.Counter

您可以从计数器中添加或减去可迭代项:

>>> c.update(list('bbb'))
>>> c['b']
7
>>> c.subtract(list('bbb'))
>>> c['b']
4

您还可以使用计数器进行多组操作:

>>> c2 = Counter(list('aabbxyz'))
>>> c - c2                   # set difference
Counter({'a': 3, 'c': 3, 'b': 2, 'd': 2, 'e': 1})
>>> c + c2                   # addition of all elements
Counter({'a': 7, 'b': 6, 'c': 3, 'd': 2, 'e': 1, 'y': 1, 'x': 1, 'z': 1})
>>> c | c2                   # set union
Counter({'a': 5, 'b': 4, 'c': 3, 'd': 2, 'e': 1, 'y': 1, 'x': 1, 'z': 1})
>>> c & c2                   # set intersection
Counter({'a': 2, 'b': 2})

为什么不熊猫呢?

另一个答案表明:

为什么不使用熊猫?

熊猫是一个公共库,但不在标准库中。根据需要添加它并非易事。

在列表对象本身以及标准库中都有针对此用例的内置解决方案。

如果您的项目不再需要熊猫,那么仅将其作为此功能的要求是愚蠢的。

Given an item, how can I count its occurrences in a list in Python?

Here’s an example list:

>>> l = list('aaaaabbbbcccdde')
>>> l
['a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'd', 'd', 'e']

list.count

There’s the list.count method

>>> l.count('b')
4

This works fine for any list. Tuples have this method as well:

>>> t = tuple('aabbbffffff')
>>> t
('a', 'a', 'b', 'b', 'b', 'f', 'f', 'f', 'f', 'f', 'f')
>>> t.count('f')
6

collections.Counter

And then there’s collections.Counter. You can dump any iterable into a Counter, not just a list, and the Counter will retain a data structure of the counts of the elements.

Usage:

>>> from collections import Counter
>>> c = Counter(l)
>>> c['b']
4

Counters are based on Python dictionaries, their keys are the elements, so the keys need to be hashable. They are basically like sets that allow redundant elements into them.

Further usage of collections.Counter

You can add or subtract with iterables from your counter:

>>> c.update(list('bbb'))
>>> c['b']
7
>>> c.subtract(list('bbb'))
>>> c['b']
4

And you can do multi-set operations with the counter as well:

>>> c2 = Counter(list('aabbxyz'))
>>> c - c2                   # set difference
Counter({'a': 3, 'c': 3, 'b': 2, 'd': 2, 'e': 1})
>>> c + c2                   # addition of all elements
Counter({'a': 7, 'b': 6, 'c': 3, 'd': 2, 'e': 1, 'y': 1, 'x': 1, 'z': 1})
>>> c | c2                   # set union
Counter({'a': 5, 'b': 4, 'c': 3, 'd': 2, 'e': 1, 'y': 1, 'x': 1, 'z': 1})
>>> c & c2                   # set intersection
Counter({'a': 2, 'b': 2})

Why not pandas?

Another answer suggests:

Why not use pandas?

Pandas is a common library, but it’s not in the standard library. Adding it as a requirement is non-trivial.

There are builtin solutions for this use-case in the list object itself as well as in the standard library.

If your project does not already require pandas, it would be foolish to make it a requirement just for this functionality.


回答 6

我已经将所有建议的解决方案(以及一些新的解决方案)与perfplot(我的一个小项目)进行了比较。

计数一个项目

对于足够大的阵列,事实证明

numpy.sum(numpy.array(a) == 1) 

比其他解决方案快一点。

在此处输入图片说明

计算所有项目

由于之前建立的

numpy.bincount(a)

是你想要的。

在此处输入图片说明


复制代码的代码:

from collections import Counter
from collections import defaultdict
import numpy
import operator
import pandas
import perfplot


def counter(a):
    return Counter(a)


def count(a):
    return dict((i, a.count(i)) for i in set(a))


def bincount(a):
    return numpy.bincount(a)


def pandas_value_counts(a):
    return pandas.Series(a).value_counts()


def occur_dict(a):
    d = {}
    for i in a:
        if i in d:
            d[i] = d[i]+1
        else:
            d[i] = 1
    return d


def count_unsorted_list_items(items):
    counts = defaultdict(int)
    for item in items:
        counts[item] += 1
    return dict(counts)


def operator_countof(a):
    return dict((i, operator.countOf(a, i)) for i in set(a))


perfplot.show(
    setup=lambda n: list(numpy.random.randint(0, 100, n)),
    n_range=[2**k for k in range(20)],
    kernels=[
        counter, count, bincount, pandas_value_counts, occur_dict,
        count_unsorted_list_items, operator_countof
        ],
    equality_check=None,
    logx=True,
    logy=True,
    )

2。

from collections import Counter
from collections import defaultdict
import numpy
import operator
import pandas
import perfplot


def counter(a):
    return Counter(a)


def count(a):
    return dict((i, a.count(i)) for i in set(a))


def bincount(a):
    return numpy.bincount(a)


def pandas_value_counts(a):
    return pandas.Series(a).value_counts()


def occur_dict(a):
    d = {}
    for i in a:
        if i in d:
            d[i] = d[i]+1
        else:
            d[i] = 1
    return d


def count_unsorted_list_items(items):
    counts = defaultdict(int)
    for item in items:
        counts[item] += 1
    return dict(counts)


def operator_countof(a):
    return dict((i, operator.countOf(a, i)) for i in set(a))


perfplot.show(
    setup=lambda n: list(numpy.random.randint(0, 100, n)),
    n_range=[2**k for k in range(20)],
    kernels=[
        counter, count, bincount, pandas_value_counts, occur_dict,
        count_unsorted_list_items, operator_countof
        ],
    equality_check=None,
    logx=True,
    logy=True,
    )

I’ve compared all suggested solutions (and a few new ones) with perfplot (a small project of mine).

Counting one item

For large enough arrays, it turns out that

numpy.sum(numpy.array(a) == 1) 

is slightly faster than the other solutions.

enter image description here

Counting all items

As established before,

numpy.bincount(a)

is what you want.

enter image description here


Code to reproduce the plots:

from collections import Counter
from collections import defaultdict
import numpy
import operator
import pandas
import perfplot


def counter(a):
    return Counter(a)


def count(a):
    return dict((i, a.count(i)) for i in set(a))


def bincount(a):
    return numpy.bincount(a)


def pandas_value_counts(a):
    return pandas.Series(a).value_counts()


def occur_dict(a):
    d = {}
    for i in a:
        if i in d:
            d[i] = d[i]+1
        else:
            d[i] = 1
    return d


def count_unsorted_list_items(items):
    counts = defaultdict(int)
    for item in items:
        counts[item] += 1
    return dict(counts)


def operator_countof(a):
    return dict((i, operator.countOf(a, i)) for i in set(a))


perfplot.show(
    setup=lambda n: list(numpy.random.randint(0, 100, n)),
    n_range=[2**k for k in range(20)],
    kernels=[
        counter, count, bincount, pandas_value_counts, occur_dict,
        count_unsorted_list_items, operator_countof
        ],
    equality_check=None,
    logx=True,
    logy=True,
    )

2.

from collections import Counter
from collections import defaultdict
import numpy
import operator
import pandas
import perfplot


def counter(a):
    return Counter(a)


def count(a):
    return dict((i, a.count(i)) for i in set(a))


def bincount(a):
    return numpy.bincount(a)


def pandas_value_counts(a):
    return pandas.Series(a).value_counts()


def occur_dict(a):
    d = {}
    for i in a:
        if i in d:
            d[i] = d[i]+1
        else:
            d[i] = 1
    return d


def count_unsorted_list_items(items):
    counts = defaultdict(int)
    for item in items:
        counts[item] += 1
    return dict(counts)


def operator_countof(a):
    return dict((i, operator.countOf(a, i)) for i in set(a))


perfplot.show(
    setup=lambda n: list(numpy.random.randint(0, 100, n)),
    n_range=[2**k for k in range(20)],
    kernels=[
        counter, count, bincount, pandas_value_counts, occur_dict,
        count_unsorted_list_items, operator_countof
        ],
    equality_check=None,
    logx=True,
    logy=True,
    )

回答 7

如果您想一次计算所有值,则可以使用numpy数组快速完成bincount,如下所示

import numpy as np
a = np.array([1, 2, 3, 4, 1, 4, 1])
np.bincount(a)

这使

>>> array([0, 3, 1, 1, 2])

If you want to count all values at once you can do it very fast using numpy arrays and bincount as follows

import numpy as np
a = np.array([1, 2, 3, 4, 1, 4, 1])
np.bincount(a)

which gives

>>> array([0, 3, 1, 1, 2])

回答 8

如果可以使用pandas,则value_counts可以在那里进行救援。

>>> import pandas as pd
>>> a = [1, 2, 3, 4, 1, 4, 1]
>>> pd.Series(a).value_counts()
1    3
4    2
3    1
2    1
dtype: int64

它还会根据频率自动对结果进行排序。

如果您希望结果在列表列表中,请执行以下操作

>>> pd.Series(a).value_counts().reset_index().values.tolist()
[[1, 3], [4, 2], [3, 1], [2, 1]]

If you can use pandas, then value_counts is there for rescue.

>>> import pandas as pd
>>> a = [1, 2, 3, 4, 1, 4, 1]
>>> pd.Series(a).value_counts()
1    3
4    2
3    1
2    1
dtype: int64

It automatically sorts the result based on frequency as well.

If you want the result to be in a list of list, do as below

>>> pd.Series(a).value_counts().reset_index().values.tolist()
[[1, 3], [4, 2], [3, 1], [2, 1]]

回答 9

为什么不使用熊猫呢?

import pandas as pd

l = ['a', 'b', 'c', 'd', 'a', 'd', 'a']

# converting the list to a Series and counting the values
my_count = pd.Series(l).value_counts()
my_count

输出:

a    3
d    2
b    1
c    1
dtype: int64

如果要查找特定元素的数量,请说a,请尝试:

my_count['a']

输出:

3

Why not using Pandas?

import pandas as pd

l = ['a', 'b', 'c', 'd', 'a', 'd', 'a']

# converting the list to a Series and counting the values
my_count = pd.Series(l).value_counts()
my_count

Output:

a    3
d    2
b    1
c    1
dtype: int64

If you are looking for a count of a particular element, say a, try:

my_count['a']

Output:

3

回答 10

我今天遇到了这个问题,在考虑检查SO之前推出了自己的解决方案。这个:

dict((i,a.count(i)) for i in a)

对于大型列表,真的非常慢。我的解决方案

def occurDict(items):
    d = {}
    for i in items:
        if i in d:
            d[i] = d[i]+1
        else:
            d[i] = 1
return d

实际上比Counter解决方案要快一点,至少对于Python 2.7而言。

I had this problem today and rolled my own solution before I thought to check SO. This:

dict((i,a.count(i)) for i in a)

is really, really slow for large lists. My solution

def occurDict(items):
    d = {}
    for i in items:
        if i in d:
            d[i] = d[i]+1
        else:
            d[i] = 1
return d

is actually a bit faster than the Counter solution, at least for Python 2.7.


回答 11

# Python >= 2.6 (defaultdict) && < 2.7 (Counter, OrderedDict)
from collections import defaultdict
def count_unsorted_list_items(items):
    """
    :param items: iterable of hashable items to count
    :type items: iterable

    :returns: dict of counts like Py2.7 Counter
    :rtype: dict
    """
    counts = defaultdict(int)
    for item in items:
        counts[item] += 1
    return dict(counts)


# Python >= 2.2 (generators)
def count_sorted_list_items(items):
    """
    :param items: sorted iterable of items to count
    :type items: sorted iterable

    :returns: generator of (item, count) tuples
    :rtype: generator
    """
    if not items:
        return
    elif len(items) == 1:
        yield (items[0], 1)
        return
    prev_item = items[0]
    count = 1
    for item in items[1:]:
        if prev_item == item:
            count += 1
        else:
            yield (prev_item, count)
            count = 1
            prev_item = item
    yield (item, count)
    return


import unittest
class TestListCounters(unittest.TestCase):
    def test_count_unsorted_list_items(self):
        D = (
            ([], []),
            ([2], [(2,1)]),
            ([2,2], [(2,2)]),
            ([2,2,2,2,3,3,5,5], [(2,4), (3,2), (5,2)]),
            )
        for inp, exp_outp in D:
            counts = count_unsorted_list_items(inp) 
            print inp, exp_outp, counts
            self.assertEqual(counts, dict( exp_outp ))

        inp, exp_outp = UNSORTED_WIN = ([2,2,4,2], [(2,3), (4,1)])
        self.assertEqual(dict( exp_outp ), count_unsorted_list_items(inp) )


    def test_count_sorted_list_items(self):
        D = (
            ([], []),
            ([2], [(2,1)]),
            ([2,2], [(2,2)]),
            ([2,2,2,2,3,3,5,5], [(2,4), (3,2), (5,2)]),
            )
        for inp, exp_outp in D:
            counts = list( count_sorted_list_items(inp) )
            print inp, exp_outp, counts
            self.assertEqual(counts, exp_outp)

        inp, exp_outp = UNSORTED_FAIL = ([2,2,4,2], [(2,3), (4,1)])
        self.assertEqual(exp_outp, list( count_sorted_list_items(inp) ))
        # ... [(2,2), (4,1), (2,1)]
# Python >= 2.6 (defaultdict) && < 2.7 (Counter, OrderedDict)
from collections import defaultdict
def count_unsorted_list_items(items):
    """
    :param items: iterable of hashable items to count
    :type items: iterable

    :returns: dict of counts like Py2.7 Counter
    :rtype: dict
    """
    counts = defaultdict(int)
    for item in items:
        counts[item] += 1
    return dict(counts)


# Python >= 2.2 (generators)
def count_sorted_list_items(items):
    """
    :param items: sorted iterable of items to count
    :type items: sorted iterable

    :returns: generator of (item, count) tuples
    :rtype: generator
    """
    if not items:
        return
    elif len(items) == 1:
        yield (items[0], 1)
        return
    prev_item = items[0]
    count = 1
    for item in items[1:]:
        if prev_item == item:
            count += 1
        else:
            yield (prev_item, count)
            count = 1
            prev_item = item
    yield (item, count)
    return


import unittest
class TestListCounters(unittest.TestCase):
    def test_count_unsorted_list_items(self):
        D = (
            ([], []),
            ([2], [(2,1)]),
            ([2,2], [(2,2)]),
            ([2,2,2,2,3,3,5,5], [(2,4), (3,2), (5,2)]),
            )
        for inp, exp_outp in D:
            counts = count_unsorted_list_items(inp) 
            print inp, exp_outp, counts
            self.assertEqual(counts, dict( exp_outp ))

        inp, exp_outp = UNSORTED_WIN = ([2,2,4,2], [(2,3), (4,1)])
        self.assertEqual(dict( exp_outp ), count_unsorted_list_items(inp) )


    def test_count_sorted_list_items(self):
        D = (
            ([], []),
            ([2], [(2,1)]),
            ([2,2], [(2,2)]),
            ([2,2,2,2,3,3,5,5], [(2,4), (3,2), (5,2)]),
            )
        for inp, exp_outp in D:
            counts = list( count_sorted_list_items(inp) )
            print inp, exp_outp, counts
            self.assertEqual(counts, exp_outp)

        inp, exp_outp = UNSORTED_FAIL = ([2,2,4,2], [(2,3), (4,1)])
        self.assertEqual(exp_outp, list( count_sorted_list_items(inp) ))
        # ... [(2,2), (4,1), (2,1)]

回答 12

以下是三种解决方案:

最快的是使用for循环并将其存储在Dict中。

import time
from collections import Counter


def countElement(a):
    g = {}
    for i in a:
        if i in g: 
            g[i] +=1
        else: 
            g[i] =1
    return g


z = [1,1,1,1,2,2,2,2,3,3,4,5,5,234,23,3,12,3,123,12,31,23,13,2,4,23,42,42,34,234,23,42,34,23,423,42,34,23,423,4,234,23,42,34,23,4,23,423,4,23,4]


#Solution 1 - Faster
st = time.monotonic()
for i in range(1000000):
    b = countElement(z)
et = time.monotonic()
print(b)
print('Simple for loop and storing it in dict - Duration: {}'.format(et - st))

#Solution 2 - Fast
st = time.monotonic()
for i in range(1000000):
    a = Counter(z)
et = time.monotonic()
print (a)
print('Using collections.Counter - Duration: {}'.format(et - st))

#Solution 3 - Slow
st = time.monotonic()
for i in range(1000000):
    g = dict([(i, z.count(i)) for i in set(z)])
et = time.monotonic()
print(g)
print('Using list comprehension - Duration: {}'.format(et - st))

结果

#Solution 1 - Faster
{1: 4, 2: 5, 3: 4, 4: 6, 5: 2, 234: 3, 23: 10, 12: 2, 123: 1, 31: 1, 13: 1, 42: 5, 34: 4, 423: 3}
Simple for loop and storing it in dict - Duration: 12.032000000000153
#Solution 2 - Fast
Counter({23: 10, 4: 6, 2: 5, 42: 5, 1: 4, 3: 4, 34: 4, 234: 3, 423: 3, 5: 2, 12: 2, 123: 1, 31: 1, 13: 1})
Using collections.Counter - Duration: 15.889999999999418
#Solution 3 - Slow
{1: 4, 2: 5, 3: 4, 4: 6, 5: 2, 34: 4, 423: 3, 234: 3, 42: 5, 12: 2, 13: 1, 23: 10, 123: 1, 31: 1}
Using list comprehension - Duration: 33.0

Below are the three solutions:

Fastest is using a for loop and storing it in a Dict.

import time
from collections import Counter


def countElement(a):
    g = {}
    for i in a:
        if i in g: 
            g[i] +=1
        else: 
            g[i] =1
    return g


z = [1,1,1,1,2,2,2,2,3,3,4,5,5,234,23,3,12,3,123,12,31,23,13,2,4,23,42,42,34,234,23,42,34,23,423,42,34,23,423,4,234,23,42,34,23,4,23,423,4,23,4]


#Solution 1 - Faster
st = time.monotonic()
for i in range(1000000):
    b = countElement(z)
et = time.monotonic()
print(b)
print('Simple for loop and storing it in dict - Duration: {}'.format(et - st))

#Solution 2 - Fast
st = time.monotonic()
for i in range(1000000):
    a = Counter(z)
et = time.monotonic()
print (a)
print('Using collections.Counter - Duration: {}'.format(et - st))

#Solution 3 - Slow
st = time.monotonic()
for i in range(1000000):
    g = dict([(i, z.count(i)) for i in set(z)])
et = time.monotonic()
print(g)
print('Using list comprehension - Duration: {}'.format(et - st))

Result

#Solution 1 - Faster
{1: 4, 2: 5, 3: 4, 4: 6, 5: 2, 234: 3, 23: 10, 12: 2, 123: 1, 31: 1, 13: 1, 42: 5, 34: 4, 423: 3}
Simple for loop and storing it in dict - Duration: 12.032000000000153
#Solution 2 - Fast
Counter({23: 10, 4: 6, 2: 5, 42: 5, 1: 4, 3: 4, 34: 4, 234: 3, 423: 3, 5: 2, 12: 2, 123: 1, 31: 1, 13: 1})
Using collections.Counter - Duration: 15.889999999999418
#Solution 3 - Slow
{1: 4, 2: 5, 3: 4, 4: 6, 5: 2, 34: 4, 423: 3, 234: 3, 42: 5, 12: 2, 13: 1, 23: 10, 123: 1, 31: 1}
Using list comprehension - Duration: 33.0

回答 13

所有元素的计数 itertools.groupby()

获取列表中所有元素的计数的另一种可能性是使用itertools.groupby()

具有“重复”计数

from itertools import groupby

L = ['a', 'a', 'a', 't', 'q', 'a', 'd', 'a', 'd', 'c']  # Input list

counts = [(i, len(list(c))) for i,c in groupby(L)]      # Create value-count pairs as list of tuples 
print(counts)

退货

[('a', 3), ('t', 1), ('q', 1), ('a', 1), ('d', 1), ('a', 1), ('d', 1), ('c', 1)]

请注意,它是如何将前三个组合在一起a作为第一组的,而其他组合a则位于列表的下方。发生这种情况是因为未对输入列表L进行排序。如果组实际上应该分开,那么有时这可能是一个好处。

具有独特的计数

如果需要唯一的组计数,只需对输入列表进行排序:

counts = [(i, len(list(c))) for i,c in groupby(sorted(L))]
print(counts)

退货

[('a', 5), ('c', 1), ('d', 2), ('q', 1), ('t', 1)]

注意:为了创建唯一计数,与groupby解决方案相比,许多其他答案都提供了更轻松,更易读的代码。但是这里显示它与重复计数示例相似。

Count of all elements with itertools.groupby()

Antoher possiblity for getting the count of all elements in the list could be by means of itertools.groupby().

With “duplicate” counts

from itertools import groupby

L = ['a', 'a', 'a', 't', 'q', 'a', 'd', 'a', 'd', 'c']  # Input list

counts = [(i, len(list(c))) for i,c in groupby(L)]      # Create value-count pairs as list of tuples 
print(counts)

Returns

[('a', 3), ('t', 1), ('q', 1), ('a', 1), ('d', 1), ('a', 1), ('d', 1), ('c', 1)]

Notice how it combined the first three a‘s as the first group, while other groups of a are present further down the list. This happens because the input list L was not sorted. This can be a benefit sometimes if the groups should in fact be separate.

With unique counts

If unique group counts are desired, just sort the input list:

counts = [(i, len(list(c))) for i,c in groupby(sorted(L))]
print(counts)

Returns

[('a', 5), ('c', 1), ('d', 2), ('q', 1), ('t', 1)]

Note: For creating unique counts, many of the other answers provide easier and more readable code compared to the groupby solution. But it is shown here to draw a parallel to the duplicate count example.


回答 14

建议使用numpy的bincount,但是它仅适用于具有非负整数的一维数组。此外,结果数组可能会造成混淆(它包含原始列表的最小值到最大值的整数的出现,并将丢失的整数设置为0)。

使用numpy更好的方法是使用属性设置为True 的唯一函数return_counts。它返回一个元组,该元组具有唯一值的数组和每个唯一值的出现的数组。

# a = [1, 1, 0, 2, 1, 0, 3, 3]
a_uniq, counts = np.unique(a, return_counts=True)  # array([0, 1, 2, 3]), array([2, 3, 1, 2]

然后我们可以将它们配对为

dict(zip(a_uniq, counts))  # {0: 2, 1: 3, 2: 1, 3: 2}

它还可以与其他数据类型和“ 2d列表”一起使用,例如

>>> a = [['a', 'b', 'b', 'b'], ['a', 'c', 'c', 'a']]
>>> dict(zip(*np.unique(a, return_counts=True)))
{'a': 3, 'b': 3, 'c': 2}

It was suggested to use numpy’s bincount, however it works only for 1d arrays with non-negative integers. Also, the resulting array might be confusing (it contains the occurrences of the integers from min to max of the original list, and sets to 0 the missing integers).

A better way to do it with numpy is to use the unique function with the attribute return_counts set to True. It returns a tuple with an array of the unique values and an array of the occurrences of each unique value.

# a = [1, 1, 0, 2, 1, 0, 3, 3]
a_uniq, counts = np.unique(a, return_counts=True)  # array([0, 1, 2, 3]), array([2, 3, 1, 2]

and then we can pair them as

dict(zip(a_uniq, counts))  # {0: 2, 1: 3, 2: 1, 3: 2}

It also works with other data types and “2d lists”, e.g.

>>> a = [['a', 'b', 'b', 'b'], ['a', 'c', 'c', 'a']]
>>> dict(zip(*np.unique(a, return_counts=True)))
{'a': 3, 'b': 3, 'c': 2}

回答 15

计算具有共同类型的各种元素的数量:

li = ['A0','c5','A8','A2','A5','c2','A3','A9']

print sum(1 for el in li if el[0]=='A' and el[1] in '01234')

3 ,而不是6

To count the number of diverse elements having a common type:

li = ['A0','c5','A8','A2','A5','c2','A3','A9']

print sum(1 for el in li if el[0]=='A' and el[1] in '01234')

gives

3 , not 6


回答 16

虽然这是一个非常古老的问题,但是由于我没有找到一支,所以我做了一支。

# original numbers in list
l = [1, 2, 2, 3, 3, 3, 4]

# empty dictionary to hold pair of number and its count
d = {}

# loop through all elements and store count
[ d.update( {i:d.get(i, 0)+1} ) for i in l ]

print(d)

Although it is very old question, but as i didn’t find a one liner, i made one.

# original numbers in list
l = [1, 2, 2, 3, 3, 3, 4]

# empty dictionary to hold pair of number and its count
d = {}

# loop through all elements and store count
[ d.update( {i:d.get(i, 0)+1} ) for i in l ]

print(d)

回答 17

您也可以使用countOf内置模块的方法operator

>>> import operator
>>> operator.countOf([1, 2, 3, 4, 1, 4, 1], 1)
3

You can also use countOf method of a built-in module operator.

>>> import operator
>>> operator.countOf([1, 2, 3, 4, 1, 4, 1], 1)
3

回答 18

可能不是最有效的,需要额外的通行证才能删除重复项。

功能实现:

arr = np.array(['a','a','b','b','b','c'])
print(set(map(lambda x  : (x , list(arr).count(x)) , arr)))

返回:

{('c', 1), ('b', 3), ('a', 2)}

或返回为dict

print(dict(map(lambda x  : (x , list(arr).count(x)) , arr)))

返回:

{'b': 3, 'c': 1, 'a': 2}

May not be the most efficient, requires an extra pass to remove duplicates.

Functional implementation :

arr = np.array(['a','a','b','b','b','c'])
print(set(map(lambda x  : (x , list(arr).count(x)) , arr)))

returns :

{('c', 1), ('b', 3), ('a', 2)}

or return as dict :

print(dict(map(lambda x  : (x , list(arr).count(x)) , arr)))

returns :

{'b': 3, 'c': 1, 'a': 2}

回答 19

sum([1 for elem in <yourlist> if elem==<your_value>])

这将返回your_value的出现次数

sum([1 for elem in <yourlist> if elem==<your_value>])

This will return the amount of occurences of your_value


回答 20

我将使用filter()Lukasz的示例:

>>> lst = [1, 2, 3, 4, 1, 4, 1]
>>> len(filter(lambda x: x==1, lst))
3

I would use filter(), take Lukasz’s example:

>>> lst = [1, 2, 3, 4, 1, 4, 1]
>>> len(filter(lambda x: x==1, lst))
3

回答 21

如果您希望特定元素出现多次:

>>> from collections import Counter
>>> z = ['blue', 'red', 'blue', 'yellow', 'blue', 'red']
>>> single_occurrences = Counter(z)
>>> print(single_occurrences.get("blue"))
3
>>> print(single_occurrences.values())
dict_values([3, 2, 1])

if you want a number of occurrences for the particular element:

>>> from collections import Counter
>>> z = ['blue', 'red', 'blue', 'yellow', 'blue', 'red']
>>> single_occurrences = Counter(z)
>>> print(single_occurrences.get("blue"))
3
>>> print(single_occurrences.values())
dict_values([3, 2, 1])

回答 22

def countfrequncyinarray(arr1):
    r=len(arr1)
    return {i:arr1.count(i) for i in range(1,r+1)}
arr1=[4,4,4,4]
a=countfrequncyinarray(arr1)
print(a)
def countfrequncyinarray(arr1):
    r=len(arr1)
    return {i:arr1.count(i) for i in range(1,r+1)}
arr1=[4,4,4,4]
a=countfrequncyinarray(arr1)
print(a)

回答 23

l2=[1,"feto",["feto",1,["feto"]],['feto',[1,2,3,['feto']]]]
count=0
 def Test(l):   
        global count 
        if len(l)==0:
             return count
        count=l.count("feto")
        for i in l:
             if type(i) is list:
                count+=Test(i)
        return count   
    print(Test(l2))

这将递归计数或搜索列表中的项目,即使它在列表列表中

l2=[1,"feto",["feto",1,["feto"]],['feto',[1,2,3,['feto']]]]
count=0
 def Test(l):   
        global count 
        if len(l)==0:
             return count
        count=l.count("feto")
        for i in l:
             if type(i) is list:
                count+=Test(i)
        return count   
    print(Test(l2))

this will recursive count or search for the item in the list even if it in list of lists


如何通过索引从列表中删除元素

问题:如何通过索引从列表中删除元素

如何在Python中按索引从列表中删除元素?

我找到了list.remove方法,但是说我要删除最后一个元素,该怎么做?似乎默认的remove搜索列表,但是我不希望执行任何搜索。

How do I remove an element from a list by index in Python?

I found the list.remove method, but say I want to remove the last element, how do I do this? It seems like the default remove searches the list, but I don’t want any search to be performed.


回答 0

使用del并指定要删除的元素的索引:

>>> a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> del a[-1]
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8]

还支持切片:

>>> del a[2:4]
>>> a
[0, 1, 4, 5, 6, 7, 8, 9]

是教程中的部分。

Use del and specify the index of the element you want to delete:

>>> a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> del a[-1]
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8]

Also supports slices:

>>> del a[2:4]
>>> a
[0, 1, 4, 5, 6, 7, 8, 9]

Here is the section from the tutorial.


回答 1

您可能想要pop

a = ['a', 'b', 'c', 'd']
a.pop(1)

# now a is ['a', 'c', 'd']

默认情况下,pop不带任何参数将删除最后一项:

a = ['a', 'b', 'c', 'd']
a.pop()

# now a is ['a', 'b', 'c']

You probably want pop:

a = ['a', 'b', 'c', 'd']
a.pop(1)

# now a is ['a', 'c', 'd']

By default, pop without any arguments removes the last item:

a = ['a', 'b', 'c', 'd']
a.pop()

# now a is ['a', 'b', 'c']

回答 2

像其他提到的一样,pop和del是删除给定索引项有效方法。只是为了完成(因为可以通过Python中的许多方法来完成同一件事):

使用切片(这并不能代替从原始列表中删除项目):

(这也是使用Python列表时效率最低的方法,但是在使用不支持pop但却定义了a的用户定义对象时,这可能会很有用(但无效,我重申__getitem__)。

>>> a = [1, 2, 3, 4, 5, 6]
>>> index = 3 # Only positive index

>>> a = a[:index] + a[index+1 :]
# a is now [1, 2, 3, 5, 6]

注意:请注意,此方法不会像pop和那样修改列表del。相反,它制作了两个列表副本(一个从开始到索引,但没有索引(a[:index]),一个在索引后,直到最后一个元素(a[index+1:])),并通过添加两个副本来创建新的列表对象。然后将其重新分配给列表变量(a)。因此,旧列表对象被取消引用并因此被垃圾回收(前提是原始列表对象未被a以外的任何变量引用)。

这使该方法非常低效,并且还可能产生不良的副作用(尤其是当其他变量指向未修改的原始列表对象时)。

感谢@MarkDickinson指出这一点…

堆栈溢出答案说明了切片的概念。

另请注意,这仅适用于正指数。

与对象一起使用时,__getitem__必须已定义该__add__方法,更重要的是,必须已定义该方法以从两个操作数返回包含项的对象。

本质上,这适用于类定义如下的任何对象:

class foo(object):
    def __init__(self, items):
        self.items = items

    def __getitem__(self, index):
        return foo(self.items[index])

    def __add__(self, right):
        return foo( self.items + right.items )

这与list定义__getitem____add__方法一起使用。

三种方式的效率比较:

假设以下是预定义的:

a = range(10)
index = 3

del object[index]方法:

迄今为止最有效的方法。所有定义__del__方法的对象都可以使用。

拆卸如下:

码:

def del_method():
    global a
    global index
    del a[index]

拆卸:

 10    0 LOAD_GLOBAL     0 (a)
       3 LOAD_GLOBAL     1 (index)
       6 DELETE_SUBSCR   # This is the line that deletes the item
       7 LOAD_CONST      0 (None)
      10 RETURN_VALUE
None

pop 方法:

它比del方法效率低,在需要获取已删除项目时使用。

码:

def pop_method():
    global a
    global index
    a.pop(index)

拆卸:

 17     0 LOAD_GLOBAL     0 (a)
        3 LOAD_ATTR       1 (pop)
        6 LOAD_GLOBAL     2 (index)
        9 CALL_FUNCTION   1
       12 POP_TOP
       13 LOAD_CONST      0 (None)
       16 RETURN_VALUE

slice和add方法。

效率最低。

码:

def slice_method():
    global a
    global index
    a = a[:index] + a[index+1:]

拆卸:

 24     0 LOAD_GLOBAL    0 (a)
        3 LOAD_GLOBAL    1 (index)
        6 SLICE+2
        7 LOAD_GLOBAL    0 (a)
       10 LOAD_GLOBAL    1 (index)
       13 LOAD_CONST     1 (1)
       16 BINARY_ADD
       17 SLICE+1
       18 BINARY_ADD
       19 STORE_GLOBAL   0 (a)
       22 LOAD_CONST     0 (None)
       25 RETURN_VALUE
None

注意:在所有三个反汇编中,忽略最后两行,基本上是return None。同样,前两行正在加载全局值aindex

Like others mentioned pop and del are the efficient ways to remove an item of given index. Yet just for the sake of completion (since the same thing can be done via many ways in Python):

Using slices (this does not do in place removal of item from original list):

(Also this will be the least efficient method when working with Python list, but this could be useful (but not efficient, I reiterate) when working with user defined objects that do not support pop, yet do define a __getitem__ ):

>>> a = [1, 2, 3, 4, 5, 6]
>>> index = 3 # Only positive index

>>> a = a[:index] + a[index+1 :]
# a is now [1, 2, 3, 5, 6]

Note: Please note that this method does not modify the list in place like pop and del. It instead makes two copies of lists (one from the start until the index but without it (a[:index]) and one after the index till the last element (a[index+1:])) and creates a new list object by adding both. This is then reassigned to the list variable (a). The old list object is hence dereferenced and hence garbage collected (provided the original list object is not referenced by any variable other than a).

This makes this method very inefficient and it can also produce undesirable side effects (especially when other variables point to the original list object which remains un-modified).

Thanks to @MarkDickinson for pointing this out …

This Stack Overflow answer explains the concept of slicing.

Also note that this works only with positive indices.

While using with objects, the __getitem__ method must have been defined and more importantly the __add__ method must have been defined to return an object containing items from both the operands.

In essence, this works with any object whose class definition is like:

class foo(object):
    def __init__(self, items):
        self.items = items

    def __getitem__(self, index):
        return foo(self.items[index])

    def __add__(self, right):
        return foo( self.items + right.items )

This works with list which defines __getitem__ and __add__ methods.

Comparison of the three ways in terms of efficiency:

Assume the following is predefined:

a = range(10)
index = 3

The del object[index] method:

By far the most efficient method. It works will all objects that define a __del__ method.

The disassembly is as follows:

Code:

def del_method():
    global a
    global index
    del a[index]

Disassembly:

 10    0 LOAD_GLOBAL     0 (a)
       3 LOAD_GLOBAL     1 (index)
       6 DELETE_SUBSCR   # This is the line that deletes the item
       7 LOAD_CONST      0 (None)
      10 RETURN_VALUE
None

pop method:

It is less efficient than the del method and is used when you need to get the deleted item.

Code:

def pop_method():
    global a
    global index
    a.pop(index)

Disassembly:

 17     0 LOAD_GLOBAL     0 (a)
        3 LOAD_ATTR       1 (pop)
        6 LOAD_GLOBAL     2 (index)
        9 CALL_FUNCTION   1
       12 POP_TOP
       13 LOAD_CONST      0 (None)
       16 RETURN_VALUE

The slice and add method.

The least efficient.

Code:

def slice_method():
    global a
    global index
    a = a[:index] + a[index+1:]

Disassembly:

 24     0 LOAD_GLOBAL    0 (a)
        3 LOAD_GLOBAL    1 (index)
        6 SLICE+2
        7 LOAD_GLOBAL    0 (a)
       10 LOAD_GLOBAL    1 (index)
       13 LOAD_CONST     1 (1)
       16 BINARY_ADD
       17 SLICE+1
       18 BINARY_ADD
       19 STORE_GLOBAL   0 (a)
       22 LOAD_CONST     0 (None)
       25 RETURN_VALUE
None

Note: In all three disassembles ignore the last two lines which basically are return None. Also the first two lines are loading the global values a and index.


回答 3

pop从列表中删除并保留项目也很有用。del实际在哪里丢弃物品。

>>> x = [1, 2, 3, 4]

>>> p = x.pop(1)
>>> p
    2

pop is also useful to remove and keep an item from a list. Where del actually trashes the item.

>>> x = [1, 2, 3, 4]

>>> p = x.pop(1)
>>> p
    2

回答 4

如果要删除列表中的特定位置元素,例如2th,3th和7th。你不能使用

del my_list[2]
del my_list[3]
del my_list[7]

由于删除第二个元素后,实际上删除的第三个元素是原始列表中的第四个元素。您可以过滤原始列表中的2th,3th和7th元素并获得一个新列表,如下所示:

new list = [j for i, j in enumerate(my_list) if i not in [2, 3, 7]]

If you want to remove the specific position element in a list, like the 2th, 3th and 7th. you can’t use

del my_list[2]
del my_list[3]
del my_list[7]

Since after you delete the second element, the third element you delete actually is the fourth element in the original list. You can filter the 2th, 3th and 7th element in the original list and get a new list, like below:

new list = [j for i, j in enumerate(my_list) if i not in [2, 3, 7]]

回答 5

这取决于您要做什么。

如果要返回删除的元素,请使用pop()

>>> l = [1, 2, 3, 4, 5]
>>> l.pop(2)
3
>>> l
[1, 2, 4, 5]

但是,如果您只想删除一个元素,请使用del

>>> l = [1, 2, 3, 4, 5]
>>> del l[2]
>>> l
[1, 2, 4, 5]

另外,del允许您使用切片(例如del[2:])。

This depends on what you want to do.

If you want to return the element you removed, use pop():

>>> l = [1, 2, 3, 4, 5]
>>> l.pop(2)
3
>>> l
[1, 2, 4, 5]

However, if you just want to delete an element, use del:

>>> l = [1, 2, 3, 4, 5]
>>> del l[2]
>>> l
[1, 2, 4, 5]

Additionally, del allows you to use slices (e.g. del[2:]).


回答 6

通常,我使用以下方法:

>>> myList = [10,20,30,40,50]
>>> rmovIndxNo = 3
>>> del myList[rmovIndxNo]
>>> myList
[10, 20, 30, 50]

Generally, I am using the following method:

>>> myList = [10,20,30,40,50]
>>> rmovIndxNo = 3
>>> del myList[rmovIndxNo]
>>> myList
[10, 20, 30, 50]

回答 7

通过索引从列表中删除元素的另一种方法。

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# remove the element at index 3
a[3:4] = []
# a is now [0, 1, 2, 4, 5, 6, 7, 8, 9]

# remove the elements from index 3 to index 6
a[3:7] = []
# a is now [0, 1, 2, 7, 8, 9]

a [x:y]指向索引中的元素 xy-1。当我们将列表的该部分声明为空列表([])时,这些元素将被删除。

Yet another way to remove an element(s) from a list by index.

a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# remove the element at index 3
a[3:4] = []
# a is now [0, 1, 2, 4, 5, 6, 7, 8, 9]

# remove the elements from index 3 to index 6
a[3:7] = []
# a is now [0, 1, 2, 7, 8, 9]

a[x:y] points to the elements from index x to y-1. When we declare that portion of the list as an empty list ([]), those elements are removed.


回答 8

您可以只搜索要删除的项目。真的很简单。例:

    letters = ["a", "b", "c", "d", "e"]
    letters.remove(letters[1])
    print(*letters) # Used with a * to make it unpack you don't have to (Python 3.x or newer)

输出:acde

You could just search for the item you want to delete. It is really simple. Example:

    letters = ["a", "b", "c", "d", "e"]
    letters.remove(letters[1])
    print(*letters) # Used with a * to make it unpack you don't have to (Python 3.x or newer)

Output: a c d e


回答 9

使用以下代码从列表中删除元素:

list = [1, 2, 3, 4]
list.remove(1)
print(list)

output = [2, 3, 4]

如果要从列表中删除索引元素数据,请使用:

list = [1, 2, 3, 4]
list.remove(list[2])
print(list)
output : [1, 2, 4]

Use the following code to remove element from the list:

list = [1, 2, 3, 4]
list.remove(1)
print(list)

output = [2, 3, 4]

If you want to remove index element data from the list use:

list = [1, 2, 3, 4]
list.remove(list[2])
print(list)
output : [1, 2, 4]

回答 10

如前所述,最佳实践是del(); 或pop()如果您需要知道该值。

另一种解决方案是仅重新堆叠所需的那些元素:

    a = ['a', 'b', 'c', 'd'] 

    def remove_element(list_,index_):
        clipboard = []
        for i in range(len(list_)):
            if i is not index_:
                clipboard.append(list_[i])
        return clipboard

    print(remove_element(a,2))

    >> ['a', 'b', 'd']

eta:嗯…不适用于负索引值,会思考和更新

我想

if index_<0:index_=len(list_)+index_

会修补它…但是突然之间,这个想法似乎很脆弱。有趣的思想实验。似乎应该有一个“正确”的方法来执行append()/列表理解。

思考

As previously mentioned, best practice is del(); or pop() if you need to know the value.

An alternate solution is to re-stack only those elements you want:

    a = ['a', 'b', 'c', 'd'] 

    def remove_element(list_,index_):
        clipboard = []
        for i in range(len(list_)):
            if i is not index_:
                clipboard.append(list_[i])
        return clipboard

    print(remove_element(a,2))

    >> ['a', 'b', 'd']

eta: hmm… will not work on negative index values, will ponder and update

I suppose

if index_<0:index_=len(list_)+index_

would patch it… but suddenly this idea seems very brittle. Interesting thought experiment though. Seems there should be a ‘proper’ way to do this with append() / list comprehension.

pondering


回答 11

听起来好像您不是在使用列表列表,所以我将简短说明。您要使用pop,因为它将删除元素而不是列表元素,因此应使用del。要调用python中的最后一个元素,它是“ -1”

>>> test = ['item1', 'item2']
>>> test.pop(-1)
'item2'
>>> test
['item1']

It doesn’t sound like you’re working with a list of lists, so I’ll keep this short. You want to use pop since it will remove elements not elements that are lists, you should use del for that. To call the last element in python it’s “-1”

>>> test = ['item1', 'item2']
>>> test.pop(-1)
'item2'
>>> test
['item1']

回答 12

l-值列表;我们必须从inds2rem列表中删除索引。

l = range(20)
inds2rem = [2,5,1,7]
map(lambda x: l.pop(x), sorted(inds2rem, key = lambda x:-x))

>>> l
[0, 3, 4, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

l – list of values; we have to remove indexes from inds2rem list.

l = range(20)
inds2rem = [2,5,1,7]
map(lambda x: l.pop(x), sorted(inds2rem, key = lambda x:-x))

>>> l
[0, 3, 4, 6, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

回答 13

使用“ del”功能:

del listName[-N]

例如,如果要删除最后3个项目,则代码应为:

del listName[-3:]

例如,如果要删除最后8个项目,则代码应为:

del listName[-8:]

Use the “del” function:

del listName[-N]

For example, if you want to remove the last 3 items, your code should be:

del listName[-3:]

For example, if you want to remove the last 8 items, your code should be:

del listName[-8:]

回答 14

已经提到了如何从列表中删除单个元素以及不同方法具有哪些优点。但是请注意,删除多个元素可能会导致错误:

>>> l = [0,1,2,3,4,5,6,7,8,9]
>>> indices=[3,7]
>>> for i in indices:
...     del l[i]
... 
>>> l
[0, 1, 2, 4, 5, 6, 7, 9]

原始列表的元素3和8(不是3和7)已被删除(因为列表在循环中被缩短了),这可能不是本意。如果要安全删除多个索引,则应首先删除索引最高的元素,例如:

>>> l = [0,1,2,3,4,5,6,7,8,9]
>>> indices=[3,7]
>>> for i in sorted(indices, reverse=True):
...     del l[i]
... 
>>> l
[0, 1, 2, 4, 5, 6, 8, 9]

It has already been mentioned how to remove a single element from a list and which advantages the different methods have. Note, however, that removing multiple elements has some potential for errors:

>>> l = [0,1,2,3,4,5,6,7,8,9]
>>> indices=[3,7]
>>> for i in indices:
...     del l[i]
... 
>>> l
[0, 1, 2, 4, 5, 6, 7, 9]

Elements 3 and 8 (not 3 and 7) of the original list have been removed (as the list was shortened during the loop), which might not have been the intention. If you want to safely remove multiple indices you should instead delete the elements with highest index first, e.g. like this:

>>> l = [0,1,2,3,4,5,6,7,8,9]
>>> indices=[3,7]
>>> for i in sorted(indices, reverse=True):
...     del l[i]
... 
>>> l
[0, 1, 2, 4, 5, 6, 8, 9]

回答 15

或者,如果应删除多个索引:

print([v for i,v in enumerate(your_list) if i not in list_of_unwanted_indexes])

然后当然也可以这样做:

print([v for i,v in enumerate(your_list) if i != unwanted_index])

Or if multiple indexes should be removed:

print([v for i,v in enumerate(your_list) if i not in list_of_unwanted_indexes])

Of course then could also do:

print([v for i,v in enumerate(your_list) if i != unwanted_index])

回答 16

您可以使用del或pop来基于索引从列表中删除元素。Pop将打印从列表中删除的成员,而列表删除该成员而不打印它。

>>> a=[1,2,3,4,5]
>>> del a[1]
>>> a
[1, 3, 4, 5]
>>> a.pop(1)
 3
>>> a
[1, 4, 5]
>>> 

You can use either del or pop to remove element from list based on index. Pop will print member it is removing from list, while list delete that member without printing it.

>>> a=[1,2,3,4,5]
>>> del a[1]
>>> a
[1, 3, 4, 5]
>>> a.pop(1)
 3
>>> a
[1, 4, 5]
>>> 

回答 17

可以使用del或pop,但我更喜欢del,因为您可以指定索引和切片,从而使用户可以更好地控制数据。

例如,从所示的列表开始,可以使用删除其最后一个元素del作为切片,然后可以使用从结果中删除最后一个元素pop

>>> l = [1,2,3,4,5]
>>> del l[-1:]
>>> l
[1, 2, 3, 4]
>>> l.pop(-1)
4
>>> l
[1, 2, 3]

One can either use del or pop, but I prefer del, since you can specify index and slices, giving the user more control over the data.

For example, starting with the list shown, one can remove its last element with del as a slice, and then one can remove the last element from the result using pop.

>>> l = [1,2,3,4,5]
>>> del l[-1:]
>>> l
[1, 2, 3, 4]
>>> l.pop(-1)
4
>>> l
[1, 2, 3]

将两个列表转换成字典

问题:将两个列表转换成字典

想象一下您有:

keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']

产生以下字典的最简单方法是什么?

a_dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

Imagine that you have:

keys = ['name', 'age', 'food']
values = ['Monty', 42, 'spam']

What is the simplest way to produce the following dictionary?

a_dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

回答 0

像这样:

>>> keys = ['a', 'b', 'c']
>>> values = [1, 2, 3]
>>> dictionary = dict(zip(keys, values))
>>> print(dictionary)
{'a': 1, 'b': 2, 'c': 3}

Voila :-)成对的dict构造函数和zip函数非常有用:https//docs.python.org/3/library/functions.html#func-dict

Like this:

>>> keys = ['a', 'b', 'c']
>>> values = [1, 2, 3]
>>> dictionary = dict(zip(keys, values))
>>> print(dictionary)
{'a': 1, 'b': 2, 'c': 3}

Voila :-) The pairwise dict constructor and zip function are awesomely useful: https://docs.python.org/3/library/functions.html#func-dict


回答 1

想象一下您有:

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

产生以下字典的最简单方法是什么?

dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

绩效最高的dict构造函数zip

new_dict = dict(zip(keys, values))

在Python 3中,zip现在返回一个惰性迭代器,这是目前性能最高的方法。

dict(zip(keys, values))确实需要为dict和进行一次性全局查找zip,但它不会形成任何不必要的中间数据结构,也不必在函数应用程序中处理局部查找。

亚军,dict理解:

使用dict构造函数的第二名是使用dict理解的本机语法(而不是列表理解,因为其他人错误地将其理解为):

new_dict = {k: v for k, v in zip(keys, values)}

当您需要根据键或值进行映射或过滤时选择此选项。

在Python 2中,zip返回一个列表,以避免创建不必要的列表,请izip改用(别名为zip可以减少移至Python 3时的代码更改)。

from itertools import izip as zip

所以仍然是(2.7):

new_dict = {k: v for k, v in zip(keys, values)}

Python 2,非常适合<= 2.6

izipitertools变为zip在Python 3. izip大于拉链用于Python 2更好(因为它避免了不必要的列表创建),以及理想的2.6或以下:

from itertools import izip
new_dict = dict(izip(keys, values))

所有情况的结果:

在所有情况下:

>>> new_dict
{'age': 42, 'name': 'Monty', 'food': 'spam'}

说明:

如果我们查看帮助,dict就会发现它采用了多种形式的参数:


>>> help(dict)

class dict(object)
 |  dict() -> new empty dictionary
 |  dict(mapping) -> new dictionary initialized from a mapping object's
 |      (key, value) pairs
 |  dict(iterable) -> new dictionary initialized as if via:
 |      d = {}
 |      for k, v in iterable:
 |          d[k] = v
 |  dict(**kwargs) -> new dictionary initialized with the name=value pairs
 |      in the keyword argument list.  For example:  dict(one=1, two=2)

最佳方法是使用可迭代对象,同时避免创建不必要的数据结构。在Python 2中,zip创建了不必要的列表:

>>> zip(keys, values)
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

在Python 3中,等效项为:

>>> list(zip(keys, values))
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

Python 3 zip仅创建了一个可迭代的对象:

>>> zip(keys, values)
<zip object at 0x7f0e2ad029c8>

由于我们要避免创建不必要的数据结构,因此我们通常希望避免使用Python 2 zip(因为它创建了不必要的列表)。

性能较差的替代品:

这是一个传递给dict构造函数的生成器表达式:

generator_expression = ((k, v) for k, v in zip(keys, values))
dict(generator_expression)

或等效地:

dict((k, v) for k, v in zip(keys, values))

这是一个传递给dict构造函数的列表理解:

dict([(k, v) for k, v in zip(keys, values)])

在前两种情况下,在可迭代的zip上放置了一个额外的非操作(因此是不必要的)计算层,并且在列表理解的情况下,不必要地创建了一个额外的列表。我希望他们所有人的表现都不太好,当然也不会那么好。

绩效考核:

在Ubuntu 16.04上,由Nix提供的64位Python 3.8.2中,从最快到最慢的顺序是:

>>> min(timeit.repeat(lambda: dict(zip(keys, values))))
0.6695233230129816
>>> min(timeit.repeat(lambda: {k: v for k, v in zip(keys, values)}))
0.6941362579818815
>>> min(timeit.repeat(lambda: {keys[i]: values[i] for i in range(len(keys))}))
0.8782548159942962
>>> 
>>> min(timeit.repeat(lambda: dict([(k, v) for k, v in zip(keys, values)])))
1.077607496001292
>>> min(timeit.repeat(lambda: dict((k, v) for k, v in zip(keys, values))))
1.1840861019445583

dict(zip(keys, values)) 即使使用少量键和值也能获胜,但对于较大的键和值,则性能差异会更大。

评论者说:

min似乎是比较效果的一种坏方法。当然mean和/或max将是更有用的实际使用指标。

我们使用min这些算法是因为它们是确定性的。我们想知道算法在最佳条件下的性能。

如果操作系统由于任何原因挂起,则与我们要比较的内容无关,因此我们需要从分析中排除这些结果。

如果使用mean,这些事件将大大扭曲我们的结果,而如果使用,max我们将只会得到最极端的结果-最有可能受此类事件影响的结果。

评论者还说:

在python 3.6.8中,使用平均值,对dict的理解确实仍然更快,对于这些小列表而言,大约30%。对于较大的列表(10k个随机数),dict通话速度大约快10%。

我想我们的意思是dict(zip(...10k随机数。听起来确实是一个非常不寻常的用例。确实有道理,最直接的调用将在大型数据集中占主导地位,并且考虑到运行该测试将花费多长时间,进而使您的数字发生偏差,如果操作系统挂起占主导地位,我也不会感到惊讶。如果您使用meanmax我会认为您的结果毫无意义。

让我们在上面的示例中使用更实际的尺寸:

import numpy
import timeit
l1 = list(numpy.random.random(100))
l2 = list(numpy.random.random(100))

而且我们在这里看到,dict(zip(...对于较大的数据集,确实可以更快地运行约20%。

>>> min(timeit.repeat(lambda: {k: v for k, v in zip(l1, l2)}))
9.698965263989521
>>> min(timeit.repeat(lambda: dict(zip(l1, l2))))
7.9965161079890095

Imagine that you have:

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

What is the simplest way to produce the following dictionary ?

dict = {'name' : 'Monty', 'age' : 42, 'food' : 'spam'}

Most performant, dict constructor with zip

new_dict = dict(zip(keys, values))

In Python 3, zip now returns a lazy iterator, and this is now the most performant approach.

dict(zip(keys, values)) does require the one-time global lookup each for dict and zip, but it doesn’t form any unnecessary intermediate data-structures or have to deal with local lookups in function application.

Runner-up, dict comprehension:

A close runner-up to using the dict constructor is to use the native syntax of a dict comprehension (not a list comprehension, as others have mistakenly put it):

new_dict = {k: v for k, v in zip(keys, values)}

Choose this when you need to map or filter based on the keys or value.

In Python 2, zip returns a list, to avoid creating an unnecessary list, use izip instead (aliased to zip can reduce code changes when you move to Python 3).

from itertools import izip as zip

So that is still (2.7):

new_dict = {k: v for k, v in zip(keys, values)}

Python 2, ideal for <= 2.6

izip from itertools becomes zip in Python 3. izip is better than zip for Python 2 (because it avoids the unnecessary list creation), and ideal for 2.6 or below:

from itertools import izip
new_dict = dict(izip(keys, values))

Result for all cases:

In all cases:

>>> new_dict
{'age': 42, 'name': 'Monty', 'food': 'spam'}

Explanation:

If we look at the help on dict we see that it takes a variety of forms of arguments:


>>> help(dict)

class dict(object)
 |  dict() -> new empty dictionary
 |  dict(mapping) -> new dictionary initialized from a mapping object's
 |      (key, value) pairs
 |  dict(iterable) -> new dictionary initialized as if via:
 |      d = {}
 |      for k, v in iterable:
 |          d[k] = v
 |  dict(**kwargs) -> new dictionary initialized with the name=value pairs
 |      in the keyword argument list.  For example:  dict(one=1, two=2)

The optimal approach is to use an iterable while avoiding creating unnecessary data structures. In Python 2, zip creates an unnecessary list:

>>> zip(keys, values)
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

In Python 3, the equivalent would be:

>>> list(zip(keys, values))
[('name', 'Monty'), ('age', 42), ('food', 'spam')]

and Python 3’s zip merely creates an iterable object:

>>> zip(keys, values)
<zip object at 0x7f0e2ad029c8>

Since we want to avoid creating unnecessary data structures, we usually want to avoid Python 2’s zip (since it creates an unnecessary list).

Less performant alternatives:

This is a generator expression being passed to the dict constructor:

generator_expression = ((k, v) for k, v in zip(keys, values))
dict(generator_expression)

or equivalently:

dict((k, v) for k, v in zip(keys, values))

And this is a list comprehension being passed to the dict constructor:

dict([(k, v) for k, v in zip(keys, values)])

In the first two cases, an extra layer of non-operative (thus unnecessary) computation is placed over the zip iterable, and in the case of the list comprehension, an extra list is unnecessarily created. I would expect all of them to be less performant, and certainly not more-so.

Performance review:

In 64 bit Python 3.8.2 provided by Nix, on Ubuntu 16.04, ordered from fastest to slowest:

>>> min(timeit.repeat(lambda: dict(zip(keys, values))))
0.6695233230129816
>>> min(timeit.repeat(lambda: {k: v for k, v in zip(keys, values)}))
0.6941362579818815
>>> min(timeit.repeat(lambda: {keys[i]: values[i] for i in range(len(keys))}))
0.8782548159942962
>>> 
>>> min(timeit.repeat(lambda: dict([(k, v) for k, v in zip(keys, values)])))
1.077607496001292
>>> min(timeit.repeat(lambda: dict((k, v) for k, v in zip(keys, values))))
1.1840861019445583

dict(zip(keys, values)) wins even with small sets of keys and values, but for larger sets, the differences in performance will become greater.

A commenter said:

min seems like a bad way to compare performance. Surely mean and/or max would be much more useful indicators for real usage.

We use min because these algorithms are deterministic. We want to know the performance of the algorithms under the best conditions possible.

If the operating system hangs for any reason, it has nothing to do with what we’re trying to compare, so we need to exclude those kinds of results from our analysis.

If we used mean, those kinds of events would skew our results greatly, and if we used max we will only get the most extreme result – the one most likely affected by such an event.

A commenter also says:

In python 3.6.8, using mean values, the dict comprehension is indeed still faster, by about 30% for these small lists. For larger lists (10k random numbers), the dict call is about 10% faster.

I presume we mean dict(zip(... with 10k random numbers. That does sound like a fairly unusual use case. It does makes sense that the most direct calls would dominate in large datasets, and I wouldn’t be surprised if OS hangs are dominating given how long it would take to run that test, further skewing your numbers. And if you use mean or max I would consider your results meaningless.

Let’s use a more realistic size on our top examples:

import numpy
import timeit
l1 = list(numpy.random.random(100))
l2 = list(numpy.random.random(100))

And we see here that dict(zip(... does indeed run faster for larger datasets by about 20%.

>>> min(timeit.repeat(lambda: {k: v for k, v in zip(l1, l2)}))
9.698965263989521
>>> min(timeit.repeat(lambda: dict(zip(l1, l2))))
7.9965161079890095

回答 2

尝试这个:

>>> import itertools
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> adict = dict(itertools.izip(keys,values))
>>> adict
{'food': 'spam', 'age': 42, 'name': 'Monty'}

在Python 2中,与相比,它在内存消耗方面更经济zip

Try this:

>>> import itertools
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> adict = dict(itertools.izip(keys,values))
>>> adict
{'food': 'spam', 'age': 42, 'name': 'Monty'}

In Python 2, it’s also more economical in memory consumption compared to zip.


回答 3

>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> dict(zip(keys, values))
{'food': 'spam', 'age': 42, 'name': 'Monty'}
>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> dict(zip(keys, values))
{'food': 'spam', 'age': 42, 'name': 'Monty'}

回答 4

您还可以在≥2.7的Python中使用字典理解:

>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> {k: v for k, v in zip(keys, values)}
{'food': 'spam', 'age': 42, 'name': 'Monty'}

You can also use dictionary comprehensions in Python ≥ 2.7:

>>> keys = ('name', 'age', 'food')
>>> values = ('Monty', 42, 'spam')
>>> {k: v for k, v in zip(keys, values)}
{'food': 'spam', 'age': 42, 'name': 'Monty'}

回答 5

一种更自然的方法是使用字典理解

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')    
dict = {keys[i]: values[i] for i in range(len(keys))}

A more natural way is to use dictionary comprehension

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')    
dict = {keys[i]: values[i] for i in range(len(keys))}

回答 6

如果需要在创建字典之前转换键或值,则可以使用生成器表达式。例:

>>> adict = dict((str(k), v) for k, v in zip(['a', 1, 'b'], [2, 'c', 3])) 

看看像Pythonista一样的代码:惯用Python

If you need to transform keys or values before creating a dictionary then a generator expression could be used. Example:

>>> adict = dict((str(k), v) for k, v in zip(['a', 1, 'b'], [2, 'c', 3])) 

Take a look Code Like a Pythonista: Idiomatic Python.


回答 7

使用Python 3.x进行dict理解

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

dic = {k:v for k,v in zip(keys, values)}

print(dic)

有关dict理解的更多信息,这里有一个示例:

>>> print {i : chr(65+i) for i in range(4)}
    {0 : 'A', 1 : 'B', 2 : 'C', 3 : 'D'}

with Python 3.x, goes for dict comprehensions

keys = ('name', 'age', 'food')
values = ('Monty', 42, 'spam')

dic = {k:v for k,v in zip(keys, values)}

print(dic)

More on dict comprehensions here, an example is there:

>>> print {i : chr(65+i) for i in range(4)}
    {0 : 'A', 1 : 'B', 2 : 'C', 3 : 'D'}

回答 8

对于那些需要简单代码并且不熟悉的人zip

List1 = ['This', 'is', 'a', 'list']
List2 = ['Put', 'this', 'into', 'dictionary']

这可以通过一行代码来完成:

d = {List1[n]: List2[n] for n in range(len(List1))}

For those who need simple code and aren’t familiar with zip:

List1 = ['This', 'is', 'a', 'list']
List2 = ['Put', 'this', 'into', 'dictionary']

This can be done by one line of code:

d = {List1[n]: List2[n] for n in range(len(List1))}

回答 9

  • 2018-04-18

最好的解决方案仍然是:

In [92]: keys = ('name', 'age', 'food')
...: values = ('Monty', 42, 'spam')
...: 

In [93]: dt = dict(zip(keys, values))
In [94]: dt
Out[94]: {'age': 42, 'food': 'spam', 'name': 'Monty'}

整理一下:

    lst = [('name', 'Monty'), ('age', 42), ('food', 'spam')]
    keys, values = zip(*lst)
    In [101]: keys
    Out[101]: ('name', 'age', 'food')
    In [102]: values
    Out[102]: ('Monty', 42, 'spam')
  • 2018-04-18

The best solution is still:

In [92]: keys = ('name', 'age', 'food')
...: values = ('Monty', 42, 'spam')
...: 

In [93]: dt = dict(zip(keys, values))
In [94]: dt
Out[94]: {'age': 42, 'food': 'spam', 'name': 'Monty'}

Tranpose it:

    lst = [('name', 'Monty'), ('age', 42), ('food', 'spam')]
    keys, values = zip(*lst)
    In [101]: keys
    Out[101]: ('name', 'age', 'food')
    In [102]: values
    Out[102]: ('Monty', 42, 'spam')

回答 10

您可以使用以下代码:

dict(zip(['name', 'age', 'food'], ['Monty', 42, 'spam']))

但是请确保列表的长度相同。如果长度不相同,则zip函数会将较长的列表进行分类。

you can use this below code:

dict(zip(['name', 'age', 'food'], ['Monty', 42, 'spam']))

But make sure that length of the lists will be same.if length is not same.then zip function turncate the longer one.


回答 11

我在尝试解决与图形相关的问题时有这个疑问。我遇到的问题是我需要定义一个空的邻接列表,并想用一个空列表初始化所有节点,那是当我想到如何检查它是否足够快时,我的意思是说值得进行zip操作而不是简单的分配键值对。在大多数情况下,时间因素是重要的破冰者。因此,我对两种方法都执行了timeit操作。

import timeit
def dictionary_creation(n_nodes):
    dummy_dict = dict()
    for node in range(n_nodes):
        dummy_dict[node] = []
    return dummy_dict


def dictionary_creation_1(n_nodes):
    keys = list(range(n_nodes))
    values = [[] for i in range(n_nodes)]
    graph = dict(zip(keys, values))
    return graph


def wrapper(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

iteration = wrapper(dictionary_creation, n_nodes)
shorthand = wrapper(dictionary_creation_1, n_nodes)

for trail in range(1, 8):
    print(f'Itertion: {timeit.timeit(iteration, number=trails)}\nShorthand: {timeit.timeit(shorthand, number=trails)}')

对于n_nodes = 10,000,000我得到了,

迭代:2.825081646999024速记:3.535717916001886

迭代:5.051560923002398速记:6.255070794999483

迭代:6.52859034499852速记:8.221581164998497

迭代:8.683652416999394速记:12.599181543999293

迭代:11.587241565001023速记:15.27298851100204

迭代:14.816342867001367速记:17.162912737003353

迭代:16.645022411001264速记:19.976680120998935

您可以清楚地看到在某一点之后,第n_步的迭代方法超过了第n-1_步的速记方法所花费的时间。

I had this doubt while I was trying to solve a graph-related problem. The issue I had was I needed to define an empty adjacency list and wanted to initialize all the nodes with an empty list, that’s when I thought how about I check if it is fast enough, I mean if it will be worth doing a zip operation rather than simple assignment key-value pair. After all most of the times, the time factor is an important ice breaker. So I performed timeit operation for both approaches.

import timeit
def dictionary_creation(n_nodes):
    dummy_dict = dict()
    for node in range(n_nodes):
        dummy_dict[node] = []
    return dummy_dict


def dictionary_creation_1(n_nodes):
    keys = list(range(n_nodes))
    values = [[] for i in range(n_nodes)]
    graph = dict(zip(keys, values))
    return graph


def wrapper(func, *args, **kwargs):
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

iteration = wrapper(dictionary_creation, n_nodes)
shorthand = wrapper(dictionary_creation_1, n_nodes)

for trail in range(1, 8):
    print(f'Itertion: {timeit.timeit(iteration, number=trails)}\nShorthand: {timeit.timeit(shorthand, number=trails)}')

For n_nodes = 10,000,000 I get,

Iteration: 2.825081646999024 Shorthand: 3.535717916001886

Iteration: 5.051560923002398 Shorthand: 6.255070794999483

Iteration: 6.52859034499852 Shorthand: 8.221581164998497

Iteration: 8.683652416999394 Shorthand: 12.599181543999293

Iteration: 11.587241565001023 Shorthand: 15.27298851100204

Iteration: 14.816342867001367 Shorthand: 17.162912737003353

Iteration: 16.645022411001264 Shorthand: 19.976680120998935

You can clearly see after a certain point, iteration approach at n_th step overtakes the time taken by shorthand approach at n-1_th step.


回答 12

这也是在字典中添加列表值的示例

list1 = ["Name", "Surname", "Age"]
list2 = [["Cyd", "JEDD", "JESS"], ["DEY", "AUDIJE", "PONGARON"], [21, 32, 47]]
dic = dict(zip(list1, list2))
print(dic)

始终确保您的“键”(list1)始终在第一个参数中。

{'Name': ['Cyd', 'JEDD', 'JESS'], 'Surname': ['DEY', 'AUDIJE', 'PONGARON'], 'Age': [21, 32, 47]}

Here is also an example of adding a list value in you dictionary

list1 = ["Name", "Surname", "Age"]
list2 = [["Cyd", "JEDD", "JESS"], ["DEY", "AUDIJE", "PONGARON"], [21, 32, 47]]
dic = dict(zip(list1, list2))
print(dic)

always make sure the your “Key”(list1) is always in the first parameter.

{'Name': ['Cyd', 'JEDD', 'JESS'], 'Surname': ['DEY', 'AUDIJE', 'PONGARON'], 'Age': [21, 32, 47]}

回答 13

作为字典理解的解决方案,带有枚举:

dict = {item : values[index] for index, item in enumerate(keys)}

与枚举循环的解决方案:

dict = {}
for index, item in enumerate(keys):
    dict[item] = values[index]

Solution as dictionary comprehension with enumerate:

dict = {item : values[index] for index, item in enumerate(keys)}

Solution as for loop with enumerate:

dict = {}
for index, item in enumerate(keys):
    dict[item] = values[index]

回答 14

您也可以尝试将两个列表组合在一起的一个列表;)

a = [1,2,3,4]
n = [5,6,7,8]

x = []
for i in a,n:
    x.append(i)

print(dict(zip(x[0], x[1])))

You may also try with one list which is a combination of two lists ;)

a = [1,2,3,4]
n = [5,6,7,8]

x = []
for i in a,n:
    x.append(i)

print(dict(zip(x[0], x[1])))

回答 15

没有zip功能的方法

l1 = [1,2,3,4,5]
l2 = ['a','b','c','d','e']
d1 = {}
for l1_ in l1:
    for l2_ in l2:
        d1[l1_] = l2_
        l2.remove(l2_)
        break  

print (d1)


{1: 'd', 2: 'b', 3: 'e', 4: 'a', 5: 'c'}

method without zip function

l1 = [1,2,3,4,5]
l2 = ['a','b','c','d','e']
d1 = {}
for l1_ in l1:
    for l2_ in l2:
        d1[l1_] = l2_
        l2.remove(l2_)
        break  

print (d1)


{1: 'd', 2: 'b', 3: 'e', 4: 'a', 5: 'c'}

如何在Python中反转列表?

问题:如何在Python中反转列表?

如何在Python中执行以下操作?

array = [0, 10, 20, 40]
for (i = array.length() - 1; i >= 0; i--)

我需要一个数组的元素,但是要从头到尾。

How can I do the following in Python?

array = [0, 10, 20, 40]
for (i = array.length() - 1; i >= 0; i--)

I need to have the elements of an array, but from the end to the beginning.


回答 0

您可以通过以下方式使用该reversed函数:

>>> array=[0,10,20,40]
>>> for i in reversed(array):
...     print(i)

请注意,reversed(...)它不会返回列表。您可以使用来获得反向列表list(reversed(array))

You can make use of the reversed function for this as:

>>> array=[0,10,20,40]
>>> for i in reversed(array):
...     print(i)

Note that reversed(...) does not return a list. You can get a reversed list using list(reversed(array)).


回答 1

>>> L = [0,10,20,40]
>>> L[::-1]
[40, 20, 10, 0]

扩展片语法在Python 新增功能条目中得到了很好的解释2.3.5

根据注释中的特殊要求,这是最新的slice文档

>>> L = [0,10,20,40]
>>> L[::-1]
[40, 20, 10, 0]

Extended slice syntax is explained well in the Python What’s new Entry for release 2.3.5

By special request in a comment this is the most current slice documentation.


回答 2

>>> L = [0,10,20,40]
>>> L.reverse()
>>> L
[40, 20, 10, 0]

要么

>>> L[::-1]
[40, 20, 10, 0]
>>> L = [0,10,20,40]
>>> L.reverse()
>>> L
[40, 20, 10, 0]

Or

>>> L[::-1]
[40, 20, 10, 0]

回答 3

这是复制列表:

L = [0,10,20,40]
p = L[::-1]  #  Here p will be having reversed list

这是就地反转列表:

L.reverse() # Here L will be reversed in-place (no new list made)

This is to duplicate the list:

L = [0,10,20,40]
p = L[::-1]  #  Here p will be having reversed list

This is to reverse the list in-place:

L.reverse() # Here L will be reversed in-place (no new list made)

回答 4

我认为在Python中反转列表的最好方法是:

a = [1,2,3,4]
a = a[::-1]
print(a)
>>> [4,3,2,1]

该工作已完成,现在您有一个反向列表。

I think that the best way to reverse a list in Python is to do:

a = [1,2,3,4]
a = a[::-1]
print(a)
>>> [4,3,2,1]

The job is done, and now you have a reversed list.


回答 5

要反转相同列表,请使用:

array.reverse()

要将反向列表分配给其他列表,请使用:

newArray = array[::-1] 

For reversing the same list use:

array.reverse()

To assign reversed list into some other list use:

newArray = array[::-1] 

回答 6

使用切片,例如array = array [::-1]是一个巧妙的技巧,非常具有Python风格,但是对于新手来说可能有些晦涩。使用reverse()方法是日常编码的好方法,因为它易于阅读。

但是,如果像面试问题中那样需要在适当的位置反转列表,则可能无法使用此类内置方法。面试官将着眼于您如何解决问题,而不是深入了解Python知识,这需要一种算法方法。下面的示例使用经典交换,可能是实现此目的的一种方法:-

def reverse_in_place(lst):      # Declare a function
    size = len(lst)             # Get the length of the sequence
    hiindex = size - 1
    its = size/2                # Number of iterations required
    for i in xrange(0, its):    # i is the low index pointer
        temp = lst[hiindex]     # Perform a classic swap
        lst[hiindex] = lst[i]
        lst[i] = temp
        hiindex -= 1            # Decrement the high index pointer
    print "Done!"

# Now test it!!
array = [2, 5, 8, 9, 12, 19, 25, 27, 32, 60, 65, 1, 7, 24, 124, 654]

print array                    # Print the original sequence
reverse_in_place(array)        # Call the function passing the list
print array                    # Print reversed list


**The result:**
[2, 5, 8, 9, 12, 19, 25, 27, 32, 60, 65, 1, 7, 24, 124, 654]
Done!
[654, 124, 24, 7, 1, 65, 60, 32, 27, 25, 19, 12, 9, 8, 5, 2]

请注意,这不适用于元组或字符串序列,因为字符串和元组是不可变的,即,您无法写入它们来更改元素。

Using slicing, e.g. array = array[::-1], is a neat trick and very Pythonic, but a little obscure for newbies maybe. Using the reverse() method is a good way to go in day to day coding because it is easily readable.

However, if you need to reverse a list in place as in an interview question, you will likely not be able to use built in methods like these. The interviewer will be looking at how you approach the problem rather than the depth of Python knowledge, an algorithmic approach is required. The following example, using a classic swap, might be one way to do it:-

def reverse_in_place(lst):      # Declare a function
    size = len(lst)             # Get the length of the sequence
    hiindex = size - 1
    its = size/2                # Number of iterations required
    for i in xrange(0, its):    # i is the low index pointer
        temp = lst[hiindex]     # Perform a classic swap
        lst[hiindex] = lst[i]
        lst[i] = temp
        hiindex -= 1            # Decrement the high index pointer
    print "Done!"

# Now test it!!
array = [2, 5, 8, 9, 12, 19, 25, 27, 32, 60, 65, 1, 7, 24, 124, 654]

print array                    # Print the original sequence
reverse_in_place(array)        # Call the function passing the list
print array                    # Print reversed list


**The result:**
[2, 5, 8, 9, 12, 19, 25, 27, 32, 60, 65, 1, 7, 24, 124, 654]
Done!
[654, 124, 24, 7, 1, 65, 60, 32, 27, 25, 19, 12, 9, 8, 5, 2]

Note that this will not work on Tuples or string sequences, because strings and tuples are immutable, i.e., you cannot write into them to change elements.


回答 7

我发现(与其他建议相反)l.reverse()到目前为止,这是在Python 3和2中反转一长串列表的最快方法。我很想知道其他人是否可以复制这些计时。

l[::-1]可能较慢,因为它会在反转之前复制列表。在由list()进行的迭代器周围添加调用reversed(l)必须增加一些开销。当然,如果您想要列表的副本或迭代器,则可以使用相应的方法,但是,如果您只想反转列表,那么这l.reverse()似乎是最快的方法。

功能

def rev_list1(l):
    return l[::-1]

def rev_list2(l):
    return list(reversed(l))

def rev_list3(l):
    l.reverse()
    return l

清单

l = list(range(1000000))

Python 3.5计时

timeit(lambda: rev_list1(l), number=1000)
# 6.48
timeit(lambda: rev_list2(l), number=1000)
# 7.13
timeit(lambda: rev_list3(l), number=1000)
# 0.44

Python 2.7计时

timeit(lambda: rev_list1(l), number=1000)
# 6.76
timeit(lambda: rev_list2(l), number=1000)
# 9.18
timeit(lambda: rev_list3(l), number=1000)
# 0.46

I find (contrary to some other suggestions) that l.reverse() is by far the fastest way to reverse a long list in Python 3 and 2. I’d be interested to know if others can replicate these timings.

l[::-1] is probably slower because it copies the list prior to reversing it. Adding the list() call around the iterator made by reversed(l) must add some overhead. Of course if you want a copy of the list or an iterator then use those respective methods, but if you want to just reverse the list then l.reverse() seems to be the fastest way.

Functions

def rev_list1(l):
    return l[::-1]

def rev_list2(l):
    return list(reversed(l))

def rev_list3(l):
    l.reverse()
    return l

List

l = list(range(1000000))

Python 3.5 timings

timeit(lambda: rev_list1(l), number=1000)
# 6.48
timeit(lambda: rev_list2(l), number=1000)
# 7.13
timeit(lambda: rev_list3(l), number=1000)
# 0.44

Python 2.7 timings

timeit(lambda: rev_list1(l), number=1000)
# 6.76
timeit(lambda: rev_list2(l), number=1000)
# 9.18
timeit(lambda: rev_list3(l), number=1000)
# 0.46

回答 8

for x in array[::-1]:
    do stuff
for x in array[::-1]:
    do stuff

回答 9

reversedlist

>>> list1 = [1,2,3]
>>> reversed_list = list(reversed(list1))
>>> reversed_list
>>> [3, 2, 1]

With reversed and list:

>>> list1 = [1,2,3]
>>> reversed_list = list(reversed(list1))
>>> reversed_list
>>> [3, 2, 1]

回答 10

array=[0,10,20,40]
for e in reversed(array):
  print e
array=[0,10,20,40]
for e in reversed(array):
  print e

回答 11

使用reversed(array)可能是最佳途径。

>>> array = [1,2,3,4]
>>> for item in reversed(array):
>>>     print item

您是否需要了解如何在不使用内置的情况下实现此目标reversed

def reverse(a):
    midpoint = len(a)/2
    for item in a[:midpoint]:
        otherside = (len(a) - a.index(item)) - 1
        temp = a[otherside]
        a[otherside] = a[a.index(item)]
        a[a.index(item)] = temp
    return a

这需要O(N)时间。

Using reversed(array) would be the likely best route.

>>> array = [1,2,3,4]
>>> for item in reversed(array):
>>>     print item

Should you need to understand how could implement this without using the built in reversed.

def reverse(a):
    midpoint = len(a)/2
    for item in a[:midpoint]:
        otherside = (len(a) - a.index(item)) - 1
        temp = a[otherside]
        a[otherside] = a[a.index(item)]
        a[a.index(item)] = temp
    return a

This should take O(N) time.


回答 12

如果要将反向列表的元素存储在其他变量中,则可以使用revArray = array[::-1]revArray = list(reversed(array))

但是第一个变体要快一些:

z = range(1000000)
startTimeTic = time.time()
y = z[::-1]
print("Time: %s s" % (time.time() - startTimeTic))

f = range(1000000)
startTimeTic = time.time()
g = list(reversed(f))
print("Time: %s s" % (time.time() - startTimeTic))

输出:

Time: 0.00489711761475 s
Time: 0.00609302520752 s

If you want to store the elements of reversed list in some other variable, then you can use revArray = array[::-1] or revArray = list(reversed(array)).

But the first variant is slightly faster:

z = range(1000000)
startTimeTic = time.time()
y = z[::-1]
print("Time: %s s" % (time.time() - startTimeTic))

f = range(1000000)
startTimeTic = time.time()
g = list(reversed(f))
print("Time: %s s" % (time.time() - startTimeTic))

Output:

Time: 0.00489711761475 s
Time: 0.00609302520752 s

回答 13

组织值:

在Python中,列表的顺序也可以通过sort操作,以数字/字母顺序组织变量:

暂时:

print(sorted(my_list))

常驻:

my_list.sort(), print(my_list)

您可以使用标志“ reverse = True”进行排序:

print(sorted(my_list, reverse=True))

要么

my_list.sort(reverse=True), print(my_list)

没有组织

也许您不想对值进行排序,而只反转值。然后我们可以这样做:

print(list(reversed(my_list)))

** 数字按字母顺序排列优先于字母。Python价值观的组织很棒。

ORGANIZING VALUES:

In Python, lists’ order too can be manipulated with sort, organizing your variables in numerical/alphabetical order:

Temporarily:

print(sorted(my_list))

Permanent:

my_list.sort(), print(my_list)

You can sort with the flag “reverse=True”:

print(sorted(my_list, reverse=True))

or

my_list.sort(reverse=True), print(my_list)

WITHOUT ORGANIZING

Maybe you do not want to sort values, but only reverse the values. Then we can do it like this:

print(list(reversed(my_list)))

**Numbers have priority over alphabet in listing order. The Python values’ organization is awesome.


回答 14

带有解释和计时结果的方法摘要

有几个很好的答案,但是分布广泛,并且大多数没有指出每种方法的根本区别。

总体而言,最好使用内置函数/方法来进行反转,就像几乎所有函数一样。在这种情况下,与手动创建索引方法相比,它们在短列表(10个项目)上的速度大约快2到8倍,而在长列表上的速度快约300倍以上。这是有道理的,因为他们有专家来创建,检查和优化。它们还不太容易出现缺陷,并且更有可能处理边缘和角落情况。

还考虑是否要:

  1. 反转现有清单
    • 最好的解决方法是“ object.reverse()”方法
  2. 创建一个与列表相反的迭代器(因为您要将其馈送到for循环,生成器等)。
    • 最好的解决方案是’reversed(object)`,它可以创建迭代器
  3. 或创建相反顺序的完整副本
    • 最佳解决方案是使用-1步长的切片: object[::-1]

测试脚本

这是我所测试方法的开始。将这个答案中的所有代码片段放在一起,以创建一个脚本,该脚本将运行所有不同的方式来反转列表和时间(上一节中显示的输出)。

from timeit import timeit
from copy import copy

def time_str_ms(t):
    return '{0:8.2f} ms'.format(t * 1000)

方法1:使用obj.reverse()进行适当的反向

如果目标只是颠倒现有列表中项目的顺序,而不要遍历它们或使副本可用,请使用此<list>.reverse()功能。直接在列表对象上运行此命令,所有项目的顺序将颠倒:

请注意,以下内容将反转给定的原始变量,即使它也返回已反转的列表。即,您可以使用此函数输出创建副本。通常,您不会为此创建函数,但是我这样做是为了在最后使用时序代码。

我们将测试这两种方式的性能-首先只是就地反转一个列表(更改原始列表),然后复制该列表然后将其反转。

def rev_in_place(mylist):
    mylist.reverse()
    return mylist

def rev_copy_reverse(mylist):
    a = copy(mylist)
    a.reverse()
    return a

方法2:使用切片反向列表 obj[::-1]

内置的索引切片方法使您可以复制任何索引对象的一部分。

  • 它不影响原始对象
  • 它建立一个完整的列表,而不是一个迭代器

通用语法为:<object>[first_index:last_index:step]。要利用切片来创建简单的反向列表,请使用:<list>[::-1]。将选项保留为空时,它将其设置为对象的第一个和最后一个元素的默认值(如果步长为负,则相反)。

索引允许使用负数,该负数从对象索引的末尾开始倒数(即-2是倒数第二个项目)。当步长为负数时,它将从最后一项开始,并以该数量向后索引。有一些与此相关的开始-结束逻辑已被优化。

def rev_slice(mylist):
    a = mylist[::-1]
    return a

方法3:使用reversed(obj)迭代器功能反转列表

有一个reversed(indexed_object)功能:

  • 这将创建反向索引迭代器,而不是列表。如果您将其馈入循环以在大型列表中获得更好的性能,那就太好了
  • 这将创建一个副本,并且不会影响原始对象

使用原始迭代器进行测试,并从迭代器创建列表。

def reversed_iterator(mylist):
    a = reversed(mylist)
    return a

def reversed_with_list(mylist):
    a = list(reversed(mylist))
    return a

方法4:具有自定义/手动索引的反向列表

正如时间将显示的那样,创建自己的索引编制方法不是一个好主意。除非需要执行一些真正的自定义操作,否则请使用内置方法。

也就是说,列表大小较小不会带来很大的损失,但是当您扩大规模时,损失会变得很大。我确定下面的代码可以优化,但是我会坚持使用内置方法。

def rev_manual_pos_gen(mylist):
    max_index = len(mylist) - 1
    return [ mylist[max_index - index] for index in range(len(mylist)) ]

def rev_manual_neg_gen(mylist):
    ## index is 0 to 9, but we need -1 to -10
    return [ mylist[-index-1] for index in range(len(mylist)) ]

def rev_manual_index_loop(mylist):
    a = []
    reverse_index = len(mylist) - 1
    for index in range(len(mylist)):
        a.append(mylist[reverse_index - index])
    return a

def rev_manual_loop(mylist):
    a = []
    reverse_index = len(mylist)
    for index, _ in enumerate(mylist):
        reverse_index -= 1
        a.append(mylist[reverse_index])
    return a

定时每种方法

接下来是脚本的其余部分,以计时每种反转方法的时间。它显示obj.reverse()了使用reversed(obj)迭代器原地反转和创建迭代器始终是最快的,而使用切片是创建副本的最快方法。

事实证明,除非必须这样做,否则不要尝试创建自己的方式!

loops_to_test = 100000
number_of_items = 10
list_to_reverse = list(range(number_of_items))
if number_of_items < 15:
    print("a: {}".format(list_to_reverse))
print('Loops: {:,}'.format(loops_to_test))
# List of the functions we want to test with the timer, in print order
fcns = [rev_in_place, reversed_iterator, rev_slice, rev_copy_reverse,
        reversed_with_list, rev_manual_pos_gen, rev_manual_neg_gen,
        rev_manual_index_loop, rev_manual_loop]
max_name_string = max([ len(fcn.__name__) for fcn in fcns ])
for fcn in fcns:
    a = copy(list_to_reverse) # copy to start fresh each loop
    out_str = ' | out = {}'.format(fcn(a)) if number_of_items < 15 else ''
    # Time in ms for the given # of loops on this fcn
    time_str = time_str_ms(timeit(lambda: fcn(a), number=loops_to_test))
    # Get the output string for this function
    fcn_str = '{}(a):'.format(fcn.__name__)
    # Add the correct string length to accommodate the maximum fcn name
    format_str = '{{fx:{}s}} {{time}}{{rev}}'.format(max_name_string + 4)
    print(format_str.format(fx=fcn_str, time=time_str, rev=out_str))

计时结果

结果表明,缩放比例最适合用于给定任务的内置方法。换句话说,随着对象元素数量的增加,内置方法开始具有优越的性能结果。

与直接将事情串在一起相比,使用最好的内置方法直接实现所需的效果更好。也就是说,切片是最好的,如果您需要反向列表的副本-它比从reversed()函数创建列表要快,并且比复制列表然后就地执行要快obj.reverse()。但是,如果您真正需要这两种方法中的任何一种,它们就会更快,但速度永远不会超过两倍。同时-自定义,手动方法可能需要更长的数量级,尤其是对于非常大的列表。

对于缩放,使用1000个项目列表,该reversed(<list>)函数调用花费约30毫秒来设置迭代器,就地反转仅花费约55毫秒,使用slice方法花费约210毫秒来创建完整的反转列表的副本,但是我做的最快的手动方法花费了大约8400毫秒!

列表中有2个项目:

a: [0, 1]
Loops: 100,000
rev_in_place(a):             24.70 ms | out = [1, 0]
reversed_iterator(a):        30.48 ms | out = <list_reverseiterator object at 0x0000020242580408>
rev_slice(a):                31.65 ms | out = [1, 0]
rev_copy_reverse(a):         63.42 ms | out = [1, 0]
reversed_with_list(a):       48.65 ms | out = [1, 0]
rev_manual_pos_gen(a):       98.94 ms | out = [1, 0]
rev_manual_neg_gen(a):       88.11 ms | out = [1, 0]
rev_manual_index_loop(a):    87.23 ms | out = [1, 0]
rev_manual_loop(a):          79.24 ms | out = [1, 0]

列表中有10个项目:

rev_in_place(a):             23.39 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
reversed_iterator(a):        30.23 ms | out = <list_reverseiterator object at 0x00000290A3CB0388>
rev_slice(a):                36.01 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
rev_copy_reverse(a):         64.67 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
reversed_with_list(a):       50.77 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
rev_manual_pos_gen(a):      162.83 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
rev_manual_neg_gen(a):      167.43 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
rev_manual_index_loop(a):   152.04 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
rev_manual_loop(a):         183.01 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

并在列表中包含1000个项目:

rev_in_place(a):             56.37 ms
reversed_iterator(a):        30.47 ms
rev_slice(a):               211.42 ms
rev_copy_reverse(a):        295.74 ms
reversed_with_list(a):      418.45 ms
rev_manual_pos_gen(a):     8410.01 ms
rev_manual_neg_gen(a):    11054.84 ms
rev_manual_index_loop(a): 10543.11 ms
rev_manual_loop(a):       15472.66 ms

Summary of Methods with Explanation and Timing Results

There are several good answers, but spread out and most don’t indicate the fundamental differences of each approach.

Overall, it is better to use built-in functions/methods to reverse, as with just about any function. In this case, they are roughly 2 to 8 times faster on short lists (10 items), and up to ~300+ times faster on long lists compared to manually created a means of indexing. This makes sense as they have experts creating them, scrutiny, and optimization. They are also less prone to defects and more likely to handle edge and corner cases.

Also consider whether you want to:

  1. Reverse an existing list in-place
    • Best solution is `object.reverse()’ method
  2. Create an iterator of the reverse of the list (because you are going to feed it to a for-loop, a generator, etc.)
    • Best solution is ‘reversed(object)` which creates the iterator
  3. or create a complete copy that is in the reverse order
    • Best solution is using slices with a -1 step size: object[::-1]

Test Script

Here is the start of my test script for the methods covered. Put all the code snippets in this answer together to make a script that will run all the different ways of reversing a list and time each one (output shown in the last section).

from timeit import timeit
from copy import copy

def time_str_ms(t):
    return '{0:8.2f} ms'.format(t * 1000)

Method 1: Reverse in place with obj.reverse()

If the goal is just to reverse the order of the items in an existing list, without looping over them or getting a copy to work with, use the <list>.reverse() function. Run this directly on a list object, and the order of all items will be reversed:

Note that the following will reverse the original variable that is given, even though it also returns the reversed list back. i.e. you can create a copy by using this function output. Typically, you wouldn’t make a function for this, but I did so to use the timing code at the end.

We’ll test the performance of this two ways – first just reversing a list in-place (changes the original list), and then copying the list and reversing it afterward.

def rev_in_place(mylist):
    mylist.reverse()
    return mylist

def rev_copy_reverse(mylist):
    a = copy(mylist)
    a.reverse()
    return a

Method 2: Reverse a list using slices obj[::-1]

The built-in index slicing method allows you to make a copy of part of any indexed object.

  • It does not affect the original object
  • It builds a full list, not an iterator

The generic syntax is: <object>[first_index:last_index:step]. To exploit slicing to create a simple reversed list, use: <list>[::-1]. When leaving an option empty, it sets them to defaults of the first and last element of the object (reversed if the step size is negative).

Indexing allows one to use negative numbers, which count from the end of the object’s index backwards (i.e. -2 is the second to last item). When the step size is negative, it will start with the last item and index backward by that amount. There is some start-end logic associated with this that has be optimized.

def rev_slice(mylist):
    a = mylist[::-1]
    return a

Method 3: Reverse a list with the reversed(obj) iterator function

There is a reversed(indexed_object) function:

  • This creates a reverse index iterator, not a list. Great if you are feeding it to a loop for better performance on large lists
  • This creates a copy and does not affect the original object

Test with both a raw iterator, and creating a list from the iterator.

def reversed_iterator(mylist):
    a = reversed(mylist)
    return a

def reversed_with_list(mylist):
    a = list(reversed(mylist))
    return a

Method 4: Reverse list with Custom/Manual indexing

As the timing will show, creating your own methods of indexing is a bad idea. Use the built-in methods unless you need to do something really custom.

That said, there is not a huge penalty with smaller list sizes, but when you scale up the penalty becomes tremendous. I’m sure my code below could be optimized, but I’ll stick with the built-in methods.

def rev_manual_pos_gen(mylist):
    max_index = len(mylist) - 1
    return [ mylist[max_index - index] for index in range(len(mylist)) ]

def rev_manual_neg_gen(mylist):
    ## index is 0 to 9, but we need -1 to -10
    return [ mylist[-index-1] for index in range(len(mylist)) ]

def rev_manual_index_loop(mylist):
    a = []
    reverse_index = len(mylist) - 1
    for index in range(len(mylist)):
        a.append(mylist[reverse_index - index])
    return a

def rev_manual_loop(mylist):
    a = []
    reverse_index = len(mylist)
    for index, _ in enumerate(mylist):
        reverse_index -= 1
        a.append(mylist[reverse_index])
    return a

Timing each method

Following is the rest of the script to time each method of reversing. It shows reversing in place with obj.reverse() and creating the reversed(obj) iterator are always the fastest, while using slices is the fastest way to create a copy.

It also proves not to try to create a way of doing it on your own unless you have to!

loops_to_test = 100000
number_of_items = 10
list_to_reverse = list(range(number_of_items))
if number_of_items < 15:
    print("a: {}".format(list_to_reverse))
print('Loops: {:,}'.format(loops_to_test))
# List of the functions we want to test with the timer, in print order
fcns = [rev_in_place, reversed_iterator, rev_slice, rev_copy_reverse,
        reversed_with_list, rev_manual_pos_gen, rev_manual_neg_gen,
        rev_manual_index_loop, rev_manual_loop]
max_name_string = max([ len(fcn.__name__) for fcn in fcns ])
for fcn in fcns:
    a = copy(list_to_reverse) # copy to start fresh each loop
    out_str = ' | out = {}'.format(fcn(a)) if number_of_items < 15 else ''
    # Time in ms for the given # of loops on this fcn
    time_str = time_str_ms(timeit(lambda: fcn(a), number=loops_to_test))
    # Get the output string for this function
    fcn_str = '{}(a):'.format(fcn.__name__)
    # Add the correct string length to accommodate the maximum fcn name
    format_str = '{{fx:{}s}} {{time}}{{rev}}'.format(max_name_string + 4)
    print(format_str.format(fx=fcn_str, time=time_str, rev=out_str))

Timing Results

The results show that scaling works best with the built-in methods best suited for a given task. In other words, as the object element count increases, the built-in methods begin to have far superior performance results.

You are also better off using the best built-in method that directly achieves what you need than to string things together. i.e. slicing is best if you need a copy of the reversed list – it’s faster than creating a list from the reversed() function, and faster than making a copy of the list and then doing an in-place obj.reverse(). But if either of those methods are really all you need, they are faster, but never by more than double the speed. Meanwhile – custom, manual methods can take orders of magnitude longer, especially with very large lists.

For scaling, with a 1000 item list, the reversed(<list>) function call takes ~30 ms to setup the iterator, reversing in-place takes just ~55 ms, using the slice method takes ~210 ms to create a copy of the full reversed list, but the quickest manual method I made took ~8400 ms!!

With 2 items in the list:

a: [0, 1]
Loops: 100,000
rev_in_place(a):             24.70 ms | out = [1, 0]
reversed_iterator(a):        30.48 ms | out = <list_reverseiterator object at 0x0000020242580408>
rev_slice(a):                31.65 ms | out = [1, 0]
rev_copy_reverse(a):         63.42 ms | out = [1, 0]
reversed_with_list(a):       48.65 ms | out = [1, 0]
rev_manual_pos_gen(a):       98.94 ms | out = [1, 0]
rev_manual_neg_gen(a):       88.11 ms | out = [1, 0]
rev_manual_index_loop(a):    87.23 ms | out = [1, 0]
rev_manual_loop(a):          79.24 ms | out = [1, 0]

With 10 items in the list:

rev_in_place(a):             23.39 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
reversed_iterator(a):        30.23 ms | out = <list_reverseiterator object at 0x00000290A3CB0388>
rev_slice(a):                36.01 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
rev_copy_reverse(a):         64.67 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
reversed_with_list(a):       50.77 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
rev_manual_pos_gen(a):      162.83 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
rev_manual_neg_gen(a):      167.43 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
rev_manual_index_loop(a):   152.04 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
rev_manual_loop(a):         183.01 ms | out = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

And with 1000 items in the list:

rev_in_place(a):             56.37 ms
reversed_iterator(a):        30.47 ms
rev_slice(a):               211.42 ms
rev_copy_reverse(a):        295.74 ms
reversed_with_list(a):      418.45 ms
rev_manual_pos_gen(a):     8410.01 ms
rev_manual_neg_gen(a):    11054.84 ms
rev_manual_index_loop(a): 10543.11 ms
rev_manual_loop(a):       15472.66 ms

回答 15

使用一些逻辑

使用一些古老的逻辑来练习面试。

从前到后交换数字。使用两个指针index[0] and index[last]

def reverse(array):
    n = array
    first = 0
    last = len(array) - 1
    while first < last:
      holder = n[first]
      n[first] = n[last]
      n[last] = holder
      first += 1
      last -= 1
    return n

input -> [-1 ,1, 2, 3, 4, 5, 6]
output -> [6, 1, 2, 3, 4, 5, -1]

Using some logic

Using some old school logic to practice for interviews.

Swapping numbers front to back. Using two pointers index[0] and index[last]

def reverse(array):
    n = array
    first = 0
    last = len(array) - 1
    while first < last:
      holder = n[first]
      n[first] = n[last]
      n[last] = holder
      first += 1
      last -= 1
    return n

input -> [-1 ,1, 2, 3, 4, 5, 6]
output -> [6, 1, 2, 3, 4, 5, -1]

回答 16

您还可以使用数组索引的按位补码来反向遍历数组:

>>> array = [0, 10, 20, 40]
>>> [array[~i] for i, _ in enumerate(array)]
[40, 20, 10, 0]

无论您做什么,都不要这样。

You can also use the bitwise complement of the array index to step through the array in reverse:

>>> array = [0, 10, 20, 40]
>>> [array[~i] for i, _ in enumerate(array)]
[40, 20, 10, 0]

Whatever you do, don’t do it this way.


回答 17

使用清单理解:

[array[n] for n in range(len(array)-1, -1, -1)]

Use list comprehension:

[array[n] for n in range(len(array)-1, -1, -1)]

回答 18

另一种解决方案是使用numpy.flip

import numpy as np
array = [0, 10, 20, 40]
list(np.flip(array))
[40, 20, 10, 0]

Another solution would be to use numpy.flip for this

import numpy as np
array = [0, 10, 20, 40]
list(np.flip(array))
[40, 20, 10, 0]

回答 19

严格来说,问题不是如何反向返回列表,而是如何反向显示带有示例列表名称的列表array

要反转一个名为"array"use 的列表array.reverse()

通过使用将列表定义为自身的切片修改,还可以使用如上所述非常有用的切片方法将列表反向显示array = array[::-1]

Strictly speaking, the question is not how to return a list in reverse but rather how to reverse a list with an example list name array.

To reverse a list named "array" use array.reverse().

The incredibly useful slice method as described can also be used to reverse a list in place by defining the list as a sliced modification of itself using array = array[::-1].


回答 20

def reverse(text):
    output = []
    for i in range(len(text)-1, -1, -1):
        output.append(text[i])
    return output
def reverse(text):
    output = []
    for i in range(len(text)-1, -1, -1):
        output.append(text[i])
    return output

回答 21

使用最少的内置功能(假设它是采访设置)

array = [1, 2, 3, 4, 5, 6,7, 8]
inverse = [] #create container for inverse array
length = len(array)  #to iterate later, returns 8 
counter = length - 1  #because the 8th element is on position 7 (as python starts from 0)

for i in range(length): 
   inverse.append(array[counter])
   counter -= 1
print(inverse)

With minimum amount of built-in functions, assuming it’s interview settings

array = [1, 2, 3, 4, 5, 6,7, 8]
inverse = [] #create container for inverse array
length = len(array)  #to iterate later, returns 8 
counter = length - 1  #because the 8th element is on position 7 (as python starts from 0)

for i in range(length): 
   inverse.append(array[counter])
   counter -= 1
print(inverse)

回答 22

您的需求到Python中最直接的翻译是以下for语句:

for i in xrange(len(array) - 1, -1, -1):
   print i, array[i]

这相当神秘,但可能有用。

The most direct translation of your requirement into Python is this for statement:

for i in xrange(len(array) - 1, -1, -1):
   print i, array[i]

This is rather cryptic but may be useful.


回答 23

def reverse(my_list):
  L = len(my_list)
  for i in range(L/2):
    my_list[i], my_list[L-i - 1] = my_list[L-i-1], my_list[i]
  return my_list
def reverse(my_list):
  L = len(my_list)
  for i in range(L/2):
    my_list[i], my_list[L-i - 1] = my_list[L-i-1], my_list[i]
  return my_list

回答 24

您总是可以像对待堆栈一样对待列表,只是将元素从列表的后端弹出堆栈顶部。这样,您就可以利用堆栈的后进先出特性。当然,您正在使用第一个数组。我确实喜欢这种方法,因为它非常直观,您可以看到一个列表是从后端使用的,而另一个列表是从前端构建的。

>>> l = [1,2,3,4,5,6]; nl=[]
>>> while l:
        nl.append(l.pop())  
>>> print nl
[6, 5, 4, 3, 2, 1]

You could always treat the list like a stack just popping the elements off the top of the stack from the back end of the list. That way you take advantage of first in last out characteristics of a stack. Of course you are consuming the 1st array. I do like this method in that it’s pretty intuitive in that you see one list being consumed from the back end while the other is being built from the front end.

>>> l = [1,2,3,4,5,6]; nl=[]
>>> while l:
        nl.append(l.pop())  
>>> print nl
[6, 5, 4, 3, 2, 1]

回答 25

list_data = [1,2,3,4,5]
l = len(list_data)
i=l+1
rev_data = []
while l>0:
  j=i-l
  l-=1
  rev_data.append(list_data[-j])
print "After Rev:- %s" %rev_data 
list_data = [1,2,3,4,5]
l = len(list_data)
i=l+1
rev_data = []
while l>0:
  j=i-l
  l-=1
  rev_data.append(list_data[-j])
print "After Rev:- %s" %rev_data 

回答 26

采用

print(reversed(list_name))

use

print(reversed(list_name))

回答 27

这是一种使用生成器懒洋洋地求逆的方法:

def reverse(seq):
    for x in range(len(seq), -1, -1): #Iterate through a sequence starting from -1 and increasing by -1.
        yield seq[x] #Yield a value to the generator

现在像这样迭代:

for x in reverse([1, 2, 3]):
    print(x)

如果需要列表:

l = list(reverse([1, 2, 3]))

Here’s a way to lazily evaluate the reverse using a generator:

def reverse(seq):
    for x in range(len(seq), -1, -1): #Iterate through a sequence starting from -1 and increasing by -1.
        yield seq[x] #Yield a value to the generator

Now iterate through like this:

for x in reverse([1, 2, 3]):
    print(x)

If you need a list:

l = list(reverse([1, 2, 3]))

回答 28

有3种方法可以获取反向列表:

  1. 切片方法1: reversed_array = array[-1::-1]

  2. 切片方法2: reversed_array2 = array[::-1]

  3. 使用内置函数: reversed_array = array.reverse()

第三个功能实际上是将列表对象反转到位。这意味着不保留原始数据的副本。如果您不想维护旧版本,这是一个好方法。但是,如果您确实想要原始版本和反向版本,这似乎不是解决方案。

There are 3 methods to get the reversed list:

  1. Slicing Method 1: reversed_array = array[-1::-1]

  2. Slicing Method 2: reversed_array2 = array[::-1]

  3. Using the builtin function: reversed_array = array.reverse()

The third function actually reversed the list object in place. That means no copy of pristine data is maintained. This is a good approach if you don’t want to maintain the old version. But doesn’t seem to be a solution if you do want the pristine and reversed version.


回答 29

>>> l = [1, 2, 3, 4, 5]
>>> print(reduce(lambda acc, x: [x] + acc, l, []))
[5, 4, 3, 2, 1]
>>> l = [1, 2, 3, 4, 5]
>>> print(reduce(lambda acc, x: [x] + acc, l, []))
[5, 4, 3, 2, 1]

列表和元组之间有什么区别?

问题:列表和元组之间有什么区别?

有什么不同?

元组/列表的优点/缺点是什么?

What’s the difference?

What are the advantages / disadvantages of tuples / lists?


回答 0

除了元组是不可变的之外,还有语义上的区别应指导它们的使用。元组是异构数据结构(即,它们的条目具有不同的含义),而列表是同类序列。元组具有结构,列表具有顺序。

使用这种区别可以使代码更加明确和易于理解。

一个示例是成对的页和行号,以成对参考书中的位置,例如:

my_location = (42, 11)  # page number, line number

然后,您可以将其用作字典中的键来存储有关位置的注释。另一方面,列表可用于存储多个位置。自然地,人们可能想在列表中添加或删除位置,因此列表是可变的很有意义。另一方面,从现有位置添加或删除项目没有意义-因此,元组是不可变的。

在某些情况下,您可能想更改现有位置元组中的项目,例如在页面的各行中进行迭代时。但是元组不变性迫使您为每个新值创建一个新的位置元组。从表面上看,这似乎很不方便,但是使用这样的不可变数据是值类型和函数编程技术的基石,可以具有很多优点。

关于此问题,有一些有趣的文章,例如“ Python元组不仅仅是常量列表”“了解Python中的元组与列表”。官方Python文档也提到了这一点

“组是不可变的,并且通常包含一个异类序列…”。

在像Haskell这样的静态类型语言中,元组中的值通常具有不同的类型,并且元组的长度必须固定。在列表中,所有值都具有相同的类型,并且长度不是固定的。因此区别非常明显。

最后,在Python中有一个namedtuple,这很有意义,因为一个元组已经被认为具有结构。这强调了元组是类和实例的轻量级替代方案的思想。

Apart from tuples being immutable there is also a semantic distinction that should guide their usage. Tuples are heterogeneous data structures (i.e., their entries have different meanings), while lists are homogeneous sequences. Tuples have structure, lists have order.

Using this distinction makes code more explicit and understandable.

One example would be pairs of page and line number to reference locations in a book, e.g.:

my_location = (42, 11)  # page number, line number

You can then use this as a key in a dictionary to store notes on locations. A list on the other hand could be used to store multiple locations. Naturally one might want to add or remove locations from the list, so it makes sense that lists are mutable. On the other hand it doesn’t make sense to add or remove items from an existing location – hence tuples are immutable.

There might be situations where you want to change items within an existing location tuple, for example when iterating through the lines of a page. But tuple immutability forces you to create a new location tuple for each new value. This seems inconvenient on the face of it, but using immutable data like this is a cornerstone of value types and functional programming techniques, which can have substantial advantages.

There are some interesting articles on this issue, e.g. “Python Tuples are Not Just Constant Lists” or “Understanding tuples vs. lists in Python”. The official Python documentation also mentions this

“Tuples are immutable, and usually contain an heterogeneous sequence …”.

In a statically typed language like Haskell the values in a tuple generally have different types and the length of the tuple must be fixed. In a list the values all have the same type and the length is not fixed. So the difference is very obvious.

Finally there is the namedtuple in Python, which makes sense because a tuple is already supposed to have structure. This underlines the idea that tuples are a light-weight alternative to classes and instances.


回答 1

列表和元组之间的区别

  1. 文字

    someTuple = (1,2)
    someList  = [1,2] 
  2. 尺寸

    a = tuple(range(1000))
    b = list(range(1000))
    
    a.__sizeof__() # 8024
    b.__sizeof__() # 9088

    由于元组操作的大小较小,因此它变得更快一些,但是在您拥有大量元素之前,不必多说。

  3. 允许的操作

    b    = [1,2]   
    b[0] = 3       # [3, 2]
    
    a    = (1,2)
    a[0] = 3       # Error

    这也意味着您不能删除元素或对元组进行排序。但是,您可以在列表和元组中都添加一个新元素,唯一的区别是,由于元组是不可变的,因此您并不是真正在添加元素,而是要创建一个新的元组,因此id将会改变

    a     = (1,2)
    b     = [1,2]  
    
    id(a)          # 140230916716520
    id(b)          # 748527696
    
    a   += (3,)    # (1, 2, 3)
    b   += [3]     # [1, 2, 3]
    
    id(a)          # 140230916878160
    id(b)          # 748527696
  4. 用法

    由于列表是可变的,因此不能用作字典中的键,而可以使用元组。

    a    = (1,2)
    b    = [1,2] 
    
    c = {a: 1}     # OK
    c = {b: 1}     # Error

Difference between list and tuple

  1. Literal

    someTuple = (1,2)
    someList  = [1,2] 
    
  2. Size

    a = tuple(range(1000))
    b = list(range(1000))
    
    a.__sizeof__() # 8024
    b.__sizeof__() # 9088
    

    Due to the smaller size of a tuple operation, it becomes a bit faster, but not that much to mention about until you have a huge number of elements.

  3. Permitted operations

    b    = [1,2]   
    b[0] = 3       # [3, 2]
    
    a    = (1,2)
    a[0] = 3       # Error
    

    That also means that you can’t delete an element or sort a tuple. However, you could add a new element to both list and tuple with the only difference that since the tuple is immutable, you are not really adding an element but you are creating a new tuple, so the id of will change

    a     = (1,2)
    b     = [1,2]  
    
    id(a)          # 140230916716520
    id(b)          # 748527696
    
    a   += (3,)    # (1, 2, 3)
    b   += [3]     # [1, 2, 3]
    
    id(a)          # 140230916878160
    id(b)          # 748527696
    
  4. Usage

    As a list is mutable, it can’t be used as a key in a dictionary, whereas a tuple can be used.

    a    = (1,2)
    b    = [1,2] 
    
    c = {a: 1}     # OK
    c = {b: 1}     # Error
    

回答 2

如果您去散散步,您可以随时在 (x,y)元组中。

如果要记录您的旅程,可以每隔几秒钟将您的位置附加到一个列表中。

但您无法做到这一点。

If you went for a walk, you could note your coordinates at any instant in an (x,y) tuple.

If you wanted to record your journey, you could append your location every few seconds to a list.

But you couldn’t do it the other way around.


回答 3

关键区别在于元组是不可变的。这意味着一旦创建元组,就无法更改其值。

因此,如果您需要更改值,请使用列表。

对元组的好处:

  1. 性能略有改善。
  2. 由于元组是不可变的,因此可以将其用作字典中的键。
  3. 如果您无法更改它,那么其他任何人也不能更改它,也就是说,您无需担心任何API函数等。无需询问即可更改元组。

The key difference is that tuples are immutable. This means that you cannot change the values in a tuple once you have created it.

So if you’re going to need to change the values use a List.

Benefits to tuples:

  1. Slight performance improvement.
  2. As a tuple is immutable it can be used as a key in a dictionary.
  3. If you can’t change it neither can anyone else, which is to say you don’t need to worry about any API functions etc. changing your tuple without being asked.

回答 4

列表是可变的;元组不是。

来自docs.python.org/2/tutorial/datastructures.html

元组是不可变的,通常包含一个异类元素序列,这些元素可以通过拆包(请参阅本节后面的内容)或索引(甚至在命名元组的情况下通过属性)进行访问。列表是可变的,并且它们的元素通常是同类的,并且可以通过遍历列表来访问。

Lists are mutable; tuples are not.

From docs.python.org/2/tutorial/datastructures.html

Tuples are immutable, and usually contain an heterogeneous sequence of elements that are accessed via unpacking (see later in this section) or indexing (or even by attribute in the case of namedtuples). Lists are mutable, and their elements are usually homogeneous and are accessed by iterating over the list.


回答 5

被提及的差异主要语义:人们期待一个元组和列表来表示不同的信息。但这远远超出了指导原则。有些库实际上根据传递的内容而有所不同。以NumPy为例(从我要求更多示例的另一篇文章中复制):

>>> import numpy as np
>>> a = np.arange(9).reshape(3,3)
>>> a
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
>>> idx = (1,1)
>>> a[idx]
4
>>> idx = [1,1]
>>> a[idx]
array([[3, 4, 5],
       [3, 4, 5]])

关键是,虽然NumPy可能不是标准库的一部分,但它是一个主要的 Python库,在NumPy列表和元组中是完全不同的东西。

It’s been mentioned that the difference is largely semantic: people expect a tuple and list to represent different information. But this goes further than a guideline; some libraries actually behave differently based on what they are passed. Take NumPy for example (copied from another post where I ask for more examples):

>>> import numpy as np
>>> a = np.arange(9).reshape(3,3)
>>> a
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
>>> idx = (1,1)
>>> a[idx]
4
>>> idx = [1,1]
>>> a[idx]
array([[3, 4, 5],
       [3, 4, 5]])

The point is, while NumPy may not be part of the standard library, it’s a major Python library, and within NumPy lists and tuples are completely different things.


回答 6

列表用于循环,元组用于结构,即"%s %s" %tuple

列表通常是同质的,元组通常是异类的。

列表用于可变长度,元组用于固定长度。

Lists are for looping, tuples are for structures i.e. "%s %s" %tuple.

Lists are usually homogeneous, tuples are usually heterogeneous.

Lists are for variable length, tuples are for fixed length.


回答 7

这是Python列表的示例:

my_list = [0,1,2,3,4]
top_rock_list = ["Bohemian Rhapsody","Kashmir","Sweet Emotion", "Fortunate Son"]

这是Python元组的示例:

my_tuple = (a,b,c,d,e)
celebrity_tuple = ("John", "Wayne", 90210, "Actor", "Male", "Dead")

Python列表和元组的相似之处在于它们都是值的有序集合。除了使用括号“ […,…]”创建列表的浅层差异以及使用括号“(…,…)”创建的元组之外,它们之间的核心技术“用Python语法进行硬编码”之间的差异是特定元组的元素是不可变的,而列表是可变的(…因此,只有元组是可哈希的,并且可以用作字典/哈希键!)。这就导致了它们的使用方式或不使用方式的差异(通过语法先验地实现)以及人们选择使用它们的方式上的差异(鼓励作为“最佳实践”,后验,这就是智能程序员所做的事情)。 人们赋予元素顺序。

对于元组,“顺序”仅表示存储信息的特定“结构”。在第一个字段中找到的值可以很容易地切换到第二个字段,因为每个值都提供跨两个不同维度或比例的值。它们为不同类型的问题提供答案,并且通常采用以下形式:对于给定的对象/对象,其属性是什么?对象/对象保持不变,属性不同。

对于列表,“顺序”表示顺序或方向。第二个元素必须位于第一个元素之后,因为它基于特定且通用的比例或维度位于第二位。这些元素是一个整体,并且通常针对一个给定属性的形式单个问题提供答案,对于给定的属性,这些对象/对象如何比较?属性保持不变,对象/主题不同。

有无数流行文化的人和不符合这些差异的程序员的例子,有无数人可能在主菜上使用色叉。一天结束后,一切都很好,通常都可以完成工作。

总结一些更好的细节

相似之处:

  1. 重复项 -元组和列表都允许重复项
  2. 索引,选择和切片 -元组和列表都使用括号内的整数值进行索引。因此,如果要给定列表或元组的前三个值,语法将是相同的:

    >>> my_list[0:3]
    [0,1,2]
    >>> my_tuple[0:3]
    [a,b,c]
  3. 比较和排序 -两个元组或两个列表都通过它们的第一个元素进行比较,如果有平局,则通过第二个元素进行比较,依此类推。在较早的元素显示出不同之后,不再关注后续元素。

    >>> [0,2,0,0,0,0]>[0,0,0,0,0,500]
    True
    >>> (0,2,0,0,0,0)>(0,0,0,0,0,500)
    True

区别: -先验,根据定义

  1. 语法 -列表使用[],元组使用()

  2. 可变性 -给定列表中的元素是可变的,给定元组中的元素不是可变的。

    # Lists are mutable:
    >>> top_rock_list
    ['Bohemian Rhapsody', 'Kashmir', 'Sweet Emotion', 'Fortunate Son']
    >>> top_rock_list[1]
    'Kashmir'
    >>> top_rock_list[1] = "Stairway to Heaven"
    >>> top_rock_list
    ['Bohemian Rhapsody', 'Stairway to Heaven', 'Sweet Emotion', 'Fortunate Son']
    
    # Tuples are NOT mutable:       
    >>> celebrity_tuple
    ('John', 'Wayne', 90210, 'Actor', 'Male', 'Dead')
    >>> celebrity_tuple[5]
    'Dead'
    >>> celebrity_tuple[5]="Alive"
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: 'tuple' object does not support item assignment
  3. 哈希表(字典) -由于哈希表(字典)要求其键是可哈希的,因此是不可变的,因此只有元组可以用作字典键,而不能用作列表。

    #Lists CAN'T act as keys for hashtables(dictionaries)
    >>> my_dict = {[a,b,c]:"some value"}
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: unhashable type: 'list'
    
    #Tuples CAN act as keys for hashtables(dictionaries)
    >>> my_dict = {("John","Wayne"): 90210}
    >>> my_dict
    {('John', 'Wayne'): 90210}

差异-后验用法

  1. 元素的均质性与异质性-通常,列表对象是同质的,而元组对象是异质的。也就是说,列表用于相同类型的对象/对象(例如所有总统候选人,所有歌曲或所有跑步者),而虽然不是强制的,但元组更多地用于异构对象。

  2. 循环与结构-尽管两者都允许循环(对于my_list中的x,…),但实际上对于列表而言才有意义。元组更适合于结构化和呈现信息(驻留在%s中的%s%s是%s,当前是%s%(“ John”,“ Wayne”,90210,“ Actor”,“ Dead”))

This is an example of Python lists:

my_list = [0,1,2,3,4]
top_rock_list = ["Bohemian Rhapsody","Kashmir","Sweet Emotion", "Fortunate Son"]

This is an example of Python tuple:

my_tuple = (a,b,c,d,e)
celebrity_tuple = ("John", "Wayne", 90210, "Actor", "Male", "Dead")

Python lists and tuples are similar in that they both are ordered collections of values. Besides the shallow difference that lists are created using brackets “[ … , … ]” and tuples using parentheses “( … , … )”, the core technical “hard coded in Python syntax” difference between them is that the elements of a particular tuple are immutable whereas lists are mutable (…so only tuples are hashable and can be used as dictionary/hash keys!). This gives rise to differences in how they can or can’t be used (enforced a priori by syntax) and differences in how people choose to use them (encouraged as ‘best practices,’ a posteriori, this is what smart programers do). The main difference a posteriori in differentiating when tuples are used versus when lists are used lies in what meaning people give to the order of elements.

For tuples, ‘order’ signifies nothing more than just a specific ‘structure’ for holding information. What values are found in the first field can easily be switched into the second field as each provides values across two different dimensions or scales. They provide answers to different types of questions and are typically of the form: for a given object/subject, what are its attributes? The object/subject stays constant, the attributes differ.

For lists, ‘order’ signifies a sequence or a directionality. The second element MUST come after the first element because it’s positioned in the 2nd place based on a particular and common scale or dimension. The elements are taken as a whole and mostly provide answers to a single question typically of the form, for a given attribute, how do these objects/subjects compare? The attribute stays constant, the object/subject differs.

There are countless examples of people in popular culture and programmers who don’t conform to these differences and there are countless people who might use a salad fork for their main course. At the end of the day, it’s fine and both can usually get the job done.

To summarize some of the finer details

Similarities:

  1. Duplicates – Both tuples and lists allow for duplicates
  2. Indexing, Selecting, & Slicing – Both tuples and lists index using integer values found within brackets. So, if you want the first 3 values of a given list or tuple, the syntax would be the same:

    >>> my_list[0:3]
    [0,1,2]
    >>> my_tuple[0:3]
    [a,b,c]
    
  3. Comparing & Sorting – Two tuples or two lists are both compared by their first element, and if there is a tie, then by the second element, and so on. No further attention is paid to subsequent elements after earlier elements show a difference.

    >>> [0,2,0,0,0,0]>[0,0,0,0,0,500]
    True
    >>> (0,2,0,0,0,0)>(0,0,0,0,0,500)
    True
    

Differences: – A priori, by definition

  1. Syntax – Lists use [], tuples use ()

  2. Mutability – Elements in a given list are mutable, elements in a given tuple are NOT mutable.

    # Lists are mutable:
    >>> top_rock_list
    ['Bohemian Rhapsody', 'Kashmir', 'Sweet Emotion', 'Fortunate Son']
    >>> top_rock_list[1]
    'Kashmir'
    >>> top_rock_list[1] = "Stairway to Heaven"
    >>> top_rock_list
    ['Bohemian Rhapsody', 'Stairway to Heaven', 'Sweet Emotion', 'Fortunate Son']
    
    # Tuples are NOT mutable:       
    >>> celebrity_tuple
    ('John', 'Wayne', 90210, 'Actor', 'Male', 'Dead')
    >>> celebrity_tuple[5]
    'Dead'
    >>> celebrity_tuple[5]="Alive"
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: 'tuple' object does not support item assignment
    
  3. Hashtables (Dictionaries) – As hashtables (dictionaries) require that its keys are hashable and therefore immutable, only tuples can act as dictionary keys, not lists.

    #Lists CAN'T act as keys for hashtables(dictionaries)
    >>> my_dict = {[a,b,c]:"some value"}
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: unhashable type: 'list'
    
    #Tuples CAN act as keys for hashtables(dictionaries)
    >>> my_dict = {("John","Wayne"): 90210}
    >>> my_dict
    {('John', 'Wayne'): 90210}
    

Differences – A posteriori, in usage

  1. Homo vs. Heterogeneity of Elements – Generally list objects are homogenous and tuple objects are heterogeneous. That is, lists are used for objects/subjects of the same type (like all presidential candidates, or all songs, or all runners) whereas although it’s not forced by), whereas tuples are more for heterogenous objects.

  2. Looping vs. Structures – Although both allow for looping (for x in my_list…), it only really makes sense to do it for a list. Tuples are more appropriate for structuring and presenting information (%s %s residing in %s is an %s and presently %s % (“John”,”Wayne”,90210, “Actor”,”Dead”))


回答 8

list的值可以随时更改,但是元组的值不能更改。

优点和缺点取决于使用。如果您拥有从未更改过的数据,则必须使用元组,否则list是最佳选择。

The values of list can be changed any time but the values of tuples can’t be change.

The advantages and disadvantages depends upon the use. If you have such a data which you never want to change then you should have to use tuple, otherwise list is the best option.


回答 9

列表和元组之间的区别

元组和列表在Python中似乎都是相似的序列类型。

  1. 文字语法

    我们使用括号()构造元组和方括号[ ]以获取新列表。另外,我们可以使用适当类型的调用来获取所需的结构-元组或列表。

    someTuple = (4,6)
    someList  = [2,6] 
  2. 变异性

    元组是不可变的,而列表是可变的。这是以下几点的基础。

  3. 内存使用情况

    由于可变性,您需要更多的内存用于列表,而更少的内存用于元组。

  4. 延伸

    您可以将新元素添加到元组和列表中,唯一的区别是将更改元组的ID(即,我们将有一个新的对象)。

  5. 散列

    元组可散列,而列表则不可。这意味着您可以将元组用作字典中的键。该列表不能用作字典中的键,而可以使用元组

    tup      = (1,2)
    list_    = [1,2] 
    
    c = {tup   : 1}     # ok
    c = {list_ : 1}     # error
  6. 语义学

    这一点是关于最佳实践的。您应该将元组用作异构数据结构,而列表则是同质序列。

Difference between list and tuple

Tuples and lists are both seemingly similar sequence types in Python.

  1. Literal syntax

    We use parenthesis () to construct tuples and square brackets [ ] to get a new list. Also, we can use call of the appropriate type to get required structure — tuple or list.

    someTuple = (4,6)
    someList  = [2,6] 
    
  2. Mutability

    Tuples are immutable, while lists are mutable. This point is the base the for the following ones.

  3. Memory usage

    Due to mutability, you need more memory for lists and less memory for tuples.

  4. Extending

    You can add a new element to both tuples and lists with the only difference that the id of the tuple will be changed (i.e., we’ll have a new object).

  5. Hashing

    Tuples are hashable and lists are not. It means that you can use a tuple as a key in a dictionary. The list can’t be used as a key in a dictionary, whereas a tuple can be used

    tup      = (1,2)
    list_    = [1,2] 
    
    c = {tup   : 1}     # ok
    c = {list_ : 1}     # error
    
  6. Semantics

    This point is more about best practice. You should use tuples as heterogeneous data structures, while lists are homogenous sequences.


回答 10

列表旨在为同质序列,而元组为异构数据结构。

Lists are intended to be homogeneous sequences, while tuples are heterogeneous data structures.


回答 11

正如人们已经在这里回答的那样tuples,虽然lists可变但可变是不变的,但是使用元组有一个重要方面,我们必须记住

如果中tuple包含一个listdictionary内部,则即使它们tuple本身是不可变的,也可以更改它们。

例如,假设我们有一个元组,其中包含一个列表和一个字典,如下所示

my_tuple = (10,20,30,[40,50],{ 'a' : 10})

我们可以将列表的内容更改为

my_tuple[3][0] = 400
my_tuple[3][1] = 500

这使得新的元组看起来像

(10, 20, 30, [400, 500], {'a': 10})

我们也可以将元组中的字典更改为

my_tuple[4]['a'] = 500

这将使整个元组看起来像

(10, 20, 30, [400, 500], {'a': 500})

这是因为 listdictionary是对象,而这些对象并没有改变,而是其指向的内容。

因此,这些tuple遗物毫无exceptions地保持不变

As people have already answered here that tuples are immutable while lists are mutable, but there is one important aspect of using tuples which we must remember

If the tuple contains a list or a dictionary inside it, those can be changed even if the tuple itself is immutable.

For example, let’s assume we have a tuple which contains a list and a dictionary as

my_tuple = (10,20,30,[40,50],{ 'a' : 10})

we can change the contents of the list as

my_tuple[3][0] = 400
my_tuple[3][1] = 500

which makes new tuple looks like

(10, 20, 30, [400, 500], {'a': 10})

we can also change the dictionary inside tuple as

my_tuple[4]['a'] = 500

which will make the overall tuple looks like

(10, 20, 30, [400, 500], {'a': 500})

This happens because list and dictionary are the objects and these objects are not changing, but the contents its pointing to.

So the tuple remains immutable without any exception


回答 12

PEP 484 -类型提示说,该类型的元素tuple可以单独输入; 这样你可以说Tuple[str, int, float]; 但是list,随着List键入类可以采取仅一种类型的参数:List[str],这提示了2的差异确实是,前者是异质的,而后者本质上是均匀的。

另外,标准库通常使用元组作为C会返回a的标准函数的返回值struct

The PEP 484 — Type Hints says that the types of elements of a tuple can be individually typed; so that you can say Tuple[str, int, float]; but a list, with List typing class can take only one type parameter: List[str], which hints that the difference of the 2 really is that the former is heterogeneous, whereas the latter intrinsically homogeneous.

Also, the standard library mostly uses the tuple as a return value from such standard functions where the C would return a struct.


回答 13

正如人们已经提到的差异一样,我将写有关元组的原因。

为什么首选元组?

小元组的分配优化

为了减少内存碎片并加快分配速度,Python重用了旧的元组。如果不再需要一个元组,并且元组少于20个,而不是将其永久删除,Python会将其移至空闲列表。

一个空闲列表分为20组,其中每个组代表长度为n的0至20之间的元组列表。每个组最多可以存储2000个元组。第一个(零)组仅包含一个元素,代表一个空的元组。

>>> a = (1,2,3)
>>> id(a)
4427578104
>>> del a
>>> b = (1,2,4)
>>> id(b)
4427578104

在上面的示例中,我们可以看到a和b具有相同的ID。那是因为我们立即占领了一个在空闲列表中的被破坏的元组。

列表分配优化

由于可以修改列表,因此Python不会使用与元组相同的优化。但是,Python列表也有一个空闲列表,但仅用于空对象。如果GC删除或收集了一个空列表,则以后可以重复使用。

>>> a = []
>>> id(a)
4465566792
>>> del a
>>> b = []
>>> id(b)
4465566792

资料来源:https : //rushter.com/blog/python-lists-and-tuples/

为什么元组比列表高效?-> https://stackoverflow.com/a/22140115

As people have already mentioned the differences I will write about why tuples.

Why tuples are preferred?

Allocation optimization for small tuples

To reduce memory fragmentation and speed up allocations, Python reuses old tuples. If a tuple no longer needed and has less than 20 items instead of deleting it permanently Python moves it to a free list.

A free list is divided into 20 groups, where each group represents a list of tuples of length n between 0 and 20. Each group can store up to 2 000 tuples. The first (zero) group contains only 1 element and represents an empty tuple.

>>> a = (1,2,3)
>>> id(a)
4427578104
>>> del a
>>> b = (1,2,4)
>>> id(b)
4427578104

In the example above we can see that a and b have the same id. That is because we immediately occupied a destroyed tuple which was on the free list.

Allocation optimization for lists

Since lists can be modified, Python does not use the same optimization as in tuples. However, Python lists also have a free list, but it is used only for empty objects. If an empty list is deleted or collected by GC, it can be reused later.

>>> a = []
>>> id(a)
4465566792
>>> del a
>>> b = []
>>> id(b)
4465566792

Source: https://rushter.com/blog/python-lists-and-tuples/

Why tuples are efficient than lists? -> https://stackoverflow.com/a/22140115


回答 14

5.3文档中的方向引文元组和序列

尽管元组看起来类似于列表,但是它们通常用于不同的情况和不同的目的。元组是不可变的,并且通常包含异类元素序列,这些元素可以通过拆包(请参阅本节后面的内容)或索引(甚至在namedtuple的情况下通过属性)进行访问。列表是可变的,并且它们的元素通常是同质的,可以通过迭代列表来访问。

A direction quotation from the documentation on 5.3. Tuples and Sequences:

Though tuples may seem similar to lists, they are often used in different situations and for different purposes. Tuples are immutable, and usually contain a heterogeneous sequence of elements that are accessed via unpacking (see later in this section) or indexing (or even by attribute in the case of namedtuples). Lists are mutable, and their elements are usually homogeneous and are accessed by iterating over the list.


回答 15

首先,它们都是Python中的非标量对象(也称为复合对象)。

  • 元组,元素的有序序列(可以包含任何对象,而不会出现别名问题)
    • 不可变的(元组,整数,浮点数,str)
    • 使用串联+(当然会创建全新的元组)
    • 索引编制
    • 切片
    • 单例(3,) # -> (3)而不是(3) # -> 3
  • 列表(其他语言的数组),值的有序序列
    • 可变的
    • 辛格尔顿 [3]
    • 克隆 new_array = origin_array[:]
    • 列表理解[x**2 for x in range(1,7)]给您 [1,4,9,16,25,36](不可读)

使用列表可能还会导致混淆错误(指向同一对象的两个不同路径)。

First of all, they both are the non-scalar objects (also known as a compound objects) in Python.

  • Tuples, ordered sequence of elements (which can contain any object with no aliasing issue)
    • Immutable (tuple, int, float, str)
    • Concatenation using + (brand new tuple will be created of course)
    • Indexing
    • Slicing
    • Singleton (3,) # -> (3) instead of (3) # -> 3
  • List (Array in other languages), ordered sequence of values
    • Mutable
    • Singleton [3]
    • Cloning new_array = origin_array[:]
    • List comprehension [x**2 for x in range(1,7)] gives you [1,4,9,16,25,36] (Not readable)

Using list may also cause an aliasing bug (two distinct paths pointing to the same object).


回答 16

列表是可变的,元组是不可变的。只要考虑这个例子。

a = ["1", "2", "ra", "sa"]    #list
b = ("1", "2", "ra", "sa")    #tuple

现在更改list和tuple的索引值。

a[2] = 1000
print a     #output : ['1', '2', 1000, 'sa']
b[2] = 1000
print b     #output : TypeError: 'tuple' object does not support item assignment.

因此证明了以下代码对元组无效,因为我们试图更新一个元组,这是不允许的。

Lists are mutable and tuples are immutable. Just consider this example.

a = ["1", "2", "ra", "sa"]    #list
b = ("1", "2", "ra", "sa")    #tuple

Now change index values of list and tuple.

a[2] = 1000
print a     #output : ['1', '2', 1000, 'sa']
b[2] = 1000
print b     #output : TypeError: 'tuple' object does not support item assignment.

Hence proved the following code is invalid with tuple, because we attempted to update a tuple, which is not allowed.


回答 17

列表是可变的,元组是不可变的。可变项和不可变项之间的主要区别是在尝试附加项目时的内存使用情况。

创建变量时,会将一些固定内存分配给该变量。如果是列表,则分配的内存将大于实际使用的内存。例如,如果当前内存分配为100字节,则当您要追加第101个字节时,可能会另外分配100个字节(在这种情况下,总共为200个字节)。

但是,如果您知道不经常添加新元素,则应使用元组。元组精确分配所需的内存大小,从而节省了内存,尤其是在使用大容量内存块时。

List is mutable and tuples is immutable. The main difference between mutable and immutable is memory usage when you are trying to append an item.

When you create a variable, some fixed memory is assigned to the variable. If it is a list, more memory is assigned than actually used. E.g. if current memory assignment is 100 bytes, when you want to append the 101th byte, maybe another 100 bytes will be assigned (in total 200 bytes in this case).

However, if you know that you are not frequently add new elements, then you should use tuples. Tuples assigns exactly size of the memory needed, and hence saves memory, especially when you use large blocks of memory.


删除列表中的重复项

问题:删除列表中的重复项

我几乎需要编写一个程序来检查列表中是否有重复项,如果删除了重复项,则将其删除并返回一个新列表,其中包含未重复/删除的项。这就是我所拥有的,但老实说我不知道​​该怎么办。

def remove_duplicates():
    t = ['a', 'b', 'c', 'd']
    t2 = ['a', 'c', 'd']
    for t in t2:
        t.append(t.remove())
    return t

Pretty much I need to write a program to check if a list has any duplicates and if it does it removes them and returns a new list with the items that weren’t duplicated/removed. This is what I have but to be honest I do not know what to do.

def remove_duplicates():
    t = ['a', 'b', 'c', 'd']
    t2 = ['a', 'c', 'd']
    for t in t2:
        t.append(t.remove())
    return t

回答 0

获取唯一项目集合的常用方法是使用set。集是不同对象的无序集合。要从任何迭代创建集合,只需将其传递给内置函数即可。如果以后再次需要真实列表,则可以类似地将集合传递给set()list()函数。

以下示例应涵盖您尝试做的所有事情:

>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> t
[1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> list(set(t))
[1, 2, 3, 5, 6, 7, 8]
>>> s = [1, 2, 3]
>>> list(set(t) - set(s))
[8, 5, 6, 7]

从示例结果可以看出,原始订单未得到维护。如上所述,集合本身是无序集合,因此顺序丢失。将集合转换回列表时,将创建任意顺序。

维持秩序

如果订单对您很重要,那么您将不得不使用其他机制。一个非常常见的解决方案是OrderedDict在插入期间依靠保持键的顺序:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]

从Python 3.7开始,内置字典也保证可以保持插入顺序,因此,如果您使用的是Python 3.7或更高版本(或CPython 3.6),也可以直接使用它:

>>> list(dict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]

请注意,这可能会产生一些开销,先创建字典,然后再从中创建列表。如果您实际上不需要保留订单,则通常最好使用一组,特别是因为它可以为您提供更多操作。请查看此问题,以获取更多详细信息以及删除重复项时保留订单的其他方法。


最后请注意,解决方案setOrderedDict/ dict解决方案都要求您的项目是可哈希的。这通常意味着它们必须是不变的。如果必须处理不可散列的项目(例如列表对象),则必须使用慢速方法,在这种方法中,您基本上必须将每个项目与嵌套循环中的所有其他项目进行比较。

The common approach to get a unique collection of items is to use a set. Sets are unordered collections of distinct objects. To create a set from any iterable, you can simply pass it to the built-in set() function. If you later need a real list again, you can similarly pass the set to the list() function.

The following example should cover whatever you are trying to do:

>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> t
[1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> list(set(t))
[1, 2, 3, 5, 6, 7, 8]
>>> s = [1, 2, 3]
>>> list(set(t) - set(s))
[8, 5, 6, 7]

As you can see from the example result, the original order is not maintained. As mentioned above, sets themselves are unordered collections, so the order is lost. When converting a set back to a list, an arbitrary order is created.

Maintaining order

If order is important to you, then you will have to use a different mechanism. A very common solution for this is to rely on OrderedDict to keep the order of keys during insertion:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]

Starting with Python 3.7, the built-in dictionary is guaranteed to maintain the insertion order as well, so you can also use that directly if you are on Python 3.7 or later (or CPython 3.6):

>>> list(dict.fromkeys(t))
[1, 2, 3, 5, 6, 7, 8]

Note that this may have some overhead of creating a dictionary first, and then creating a list from it. If you don’t actually need to preserve the order, you’re often better off using a set, especially because it gives you a lot more operations to work with. Check out this question for more details and alternative ways to preserve the order when removing duplicates.


Finally note that both the set as well as the OrderedDict/dict solutions require your items to be hashable. This usually means that they have to be immutable. If you have to deal with items that are not hashable (e.g. list objects), then you will have to use a slow approach in which you will basically have to compare every item with every other item in a nested loop.


回答 1

在Python 2.7中,从迭代器中删除重复项并同时保持其原始顺序的新方法是:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

在Python 3.5中,OrderedDict具有C实现。我的时间表明,这是Python 3.5各种方法中最快也是最短的。

在Python 3.6中,常规字典变得有序且紧凑。(此功能适用于CPython和PyPy,但在其他实现中可能不存在)。这为我们提供了一种在保留订单的同时进行重复数据删除的最快方法:

>>> list(dict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

在Python 3.7中,保证常规dict在所有实现中都排序。 因此,最短,最快的解决方案是:

>>> list(dict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

In Python 2.7, the new way of removing duplicates from an iterable while keeping it in the original order is:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

In Python 3.5, the OrderedDict has a C implementation. My timings show that this is now both the fastest and shortest of the various approaches for Python 3.5.

In Python 3.6, the regular dict became both ordered and compact. (This feature is holds for CPython and PyPy but may not present in other implementations). That gives us a new fastest way of deduping while retaining order:

>>> list(dict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

In Python 3.7, the regular dict is guaranteed to both ordered across all implementations. So, the shortest and fastest solution is:

>>> list(dict.fromkeys('abracadabra'))
['a', 'b', 'r', 'c', 'd']

回答 2

这是单线的:list(set(source_list))会成功的。

A set是不可能重复的东西。

更新:保留订单的方法有两行:

from collections import OrderedDict
OrderedDict((x, True) for x in source_list).keys()

在这里,我们使用一个事实,即OrderedDict记住键的插入顺序,并且在更新特定键的值时不会更改它。我们插入True作为值,但是我们可以插入任何东西,只是不使用值。(也set很像dict带有忽略值的a 。)

It’s a one-liner: list(set(source_list)) will do the trick.

A set is something that can’t possibly have duplicates.

Update: an order-preserving approach is two lines:

from collections import OrderedDict
OrderedDict((x, True) for x in source_list).keys()

Here we use the fact that OrderedDict remembers the insertion order of keys, and does not change it when a value at a particular key is updated. We insert True as values, but we could insert anything, values are just not used. (set works a lot like a dict with ignored values, too.)


回答 3

>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> t
[1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> s = []
>>> for i in t:
       if i not in s:
          s.append(i)
>>> s
[1, 2, 3, 5, 6, 7, 8]
>>> t = [1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> t
[1, 2, 3, 1, 2, 5, 6, 7, 8]
>>> s = []
>>> for i in t:
       if i not in s:
          s.append(i)
>>> s
[1, 2, 3, 5, 6, 7, 8]

回答 4

如果您不关心订单,请执行以下操作:

def remove_duplicates(l):
    return list(set(l))

set保证A 没有重复项。

If you don’t care about the order, just do this:

def remove_duplicates(l):
    return list(set(l))

A set is guaranteed to not have duplicates.


回答 5

制作一个新列表,其中保留重复项中第一个元素的顺序 L

newlist=[ii for n,ii in enumerate(L) if ii not in L[:n]]

例如,if L=[1, 2, 2, 3, 4, 2, 4, 3, 5]那么newlist将是[1,2,3,4,5]

这会在添加每个新元素之前检查它是否没有出现在列表中。而且它不需要进口。

To make a new list retaining the order of first elements of duplicates in L

newlist=[ii for n,ii in enumerate(L) if ii not in L[:n]]

for example if L=[1, 2, 2, 3, 4, 2, 4, 3, 5] then newlist will be [1,2,3,4,5]

This checks each new element has not appeared previously in the list before adding it. Also it does not need imports.


回答 6

一位同事已将接受的答案作为他的代码的一部分发送给我,以供今天进行代码审查。尽管我当然很欣赏所提问题的优雅之处,但我对这种表现并不满意。我已经尝试过此解决方案(我使用set来减少查找时间)

def ordered_set(in_list):
    out_list = []
    added = set()
    for val in in_list:
        if not val in added:
            out_list.append(val)
            added.add(val)
    return out_list

为了比较效率,我使用了100个整数的随机样本-62个是唯一的

from random import randint
x = [randint(0,100) for _ in xrange(100)]

In [131]: len(set(x))
Out[131]: 62

这是测量结果

In [129]: %timeit list(OrderedDict.fromkeys(x))
10000 loops, best of 3: 86.4 us per loop

In [130]: %timeit ordered_set(x)
100000 loops, best of 3: 15.1 us per loop

好吧,如果将集合从解决方案中删除,会发生什么?

def ordered_set(inlist):
    out_list = []
    for val in inlist:
        if not val in out_list:
            out_list.append(val)
    return out_list

结果不如OrderedDict差,但仍然是原始解决方案的3倍以上

In [136]: %timeit ordered_set(x)
10000 loops, best of 3: 52.6 us per loop

A colleague have sent the accepted answer as part of his code to me for a codereview today. While I certainly admire the elegance of the answer in question, I am not happy with the performance. I have tried this solution (I use set to reduce lookup time)

def ordered_set(in_list):
    out_list = []
    added = set()
    for val in in_list:
        if not val in added:
            out_list.append(val)
            added.add(val)
    return out_list

To compare efficiency, I used a random sample of 100 integers – 62 were unique

from random import randint
x = [randint(0,100) for _ in xrange(100)]

In [131]: len(set(x))
Out[131]: 62

Here are the results of the measurements

In [129]: %timeit list(OrderedDict.fromkeys(x))
10000 loops, best of 3: 86.4 us per loop

In [130]: %timeit ordered_set(x)
100000 loops, best of 3: 15.1 us per loop

Well, what happens if set is removed from the solution?

def ordered_set(inlist):
    out_list = []
    for val in inlist:
        if not val in out_list:
            out_list.append(val)
    return out_list

The result is not as bad as with the OrderedDict, but still more than 3 times of the original solution

In [136]: %timeit ordered_set(x)
10000 loops, best of 3: 52.6 us per loop

回答 7

也有使用Pandas和Numpy的解决方案。它们都返回numpy数组,因此.tolist()如果需要列表,则必须使用该函数。

t=['a','a','b','b','b','c','c','c']
t2= ['c','c','b','b','b','a','a','a']

熊猫解决方案

使用熊猫功能unique()

import pandas as pd
pd.unique(t).tolist()
>>>['a','b','c']
pd.unique(t2).tolist()
>>>['c','b','a']

脾气暴躁的解决方案

使用numpy函数unique()

import numpy as np
np.unique(t).tolist()
>>>['a','b','c']
np.unique(t2).tolist()
>>>['a','b','c']

请注意,numpy.unique()也对值进行排序。因此,列表t2按排序返回。如果您想保留订单,请按照以下答案进行操作

_, idx = np.unique(t2, return_index=True)
t2[np.sort(idx)].tolist()
>>>['c','b','a']

与其他解决方案相比,该解决方案并不那么优雅,但是与pandas.unique()相比,numpy.unique()还可让您检查嵌套数组在一个选定轴上是否唯一。

There are also solutions using Pandas and Numpy. They both return numpy array so you have to use the function .tolist() if you want a list.

t=['a','a','b','b','b','c','c','c']
t2= ['c','c','b','b','b','a','a','a']

Pandas solution

Using Pandas function unique():

import pandas as pd
pd.unique(t).tolist()
>>>['a','b','c']
pd.unique(t2).tolist()
>>>['c','b','a']

Numpy solution

Using numpy function unique().

import numpy as np
np.unique(t).tolist()
>>>['a','b','c']
np.unique(t2).tolist()
>>>['a','b','c']

Note that numpy.unique() also sort the values. So the list t2 is returned sorted. If you want to have the order preserved use as in this answer:

_, idx = np.unique(t2, return_index=True)
t2[np.sort(idx)].tolist()
>>>['c','b','a']

The solution is not so elegant compared to the others, however, compared to pandas.unique(), numpy.unique() allows you also to check if nested arrays are unique along one selected axis.


回答 8

另一种方式:

>>> seq = [1,2,3,'a', 'a', 1,2]
>> dict.fromkeys(seq).keys()
['a', 1, 2, 3]

Another way of doing:

>>> seq = [1,2,3,'a', 'a', 1,2]
>> dict.fromkeys(seq).keys()
['a', 1, 2, 3]

回答 9

简单易行:

myList = [1, 2, 3, 1, 2, 5, 6, 7, 8]
cleanlist = []
[cleanlist.append(x) for x in myList if x not in cleanlist]

输出:

>>> cleanlist 
[1, 2, 3, 5, 6, 7, 8]

Simple and easy:

myList = [1, 2, 3, 1, 2, 5, 6, 7, 8]
cleanlist = []
[cleanlist.append(x) for x in myList if x not in cleanlist]

Output:

>>> cleanlist 
[1, 2, 3, 5, 6, 7, 8]

回答 10

在这个答案中,将分为两个部分:两个独特的解决方案,以及特定解决方案的速度图表。

删除重复项

这些答案大多数都只删除可哈希的重复项,但是这个问题并不意味着它不仅需要可哈希项,这意味着我将提供一些不需要哈希项的解决方案。

collections.Counter是标准库中的强大工具,可能对此非常理想。只有另一种解决方案甚至包含Counter。但是,该解决方案也仅限于可哈希键。

为了在Counter中允许不可散列的键,我制作了一个Container类,它将尝试获取对象的默认散列函数,但是如果失败,它将尝试其标识函数。它还定义了一个eq和一个哈希方法。这应该足以允许我们的解决方案中使用不可散列的项目。不可哈希对象将被视为可哈希对象。但是,此哈希函数对不可哈希对象使用标识,这意味着两个不可哈希的相等对象将不起作用。我建议您重写此方法,并将其更改为使用等效可变类型的哈希(例如使用hash(tuple(my_list))ifmy_list是列表)。

我还提出了两种解决方案。另一个解决方案是使用OrderedDict和Counter的子类(称为“ OrderedCounter”)来保持商品的顺序。现在,这里是功能:

from collections import OrderedDict, Counter

class Container:
    def __init__(self, obj):
        self.obj = obj
    def __eq__(self, obj):
        return self.obj == obj
    def __hash__(self):
        try:
            return hash(self.obj)
        except:
            return id(self.obj)

class OrderedCounter(Counter, OrderedDict):
     'Counter that remembers the order elements are first encountered'

     def __repr__(self):
         return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))

     def __reduce__(self):
         return self.__class__, (OrderedDict(self),)

def remd(sequence):
    cnt = Counter()
    for x in sequence:
        cnt[Container(x)] += 1
    return [item.obj for item in cnt]

def oremd(sequence):
    cnt = OrderedCounter()
    for x in sequence:
        cnt[Container(x)] += 1
    return [item.obj for item in cnt]

remd是非排序排序,oremd是排序排序。您可以清楚地分辨出哪一个速度更快,但无论如何我都会解释。无序排序略快。由于不需要排序,因此它保留的数据较少。

现在,我还想显示每个答案的速度比较。所以,我现在就开始做。

哪个功能最快?

为了删除重复项,我从一些答案中收集了10个函数。我计算了每个函数的速度,并使用matplotlib.pyplot将其放入图表中。

我将其分为三轮。可哈希对象是可以被哈希处理的任何对象,不可哈希对象是不能被哈希处理的任何对象。有序序列是保留顺序的序列,无序序列不保留顺序。现在,这里还有一些术语:

“无序哈希”适用于任何删除重复项的方法,这些方法不一定必须保持顺序。它不必为无法哈希​​的文件工作,但是可以。

Ordered Hashable适用于将项目的顺序保留在列表中的任何方法,但是它不一定适用于unhashables,但是可以。

Ordered Unhashable是保留列表中项目顺序并适用于unhashable的任何方法。

在y轴上是花费的秒数。

在x轴上是应用该功能的编号。

我们通过以下理解为无序哈希和有序哈希生成序列: [list(range(x)) + list(range(x)) for x in range(0, 1000, 10)]

对于订购的不可哈希值: [[list(range(y)) + list(range(y)) for y in range(x)] for x in range(0, 1000, 10)]

请注意,该范围内有一个“台阶”,因为没有它,这将花费10倍的时间。另外,由于我个人的观点,我认为它看起来似乎更容易阅读。

另请注意,图例上的键是我试图猜测为功能最重要的部分。至于什么功能最差或最好?该图说明了一切。

解决之后,下面是图表。

无序哈希

在此处输入图片说明 (放大) 在此处输入图片说明

有序哈希

在此处输入图片说明 (放大) 在此处输入图片说明

有序的不可哈希

在此处输入图片说明 (放大) 在此处输入图片说明

In this answer, will be two sections: Two unique solutions, and a graph of speed for specific solutions.

Removing Duplicate Items

Most of these answers only remove duplicate items which are hashable, but this question doesn’t imply it doesn’t just need hashable items, meaning I’ll offer some solutions which don’t require hashable items.

collections.Counter is a powerful tool in the standard library which could be perfect for this. There’s only one other solution which even has Counter in it. However, that solution is also limited to hashable keys.

To allow unhashable keys in Counter, I made a Container class, which will try to get the object’s default hash function, but if it fails, it will try its identity function. It also defines an eq and a hash method. This should be enough to allow unhashable items in our solution. Unhashable objects will be treated as if they are hashable. However, this hash function uses identity for unhashable objects, meaning two equal objects that are both unhashable won’t work. I suggest you overriding this, and changing it to use the hash of an equivalent mutable type (like using hash(tuple(my_list)) if my_list is a list).

I also made two solutions. Another solution which keeps the order of the items, using a subclass of both OrderedDict and Counter which is named ‘OrderedCounter’. Now, here are the functions:

from collections import OrderedDict, Counter

class Container:
    def __init__(self, obj):
        self.obj = obj
    def __eq__(self, obj):
        return self.obj == obj
    def __hash__(self):
        try:
            return hash(self.obj)
        except:
            return id(self.obj)

class OrderedCounter(Counter, OrderedDict):
     'Counter that remembers the order elements are first encountered'

     def __repr__(self):
         return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))

     def __reduce__(self):
         return self.__class__, (OrderedDict(self),)

def remd(sequence):
    cnt = Counter()
    for x in sequence:
        cnt[Container(x)] += 1
    return [item.obj for item in cnt]

def oremd(sequence):
    cnt = OrderedCounter()
    for x in sequence:
        cnt[Container(x)] += 1
    return [item.obj for item in cnt]

remd is non-ordered sorting, oremd is ordered sorting. You can clearly tell which one is faster, but I’ll explain anyways. The non-ordered sorting is slightly faster. It keeps less data, since it doesn’t need order.

Now, I also wanted to show the speed comparisons of each answer. So, I’ll do that now.

Which Function is the Fastest?

For removing duplicates, I gathered 10 functions from a few answers. I calculated the speed of each function and put it into a graph using matplotlib.pyplot.

I divided this into three rounds of graphing. A hashable is any object which can be hashed, an unhashable is any object which cannot be hashed. An ordered sequence is a sequence which preserves order, an unordered sequence does not preserve order. Now, here are a few more terms:

Unordered Hashable was for any method which removed duplicates, which didn’t necessarily have to keep the order. It didn’t have to work for unhashables, but it could.

Ordered Hashable was for any method which kept the order of the items in the list, but it didn’t have to work for unhashables, but it could.

Ordered Unhashable was any method which kept the order of the items in the list, and worked for unhashables.

On the y-axis is the amount of seconds it took.

On the x-axis is the number the function was applied to.

We generated sequences for unordered hashables and ordered hashables with the following comprehension: [list(range(x)) + list(range(x)) for x in range(0, 1000, 10)]

For ordered unhashables: [[list(range(y)) + list(range(y)) for y in range(x)] for x in range(0, 1000, 10)]

Note there is a ‘step’ in the range because without it, this would’ve taken 10x as long. Also because in my personal opinion, I thought it might’ve looked a little easier to read.

Also note the keys on the legend are what I tried to guess as the most vital parts of the function. As for what function does the worst or best? The graph speaks for itself.

With that settled, here are the graphs.

Unordered Hashables

enter image description here (Zoomed in) enter image description here

Ordered Hashables

enter image description here (Zoomed in) enter image description here

Ordered Unhashables

enter image description here (Zoomed in) enter image description here


回答 11

我的清单上有一个字典,所以我不能使用上述方法。我得到了错误:

TypeError: unhashable type:

因此,如果您关心订单和/或某些项目无法散列。然后,您可能会发现这很有用:

def make_unique(original_list):
    unique_list = []
    [unique_list.append(obj) for obj in original_list if obj not in unique_list]
    return unique_list

有些人可能认为列表理解有副作用不是一个好的解决方案。这是一个替代方案:

def make_unique(original_list):
    unique_list = []
    map(lambda x: unique_list.append(x) if (x not in unique_list) else False, original_list)
    return unique_list

I had a dict in my list, so I could not use the above approach. I got the error:

TypeError: unhashable type:

So if you care about order and/or some items are unhashable. Then you might find this useful:

def make_unique(original_list):
    unique_list = []
    [unique_list.append(obj) for obj in original_list if obj not in unique_list]
    return unique_list

Some may consider list comprehension with a side effect to not be a good solution. Here’s an alternative:

def make_unique(original_list):
    unique_list = []
    map(lambda x: unique_list.append(x) if (x not in unique_list) else False, original_list)
    return unique_list

回答 12

所有的保持阶接近我在这里看到迄今要么使用比较幼稚(具有为O(n ^ 2)在最佳的时间复杂度)或重重量OrderedDicts/ set+ list的组合被限制于可哈希输入。这是独立于哈希的O(nlogn)解决方案:

更新添加了key参数,文档和Python 3兼容性。

# from functools import reduce <-- add this import on Python 3

def uniq(iterable, key=lambda x: x):
    """
    Remove duplicates from an iterable. Preserves order. 
    :type iterable: Iterable[Ord => A]
    :param iterable: an iterable of objects of any orderable type
    :type key: Callable[A] -> (Ord => B)
    :param key: optional argument; by default an item (A) is discarded 
    if another item (B), such that A == B, has already been encountered and taken. 
    If you provide a key, this condition changes to key(A) == key(B); the callable 
    must return orderable objects.
    """
    # Enumerate the list to restore order lately; reduce the sorted list; restore order
    def append_unique(acc, item):
        return acc if key(acc[-1][1]) == key(item[1]) else acc.append(item) or acc 
    srt_enum = sorted(enumerate(iterable), key=lambda item: key(item[1]))
    return [item[1] for item in sorted(reduce(append_unique, srt_enum, [srt_enum[0]]))] 

All the order-preserving approaches I’ve seen here so far either use naive comparison (with O(n^2) time-complexity at best) or heavy-weight OrderedDicts/set+list combinations that are limited to hashable inputs. Here is a hash-independent O(nlogn) solution:

Update added the key argument, documentation and Python 3 compatibility.

# from functools import reduce <-- add this import on Python 3

def uniq(iterable, key=lambda x: x):
    """
    Remove duplicates from an iterable. Preserves order. 
    :type iterable: Iterable[Ord => A]
    :param iterable: an iterable of objects of any orderable type
    :type key: Callable[A] -> (Ord => B)
    :param key: optional argument; by default an item (A) is discarded 
    if another item (B), such that A == B, has already been encountered and taken. 
    If you provide a key, this condition changes to key(A) == key(B); the callable 
    must return orderable objects.
    """
    # Enumerate the list to restore order lately; reduce the sorted list; restore order
    def append_unique(acc, item):
        return acc if key(acc[-1][1]) == key(item[1]) else acc.append(item) or acc 
    srt_enum = sorted(enumerate(iterable), key=lambda item: key(item[1]))
    return [item[1] for item in sorted(reduce(append_unique, srt_enum, [srt_enum[0]]))] 

回答 13

如果您想保留订单,并且不使用任何外部模块,则可以通过以下简便方法进行操作:

>>> t = [1, 9, 2, 3, 4, 5, 3, 6, 7, 5, 8, 9]
>>> list(dict.fromkeys(t))
[1, 9, 2, 3, 4, 5, 6, 7, 8]

注意:此方法保留了外观顺序,因此,如前所述,因为它是第一次出现,所以后面将有九个。但是,这与您得到的结果相同

from collections import OrderedDict
ulist=list(OrderedDict.fromkeys(l))

但它更短,并且运行更快。

之所以fromkeys可行,是因为每次函数尝试创建一个新键时,如果该值已经存在,它将简单地覆盖它。但是,这根本不会影响字典,因为fromkeys会创建一个字典,其中所有键都具有value None,因此有效地它消除了所有重复项。

If you want to preserve the order, and not use any external modules here is an easy way to do this:

>>> t = [1, 9, 2, 3, 4, 5, 3, 6, 7, 5, 8, 9]
>>> list(dict.fromkeys(t))
[1, 9, 2, 3, 4, 5, 6, 7, 8]

Note: This method preserves the order of appearance, so, as seen above, nine will come after one because it was the first time it appeared. This however, is the same result as you would get with doing

from collections import OrderedDict
ulist=list(OrderedDict.fromkeys(l))

but it is much shorter, and runs faster.

This works because each time the fromkeys function tries to create a new key, if the value already exists it will simply overwrite it. This wont affect the dictionary at all however, as fromkeys creates a dictionary where all keys have the value None, so effectively it eliminates all duplicates this way.


回答 14

您也可以这样做:

>>> t = [1, 2, 3, 3, 2, 4, 5, 6]
>>> s = [x for i, x in enumerate(t) if i == t.index(x)]
>>> s
[1, 2, 3, 4, 5, 6]

上面的工作原理是该index方法仅返回元素的第一个索引。重复元素具有更高的索引。请参考这里

list.index(x [,start [,end]])
在值为x的第一项列表中返回从零开始的索引。如果没有这样的项目,则引发ValueError。

You could also do this:

>>> t = [1, 2, 3, 3, 2, 4, 5, 6]
>>> s = [x for i, x in enumerate(t) if i == t.index(x)]
>>> s
[1, 2, 3, 4, 5, 6]

The reason that above works is that index method returns only the first index of an element. Duplicate elements have higher indices. Refer to here:

list.index(x[, start[, end]])
Return zero-based index in the list of the first item whose value is x. Raises a ValueError if there is no such item.


回答 15

尝试使用集合:

import sets
t = sets.Set(['a', 'b', 'c', 'd'])
t1 = sets.Set(['a', 'b', 'c'])

print t | t1
print t - t1

Try using sets:

import sets
t = sets.Set(['a', 'b', 'c', 'd'])
t1 = sets.Set(['a', 'b', 'c'])

print t | t1
print t - t1

回答 16

通过保留订单来减少变体:

假设我们有清单:

l = [5, 6, 6, 1, 1, 2, 2, 3, 4]

减少变体(无效):

>>> reduce(lambda r, v: v in r and r or r + [v], l, [])
[5, 6, 1, 2, 3, 4]

速度提高5倍,但功能更先进

>>> reduce(lambda r, v: v in r[1] and r or (r[0].append(v) or r[1].add(v)) or r, l, ([], set()))[0]
[5, 6, 1, 2, 3, 4]

说明:

default = (list(), set())
# user list to keep order
# use set to make lookup faster

def reducer(result, item):
    if item not in result[1]:
        result[0].append(item)
        result[1].add(item)
    return result

reduce(reducer, l, default)[0]

Reduce variant with ordering preserve:

Assume that we have list:

l = [5, 6, 6, 1, 1, 2, 2, 3, 4]

Reduce variant (unefficient):

>>> reduce(lambda r, v: v in r and r or r + [v], l, [])
[5, 6, 1, 2, 3, 4]

5 x faster but more sophisticated

>>> reduce(lambda r, v: v in r[1] and r or (r[0].append(v) or r[1].add(v)) or r, l, ([], set()))[0]
[5, 6, 1, 2, 3, 4]

Explanation:

default = (list(), set())
# user list to keep order
# use set to make lookup faster

def reducer(result, item):
    if item not in result[1]:
        result[0].append(item)
        result[1].add(item)
    return result

reduce(reducer, l, default)[0]

回答 17

从列表中删除重复项的最佳方法是使用python中可用的set()函数,再次将其转换为列表

In [2]: some_list = ['a','a','v','v','v','c','c','d']
In [3]: list(set(some_list))
Out[3]: ['a', 'c', 'd', 'v']

Best approach of removing duplicates from a list is using set() function, available in python, again converting that set into list

In [2]: some_list = ['a','a','v','v','v','c','c','d']
In [3]: list(set(some_list))
Out[3]: ['a', 'c', 'd', 'v']

回答 18

您可以使用以下功能:

def rem_dupes(dup_list): 
    yooneeks = [] 
    for elem in dup_list: 
        if elem not in yooneeks: 
            yooneeks.append(elem) 
    return yooneeks

范例

my_list = ['this','is','a','list','with','dupicates','in', 'the', 'list']

用法:

rem_dupes(my_list)

[‘this’,’is’,’a’,’list’,’with’,’dupicates,’in’,’the’]

You can use the following function:

def rem_dupes(dup_list): 
    yooneeks = [] 
    for elem in dup_list: 
        if elem not in yooneeks: 
            yooneeks.append(elem) 
    return yooneeks

Example:

my_list = ['this','is','a','list','with','dupicates','in', 'the', 'list']

Usage:

rem_dupes(my_list)

[‘this’, ‘is’, ‘a’, ‘list’, ‘with’, ‘dupicates’, ‘in’, ‘the’]


回答 19

还有许多其他答案建议使用不同的方法来执行此操作,但是它们都是批处理操作,其中一些会放弃原始订单。根据您的需要,这可能没问题,但是如果您要按每个值的第一个实例的顺序迭代这些值,并且想要即时删除所有重复项,而一次删除所有重复项,则可以使用此生成器:

def uniqify(iterable):
    seen = set()
    for item in iterable:
        if item not in seen:
            seen.add(item)
            yield item

这将返回一个生成器/迭代器,因此您可以在可以使用迭代器的任何地方使用它。

for unique_item in uniqify([1, 2, 3, 4, 3, 2, 4, 5, 6, 7, 6, 8, 8]):
    print(unique_item, end=' ')

print()

输出:

1 2 3 4 5 6 7 8

如果您确实想要a list,则可以执行以下操作:

unique_list = list(uniqify([1, 2, 3, 4, 3, 2, 4, 5, 6, 7, 6, 8, 8]))

print(unique_list)

输出:

[1, 2, 3, 4, 5, 6, 7, 8]

There are many other answers suggesting different ways to do this, but they’re all batch operations, and some of them throw away the original order. That might be okay depending on what you need, but if you want to iterate over the values in the order of the first instance of each value, and you want to remove the duplicates on-the-fly versus all at once, you could use this generator:

def uniqify(iterable):
    seen = set()
    for item in iterable:
        if item not in seen:
            seen.add(item)
            yield item

This returns a generator/iterator, so you can use it anywhere that you can use an iterator.

for unique_item in uniqify([1, 2, 3, 4, 3, 2, 4, 5, 6, 7, 6, 8, 8]):
    print(unique_item, end=' ')

print()

Output:

1 2 3 4 5 6 7 8

If you do want a list, you can do this:

unique_list = list(uniqify([1, 2, 3, 4, 3, 2, 4, 5, 6, 7, 6, 8, 8]))

print(unique_list)

Output:

[1, 2, 3, 4, 5, 6, 7, 8]

回答 20

不使用设置

data=[1, 2, 3, 1, 2, 5, 6, 7, 8]
uni_data=[]
for dat in data:
    if dat not in uni_data:
        uni_data.append(dat)

print(uni_data) 

Without using set

data=[1, 2, 3, 1, 2, 5, 6, 7, 8]
uni_data=[]
for dat in data:
    if dat not in uni_data:
        uni_data.append(dat)

print(uni_data) 

回答 21

您可以使用set删除重复项:

mylist = list(set(mylist))

但是请注意,结果将是无序的。如果这是一个问题:

mylist.sort()

You can use set to remove duplicates:

mylist = list(set(mylist))

But note the results will be unordered. If that’s an issue:

mylist.sort()

回答 22

还有一种更好的方法是

import pandas as pd

myList = [1, 2, 3, 1, 2, 5, 6, 7, 8]
cleanList = pd.Series(myList).drop_duplicates().tolist()
print(cleanList)

#> [1, 2, 3, 5, 6, 7, 8]

并且订单保持不变。

One more better approach could be,

import pandas as pd

myList = [1, 2, 3, 1, 2, 5, 6, 7, 8]
cleanList = pd.Series(myList).drop_duplicates().tolist()
print(cleanList)

#> [1, 2, 3, 5, 6, 7, 8]

and the order remains preserved.


回答 23

这个人关心订单的过程没有太多麻烦(OrderdDict等)。可能不是最Python的方式,也不是最短的方式,但是可以解决这个问题:

def remove_duplicates(list):
    ''' Removes duplicate items from a list '''
    singles_list = []
    for element in list:
        if element not in singles_list:
            singles_list.append(element)
    return singles_list

This one cares about the order without too much hassle (OrderdDict & others). Probably not the most Pythonic way, nor shortest way, but does the trick:

def remove_duplicates(list):
    ''' Removes duplicate items from a list '''
    singles_list = []
    for element in list:
        if element not in singles_list:
            singles_list.append(element)
    return singles_list

回答 24

下面的代码很容易删除列表中的重复项

def remove_duplicates(x):
    a = []
    for i in x:
        if i not in a:
            a.append(i)
    return a

print remove_duplicates([1,2,2,3,3,4])

它返回[1,2,3,4]

below code is simple for removing duplicate in list

def remove_duplicates(x):
    a = []
    for i in x:
        if i not in a:
            a.append(i)
    return a

print remove_duplicates([1,2,2,3,3,4])

it returns [1,2,3,4]


回答 25

这是最快的pythonic解决方案,适用于其他答复中列出的解决方案。

使用短路评估的实施细节可以使用列表理解,这足够快。visited.add(item)始终返回None结果,其结果为False,因此的右侧or始终是该表达式的结果。

自己计时

def deduplicate(sequence):
    visited = set()
    adder = visited.add  # get rid of qualification overhead
    out = [adder(item) or item for item in sequence if item not in visited]
    return out

Here’s the fastest pythonic solution comaring to others listed in replies.

Using implementation details of short-circuit evaluation allows to use list comprehension, which is fast enough. visited.add(item) always returns None as a result, which is evaluated as False, so the right-side of or would always be the result of such an expression.

Time it yourself

def deduplicate(sequence):
    visited = set()
    adder = visited.add  # get rid of qualification overhead
    out = [adder(item) or item for item in sequence if item not in visited]
    return out

回答 26

使用set

a = [0,1,2,3,4,3,3,4]
a = list(set(a))
print a

使用独特的

import numpy as np
a = [0,1,2,3,4,3,3,4]
a = np.unique(a).tolist()
print a

Using set :

a = [0,1,2,3,4,3,3,4]
a = list(set(a))
print a

Using unique :

import numpy as np
a = [0,1,2,3,4,3,3,4]
a = np.unique(a).tolist()
print a

回答 27

不幸。此处的大多数答案要么不保留顺序,要么太长。这是一个简单的订单保留答案。

s = [1,2,3,4,5,2,5,6,7,1,3,9,3,5]
x=[]

[x.append(i) for i in s if i not in x]
print(x)

这将为您x删除重复项,但保留顺序。

Unfortunately. Most answers here either do not preserve the order or are too long. Here is a simple, order preserving answer.

s = [1,2,3,4,5,2,5,6,7,1,3,9,3,5]
x=[]

[x.append(i) for i in s if i not in x]
print(x)

This will give you x with duplicates removed but preserving the order.


回答 28

Python 3中非常简单的方法:

>>> n = [1, 2, 3, 4, 1, 1]
>>> n
[1, 2, 3, 4, 1, 1]
>>> m = sorted(list(set(n)))
>>> m
[1, 2, 3, 4]

Very simple way in Python 3:

>>> n = [1, 2, 3, 4, 1, 1]
>>> n
[1, 2, 3, 4, 1, 1]
>>> m = sorted(list(set(n)))
>>> m
[1, 2, 3, 4]

回答 29

Python内置类型的魔力

在python中,仅通过python的内置类型,即可轻松处理此类复杂情况。

让我告诉你怎么做!

方法1:一般情况

删除列表中重复元素并仍然保持排序顺序的方式(1行代码

line = [1, 2, 3, 1, 2, 5, 6, 7, 8]
new_line = sorted(set(line), key=line.index) # remove duplicated element
print(new_line)

您将得到结果

[1, 2, 3, 5, 6, 7, 8]

方法2:特例

TypeError: unhashable type: 'list'

处理不可散列的特殊情况(3行代码

line=[['16.4966155686595', '-27.59776154691', '52.3786295521147']
,['16.4966155686595', '-27.59776154691', '52.3786295521147']
,['17.6508629295574', '-27.143305738671', '47.534955022564']
,['17.6508629295574', '-27.143305738671', '47.534955022564']
,['18.8051102904552', '-26.688849930432', '42.6912804930134']
,['18.8051102904552', '-26.688849930432', '42.6912804930134']
,['19.5504702331098', '-26.205884452727', '37.7709192714727']
,['19.5504702331098', '-26.205884452727', '37.7709192714727']
,['20.2929416861422', '-25.722717575124', '32.8500163147157']
,['20.2929416861422', '-25.722717575124', '32.8500163147157']]

tuple_line = [tuple(pt) for pt in line] # convert list of list into list of tuple
tuple_new_line = sorted(set(tuple_line),key=tuple_line.index) # remove duplicated element
new_line = [list(t) for t in tuple_new_line] # convert list of tuple into list of list

print (new_line)

您将得到结果:

[
  ['16.4966155686595', '-27.59776154691', '52.3786295521147'], 
  ['17.6508629295574', '-27.143305738671', '47.534955022564'], 
  ['18.8051102904552', '-26.688849930432', '42.6912804930134'], 
  ['19.5504702331098', '-26.205884452727', '37.7709192714727'], 
  ['20.2929416861422', '-25.722717575124', '32.8500163147157']
]

由于元组是可哈希的,因此您可以轻松地在列表和元组之间转换数据

The Magic of Python Built-in type

In python, it is very easy to process the complicated cases like this and only by python’s built-in type.

Let me show you how to do !

Method 1: General Case

The way (1 line code) to remove duplicated element in list and still keep sorting order

line = [1, 2, 3, 1, 2, 5, 6, 7, 8]
new_line = sorted(set(line), key=line.index) # remove duplicated element
print(new_line)

You will get the result

[1, 2, 3, 5, 6, 7, 8]

Method 2: Special Case

TypeError: unhashable type: 'list'

The special case to process unhashable (3 line codes)

line=[['16.4966155686595', '-27.59776154691', '52.3786295521147']
,['16.4966155686595', '-27.59776154691', '52.3786295521147']
,['17.6508629295574', '-27.143305738671', '47.534955022564']
,['17.6508629295574', '-27.143305738671', '47.534955022564']
,['18.8051102904552', '-26.688849930432', '42.6912804930134']
,['18.8051102904552', '-26.688849930432', '42.6912804930134']
,['19.5504702331098', '-26.205884452727', '37.7709192714727']
,['19.5504702331098', '-26.205884452727', '37.7709192714727']
,['20.2929416861422', '-25.722717575124', '32.8500163147157']
,['20.2929416861422', '-25.722717575124', '32.8500163147157']]

tuple_line = [tuple(pt) for pt in line] # convert list of list into list of tuple
tuple_new_line = sorted(set(tuple_line),key=tuple_line.index) # remove duplicated element
new_line = [list(t) for t in tuple_new_line] # convert list of tuple into list of list

print (new_line)

You will get the result :

[
  ['16.4966155686595', '-27.59776154691', '52.3786295521147'], 
  ['17.6508629295574', '-27.143305738671', '47.534955022564'], 
  ['18.8051102904552', '-26.688849930432', '42.6912804930134'], 
  ['19.5504702331098', '-26.205884452727', '37.7709192714727'], 
  ['20.2929416861422', '-25.722717575124', '32.8500163147157']
]

Because tuple is hashable and you can convert data between list and tuple easily