标签归档:list

Python中的字母范围

问题:Python中的字母范围

而不是像这样列出字母字符:

alpha = ['a', 'b', 'c', 'd'.........'z']

有什么办法可以将它分组到某个范围之内?例如,对于数字,可以使用进行分组range()

range(1, 10)

Instead of making a list of alphabet characters like this:

alpha = ['a', 'b', 'c', 'd'.........'z']

is there any way that we can group it to a range or something? For example, for numbers it can be grouped using range():

range(1, 10)

回答 0

>>> import string
>>> string.ascii_lowercase
'abcdefghijklmnopqrstuvwxyz'

如果您确实需要列表:

>>> list(string.ascii_lowercase)
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

并做到这一点 range

>>> list(map(chr, range(97, 123))) #or list(map(chr, range(ord('a'), ord('z')+1)))
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

其他有用的string模块功能:

>>> help(string) # on Python 3
....
DATA
    ascii_letters = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    ascii_lowercase = 'abcdefghijklmnopqrstuvwxyz'
    ascii_uppercase = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    digits = '0123456789'
    hexdigits = '0123456789abcdefABCDEF'
    octdigits = '01234567'
    printable = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'
    punctuation = '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
    whitespace = ' \t\n\r\x0b\x0c'
>>> import string
>>> string.ascii_lowercase
'abcdefghijklmnopqrstuvwxyz'

If you really need a list:

>>> list(string.ascii_lowercase)
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

And to do it with range

>>> list(map(chr, range(97, 123))) #or list(map(chr, range(ord('a'), ord('z')+1)))
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

Other helpful string module features:

>>> help(string) # on Python 3
....
DATA
    ascii_letters = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    ascii_lowercase = 'abcdefghijklmnopqrstuvwxyz'
    ascii_uppercase = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    digits = '0123456789'
    hexdigits = '0123456789abcdefABCDEF'
    octdigits = '01234567'
    printable = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ \t\n\r\x0b\x0c'
    punctuation = '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
    whitespace = ' \t\n\r\x0b\x0c'

回答 1

[chr(i) for i in range(ord('a'),ord('z')+1)]
[chr(i) for i in range(ord('a'),ord('z')+1)]

回答 2

在Python 2.7和3中,您可以使用以下代码:

import string
string.ascii_lowercase
'abcdefghijklmnopqrstuvwxyz'

string.ascii_uppercase
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

正如@Zaz所说: string.lowercase已弃用,不再在Python 3中string.ascii_lowercase工作,但在两者中都工作

In Python 2.7 and 3 you can use this:

import string
string.ascii_lowercase
'abcdefghijklmnopqrstuvwxyz'

string.ascii_uppercase
'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

As @Zaz says: string.lowercase is deprecated and no longer works in Python 3 but string.ascii_lowercase works in both


回答 3

这是一个简单的字母范围实现:

def letter_range(start, stop="{", step=1):
    """Yield a range of lowercase letters.""" 
    for ord_ in range(ord(start.lower()), ord(stop.lower()), step):
        yield chr(ord_)

演示版

list(letter_range("a", "f"))
# ['a', 'b', 'c', 'd', 'e']

list(letter_range("a", "f", step=2))
# ['a', 'c', 'e']

Here is a simple letter-range implementation:

Code

def letter_range(start, stop="{", step=1):
    """Yield a range of lowercase letters.""" 
    for ord_ in range(ord(start.lower()), ord(stop.lower()), step):
        yield chr(ord_)

Demo

list(letter_range("a", "f"))
# ['a', 'b', 'c', 'd', 'e']

list(letter_range("a", "f", step=2))
# ['a', 'c', 'e']

回答 4

如果要查找letters[1:10]与R 相等的值,则可以使用:

 import string
 list(string.ascii_lowercase[0:10])

If you are looking to an equivalent of letters[1:10] from R, you can use:

 import string
 list(string.ascii_lowercase[0:10])

回答 5

使用内置的range函数在python中打印大写和小写字母

def upperCaseAlphabets():
    print("Upper Case Alphabets")
    for i in range(65, 91):
        print(chr(i), end=" ")
    print()

def lowerCaseAlphabets():
    print("Lower Case Alphabets")
    for i in range(97, 123):
        print(chr(i), end=" ")

upperCaseAlphabets();
lowerCaseAlphabets();

Print the Upper and Lower case alphabets in python using a built-in range function

def upperCaseAlphabets():
    print("Upper Case Alphabets")
    for i in range(65, 91):
        print(chr(i), end=" ")
    print()

def lowerCaseAlphabets():
    print("Lower Case Alphabets")
    for i in range(97, 123):
        print(chr(i), end=" ")

upperCaseAlphabets();
lowerCaseAlphabets();

回答 6

这是我能找出的最简单方法:

#!/usr/bin/python3 for i in range(97, 123): print("{:c}".format(i), end='')

因此,从97到122是与’a’和’z’等效的ASCII码。请注意,小写字母和放置123的需要,因为将不包括在内)。

在打印功能中,请确保设置{:c}(字符)格式,在这种情况下,我们希望它一起打印所有内容,甚至不让最后一行换行,这样end=''就可以了。

结果是这样的: abcdefghijklmnopqrstuvwxyz

This is the easiest way I can figure out:

#!/usr/bin/python3 for i in range(97, 123): print("{:c}".format(i), end='')

So, 97 to 122 are the ASCII number equivalent to ‘a’ to and ‘z’. Notice the lowercase and the need to put 123, since it will not be included).

In print function make sure to set the {:c} (character) format, and, in this case, we want it to print it all together not even letting a new line at the end, so end=''would do the job.

The result is this: abcdefghijklmnopqrstuvwxyz


有没有一种简单的方法可以按值删除列表元素?

问题:有没有一种简单的方法可以按值删除列表元素?

a = [1, 2, 3, 4]
b = a.index(6)

del a[b]
print(a)

上面显示了以下错误:

Traceback (most recent call last):
  File "D:\zjm_code\a.py", line 6, in <module>
    b = a.index(6)
ValueError: list.index(x): x not in list

所以我必须这样做:

a = [1, 2, 3, 4]

try:
    b = a.index(6)
    del a[b]
except:
    pass

print(a)

但是,没有简单的方法可以做到这一点吗?

a = [1, 2, 3, 4]
b = a.index(6)

del a[b]
print(a)

The above shows the following error:

Traceback (most recent call last):
  File "D:\zjm_code\a.py", line 6, in <module>
    b = a.index(6)
ValueError: list.index(x): x not in list

So I have to do this:

a = [1, 2, 3, 4]

try:
    b = a.index(6)
    del a[b]
except:
    pass

print(a)

But is there not a simpler way to do this?


回答 0

要删除列表中元素的首次出现,只需使用list.remove

>>> a = ['a', 'b', 'c', 'd']
>>> a.remove('b')
>>> print(a)
['a', 'c', 'd']

请注意,它不会删除所有出现的元素。为此使用列表理解。

>>> a = [10, 20, 30, 40, 20, 30, 40, 20, 70, 20]
>>> a = [x for x in a if x != 20]
>>> print(a)
[10, 30, 40, 30, 40, 70]

To remove an element’s first occurrence in a list, simply use list.remove:

>>> a = ['a', 'b', 'c', 'd']
>>> a.remove('b')
>>> print(a)
['a', 'c', 'd']

Mind that it does not remove all occurrences of your element. Use a list comprehension for that.

>>> a = [10, 20, 30, 40, 20, 30, 40, 20, 70, 20]
>>> a = [x for x in a if x != 20]
>>> print(a)
[10, 30, 40, 30, 40, 70]

回答 1

通常,如果您告诉Python做一些它无法做的事情,它将抛出一个Exception,因此您必须执行以下任一操作:

if c in a:
    a.remove(c)

要么:

try:
    a.remove(c)
except ValueError:
    pass

只要它是您期望的并且可以正确处理的异常,它就不一定是一件坏事。

Usually Python will throw an Exception if you tell it to do something it can’t so you’ll have to do either:

if c in a:
    a.remove(c)

or:

try:
    a.remove(c)
except ValueError:
    pass

An Exception isn’t necessarily a bad thing as long as it’s one you’re expecting and handle properly.


回答 2

你可以做

a=[1,2,3,4]
if 6 in a:
    a.remove(6)

但以上需要在列表中搜索6 2次,因此尝试使用除外会更快

try:
    a.remove(6)
except:
    pass

You can do

a=[1,2,3,4]
if 6 in a:
    a.remove(6)

but above need to search 6 in list a 2 times, so try except would be faster

try:
    a.remove(6)
except:
    pass

回答 3

考虑:

a = [1,2,2,3,4,5]

要排除所有情况,可以在python中使用filter函数。例如,它看起来像:

a = list(filter(lambda x: x!= 2, a))

因此,它将保留的所有元素a != 2

仅取出其中一项使用

a.remove(2)

Consider:

a = [1,2,2,3,4,5]

To take out all occurrences, you could use the filter function in python. For example, it would look like:

a = list(filter(lambda x: x!= 2, a))

So, it would keep all elements of a != 2.

To just take out one of the items use

a.remove(2)

回答 4

这是就地执行此操作的方法(无需列表理解):

def remove_all(seq, value):
    pos = 0
    for item in seq:
        if item != value:
           seq[pos] = item
           pos += 1
    del seq[pos:]

Here’s how to do it inplace (without list comprehension):

def remove_all(seq, value):
    pos = 0
    for item in seq:
        if item != value:
           seq[pos] = item
           pos += 1
    del seq[pos:]

回答 5

如果您知道要删除的值,这是一种简单的方法(无论如何,我仍然可以想到):

a = [0, 1, 1, 0, 1, 2, 1, 3, 1, 4]
while a.count(1) > 0:
    a.remove(1)

你会得到 [0, 0, 2, 3, 4]

If you know what value to delete, here’s a simple way (as simple as I can think of, anyway):

a = [0, 1, 1, 0, 1, 2, 1, 3, 1, 4]
while a.count(1) > 0:
    a.remove(1)

You’ll get [0, 0, 2, 3, 4]


回答 6

如果集合适用于您的应用程序,则另一种可能性是使用集合而不是列表。

IE,如果您的数据未排序,并且没有重复,则

my_set=set([3,4,2])
my_set.discard(1)

没有错误。

通常,列表只是一个方便存放实际未排序商品的容器。有一些问题询问如何从列表中删除所有出现的元素。如果您不想一开始就喜欢做傻瓜,那么再说一遍就很方便了。

my_set.add(3)

不会从上面改变my_set。

Another possibility is to use a set instead of a list, if a set is applicable in your application.

IE if your data is not ordered, and does not have duplicates, then

my_set=set([3,4,2])
my_set.discard(1)

is error-free.

Often a list is just a handy container for items that are actually unordered. There are questions asking how to remove all occurences of an element from a list. If you don’t want dupes in the first place, once again a set is handy.

my_set.add(3)

doesn’t change my_set from above.


回答 7

如许多其他答案所述,它list.remove()可以工作,但ValueError如果该项目不在列表中,则抛出a 。在python 3.4及更高版本中,有一种有趣的方法可以使用抑制上下文管理器来处理此问题:

from contextlib import suppress
with suppress(ValueError):
    a.remove('b')

As stated by numerous other answers, list.remove() will work, but throw a ValueError if the item wasn’t in the list. With python 3.4+, there’s an interesting approach to handling this, using the suppress contextmanager:

from contextlib import suppress
with suppress(ValueError):
    a.remove('b')

回答 8

通过使用列表的remove方法,可以更轻松地在列表中查找值,然后删除该索引(如果存在)。

>>> a = [1, 2, 3, 4]
>>> try:
...   a.remove(6)
... except ValueError:
...   pass
... 
>>> print a
[1, 2, 3, 4]
>>> try:
...   a.remove(3)
... except ValueError:
...   pass
... 
>>> print a
[1, 2, 4]

如果您经常这样做,则可以将其包装在一个函数中:

def remove_if_exists(L, value):
  try:
    L.remove(value)
  except ValueError:
    pass

Finding a value in a list and then deleting that index (if it exists) is easier done by just using list’s remove method:

>>> a = [1, 2, 3, 4]
>>> try:
...   a.remove(6)
... except ValueError:
...   pass
... 
>>> print a
[1, 2, 3, 4]
>>> try:
...   a.remove(3)
... except ValueError:
...   pass
... 
>>> print a
[1, 2, 4]

If you do this often, you can wrap it up in a function:

def remove_if_exists(L, value):
  try:
    L.remove(value)
  except ValueError:
    pass

回答 9

这个例子很快,并且将从列表中删除该值的所有实例:

a = [1,2,3,1,2,3,4]
while True:
    try:
        a.remove(3)
    except:
        break
print a
>>> [1, 2, 1, 2, 4]

This example is fast and will delete all instances of a value from the list:

a = [1,2,3,1,2,3,4]
while True:
    try:
        a.remove(3)
    except:
        break
print a
>>> [1, 2, 1, 2, 4]

回答 10

一行:

a.remove('b') if 'b' in a else None

有时很有用。

更简单:

if 'b' in a: a.remove('b')

In one line:

a.remove('b') if 'b' in a else None

sometimes it usefull.

Even easier:

if 'b' in a: a.remove('b')

回答 11

如果您的元素是不同的,那么简单的集合差异就可以了。

c = [1,2,3,4,'x',8,6,7,'x',9,'x']
z = list(set(c) - set(['x']))
print z
[1, 2, 3, 4, 6, 7, 8, 9]

If your elements are distinct, then a simple set difference will do.

c = [1,2,3,4,'x',8,6,7,'x',9,'x']
z = list(set(c) - set(['x']))
print z
[1, 2, 3, 4, 6, 7, 8, 9]

回答 12

我们还可以使用.pop:

>>> lst = [23,34,54,45]
>>> remove_element = 23
>>> if remove_element in lst:
...     lst.pop(lst.index(remove_element))
... 
23
>>> lst
[34, 54, 45]
>>> 

We can also use .pop:

>>> lst = [23,34,54,45]
>>> remove_element = 23
>>> if remove_element in lst:
...     lst.pop(lst.index(remove_element))
... 
23
>>> lst
[34, 54, 45]
>>> 

回答 13

通过索引除您要删除的元素以外的所有内容来覆盖列表

>>> s = [5,4,3,2,1]
>>> s[0:2] + s[3:]
[5, 4, 2, 1]

Overwrite the list by indexing everything except the elements you wish to remove

>>> s = [5,4,3,2,1]
>>> s[0:2] + s[3:]
[5, 4, 2, 1]

回答 14

有一个for循环和一个条件:

def cleaner(seq, value):    
    temp = []                      
    for number in seq:
        if number != value:
            temp.append(number)
    return temp

如果要删除一些但不是全部:

def cleaner(seq, value, occ):
    temp = []
    for number in seq:
        if number == value and occ:
            occ -= 1
            continue
        else:
            temp.append(number)
    return temp

With a for loop and a condition:

def cleaner(seq, value):    
    temp = []                      
    for number in seq:
        if number != value:
            temp.append(number)
    return temp

And if you want to remove some, but not all:

def cleaner(seq, value, occ):
    temp = []
    for number in seq:
        if number == value and occ:
            occ -= 1
            continue
        else:
            temp.append(number)
    return temp

回答 15

 list1=[1,2,3,3,4,5,6,1,3,4,5]
 n=int(input('enter  number'))
 while n in list1:
    list1.remove(n)
 print(list1)
 list1=[1,2,3,3,4,5,6,1,3,4,5]
 n=int(input('enter  number'))
 while n in list1:
    list1.remove(n)
 print(list1)

回答 16

举例来说,我们要从x中删除所有1。这就是我要做的:

x = [1, 2, 3, 1, 2, 3]

现在,这是我的方法的实际用法:

def Function(List, Unwanted):
    [List.remove(Unwanted) for Item in range(List.count(Unwanted))]
    return List
x = Function(x, 1)
print(x)

这是我的方法,只需一行:

[x.remove(1) for Item in range(x.count(1))]
print(x)

两者都将其作为输出:

[2, 3, 2, 3, 2, 3]

希望这可以帮助。PS,请注意,这是在3.6.2版中编写的,因此您可能需要针对旧版本进行调整。

Say for example, we want to remove all 1’s from x. This is how I would go about it:

x = [1, 2, 3, 1, 2, 3]

Now, this is a practical use of my method:

def Function(List, Unwanted):
    [List.remove(Unwanted) for Item in range(List.count(Unwanted))]
    return List
x = Function(x, 1)
print(x)

And this is my method in a single line:

[x.remove(1) for Item in range(x.count(1))]
print(x)

Both yield this as an output:

[2, 3, 2, 3, 2, 3]

Hope this helps. PS, pleas note that this was written in version 3.6.2, so you might need to adjust it for older versions.


回答 17

也许您的解决方案适用于int,但不适用于字典。

一方面,remove()对我不起作用。但也许它适用于基本类型。我猜下面的代码也是从对象列表中删除项目的方法。

另一方面,“ del”也无法正常工作。就我而言,使用python 3.6:当我尝试使用“ del”命令从“ for”气泡中的列表中删除元素时,python会更改进程中的索引,而bucle会在时间之前过早停止。仅当您以相反的顺序删除一个元素时,它才有效。这样,您在遍历时不会更改未决元素数组索引

然后,我用了:

c = len(list)-1
for element in (reversed(list)):
    if condition(element):
        del list[c]
    c -= 1
print(list)

其中“列表”类似于[{‘key1’:value1’},{‘key2’:value2},{‘key3’:value3},…]

另外,您可以使用enumerate做更多的pythonic操作:

for i, element in enumerate(reversed(list)):
    if condition(element):
        del list[(i+1)*-1]
print(list)

Maybe your solutions works with ints, but It Doesnt work for me with dictionarys.

In one hand, remove() has not worked for me. But maybe it works with basic Types. I guess the code bellow is also the way to remove items from objects list.

In the other hand, ‘del’ has not worked properly either. In my case, using python 3.6: when I try to delete an element from a list in a ‘for’ bucle with ‘del’ command, python changes the index in the process and bucle stops prematurely before time. It only works if You delete element by element in reversed order. In this way you dont change the pending elements array index when you are going through it

Then, Im used:

c = len(list)-1
for element in (reversed(list)):
    if condition(element):
        del list[c]
    c -= 1
print(list)

where ‘list’ is like [{‘key1′:value1’},{‘key2’:value2}, {‘key3’:value3}, …]

Also You can do more pythonic using enumerate:

for i, element in enumerate(reversed(list)):
    if condition(element):
        del list[(i+1)*-1]
print(list)

回答 18

arr = [1, 1, 3, 4, 5, 2, 4, 3]

# to remove first occurence of that element, suppose 3 in this example
arr.remove(3)

# to remove all occurences of that element, again suppose 3
# use something called list comprehension
new_arr = [element for element in arr if element!=3]

# if you want to delete a position use "pop" function, suppose 
# position 4 
# the pop function also returns a value
removed_element = arr.pop(4)

# u can also use "del" to delete a position
del arr[4]
arr = [1, 1, 3, 4, 5, 2, 4, 3]

# to remove first occurence of that element, suppose 3 in this example
arr.remove(3)

# to remove all occurences of that element, again suppose 3
# use something called list comprehension
new_arr = [element for element in arr if element!=3]

# if you want to delete a position use "pop" function, suppose 
# position 4 
# the pop function also returns a value
removed_element = arr.pop(4)

# u can also use "del" to delete a position
del arr[4]

回答 19

"-v"将从数组中删除所有实例sys.argv,并且如果未找到实例,则不会发出任何投诉:

while "-v" in sys.argv:
    sys.argv.remove('-v')

您可以在名为的文件中查看运行中的代码speechToText.py

$ python speechToText.py -v
['speechToText.py']

$ python speechToText.py -x
['speechToText.py', '-x']

$ python speechToText.py -v -v
['speechToText.py']

$ python speechToText.py -v -v -x
['speechToText.py', '-x']

This removes all instances of "-v" from the array sys.argv, and does not complain if no instances were found:

while "-v" in sys.argv:
    sys.argv.remove('-v')

You can see the code in action, in a file called speechToText.py:

$ python speechToText.py -v
['speechToText.py']

$ python speechToText.py -x
['speechToText.py', '-x']

$ python speechToText.py -v -v
['speechToText.py']

$ python speechToText.py -v -v -x
['speechToText.py', '-x']

回答 20

这就是我的回答,只是使用

def remove_all(data, value):
    i = j = 0
    while j < len(data):
        if data[j] == value:
            j += 1
            continue
        data[i] = data[j]
        i += 1
        j += 1
    for x in range(j - i):
        data.pop()

this is my answer, just use while and for

def remove_all(data, value):
    i = j = 0
    while j < len(data):
        if data[j] == value:
            j += 1
            continue
        data[i] = data[j]
        i += 1
        j += 1
    for x in range(j - i):
        data.pop()

删除,删除和弹出列表之间的区别

问题:删除,删除和弹出列表之间的区别

>>> a=[1,2,3]
>>> a.remove(2)
>>> a
[1, 3]
>>> a=[1,2,3]
>>> del a[1]
>>> a
[1, 3]
>>> a= [1,2,3]
>>> a.pop(1)
2
>>> a
[1, 3]
>>> 

以上三种从列表中删除元素的方法之间有什么区别吗?

>>> a=[1,2,3]
>>> a.remove(2)
>>> a
[1, 3]
>>> a=[1,2,3]
>>> del a[1]
>>> a
[1, 3]
>>> a= [1,2,3]
>>> a.pop(1)
2
>>> a
[1, 3]
>>> 

Is there any difference between the above three methods to remove an element from a list?


回答 0

是的,remove删除第一个匹配,而不是特定的索引:

>>> a = [0, 2, 3, 2]
>>> a.remove(2)
>>> a
[0, 3, 2]

del 删除特定索引处的项目:

>>> a = [9, 8, 7, 6]
>>> del a[1]
>>> a
[9, 7, 6]

pop从特定索引处删除该项目并返回。

>>> a = [4, 3, 5]
>>> a.pop(1)
3
>>> a
[4, 5]

它们的错误模式也不同:

>>> a = [4, 5, 6]
>>> a.remove(7)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: list.remove(x): x not in list
>>> del a[7]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list assignment index out of range
>>> a.pop(7)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: pop index out of range

Yes, remove removes the first matching value, not a specific index:

>>> a = [0, 2, 3, 2]
>>> a.remove(2)
>>> a
[0, 3, 2]

del removes the item at a specific index:

>>> a = [9, 8, 7, 6]
>>> del a[1]
>>> a
[9, 7, 6]

and pop removes the item at a specific index and returns it.

>>> a = [4, 3, 5]
>>> a.pop(1)
3
>>> a
[4, 5]

Their error modes are different too:

>>> a = [4, 5, 6]
>>> a.remove(7)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: list.remove(x): x not in list
>>> del a[7]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list assignment index out of range
>>> a.pop(7)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: pop index out of range

回答 1

用于del按索引删除元素,pop()如果需要返回值,则按索引remove()删除元素,以及按值删除元素。后者需要搜索列表,并在列表中ValueError没有出现此类值时引发。

in元素列表中删除索引时,这些方法的计算复杂度为

del     O(n - i)
pop     O(n - i)
remove  O(n)

Use del to remove an element by index, pop() to remove it by index if you need the returned value, and remove() to delete an element by value. The latter requires searching the list, and raises ValueError if no such value occurs in the list.

When deleting index i from a list of n elements, the computational complexities of these methods are

del     O(n - i)
pop     O(n - i)
remove  O(n)

回答 2

由于没有其他人提到过它,因此请注意del(与一样pop)由于列表切片而允许删除一系列索引:

>>> lst = [3, 2, 2, 1]
>>> del lst[1:]
>>> lst
[3]

IndexError如果索引不在列表中,这还可以避免使用an :

>>> lst = [3, 2, 2, 1]
>>> del lst[10:]
>>> lst
[3, 2, 2, 1]

Since no-one else has mentioned it, note that del (unlike pop) allows the removal of a range of indexes because of list slicing:

>>> lst = [3, 2, 2, 1]
>>> del lst[1:]
>>> lst
[3]

This also allows avoidance of an IndexError if the index is not in the list:

>>> lst = [3, 2, 2, 1]
>>> del lst[10:]
>>> lst
[3, 2, 2, 1]

回答 3

已经被其他人很好地回答了。这是我的结尾:)

删除vs流行vs del

显然,pop是唯一返回值remove的对象,并且是唯一搜索对象的对象,同时del将自身限制为简单的删除。

Already answered quite well by others. This one from my end :)

remove vs pop vs del

Evidently, pop is the only one which returns the value, and remove is the only one which searches the object, while del limits itself to a simple deletion.


回答 4

这里有很多最佳的解释,但我会尽力简化一下。

在所有这些方法中,reverse和pop是后缀,而delete是prefix

remove():用于删除元素的第一次出现

remove(i) =>第一次出现i值

>>> a = [0, 2, 3, 2, 1, 4, 6, 5, 7]
>>> a.remove(2)   # where i = 2
>>> a
[0, 3, 2, 1, 4, 6, 5, 7]

pop():如果满足以下条件,则用于删除元素:

未指定

pop() =>从列表末尾

>>>a.pop()
>>>a
[0, 3, 2, 1, 4, 6, 5]

指定的

pop(index) =>索引

>>>a.pop(2)
>>>a
[0, 3, 1, 4, 6, 5]

警告:危险方法提前

delete():它是一个前缀方法。

注意同一方法的两种不同语法:[]和()。它具有以下功能:

1.删​​除索引

del a[index] =>用于删除索引及其关联值,就像pop。

>>>del a[1]
>>>a
[0, 1, 4, 6, 5]

2.删除[index 1:index N]范围内的值

del a[0:3] =>范围内的多个值

>>>del a[0:3]
>>>a
[6, 5]

3.最后但不是列表,一次删除整个列表

del (a) =>如上所述。

>>>del (a)
>>>a

希望这可以澄清混淆。

Many best explanations are here but I will try my best to simplify more.

Among all these methods, reverse & pop are postfix while delete is prefix.

remove(): It used to remove first occurrence of element

remove(i) => first occurrence of i value

>>> a = [0, 2, 3, 2, 1, 4, 6, 5, 7]
>>> a.remove(2)   # where i = 2
>>> a
[0, 3, 2, 1, 4, 6, 5, 7]

pop(): It used to remove element if:

unspecified

pop() => from end of list

>>>a.pop()
>>>a
[0, 3, 2, 1, 4, 6, 5]

specified

pop(index) => of index

>>>a.pop(2)
>>>a
[0, 3, 1, 4, 6, 5]

WARNING: Dangerous Method Ahead

delete(): Its a prefix method.

Keep an eye on two different syntax for same method: [] and (). It possesses power to:

1.Delete index

del a[index] => used to delete index and its associated value just like pop.

>>>del a[1]
>>>a
[0, 1, 4, 6, 5]

2.Delete values in range [index 1:index N]

del a[0:3] => multiple values in range

>>>del a[0:3]
>>>a
[6, 5]

3.Last but not list, to delete whole list in one shot

del (a) => as said above.

>>>del (a)
>>>a

Hope this clarifies the confusion if any.


回答 5

pop-获取索引并返回值

remove-取值,删除第一个匹配项,不返回任何内容

delete-获取索引,删除该索引处的值,并且不返回任何内容

pop – Takes Index and returns Value

remove – Takes value, removes first occurrence, and returns nothing

delete – Takes index, removes value at that index, and returns nothing


回答 6

针对特定动作定义了对不同数据结构的任何操作/功能。在您的情况下,即删除一个元素,然后删除,弹出和删除。(如果考虑设置,则添加另一个操作-丢弃)添加时其他令人困惑的情况。插入/追加。为了演示,让我们实现双端队列。deque是一种混合线性数据结构,您可以在其中添加元素/从两端删除元素。(后端和前端)

class Deque(object):

  def __init__(self):

    self.items=[]

  def addFront(self,item):

    return self.items.insert(0,item)
  def addRear(self,item):

    return self.items.append(item)
  def deleteFront(self):

    return self.items.pop(0)
  def deleteRear(self):
    return self.items.pop()
  def returnAll(self):

    return self.items[:]

在这里,查看操作:

def deleteFront(self):

    return self.items.pop(0)
def deleteRear(self):
    return self.items.pop()

操作必须返回一些东西。因此,弹出-有和没有索引。如果我不想返回该值:del self.items [0]

按值删除而不是索引:

  • 去掉 :

    list_ez=[1,2,3,4,5,6,7,8]
    for i in list_ez:
        if i%2==0:
            list_ez.remove(i)
    print list_ez

返回[1,3,5,7]

让我们考虑一下集合的情况。

set_ez=set_ez=set(range(10))

set_ez.remove(11)

# Gives Key Value Error. 
##KeyError: 11

set_ez.discard(11)

# Does Not return any errors.

Any operation/function on different data structures is defined for particular actions. Here in your case i.e. removing an element, delete, Pop and remove. (If you consider sets, Add another operation – discard) Other confusing case is while adding. Insert/Append. For Demonstration, Let us Implement deque. deque is a hybrid linear data structure, where you can add elements / remove elements from both ends.(Rear and front Ends)

class Deque(object):

  def __init__(self):

    self.items=[]

  def addFront(self,item):

    return self.items.insert(0,item)
  def addRear(self,item):

    return self.items.append(item)
  def deleteFront(self):

    return self.items.pop(0)
  def deleteRear(self):
    return self.items.pop()
  def returnAll(self):

    return self.items[:]

In here, see the operations:

def deleteFront(self):

    return self.items.pop(0)
def deleteRear(self):
    return self.items.pop()

Operations have to return something. So, pop – With and without an index. If I don’t want to return the value: del self.items[0]

Delete by value not Index:

  • remove :

    list_ez=[1,2,3,4,5,6,7,8]
    for i in list_ez:
        if i%2==0:
            list_ez.remove(i)
    print list_ez
    

Returns [1,3,5,7]

let us consider the case of sets.

set_ez=set_ez=set(range(10))

set_ez.remove(11)

# Gives Key Value Error. 
##KeyError: 11

set_ez.discard(11)

# Does Not return any errors.

回答 7

pop和delete都带有索引,以删除元素,如上面的注释所述。一个关键的区别是它们的时间复杂度。没有索引的pop()的时间复杂度为O(1),但删除最后一个元素的情况不同。

如果您的用例始终是删除最后一个元素,则始终首选使用pop()而不是delete()。有关时间复杂度的更多说明,请参阅https://www.ics.uci.edu/~pattis/ICS-33/lectures/complexitypython.txt

While pop and delete both take indices to remove an element as stated in above comments. A key difference is the time complexity for them. The time complexity for pop() with no index is O(1) but is not the same case for deletion of last element.

If your use case is always to delete the last element, it’s always preferable to use pop() over delete(). For more explanation on time complexities, you can refer to https://www.ics.uci.edu/~pattis/ICS-33/lectures/complexitypython.txt


回答 8

列表上的删除操作具有要删除的值。它搜索列表以查找具有该值的项目,然后删除找到的第一个匹配项目。如果没有匹配的项目,将引发ValueError,这是一个错误。

>>> x = [1, 0, 0, 0, 3, 4, 5]
>>> x.remove(4)
>>> x
[1, 0, 0, 0, 3, 5]
>>> del x[7]
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    del x[7]
IndexError: list assignment index out of range

德尔语句可以用来删除整个列表。如果您有一个特定的列表项作为del的参数(例如,listname [7]专门引用列表中的第8个项),则将其删除。甚至有可能从列表中删除“片段”。如果索引超出范围,则会引发IndexError,这是一个错误。

>>> x = [1, 2, 3, 4]
>>> del x[3]
>>> x
[1, 2, 3]
>>> del x[4]
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    del x[4]
IndexError: list assignment index out of range

pop的通常用法是在将列表用作堆栈时从列表中删除最后一项。与del不同,pop返回从列表中弹出的值。您可以选择提供一个索引值来从列表的末尾弹出和弹出(例如listname.pop(0)将从列表中删除第一项并返回该第一项作为结果)。您可以使用它使列表表现得像队列一样,但是有一些库例程可以提供比pop(0)更好的队列操作性能。如果索引超出范围,则会引发IndexError,这是一个错误。

>>> x = [1, 2, 3] 
>>> x.pop(2) 
3 
>>> x 
[1, 2]
>>> x.pop(4)
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    x.pop(4)
IndexError: pop index out of range

有关更多详细信息,请参见collections.deque

The remove operation on a list is given a value to remove. It searches the list to find an item with that value and deletes the first matching item it finds. It is an error if there is no matching item, raises a ValueError.

>>> x = [1, 0, 0, 0, 3, 4, 5]
>>> x.remove(4)
>>> x
[1, 0, 0, 0, 3, 5]
>>> del x[7]
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    del x[7]
IndexError: list assignment index out of range

The del statement can be used to delete an entire list. If you have a specific list item as your argument to del (e.g. listname[7] to specifically reference the 8th item in the list), it’ll just delete that item. It is even possible to delete a “slice” from a list. It is an error if there index out of range, raises a IndexError.

>>> x = [1, 2, 3, 4]
>>> del x[3]
>>> x
[1, 2, 3]
>>> del x[4]
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    del x[4]
IndexError: list assignment index out of range

The usual use of pop is to delete the last item from a list as you use the list as a stack. Unlike del, pop returns the value that it popped off the list. You can optionally give an index value to pop and pop from other than the end of the list (e.g listname.pop(0) will delete the first item from the list and return that first item as its result). You can use this to make the list behave like a queue, but there are library routines available that can provide queue operations with better performance than pop(0) does. It is an error if there index out of range, raises a IndexError.

>>> x = [1, 2, 3] 
>>> x.pop(2) 
3 
>>> x 
[1, 2]
>>> x.pop(4)
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    x.pop(4)
IndexError: pop index out of range

See collections.deque for more details.


回答 9

删除基本上对值起作用。删除并弹出索引上的工作

删除基本上删除第一个匹配值。Delete从特定索引中删除项目。Pop基本上会获取一个索引并返回该索引处的值。下次打印列表时,该值不会出现。

例:

Remove basically works on the value . Delete and pop work on the index

Remove basically removes the first matching value. Delete deletes the item from a specific index Pop basically takes an index and returns the value at that index. Next time you print the list the value doesnt appear.

Example:


回答 10

您也可以使用remove来删除索引值。

n = [1, 3, 5]

n.remove(n[1])

n然后将指代[1,5]

You can also use remove to remove a value by index as well.

n = [1, 3, 5]

n.remove(n[1])

n would then refer to [1, 5]


如何并行地遍历两个列表?

问题:如何并行地遍历两个列表?

我在Python中有两个可迭代的对象,我想成对地遍历它们:

foo = (1, 2, 3)
bar = (4, 5, 6)

for (f, b) in some_iterator(foo, bar):
    print "f: ", f, "; b: ", b

它应导致:

f: 1; b: 4
f: 2; b: 5
f: 3; b: 6

一种方法是遍历索引:

for i in xrange(len(foo)):
    print "f: ", foo[i], "; b: ", b[i]

但这对我来说似乎有些不可思议。有更好的方法吗?

I have two iterables in Python, and I want to go over them in pairs:

foo = (1, 2, 3)
bar = (4, 5, 6)

for (f, b) in some_iterator(foo, bar):
    print "f: ", f, "; b: ", b

It should result in:

f: 1; b: 4
f: 2; b: 5
f: 3; b: 6

One way to do it is to iterate over the indices:

for i in xrange(len(foo)):
    print "f: ", foo[i], "; b: ", b[i]

But that seems somewhat unpythonic to me. Is there a better way to do it?


回答 0

Python 3

for f, b in zip(foo, bar):
    print(f, b)

zipfoo或中的较短者bar停止。

Python 3zip 中,像itertools.izip在Python2中一样,返回元组的迭代器。要获取元组列表,请使用list(zip(foo, bar))。要压缩直到两个迭代器都用尽,可以使用 itertools.zip_longest

Python 2

Python 2中zip 返回一个元组列表。当foobar不是很大时,这很好。如果它们都是大量的,则形成zip(foo,bar)是不必要的大量临时变量,应将其替换为itertools.izipitertools.izip_longest,它返回迭代器而不是列表。

import itertools
for f,b in itertools.izip(foo,bar):
    print(f,b)
for f,b in itertools.izip_longest(foo,bar):
    print(f,b)

izipfoobar耗尽时停止。 izip_longest当两个停止foobar耗尽。当较短的迭代器用尽时,将izip_longest生成一个元组,None其位置与该迭代器相对应。您还可以设置不同fillvalue之外None,如果你想。看到这里的完整故事


还要注意zip,其zip类似brethen可以接受任意数量的Iterables作为参数。例如,

for num, cheese, color in zip([1,2,3], ['manchego', 'stilton', 'brie'], 
                              ['red', 'blue', 'green']):
    print('{} {} {}'.format(num, color, cheese))

版画

1 red manchego
2 blue stilton
3 green brie

Python 3

for f, b in zip(foo, bar):
    print(f, b)

zip stops when the shorter of foo or bar stops.

In Python 3, zip returns an iterator of tuples, like itertools.izip in Python2. To get a list of tuples, use list(zip(foo, bar)). And to zip until both iterators are exhausted, you would use itertools.zip_longest.

Python 2

In Python 2, zip returns a list of tuples. This is fine when foo and bar are not massive. If they are both massive then forming zip(foo,bar) is an unnecessarily massive temporary variable, and should be replaced by itertools.izip or itertools.izip_longest, which returns an iterator instead of a list.

import itertools
for f,b in itertools.izip(foo,bar):
    print(f,b)
for f,b in itertools.izip_longest(foo,bar):
    print(f,b)

izip stops when either foo or bar is exhausted. izip_longest stops when both foo and bar are exhausted. When the shorter iterator(s) are exhausted, izip_longest yields a tuple with None in the position corresponding to that iterator. You can also set a different fillvalue besides None if you wish. See here for the full story.


Note also that zip and its zip-like brethen can accept an arbitrary number of iterables as arguments. For example,

for num, cheese, color in zip([1,2,3], ['manchego', 'stilton', 'brie'], 
                              ['red', 'blue', 'green']):
    print('{} {} {}'.format(num, color, cheese))

prints

1 red manchego
2 blue stilton
3 green brie

回答 1

您需要该zip功能。

for (f,b) in zip(foo, bar):
    print "f: ", f ,"; b: ", b

You want the zip function.

for (f,b) in zip(foo, bar):
    print "f: ", f ,"; b: ", b

回答 2

您应该使用“ zip ”功能。这是您自己的zip函数的外观示例

def custom_zip(seq1, seq2):
    it1 = iter(seq1)
    it2 = iter(seq2)
    while True:
        yield next(it1), next(it2)

You should use ‘zip‘ function. Here is an example how your own zip function can look like

def custom_zip(seq1, seq2):
    it1 = iter(seq1)
    it2 = iter(seq2)
    while True:
        yield next(it1), next(it2)

回答 3

您可以使用理解将第n个元素捆绑到一个元组或列表中,然后使用生成器函数将其传递出去。

def iterate_multi(*lists):
    for i in range(min(map(len,lists))):
        yield tuple(l[i] for l in lists)

for l1, l2, l3 in iterate_multi([1,2,3],[4,5,6],[7,8,9]):
    print(str(l1)+","+str(l2)+","+str(l3))

You can bundle the nth elements into a tuple or list using comprehension, then pass them out with a generator function.

def iterate_multi(*lists):
    for i in range(min(map(len,lists))):
        yield tuple(l[i] for l in lists)

for l1, l2, l3 in iterate_multi([1,2,3],[4,5,6],[7,8,9]):
    print(str(l1)+","+str(l2)+","+str(l3))

回答 4

万一有人在寻找这样的东西,我发现它非常简单:

list_1 = ["Hello", "World"]
list_2 = [1, 2, 3]

for a,b in [(list_1, list_2)]:
    for element_a in a:
        ...
    for element_b in b:
        ...

>> Hello
World
1
2
3

列表将以其全部内容进行迭代,而zip()只会迭代最大内容长度。

In case someone is looking for something like this, I found it very simple and easy:

list_1 = ["Hello", "World"]
list_2 = [1, 2, 3]

for a,b in [(list_1, list_2)]:
    for element_a in a:
        ...
    for element_b in b:
        ...

>> Hello
World
1
2
3

The lists will be iterated with their full content, unlike zip() which only iterates up to the minimum content length.


回答 5

以下是使用列表理解的方法:

a = (1, 2, 3)
b = (4, 5, 6)
[print('f:', i, '; b', j) for i, j in zip(a, b)]

印刷品:

f: 1 ; b 4
f: 2 ; b 5
f: 3 ; b 6

Here’s how to do it with list comprehension:

a = (1, 2, 3)
b = (4, 5, 6)
[print('f:', i, '; b', j) for i, j in zip(a, b)]

prints:

f: 1 ; b 4
f: 2 ; b 5
f: 3 ; b 6

列表理解与lambda +过滤器

问题:列表理解与lambda +过滤器

我碰巧发现自己有一个基本的过滤需求:我有一个列表,并且必须按项目的属性对其进行过滤。

我的代码如下所示:

my_list = [x for x in my_list if x.attribute == value]

但是后来我想,这样写会更好吗?

my_list = filter(lambda x: x.attribute == value, my_list)

它更具可读性,并且如果需要性能,可以将lambda取出以获取收益。

问题是:使用第二种方法是否有警告?有任何性能差异吗?我是否完全想念Pythonic Way™,应该以另一种方式来做到这一点(例如,使用itemgetter而不是lambda)吗?

I happened to find myself having a basic filtering need: I have a list and I have to filter it by an attribute of the items.

My code looked like this:

my_list = [x for x in my_list if x.attribute == value]

But then I thought, wouldn’t it be better to write it like this?

my_list = filter(lambda x: x.attribute == value, my_list)

It’s more readable, and if needed for performance the lambda could be taken out to gain something.

Question is: are there any caveats in using the second way? Any performance difference? Am I missing the Pythonic Way™ entirely and should do it in yet another way (such as using itemgetter instead of the lambda)?


回答 0

奇怪的是,不同的人有多少美丽。我发现列表理解比filter+ 清晰得多lambda,但是请使用任何您更容易理解的列表。

有两件事可能会减慢您对的使用filter

第一个是函数调用开销:使用Python函数(无论是由def还是创建的lambda)后,过滤器的运行速度可能会比列表理解慢。几乎可以肯定,这还不够重要,并且在对代码进行计时并发现它是瓶颈之前,您不应该对性能进行太多的考虑,但是区别仍然存在。

可能适用的其他开销是,lambda被强制访问作用域变量(value)。这比访问局部变量要慢,并且在Python 2.x中,列表推导仅访问局部变量。如果您使用的是Python 3.x,则列表推导是在单独的函数中运行的,因此它也将value通过闭包进行访问,这种区别将不适用。

要考虑的另一个选项是使用生成器而不是列表推导:

def filterbyvalue(seq, value):
   for el in seq:
       if el.attribute==value: yield el

然后,在您的主要代码(这才是真正的可读性)中,您已经用有希望的有意义的函数名称替换了列表理解和过滤器。

It is strange how much beauty varies for different people. I find the list comprehension much clearer than filter+lambda, but use whichever you find easier.

There are two things that may slow down your use of filter.

The first is the function call overhead: as soon as you use a Python function (whether created by def or lambda) it is likely that filter will be slower than the list comprehension. It almost certainly is not enough to matter, and you shouldn’t think much about performance until you’ve timed your code and found it to be a bottleneck, but the difference will be there.

The other overhead that might apply is that the lambda is being forced to access a scoped variable (value). That is slower than accessing a local variable and in Python 2.x the list comprehension only accesses local variables. If you are using Python 3.x the list comprehension runs in a separate function so it will also be accessing value through a closure and this difference won’t apply.

The other option to consider is to use a generator instead of a list comprehension:

def filterbyvalue(seq, value):
   for el in seq:
       if el.attribute==value: yield el

Then in your main code (which is where readability really matters) you’ve replaced both list comprehension and filter with a hopefully meaningful function name.


回答 1

在Python中,这是一个有点宗教性的问题。即使Guido考虑从Python 3中删除它mapfilter并且reduce存在足够的反弹,最终只reduce从内置函数转移到functools.reduce

我个人认为列表理解更容易阅读。[i for i in list if i.attribute == value]由于所有行为都在表面上而不是在过滤器函数内部,因此表达式中发生的事情更加明确。

我不会太担心这两种方法之间的性能差异,因为这是微不足道的。如果确实证明这是您应用程序中的瓶颈(不太可能),我真的只会对其进行优化。

另外,由于BDFL希望filter摆脱这种语言,因此可以肯定地自动使列表理解更具Pythonic ;-)

This is a somewhat religious issue in Python. Even though Guido considered removing map, filter and reduce from Python 3, there was enough of a backlash that in the end only reduce was moved from built-ins to functools.reduce.

Personally I find list comprehensions easier to read. It is more explicit what is happening from the expression [i for i in list if i.attribute == value] as all the behaviour is on the surface not inside the filter function.

I would not worry too much about the performance difference between the two approaches as it is marginal. I would really only optimise this if it proved to be the bottleneck in your application which is unlikely.

Also since the BDFL wanted filter gone from the language then surely that automatically makes list comprehensions more Pythonic ;-)


回答 2

由于任何速度差都将是微不足道的,因此使用过滤器还是列表理解都取决于品味。总的来说,我倾向于使用理解(这里似乎与大多数其他答案一致),但是在某些情况下,我更喜欢使用filter

一个非常常见的用例是抽取某些可迭代的X的值作为谓词P(x):

[x for x in X if P(x)]

但有时您想先将某些函数应用于这些值:

[f(x) for x in X if P(f(x))]


作为一个具体的例子,考虑

primes_cubed = [x*x*x for x in range(1000) if prime(x)]

我认为这看起来比使用略好filter。但是现在考虑

prime_cubes = [x*x*x for x in range(1000) if prime(x*x*x)]

在这种情况下,我们要filter反对后计算值。除了两次计算多维数据集的问题(想象一个更昂贵的计算)外,还有两次写入表达式的问题,这违背了DRY的审美观。在这种情况下,我倾向于使用

prime_cubes = filter(prime, [x*x*x for x in range(1000)])

Since any speed difference is bound to be miniscule, whether to use filters or list comprehensions comes down to a matter of taste. In general I’m inclined to use comprehensions (which seems to agree with most other answers here), but there is one case where I prefer filter.

A very frequent use case is pulling out the values of some iterable X subject to a predicate P(x):

[x for x in X if P(x)]

but sometimes you want to apply some function to the values first:

[f(x) for x in X if P(f(x))]


As a specific example, consider

primes_cubed = [x*x*x for x in range(1000) if prime(x)]

I think this looks slightly better than using filter. But now consider

prime_cubes = [x*x*x for x in range(1000) if prime(x*x*x)]

In this case we want to filter against the post-computed value. Besides the issue of computing the cube twice (imagine a more expensive calculation), there is the issue of writing the expression twice, violating the DRY aesthetic. In this case I’d be apt to use

prime_cubes = filter(prime, [x*x*x for x in range(1000)])

回答 3

尽管filter可能是“更快的方法”,但“ Python方式”将不在乎这些事情,除非性能绝对至关重要(在这种情况下,您将不会使用Python!)。

Although filter may be the “faster way”, the “Pythonic way” would be not to care about such things unless performance is absolutely critical (in which case you wouldn’t be using Python!).


回答 4

我以为我会在python 3中添加,filter()实际上是一个迭代器对象,因此您必须将filter方法调用传递给list()才能构建过滤后的列表。所以在python 2:

lst_a = range(25) #arbitrary list
lst_b = [num for num in lst_a if num % 2 == 0]
lst_c = filter(lambda num: num % 2 == 0, lst_a)

列表b和c具有相同的值,并且大约在相同的时间内完成,因为filter()等效[如果在z中,则x表示y中的x]。但是,在3中,相同的代码将使列表c包含过滤器对象,而不是过滤后的列表。要在3中产生相同的值:

lst_a = range(25) #arbitrary list
lst_b = [num for num in lst_a if num % 2 == 0]
lst_c = list(filter(lambda num: num %2 == 0, lst_a))

问题在于list()接受一个可迭代的参数,并从该参数创建一个新列表。结果是,在python 3中以这种方式使用filter所花费的时间是[x for x in y if z]中方法的两倍,因为您必须遍历filter()的输出以及原始列表。

I thought I’d just add that in python 3, filter() is actually an iterator object, so you’d have to pass your filter method call to list() in order to build the filtered list. So in python 2:

lst_a = range(25) #arbitrary list
lst_b = [num for num in lst_a if num % 2 == 0]
lst_c = filter(lambda num: num % 2 == 0, lst_a)

lists b and c have the same values, and were completed in about the same time as filter() was equivalent [x for x in y if z]. However, in 3, this same code would leave list c containing a filter object, not a filtered list. To produce the same values in 3:

lst_a = range(25) #arbitrary list
lst_b = [num for num in lst_a if num % 2 == 0]
lst_c = list(filter(lambda num: num %2 == 0, lst_a))

The problem is that list() takes an iterable as it’s argument, and creates a new list from that argument. The result is that using filter in this way in python 3 takes up to twice as long as the [x for x in y if z] method because you have to iterate over the output from filter() as well as the original list.


回答 5

一个重要的区别是列表理解将list在过滤器返回a 的同时返回filter,而您不能像a那样操作list(即:对其进行调用len,但不能与的返回一起使用filter)。

我自己的自学使我遇到了一些类似的问题。

话虽这么说,如果有一种方法可以list从a中获得结果filter,就像您在.NET中所做的那样lst.Where(i => i.something()).ToList(),我很想知道。

编辑:这是Python 3而不是2的情况(请参阅注释中的讨论)。

An important difference is that list comprehension will return a list while the filter returns a filter, which you cannot manipulate like a list (ie: call len on it, which does not work with the return of filter).

My own self-learning brought me to some similar issue.

That being said, if there is a way to have the resulting list from a filter, a bit like you would do in .NET when you do lst.Where(i => i.something()).ToList(), I am curious to know it.

EDIT: This is the case for Python 3, not 2 (see discussion in comments).


回答 6

我发现第二种方法更具可读性。它确切地告诉您意图是什么:过滤列表。
PS:请勿将“列表”用作变量名

I find the second way more readable. It tells you exactly what the intention is: filter the list.
PS: do not use ‘list’ as a variable name


回答 7

filter如果使用内置函数,通常会稍快一些。

我希望列表理解在您的情况下会更快

generally filter is slightly faster if using a builtin function.

I would expect the list comprehension to be slightly faster in your case


回答 8

过滤器就是这样。它过滤出列表的元素。您可以看到定义中提到的内容相同(在我之前提到的官方文档链接中)。然而,列表理解是什么,作用于后产生一个新的列表的东西前面的列表上。(两个过滤器和列表理解创造了新的名单,并取代旧的名单无法执行操作。这里一个新的名单是像一个列表(例如,一种全新的数据类型。例如将整数转换为字符串等)

在您的示例中,按照定义,使用过滤器比使用列表理解更好。但是,如果您想从列表元素中说other_attribute,在您的示例中要作为新列表进行检索,则可以使用列表理解。

return [item.other_attribute for item in my_list if item.attribute==value]

这就是我实际上记得有关过滤器和列表理解的方式。删除列表中的一些内容并保持其他元素不变,请使用过滤器。在元素上自行使用一些逻辑,并创建适合于某些目的的缩减列表,使用列表理解。

Filter is just that. It filters out the elements of a list. You can see the definition mentions the same(in the official docs link I mentioned before). Whereas, list comprehension is something that produces a new list after acting upon something on the previous list.(Both filter and list comprehension creates new list and not perform operation in place of the older list. A new list here is something like a list with, say, an entirely new data type. Like converting integers to string ,etc)

In your example, it is better to use filter than list comprehension, as per the definition. However, if you want, say other_attribute from the list elements, in your example is to be retrieved as a new list, then you can use list comprehension.

return [item.other_attribute for item in my_list if item.attribute==value]

This is how I actually remember about filter and list comprehension. Remove a few things within a list and keep the other elements intact, use filter. Use some logic on your own at the elements and create a watered down list suitable for some purpose, use list comprehension.


回答 9

这是我需要在列表理解进行筛选时使用的一小段内容。只是过滤器,lambda和列表(也称为猫的忠诚度和狗的清洁度)的组合。

在这种情况下,我正在读取文件,删除空白行,注释掉行,以及对行进行注释后的所有内容:

# Throw out blank lines and comments
with open('file.txt', 'r') as lines:        
    # From the inside out:
    #    [s.partition('#')[0].strip() for s in lines]... Throws out comments
    #   filter(lambda x: x!= '', [s.part... Filters out blank lines
    #  y for y in filter... Converts filter object to list
    file_contents = [y for y in filter(lambda x: x != '', [s.partition('#')[0].strip() for s in lines])]

Here’s a short piece I use when I need to filter on something after the list comprehension. Just a combination of filter, lambda, and lists (otherwise known as the loyalty of a cat and the cleanliness of a dog).

In this case I’m reading a file, stripping out blank lines, commented out lines, and anything after a comment on a line:

# Throw out blank lines and comments
with open('file.txt', 'r') as lines:        
    # From the inside out:
    #    [s.partition('#')[0].strip() for s in lines]... Throws out comments
    #   filter(lambda x: x!= '', [s.part... Filters out blank lines
    #  y for y in filter... Converts filter object to list
    file_contents = [y for y in filter(lambda x: x != '', [s.partition('#')[0].strip() for s in lines])]

回答 10

除了可接受的答案外,还有一个极端的情况,您应该使用过滤器而不是列表推导。如果列表不可散列,则无法直接使用列表推导处理它。一个真实的例子是,如果您用来pyodbc从数据库中读取结果。在fetchAll()从结果cursor是unhashable列表。在这种情况下,要直接处理返回的结果,应使用过滤器:

cursor.execute("SELECT * FROM TABLE1;")
data_from_db = cursor.fetchall()
processed_data = filter(lambda s: 'abc' in s.field1 or s.StartTime >= start_date_time, data_from_db) 

如果您在此处使用列表理解,则会出现错误:

TypeError:无法散列的类型:“列表”

In addition to the accepted answer, there is a corner case when you should use filter instead of a list comprehension. If the list is unhashable you cannot directly process it with a list comprehension. A real world example is if you use pyodbc to read results from a database. The fetchAll() results from cursor is an unhashable list. In this situation, to directly manipulating on the returned results, filter should be used:

cursor.execute("SELECT * FROM TABLE1;")
data_from_db = cursor.fetchall()
processed_data = filter(lambda s: 'abc' in s.field1 or s.StartTime >= start_date_time, data_from_db) 

If you use list comprehension here you will get the error:

TypeError: unhashable type: ‘list’


回答 11

我花了一些时间熟悉higher order functions filterand map。因此,我习惯了它们,实际上我很喜欢filter,因为很明显它通过保留真实内容来进行过滤,而且我知道一些functional programming术语也很酷。

然后,我读了这段文章(Fluent Python书):

map和filter函数仍是Python 3中的内置函数,但是由于引入了列表理解和生成器表达式,因此它们并不那么重要。listcomp或genexp可将地图和过滤器组合在一起,但可读性更高。

现在,我想,如果您可以使用已经很广泛的成语(例如列表理解)来实现filter/ 的概念,那又何必困扰 map呢?而且mapsfilters是种功能。在这种情况下,我更喜欢使用Anonymous functionslambda。

最后,仅出于测试目的,我对这两种方法(maplistComp)都进行了计时,但没有看到任何相关的速度差异来证明对此进行论证的合理性。

from timeit import Timer

timeMap = Timer(lambda: list(map(lambda x: x*x, range(10**7))))
print(timeMap.timeit(number=100))

timeListComp = Timer(lambda:[(lambda x: x*x) for x in range(10**7)])
print(timeListComp.timeit(number=100))

#Map:                 166.95695265199174
#List Comprehension   177.97208347299602

It took me some time to get familiarized with the higher order functions filter and map. So i got used to them and i actually liked filter as it was explicit that it filters by keeping whatever is truthy and I’ve felt cool that I knew some functional programming terms.

Then I read this passage (Fluent Python Book):

The map and filter functions are still builtins in Python 3, but since the introduction of list comprehensions and generator ex‐ pressions, they are not as important. A listcomp or a genexp does the job of map and filter combined, but is more readable.

And now I think, why bother with the concept of filter / map if you can achieve it with already widely spread idioms like list comprehensions. Furthermore maps and filters are kind of functions. In this case I prefer using Anonymous functions lambdas.

Finally, just for the sake of having it tested, I’ve timed both methods (map and listComp) and I didn’t see any relevant speed difference that would justify making arguments about it.

from timeit import Timer

timeMap = Timer(lambda: list(map(lambda x: x*x, range(10**7))))
print(timeMap.timeit(number=100))

timeListComp = Timer(lambda:[(lambda x: x*x) for x in range(10**7)])
print(timeListComp.timeit(number=100))

#Map:                 166.95695265199174
#List Comprehension   177.97208347299602

回答 12

奇怪的是,在Python 3上,我看到过滤器的执行速度快于列表推导。

我一直认为列表理解会更有效。类似于:[如果名称不是None,则在brand_names_db中使用名称命名]生成的字节码要好一些。

>>> def f1(seq):
...     return list(filter(None, seq))
>>> def f2(seq):
...     return [i for i in seq if i is not None]
>>> disassemble(f1.__code__)
2         0 LOAD_GLOBAL              0 (list)
          2 LOAD_GLOBAL              1 (filter)
          4 LOAD_CONST               0 (None)
          6 LOAD_FAST                0 (seq)
          8 CALL_FUNCTION            2
         10 CALL_FUNCTION            1
         12 RETURN_VALUE
>>> disassemble(f2.__code__)
2           0 LOAD_CONST               1 (<code object <listcomp> at 0x10cfcaa50, file "<stdin>", line 2>)
          2 LOAD_CONST               2 ('f2.<locals>.<listcomp>')
          4 MAKE_FUNCTION            0
          6 LOAD_FAST                0 (seq)
          8 GET_ITER
         10 CALL_FUNCTION            1
         12 RETURN_VALUE

但是它们实际上要慢一些:

   >>> timeit(stmt="f1(range(1000))", setup="from __main__ import f1,f2")
   21.177661532000116
   >>> timeit(stmt="f2(range(1000))", setup="from __main__ import f1,f2")
   42.233950221000214

Curiously on Python 3, I see filter performing faster than list comprehensions.

I always thought that the list comprehensions would be more performant. Something like: [name for name in brand_names_db if name is not None] The bytecode generated is a bit better.

>>> def f1(seq):
...     return list(filter(None, seq))
>>> def f2(seq):
...     return [i for i in seq if i is not None]
>>> disassemble(f1.__code__)
2         0 LOAD_GLOBAL              0 (list)
          2 LOAD_GLOBAL              1 (filter)
          4 LOAD_CONST               0 (None)
          6 LOAD_FAST                0 (seq)
          8 CALL_FUNCTION            2
         10 CALL_FUNCTION            1
         12 RETURN_VALUE
>>> disassemble(f2.__code__)
2           0 LOAD_CONST               1 (<code object <listcomp> at 0x10cfcaa50, file "<stdin>", line 2>)
          2 LOAD_CONST               2 ('f2.<locals>.<listcomp>')
          4 MAKE_FUNCTION            0
          6 LOAD_FAST                0 (seq)
          8 GET_ITER
         10 CALL_FUNCTION            1
         12 RETURN_VALUE

But they are actually slower:

   >>> timeit(stmt="f1(range(1000))", setup="from __main__ import f1,f2")
   21.177661532000116
   >>> timeit(stmt="f2(range(1000))", setup="from __main__ import f1,f2")
   42.233950221000214

回答 13

我拿

def filter_list(list, key, value, limit=None):
    return [i for i in list if i[key] == value][:limit]

My take

def filter_list(list, key, value, limit=None):
    return [i for i in list if i[key] == value][:limit]

检查列表中是否存在值的最快方法

问题:检查列表中是否存在值的最快方法

最快的方法是什么才能知道列表中是否存在值(列表中包含数百万个值)及其索引是什么?

我知道列表中的所有值都是唯一的,如本例所示。

我尝试的第一种方法是(在我的实际代码中为3.8秒):

a = [4,2,3,1,5,6]

if a.count(7) == 1:
    b=a.index(7)
    "Do something with variable b"

我尝试的第二种方法是(速度提高了2倍:实际代码为1.9秒):

a = [4,2,3,1,5,6]

try:
    b=a.index(7)
except ValueError:
    "Do nothing"
else:
    "Do something with variable b"

堆栈溢出用户建议的方法(我的实际代码为2.74秒):

a = [4,2,3,1,5,6]
if 7 in a:
    a.index(7)

在我的真实代码中,第一种方法耗时3.81秒,第二种方法耗时1.88秒。这是一个很好的改进,但是:

我是使用Python /脚本的初学者,有没有更快的方法来完成相同的事情并节省更多的处理时间?

我的应用程序更具体的说明:

在Blender API中,我可以访问粒子列表:

particles = [1, 2, 3, 4, etc.]

从那里,我可以访问粒子的位置:

particles[x].location = [x,y,z]

对于每个粒子,我通过搜索每个粒子位置来测试是否存在邻居:

if [x+1,y,z] in particles.location
    "Find the identity of this neighbour particle in x:the particle's index
    in the array"
    particles.index([x+1,y,z])

What is the fastest way to know if a value exists in a list (a list with millions of values in it) and what its index is?

I know that all values in the list are unique as in this example.

The first method I try is (3.8 sec in my real code):

a = [4,2,3,1,5,6]

if a.count(7) == 1:
    b=a.index(7)
    "Do something with variable b"

The second method I try is (2x faster: 1.9 sec for my real code):

a = [4,2,3,1,5,6]

try:
    b=a.index(7)
except ValueError:
    "Do nothing"
else:
    "Do something with variable b"

Proposed methods from Stack Overflow user (2.74 sec for my real code):

a = [4,2,3,1,5,6]
if 7 in a:
    a.index(7)

In my real code, the first method takes 3.81 sec and the second method takes 1.88 sec. It’s a good improvement, but:

I’m a beginner with Python/scripting, and is there a faster way to do the same things and save more processing time?

More specific explication for my application:

In the Blender API I can access a list of particles:

particles = [1, 2, 3, 4, etc.]

From there, I can access a particle’s location:

particles[x].location = [x,y,z]

And for each particle I test if a neighbour exists by searching each particle location like so:

if [x+1,y,z] in particles.location
    "Find the identity of this neighbour particle in x:the particle's index
    in the array"
    particles.index([x+1,y,z])

回答 0

7 in a

最清晰,最快的方法。

您也可以考虑使用set,但是从列表中构造该集合所花费的时间可能比更快的成员资格测试所节省的时间还要长。唯一可以确定的基准就是基准测试。(这还取决于您需要执行哪些操作)

7 in a

Clearest and fastest way to do it.

You can also consider using a set, but constructing that set from your list may take more time than faster membership testing will save. The only way to be certain is to benchmark well. (this also depends on what operations you require)


回答 1

正如其他人所述,in对于大型列表,它可能非常慢。这里是表演一些比较insetbisect。请注意时间(以秒为单位)是对数刻度。

在此处输入图片说明

测试代码:

import random
import bisect
import matplotlib.pyplot as plt
import math
import time

def method_in(a,b,c):
    start_time = time.time()
    for i,x in enumerate(a):
        if x in b:
            c[i] = 1
    return(time.time()-start_time)   

def method_set_in(a,b,c):
    start_time = time.time()
    s = set(b)
    for i,x in enumerate(a):
        if x in s:
            c[i] = 1
    return(time.time()-start_time)

def method_bisect(a,b,c):
    start_time = time.time()
    b.sort()
    for i,x in enumerate(a):
        index = bisect.bisect_left(b,x)
        if index < len(a):
            if x == b[index]:
                c[i] = 1
    return(time.time()-start_time)

def profile():
    time_method_in = []
    time_method_set_in = []
    time_method_bisect = []

    Nls = [x for x in range(1000,20000,1000)]
    for N in Nls:
        a = [x for x in range(0,N)]
        random.shuffle(a)
        b = [x for x in range(0,N)]
        random.shuffle(b)
        c = [0 for x in range(0,N)]

        time_method_in.append(math.log(method_in(a,b,c)))
        time_method_set_in.append(math.log(method_set_in(a,b,c)))
        time_method_bisect.append(math.log(method_bisect(a,b,c)))

    plt.plot(Nls,time_method_in,marker='o',color='r',linestyle='-',label='in')
    plt.plot(Nls,time_method_set_in,marker='o',color='b',linestyle='-',label='set')
    plt.plot(Nls,time_method_bisect,marker='o',color='g',linestyle='-',label='bisect')
    plt.xlabel('list size', fontsize=18)
    plt.ylabel('log(time)', fontsize=18)
    plt.legend(loc = 'upper left')
    plt.show()

As stated by others, in can be very slow for large lists. Here are some comparisons of the performances for in, set and bisect. Note the time (in second) is in log scale.

enter image description here

Code for testing:

import random
import bisect
import matplotlib.pyplot as plt
import math
import time

def method_in(a,b,c):
    start_time = time.time()
    for i,x in enumerate(a):
        if x in b:
            c[i] = 1
    return(time.time()-start_time)   

def method_set_in(a,b,c):
    start_time = time.time()
    s = set(b)
    for i,x in enumerate(a):
        if x in s:
            c[i] = 1
    return(time.time()-start_time)

def method_bisect(a,b,c):
    start_time = time.time()
    b.sort()
    for i,x in enumerate(a):
        index = bisect.bisect_left(b,x)
        if index < len(a):
            if x == b[index]:
                c[i] = 1
    return(time.time()-start_time)

def profile():
    time_method_in = []
    time_method_set_in = []
    time_method_bisect = []

    Nls = [x for x in range(1000,20000,1000)]
    for N in Nls:
        a = [x for x in range(0,N)]
        random.shuffle(a)
        b = [x for x in range(0,N)]
        random.shuffle(b)
        c = [0 for x in range(0,N)]

        time_method_in.append(math.log(method_in(a,b,c)))
        time_method_set_in.append(math.log(method_set_in(a,b,c)))
        time_method_bisect.append(math.log(method_bisect(a,b,c)))

    plt.plot(Nls,time_method_in,marker='o',color='r',linestyle='-',label='in')
    plt.plot(Nls,time_method_set_in,marker='o',color='b',linestyle='-',label='set')
    plt.plot(Nls,time_method_bisect,marker='o',color='g',linestyle='-',label='bisect')
    plt.xlabel('list size', fontsize=18)
    plt.ylabel('log(time)', fontsize=18)
    plt.legend(loc = 'upper left')
    plt.show()

回答 2

您可以将物品放入set。集合查找非常有效。

尝试:

s = set(a)
if 7 in s:
  # do stuff

编辑在注释中,您说您想获取元素的索引。不幸的是,集合没有元素位置的概念。另一种方法是对列表进行预排序,然后在每次需要查找元素时使用二进制搜索

You could put your items into a set. Set lookups are very efficient.

Try:

s = set(a)
if 7 in s:
  # do stuff

edit In a comment you say that you’d like to get the index of the element. Unfortunately, sets have no notion of element position. An alternative is to pre-sort your list and then use binary search every time you need to find an element.


回答 3

def check_availability(element, collection: iter):
    return element in collection

用法

check_availability('a', [1,2,3,4,'a','b','c'])

我相信这是知道所选值是否在数组中的最快方法。

def check_availability(element, collection: iter):
    return element in collection

Usage

check_availability('a', [1,2,3,4,'a','b','c'])

I believe this is the fastest way to know if a chosen value is in an array.


回答 4

a = [4,2,3,1,5,6]

index = dict((y,x) for x,y in enumerate(a))
try:
   a_index = index[7]
except KeyError:
   print "Not found"
else:
   print "found"

如果a不变,这将是一个好主意,因此我们可以做一次dict()部分,然后重复使用它。如果确实发生变化,请提供您正在做的更多详细信息。

a = [4,2,3,1,5,6]

index = dict((y,x) for x,y in enumerate(a))
try:
   a_index = index[7]
except KeyError:
   print "Not found"
else:
   print "found"

This will only be a good idea if a doesn’t change and thus we can do the dict() part once and then use it repeatedly. If a does change, please provide more detail on what you are doing.


回答 5

最初的问题是:

最快的方法是什么才能知道列表中是否存在值(列表中包含数百万个值)及其索引是什么?

因此,有两件事可以找到:

  1. 是列表中的一项,并且
  2. 什么是索引(如果在列表中)。

为此,我修改了@xslittlegrass代码以在所有情况下计算索引,并添加了其他方法。

结果

在此处输入图片说明

方法是:

  1. in-基本上如果b中的x:返回b.index(x)
  2. try–try / catch on b.index(x)(跳过必须检查b中的x)
  3. set-基本上,如果x在set(b)中:返回b.index(x)
  4. bisect-用索引对其b进行排序,对sorted(b)中的x进行二进制搜索。请注意@xslittlegrass的mod,它返回排序后的b中的索引,而不是原始b)
  5. 反向-为b形成反向查找字典d; 然后d [x]提供x的索引。

结果表明,方法5最快。

有趣的是,tryset方法在时间上是等效的。


测试代码

import random
import bisect
import matplotlib.pyplot as plt
import math
import timeit
import itertools

def wrapper(func, *args, **kwargs):
    " Use to produced 0 argument function for call it"
    # Reference https://www.pythoncentral.io/time-a-python-function/
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

def method_in(a,b,c):
    for i,x in enumerate(a):
        if x in b:
            c[i] = b.index(x)
        else:
            c[i] = -1
    return c

def method_try(a,b,c):
    for i, x in enumerate(a):
        try:
            c[i] = b.index(x)
        except ValueError:
            c[i] = -1

def method_set_in(a,b,c):
    s = set(b)
    for i,x in enumerate(a):
        if x in s:
            c[i] = b.index(x)
        else:
            c[i] = -1
    return c

def method_bisect(a,b,c):
    " Finds indexes using bisection "

    # Create a sorted b with its index
    bsorted = sorted([(x, i) for i, x in enumerate(b)], key = lambda t: t[0])

    for i,x in enumerate(a):
        index = bisect.bisect_left(bsorted,(x, ))
        c[i] = -1
        if index < len(a):
            if x == bsorted[index][0]:
                c[i] = bsorted[index][1]  # index in the b array

    return c

def method_reverse_lookup(a, b, c):
    reverse_lookup = {x:i for i, x in enumerate(b)}
    for i, x in enumerate(a):
        c[i] = reverse_lookup.get(x, -1)
    return c

def profile():
    Nls = [x for x in range(1000,20000,1000)]
    number_iterations = 10
    methods = [method_in, method_try, method_set_in, method_bisect, method_reverse_lookup]
    time_methods = [[] for _ in range(len(methods))]

    for N in Nls:
        a = [x for x in range(0,N)]
        random.shuffle(a)
        b = [x for x in range(0,N)]
        random.shuffle(b)
        c = [0 for x in range(0,N)]

        for i, func in enumerate(methods):
            wrapped = wrapper(func, a, b, c)
            time_methods[i].append(math.log(timeit.timeit(wrapped, number=number_iterations)))

    markers = itertools.cycle(('o', '+', '.', '>', '2'))
    colors = itertools.cycle(('r', 'b', 'g', 'y', 'c'))
    labels = itertools.cycle(('in', 'try', 'set', 'bisect', 'reverse'))

    for i in range(len(time_methods)):
        plt.plot(Nls,time_methods[i],marker = next(markers),color=next(colors),linestyle='-',label=next(labels))

    plt.xlabel('list size', fontsize=18)
    plt.ylabel('log(time)', fontsize=18)
    plt.legend(loc = 'upper left')
    plt.show()

profile()

The original question was:

What is the fastest way to know if a value exists in a list (a list with millions of values in it) and what its index is?

Thus there are two things to find:

  1. is an item in the list, and
  2. what is the index (if in the list).

Towards this, I modified @xslittlegrass code to compute indexes in all cases, and added an additional method.

Results

enter image description here

Methods are:

  1. in–basically if x in b: return b.index(x)
  2. try–try/catch on b.index(x) (skips having to check if x in b)
  3. set–basically if x in set(b): return b.index(x)
  4. bisect–sort b with its index, binary search for x in sorted(b). Note mod from @xslittlegrass who returns the index in the sorted b, rather than the original b)
  5. reverse–form a reverse lookup dictionary d for b; then d[x] provides the index of x.

Results show that method 5 is the fastest.

Interestingly the try and the set methods are equivalent in time.


Test Code

import random
import bisect
import matplotlib.pyplot as plt
import math
import timeit
import itertools

def wrapper(func, *args, **kwargs):
    " Use to produced 0 argument function for call it"
    # Reference https://www.pythoncentral.io/time-a-python-function/
    def wrapped():
        return func(*args, **kwargs)
    return wrapped

def method_in(a,b,c):
    for i,x in enumerate(a):
        if x in b:
            c[i] = b.index(x)
        else:
            c[i] = -1
    return c

def method_try(a,b,c):
    for i, x in enumerate(a):
        try:
            c[i] = b.index(x)
        except ValueError:
            c[i] = -1

def method_set_in(a,b,c):
    s = set(b)
    for i,x in enumerate(a):
        if x in s:
            c[i] = b.index(x)
        else:
            c[i] = -1
    return c

def method_bisect(a,b,c):
    " Finds indexes using bisection "

    # Create a sorted b with its index
    bsorted = sorted([(x, i) for i, x in enumerate(b)], key = lambda t: t[0])

    for i,x in enumerate(a):
        index = bisect.bisect_left(bsorted,(x, ))
        c[i] = -1
        if index < len(a):
            if x == bsorted[index][0]:
                c[i] = bsorted[index][1]  # index in the b array

    return c

def method_reverse_lookup(a, b, c):
    reverse_lookup = {x:i for i, x in enumerate(b)}
    for i, x in enumerate(a):
        c[i] = reverse_lookup.get(x, -1)
    return c

def profile():
    Nls = [x for x in range(1000,20000,1000)]
    number_iterations = 10
    methods = [method_in, method_try, method_set_in, method_bisect, method_reverse_lookup]
    time_methods = [[] for _ in range(len(methods))]

    for N in Nls:
        a = [x for x in range(0,N)]
        random.shuffle(a)
        b = [x for x in range(0,N)]
        random.shuffle(b)
        c = [0 for x in range(0,N)]

        for i, func in enumerate(methods):
            wrapped = wrapper(func, a, b, c)
            time_methods[i].append(math.log(timeit.timeit(wrapped, number=number_iterations)))

    markers = itertools.cycle(('o', '+', '.', '>', '2'))
    colors = itertools.cycle(('r', 'b', 'g', 'y', 'c'))
    labels = itertools.cycle(('in', 'try', 'set', 'bisect', 'reverse'))

    for i in range(len(time_methods)):
        plt.plot(Nls,time_methods[i],marker = next(markers),color=next(colors),linestyle='-',label=next(labels))

    plt.xlabel('list size', fontsize=18)
    plt.ylabel('log(time)', fontsize=18)
    plt.legend(loc = 'upper left')
    plt.show()

profile()

回答 6

听起来您的应用程序可能会受益于使用Bloom Filter数据结构的优势。

简而言之,布隆过滤器查询可以很快告诉您集合中是否绝对没有值。否则,您可以进行较慢的查找,以获取列表中可能存在的值的索引。因此,如果您的应用程序倾向于比“已找到”结果更频繁地获得“未找到”结果,则可以通过添加Bloom Filter来加快速度。

有关详细信息,Wikipedia很好地概述了布隆过滤器的工作方式,并且对“ python布隆过滤器库”的网络搜索将至少提供一些有用的实现。

It sounds like your application might gain advantage from the use of a Bloom Filter data structure.

In short, a bloom filter look-up can tell you very quickly if a value is DEFINITELY NOT present in a set. Otherwise, you can do a slower look-up to get the index of a value that POSSIBLY MIGHT BE in the list. So if your application tends to get the “not found” result much more often then the “found” result, you might see a speed up by adding a Bloom Filter.

For details, Wikipedia provides a good overview of how Bloom Filters work, and a web search for “python bloom filter library” will provide at least a couple useful implementations.


回答 7

请注意,in运算符不仅测试相等性(==),还测试身份(is),s 的in逻辑大致等同于以下内容(它实际上是用C编写的,但不是用Python编写的,至少是用CPython编写的):list

for element in s:
    if element is target:
        # fast check for identity implies equality
        return True
    if element == target:
        # slower check for actual equality
        return True
return False

在大多数情况下,这个细节是无关紧要的,但是在某些情况下,它可能会使Python新手感到惊讶,例如,numpy.NAN具有不等于自身的异常特性:

>>> import numpy
>>> numpy.NAN == numpy.NAN
False
>>> numpy.NAN is numpy.NAN
True
>>> numpy.NAN in [numpy.NAN]
True

要区分这些异常情况,可以使用any()

>>> lst = [numpy.NAN, 1 , 2]
>>> any(element == numpy.NAN for element in lst)
False
>>> any(element is numpy.NAN for element in lst)
True 

注意s 的in逻辑为:listany()

any(element is target or element == target for element in lst)

但是,我要强调的是,这是一个in极端的情况,在绝大多数情况下,运算符都是经过高度优化的,而这正是您想要的(当然是使用a list或使用a set)。

Be aware that the in operator tests not only equality (==) but also identity (is), the in logic for lists is roughly equivalent to the following (it’s actually written in C and not Python though, at least in CPython):

for element in s:
    if element is target:
        # fast check for identity implies equality
        return True
    if element == target:
        # slower check for actual equality
        return True
return False

In most circumstances this detail is irrelevant, but in some circumstances it might leave a Python novice surprised, for example, numpy.NAN has the unusual property of being not being equal to itself:

>>> import numpy
>>> numpy.NAN == numpy.NAN
False
>>> numpy.NAN is numpy.NAN
True
>>> numpy.NAN in [numpy.NAN]
True

To distinguish between these unusual cases you could use any() like:

>>> lst = [numpy.NAN, 1 , 2]
>>> any(element == numpy.NAN for element in lst)
False
>>> any(element is numpy.NAN for element in lst)
True 

Note the in logic for lists with any() would be:

any(element is target or element == target for element in lst)

However, I should emphasize that this is an edge case, and for the vast majority of cases the in operator is highly optimised and exactly what you want of course (either with a list or with a set).


回答 8

或使用__contains__

sequence.__contains__(value)

演示:

>>> l=[1,2,3]
>>> l.__contains__(3)
True
>>> 

Or use __contains__:

sequence.__contains__(value)

Demo:

>>> l=[1,2,3]
>>> l.__contains__(3)
True
>>> 

回答 9

@Winston Ewert的解决方案极大地提高了非常大的列表的速度,但是这个stackoverflow答案表明,如果经常到达除外分支,则try:/ except:/ else:构造将变慢。一种替代方法是利用该.get()方法使用dict:

a = [4,2,3,1,5,6]

index = dict((y, x) for x, y in enumerate(a))

b = index.get(7, None)
if b is not None:
    "Do something with variable b"

.get(key, default)方法仅适用于无法保证键将包含在dict中的情况。如果关键存在,它返回值(将dict[key]),但是,当它不是,.get()返回默认值(在这里None)。在这种情况下,您需要确保所选的默认值不会在中a

@Winston Ewert’s solution yields a big speed-up for very large lists, but this stackoverflow answer indicates that the the try:/except:/else: construct will be slowed down if the except branch is often reached. An alternative is to take advantage of the .get() method for the dict:

a = [4,2,3,1,5,6]

index = dict((y, x) for x, y in enumerate(a))

b = index.get(7, None)
if b is not None:
    "Do something with variable b"

The .get(key, default) method is just for the case when you can’t guarantee a key will be in the dict. If key is present, it returns the value (as would dict[key]), but when it is not, .get() returns your default value (here None). You need to make sure in this case that the chosen default will not be in a.


回答 10

这不是代码,而是用于快速搜索的算法。

如果您的列表和要查找的值都是数字,那么这很简单。如果是字符串:请看底部:

  • -让“ n”为列表的长度
  • -可选步骤:如果需要元素索引:将第二列添加到元素的当前索引(0到n-1)-稍后再说
  • 订购列表或列表的副本(.sort())
  • 依次通过:
    • 将您的数字与列表的第n / 2个元素进行比较
      • 如果更大,则在索引n / 2-n之间再次循环
      • 如果较小,则在索引0-n / 2之间再次循环
      • 如果相同:您找到了
  • 不断缩小列表的范围,直到找到它或只有2个数字(在您要查找的数字的下方和上方)
  • 这将在最多19个步骤中找到1.000.000列表中的任何元素(准确地说是log(2)n)

如果您还需要号码的原始位置,请在第二个索引列中查找。

如果您的列表不是由数字组成的,则该方法仍然有效并且将是最快的,但是您可能需要定义一个可以比较/排序字符串的函数。

当然,这需要sorted()方法的投资,但是如果您继续重复使用相同的列表进行检查,那可能是值得的。

This is not the code, but the algorithm for very fast searching.

If your list and the value you are looking for are all numbers, this is pretty straightforward. If strings: look at the bottom:

  • -Let “n” be the length of your list
  • -Optional step: if you need the index of the element: add a second column to the list with current index of elements (0 to n-1) – see later
  • Order your list or a copy of it (.sort())
  • Loop through:
    • Compare your number to the n/2th element of the list
      • If larger, loop again between indexes n/2-n
      • If smaller, loop again between indexes 0-n/2
      • If the same: you found it
  • Keep narrowing the list until you have found it or only have 2 numbers (below and above the one you are looking for)
  • This will find any element in at most 19 steps for a list of 1.000.000 (log(2)n to be precise)

If you also need the original position of your number, look for it in the second, index column.

If your list is not made of numbers, the method still works and will be fastest, but you may need to define a function which can compare/order strings.

Of course, this needs the investment of the sorted() method, but if you keep reusing the same list for checking, it may be worth it.


回答 11

因为问题不一定总是被理解为最快的技术方法-我总是建议理解/编写最直接的最快方法:列表理解,单线

[i for i in list_from_which_to_search if i in list_to_search_in]

我对list_to_search_in所有项目都拥有一个,并想返回中的项目索引list_from_which_to_search

这将在一个不错的列表中返回索引。

还有其他方法可以解决此问题-但是列表理解速度足够快,并且可以以足够快的速度编写它来解决问题。

Because the question is not always supposed to be understood as the fastest technical way – I always suggest the most straightforward fastest way to understand/write: a list comprehension, one-liner

[i for i in list_from_which_to_search if i in list_to_search_in]

I had a list_to_search_in with all the items, and wanted to return the indexes of the items in the list_from_which_to_search.

This returns the indexes in a nice list.

There are other ways to check this problem – however list comprehensions are quick enough, adding to the fact of writing it quick enough, to solve a problem.


回答 12

对我而言,这是0.030秒(实际),0.026秒(用户)和0.004秒(系统)。

try:
print("Started")
x = ["a", "b", "c", "d", "e", "f"]

i = 0

while i < len(x):
    i += 1
    if x[i] == "e":
        print("Found")
except IndexError:
    pass

For me it was 0.030 sec (real), 0.026 sec (user), and 0.004 sec (sys).

try:
print("Started")
x = ["a", "b", "c", "d", "e", "f"]

i = 0

while i < len(x):
    i += 1
    if x[i] == "e":
        print("Found")
except IndexError:
    pass

回答 13

检查乘积等于k的数组中是否存在两个元素的代码:

n = len(arr1)
for i in arr1:
    if k%i==0:
        print(i)

Code to check whether two elements exist in array whose product equals k:

n = len(arr1)
for i in arr1:
    if k%i==0:
        print(i)

获得两个列表之间的差异

问题:获得两个列表之间的差异

我在Python中有两个列表,如下所示:

temp1 = ['One', 'Two', 'Three', 'Four']
temp2 = ['One', 'Two']

我需要用第一个列表中没有的项目创建第三个列表。从示例中,我必须得到:

temp3 = ['Three', 'Four']

有没有循环和检查的快速方法吗?

I have two lists in Python, like these:

temp1 = ['One', 'Two', 'Three', 'Four']
temp2 = ['One', 'Two']

I need to create a third list with items from the first list which aren’t present in the second one. From the example I have to get:

temp3 = ['Three', 'Four']

Are there any fast ways without cycles and checking?


回答 0

In [5]: list(set(temp1) - set(temp2))
Out[5]: ['Four', 'Three']

当心

In [5]: set([1, 2]) - set([2, 3])
Out[5]: set([1]) 

您可能希望/希望它等于的位置set([1, 3])。如果确实要set([1, 3])作为答案,则需要使用set([1, 2]).symmetric_difference(set([2, 3]))

In [5]: list(set(temp1) - set(temp2))
Out[5]: ['Four', 'Three']

Beware that

In [5]: set([1, 2]) - set([2, 3])
Out[5]: set([1]) 

where you might expect/want it to equal set([1, 3]). If you do want set([1, 3]) as your answer, you’ll need to use set([1, 2]).symmetric_difference(set([2, 3])).


回答 1

现有解决方案均提供以下一项或多项:

  • 比O(n * m)性能快。
  • 保留输入列表的顺序。

但是到目前为止,还没有解决方案。如果两者都想要,请尝试以下操作:

s = set(temp2)
temp3 = [x for x in temp1 if x not in s]

性能测试

import timeit
init = 'temp1 = list(range(100)); temp2 = [i * 2 for i in range(50)]'
print timeit.timeit('list(set(temp1) - set(temp2))', init, number = 100000)
print timeit.timeit('s = set(temp2);[x for x in temp1 if x not in s]', init, number = 100000)
print timeit.timeit('[item for item in temp1 if item not in temp2]', init, number = 100000)

结果:

4.34620224079 # ars' answer
4.2770634955  # This answer
30.7715615392 # matt b's answer

我介绍的方法以及保留顺序也比集合减法要快(略),因为它不需要构造不必要的集合。如果第一个列表比第二个列表长很多,并且散列很昂贵,则性能差异将更加明显。这是第二个测试,证明了这一点:

init = '''
temp1 = [str(i) for i in range(100000)]
temp2 = [str(i * 2) for i in range(50)]
'''

结果:

11.3836875916 # ars' answer
3.63890368748 # this answer (3 times faster!)
37.7445402279 # matt b's answer

The existing solutions all offer either one or the other of:

  • Faster than O(n*m) performance.
  • Preserve order of input list.

But so far no solution has both. If you want both, try this:

s = set(temp2)
temp3 = [x for x in temp1 if x not in s]

Performance test

import timeit
init = 'temp1 = list(range(100)); temp2 = [i * 2 for i in range(50)]'
print timeit.timeit('list(set(temp1) - set(temp2))', init, number = 100000)
print timeit.timeit('s = set(temp2);[x for x in temp1 if x not in s]', init, number = 100000)
print timeit.timeit('[item for item in temp1 if item not in temp2]', init, number = 100000)

Results:

4.34620224079 # ars' answer
4.2770634955  # This answer
30.7715615392 # matt b's answer

The method I presented as well as preserving order is also (slightly) faster than the set subtraction because it doesn’t require construction of an unnecessary set. The performance difference would be more noticable if the first list is considerably longer than the second and if hashing is expensive. Here’s a second test demonstrating this:

init = '''
temp1 = [str(i) for i in range(100000)]
temp2 = [str(i * 2) for i in range(50)]
'''

Results:

11.3836875916 # ars' answer
3.63890368748 # this answer (3 times faster!)
37.7445402279 # matt b's answer

回答 2

temp3 = [item for item in temp1 if item not in temp2]
temp3 = [item for item in temp1 if item not in temp2]

回答 3

可以使用以下简单函数找到两个列表(例如list1和list2)之间的差异。

def diff(list1, list2):
    c = set(list1).union(set(list2))  # or c = set(list1) | set(list2)
    d = set(list1).intersection(set(list2))  # or d = set(list1) & set(list2)
    return list(c - d)

要么

def diff(list1, list2):
    return list(set(list1).symmetric_difference(set(list2)))  # or return list(set(list1) ^ set(list2))

通过使用上述功能,可以使用diff(temp2, temp1)或找到差异diff(temp1, temp2)。两者都会给出结果['Four', 'Three']。您不必担心列表的顺序或先给出哪个列表。

Python文档参考

The difference between two lists (say list1 and list2) can be found using the following simple function.

def diff(list1, list2):
    c = set(list1).union(set(list2))  # or c = set(list1) | set(list2)
    d = set(list1).intersection(set(list2))  # or d = set(list1) & set(list2)
    return list(c - d)

or

def diff(list1, list2):
    return list(set(list1).symmetric_difference(set(list2)))  # or return list(set(list1) ^ set(list2))

By Using the above function, the difference can be found using diff(temp2, temp1) or diff(temp1, temp2). Both will give the result ['Four', 'Three']. You don’t have to worry about the order of the list or which list is to be given first.

Python doc reference


回答 4

如果您需要递归的区别,我已经为python编写了一个软件包:https : //github.com/seperman/deepdiff

安装

从PyPi安装:

pip install deepdiff

用法示例

输入

>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> from __future__ import print_function # In case running on Python 2

同一对象返回空

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = t1
>>> print(DeepDiff(t1, t2))
{}

项目类型已更改

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:"2", 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{ 'type_changes': { 'root[2]': { 'newtype': <class 'str'>,
                                 'newvalue': '2',
                                 'oldtype': <class 'int'>,
                                 'oldvalue': 2}}}

物品的价值已经改变

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:4, 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}

添加和/或删除项目

>>> t1 = {1:1, 2:2, 3:3, 4:4}
>>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff)
{'dic_item_added': ['root[5]', 'root[6]'],
 'dic_item_removed': ['root[4]'],
 'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}

弦差异

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}}
>>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { 'root[2]': {'newvalue': 4, 'oldvalue': 2},
                      "root[4]['b']": { 'newvalue': 'world!',
                                        'oldvalue': 'world'}}}

弦差异2

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { "root[4]['b']": { 'diff': '--- \n'
                                                '+++ \n'
                                                '@@ -1,5 +1,4 @@\n'
                                                '-world!\n'
                                                '-Goodbye!\n'
                                                '+world\n'
                                                ' 1\n'
                                                ' 2\n'
                                                ' End',
                                        'newvalue': 'world\n1\n2\nEnd',
                                        'oldvalue': 'world!\n'
                                                    'Goodbye!\n'
                                                    '1\n'
                                                    '2\n'
                                                    'End'}}}

>>> 
>>> print (ddiff['values_changed']["root[4]['b']"]["diff"])
--- 
+++ 
@@ -1,5 +1,4 @@
-world!
-Goodbye!
+world
 1
 2
 End

类型变更

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'type_changes': { "root[4]['b']": { 'newtype': <class 'str'>,
                                      'newvalue': 'world\n\n\nEnd',
                                      'oldtype': <class 'list'>,
                                      'oldvalue': [1, 2, 3]}}}

清单差异

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{'iterable_item_removed': {"root[4]['b'][2]": 3, "root[4]['b'][3]": 4}}

清单差异2:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'iterable_item_added': {"root[4]['b'][3]": 3},
  'values_changed': { "root[4]['b'][1]": {'newvalue': 3, 'oldvalue': 2},
                      "root[4]['b'][2]": {'newvalue': 2, 'oldvalue': 3}}}

列出差异忽略顺序或重复项:(具有与上述相同的字典)

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2, ignore_order=True)
>>> print (ddiff)
{}

包含字典的列表:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'dic_item_removed': ["root[4]['b'][2][2]"],
  'values_changed': {"root[4]['b'][2][1]": {'newvalue': 3, 'oldvalue': 1}}}

套装:

>>> t1 = {1, 2, 8}
>>> t2 = {1, 2, 3, 5}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (DeepDiff(t1, t2))
{'set_item_added': ['root[3]', 'root[5]'], 'set_item_removed': ['root[8]']}

命名元组:

>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> t1 = Point(x=11, y=22)
>>> t2 = Point(x=11, y=23)
>>> pprint (DeepDiff(t1, t2))
{'values_changed': {'root.y': {'newvalue': 23, 'oldvalue': 22}}}

自定义对象:

>>> class ClassA(object):
...     a = 1
...     def __init__(self, b):
...         self.b = b
... 
>>> t1 = ClassA(1)
>>> t2 = ClassA(2)
>>> 
>>> pprint(DeepDiff(t1, t2))
{'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}

对象属性添加:

>>> t2.c = "new attribute"
>>> pprint(DeepDiff(t1, t2))
{'attribute_added': ['root.c'],
 'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}

In case you want the difference recursively, I have written a package for python: https://github.com/seperman/deepdiff

Installation

Install from PyPi:

pip install deepdiff

Example usage

Importing

>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> from __future__ import print_function # In case running on Python 2

Same object returns empty

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = t1
>>> print(DeepDiff(t1, t2))
{}

Type of an item has changed

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:"2", 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{ 'type_changes': { 'root[2]': { 'newtype': <class 'str'>,
                                 'newvalue': '2',
                                 'oldtype': <class 'int'>,
                                 'oldvalue': 2}}}

Value of an item has changed

>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:4, 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}

Item added and/or removed

>>> t1 = {1:1, 2:2, 3:3, 4:4}
>>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff)
{'dic_item_added': ['root[5]', 'root[6]'],
 'dic_item_removed': ['root[4]'],
 'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}

String difference

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}}
>>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { 'root[2]': {'newvalue': 4, 'oldvalue': 2},
                      "root[4]['b']": { 'newvalue': 'world!',
                                        'oldvalue': 'world'}}}

String difference 2

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { "root[4]['b']": { 'diff': '--- \n'
                                                '+++ \n'
                                                '@@ -1,5 +1,4 @@\n'
                                                '-world!\n'
                                                '-Goodbye!\n'
                                                '+world\n'
                                                ' 1\n'
                                                ' 2\n'
                                                ' End',
                                        'newvalue': 'world\n1\n2\nEnd',
                                        'oldvalue': 'world!\n'
                                                    'Goodbye!\n'
                                                    '1\n'
                                                    '2\n'
                                                    'End'}}}

>>> 
>>> print (ddiff['values_changed']["root[4]['b']"]["diff"])
--- 
+++ 
@@ -1,5 +1,4 @@
-world!
-Goodbye!
+world
 1
 2
 End

Type change

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'type_changes': { "root[4]['b']": { 'newtype': <class 'str'>,
                                      'newvalue': 'world\n\n\nEnd',
                                      'oldtype': <class 'list'>,
                                      'oldvalue': [1, 2, 3]}}}

List difference

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{'iterable_item_removed': {"root[4]['b'][2]": 3, "root[4]['b'][3]": 4}}

List difference 2:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'iterable_item_added': {"root[4]['b'][3]": 3},
  'values_changed': { "root[4]['b'][1]": {'newvalue': 3, 'oldvalue': 2},
                      "root[4]['b'][2]": {'newvalue': 2, 'oldvalue': 3}}}

List difference ignoring order or duplicates: (with the same dictionaries as above)

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2, ignore_order=True)
>>> print (ddiff)
{}

List that contains dictionary:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'dic_item_removed': ["root[4]['b'][2][2]"],
  'values_changed': {"root[4]['b'][2][1]": {'newvalue': 3, 'oldvalue': 1}}}

Sets:

>>> t1 = {1, 2, 8}
>>> t2 = {1, 2, 3, 5}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (DeepDiff(t1, t2))
{'set_item_added': ['root[3]', 'root[5]'], 'set_item_removed': ['root[8]']}

Named Tuples:

>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> t1 = Point(x=11, y=22)
>>> t2 = Point(x=11, y=23)
>>> pprint (DeepDiff(t1, t2))
{'values_changed': {'root.y': {'newvalue': 23, 'oldvalue': 22}}}

Custom objects:

>>> class ClassA(object):
...     a = 1
...     def __init__(self, b):
...         self.b = b
... 
>>> t1 = ClassA(1)
>>> t2 = ClassA(2)
>>> 
>>> pprint(DeepDiff(t1, t2))
{'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}

Object attribute added:

>>> t2.c = "new attribute"
>>> pprint(DeepDiff(t1, t2))
{'attribute_added': ['root.c'],
 'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}

回答 5

可以使用python XOR运算符完成。

  • 这将删除每个列表中的重复项
  • 这将显示temp1与temp2和temp2与temp1的差异。

set(temp1) ^ set(temp2)

Can be done using python XOR operator.

  • This will remove the duplicates in each list
  • This will show difference of temp1 from temp2 and temp2 from temp1.

set(temp1) ^ set(temp2)

回答 6

最简单的方法

使用set()。difference(set())

list_a = [1,2,3]
list_b = [2,3]
print set(list_a).difference(set(list_b))

答案是 set([1])

可以打印为列表,

print list(set(list_a).difference(set(list_b)))

most simple way,

use set().difference(set())

list_a = [1,2,3]
list_b = [2,3]
print set(list_a).difference(set(list_b))

answer is set([1])

can print as a list,

print list(set(list_a).difference(set(list_b)))

回答 7

如果您真的在考虑性能,请使用numpy!

这是完整的笔记本,是github上的要点,其中包括list,numpy和pandas之间的比较。

https://gist.github.com/denfromufa/2821ff59b02e9482be15d27f2bbd4451

在此处输入图片说明

If you are really looking into performance, then use numpy!

Here is the full notebook as a gist on github with comparison between list, numpy, and pandas.

https://gist.github.com/denfromufa/2821ff59b02e9482be15d27f2bbd4451

enter image description here


回答 8

因为目前的解决方案都无法产生元组,所以我会抛出:

temp3 = tuple(set(temp1) - set(temp2))

或者:

#edited using @Mark Byers idea. If you accept this one as answer, just accept his instead.
temp3 = tuple(x for x in temp1 if x not in set(temp2))

像其他非元组在该方向上产生答案一样,它保留了顺序

i’ll toss in since none of the present solutions yield a tuple:

temp3 = tuple(set(temp1) - set(temp2))

alternatively:

#edited using @Mark Byers idea. If you accept this one as answer, just accept his instead.
temp3 = tuple(x for x in temp1 if x not in set(temp2))

Like the other non-tuple yielding answers in this direction, it preserves order


回答 9

我想要的东西,将采取两个列表,并可以做什么diffbash呢。因为当您搜索“ python diff two list”时该问题首先弹出,并且不是很具体,所以我将发布我提出的内容。

使用SequenceMatherfrom difflib可以像比较两个列表diff。其他答案都不会告诉您差异发生的位置,但是这个答案确实可以。一些答案只能在一个方向上有所不同。一些重新排列元素。有些不处理重复项。但是此解决方案为您提供了两个列表之间的真正区别:

a = 'A quick fox jumps the lazy dog'.split()
b = 'A quick brown mouse jumps over the dog'.split()

from difflib import SequenceMatcher

for tag, i, j, k, l in SequenceMatcher(None, a, b).get_opcodes():
  if tag == 'equal': print('both have', a[i:j])
  if tag in ('delete', 'replace'): print('  1st has', a[i:j])
  if tag in ('insert', 'replace'): print('  2nd has', b[k:l])

输出:

both have ['A', 'quick']
  1st has ['fox']
  2nd has ['brown', 'mouse']
both have ['jumps']
  2nd has ['over']
both have ['the']
  1st has ['lazy']
both have ['dog']

当然,如果您的应用程序做出与其他答案相同的假设,则您将从中受益最大。但是,如果您正在寻找真正的diff功能,那么这是唯一的方法。

例如,其他答案都无法处理:

a = [1,2,3,4,5]
b = [5,4,3,2,1]

但这确实做到了:

  2nd has [5, 4, 3, 2]
both have [1]
  1st has [2, 3, 4, 5]

I wanted something that would take two lists and could do what diff in bash does. Since this question pops up first when you search for “python diff two lists” and is not very specific, I will post what I came up with.

Using SequenceMather from difflib you can compare two lists like diff does. None of the other answers will tell you the position where the difference occurs, but this one does. Some answers give the difference in only one direction. Some reorder the elements. Some don’t handle duplicates. But this solution gives you a true difference between two lists:

a = 'A quick fox jumps the lazy dog'.split()
b = 'A quick brown mouse jumps over the dog'.split()

from difflib import SequenceMatcher

for tag, i, j, k, l in SequenceMatcher(None, a, b).get_opcodes():
  if tag == 'equal': print('both have', a[i:j])
  if tag in ('delete', 'replace'): print('  1st has', a[i:j])
  if tag in ('insert', 'replace'): print('  2nd has', b[k:l])

This outputs:

both have ['A', 'quick']
  1st has ['fox']
  2nd has ['brown', 'mouse']
both have ['jumps']
  2nd has ['over']
both have ['the']
  1st has ['lazy']
both have ['dog']

Of course, if your application makes the same assumptions the other answers make, you will benefit from them the most. But if you are looking for a true diff functionality, then this is the only way to go.

For example, none of the other answers could handle:

a = [1,2,3,4,5]
b = [5,4,3,2,1]

But this one does:

  2nd has [5, 4, 3, 2]
both have [1]
  1st has [2, 3, 4, 5]

回答 10

尝试这个:

temp3 = set(temp1) - set(temp2)

Try this:

temp3 = set(temp1) - set(temp2)

回答 11

这可能比Mark的列表理解速度还要快:

list(itertools.filterfalse(set(temp2).__contains__, temp1))

this could be even faster than Mark’s list comprehension:

list(itertools.filterfalse(set(temp2).__contains__, temp1))

回答 12

这是Counter最简单情况的答案。

这比上面的双向差异短,因为它只完全满足问题的要求:生成第一个列表的列表,而不生成第二个列表。

from collections import Counter

lst1 = ['One', 'Two', 'Three', 'Four']
lst2 = ['One', 'Two']

c1 = Counter(lst1)
c2 = Counter(lst2)
diff = list((c1 - c2).elements())

另外,根据您对可读性的偏好,它可以提供不错的一线:

diff = list((Counter(lst1) - Counter(lst2)).elements())

输出:

['Three', 'Four']

请注意,list(...)如果仅在呼叫上进行迭代,则可以将其删除。

由于此解决方案使用计数器,因此与许多基于集合的答案相比,它可以正确处理数量。例如在此输入上:

lst1 = ['One', 'Two', 'Two', 'Two', 'Three', 'Three', 'Four']
lst2 = ['One', 'Two']

输出为:

['Two', 'Two', 'Three', 'Three', 'Four']

Here’s a Counter answer for the simplest case.

This is shorter than the one above that does two-way diffs because it only does exactly what the question asks: generate a list of what’s in the first list but not the second.

from collections import Counter

lst1 = ['One', 'Two', 'Three', 'Four']
lst2 = ['One', 'Two']

c1 = Counter(lst1)
c2 = Counter(lst2)
diff = list((c1 - c2).elements())

Alternatively, depending on your readability preferences, it makes for a decent one-liner:

diff = list((Counter(lst1) - Counter(lst2)).elements())

Output:

['Three', 'Four']

Note that you can remove the list(...) call if you are just iterating over it.

Because this solution uses counters, it handles quantities properly vs the many set-based answers. For example on this input:

lst1 = ['One', 'Two', 'Two', 'Two', 'Three', 'Three', 'Four']
lst2 = ['One', 'Two']

The output is:

['Two', 'Two', 'Three', 'Three', 'Four']

回答 13

如果对difflist的元素进行排序和设置,则可以使用幼稚的方法。

list1=[1,2,3,4,5]
list2=[1,2,3]

print list1[len(list2):]

或使用本机set方法:

subset=set(list1).difference(list2)

print subset

import timeit
init = 'temp1 = list(range(100)); temp2 = [i * 2 for i in range(50)]'
print "Naive solution: ", timeit.timeit('temp1[len(temp2):]', init, number = 100000)
print "Native set solution: ", timeit.timeit('set(temp1).difference(temp2)', init, number = 100000)

天真的解决方案:0.0787101593292

本机设置解决方案:0.998837615564

You could use a naive method if the elements of the difflist are sorted and sets.

list1=[1,2,3,4,5]
list2=[1,2,3]

print list1[len(list2):]

or with native set methods:

subset=set(list1).difference(list2)

print subset

import timeit
init = 'temp1 = list(range(100)); temp2 = [i * 2 for i in range(50)]'
print "Naive solution: ", timeit.timeit('temp1[len(temp2):]', init, number = 100000)
print "Native set solution: ", timeit.timeit('set(temp1).difference(temp2)', init, number = 100000)

Naive solution: 0.0787101593292

Native set solution: 0.998837615564


回答 14

为此我在游戏中为时不晚,但是您可以将上述某些代码的性能与此进行比较,其中两个最快的竞争者是:

list(set(x).symmetric_difference(set(y)))
list(set(x) ^ set(y))

对于基本的编码我深表歉意。

import time
import random
from itertools import filterfalse

# 1 - performance (time taken)
# 2 - correctness (answer - 1,4,5,6)
# set performance
performance = 1
numberoftests = 7

def answer(x,y,z):
    if z == 0:
        start = time.clock()
        lists = (str(list(set(x)-set(y))+list(set(y)-set(y))))
        times = ("1 = " + str(time.clock() - start))
        return (lists,times)

    elif z == 1:
        start = time.clock()
        lists = (str(list(set(x).symmetric_difference(set(y)))))
        times = ("2 = " + str(time.clock() - start))
        return (lists,times)

    elif z == 2:
        start = time.clock()
        lists = (str(list(set(x) ^ set(y))))
        times = ("3 = " + str(time.clock() - start))
        return (lists,times)

    elif z == 3:
        start = time.clock()
        lists = (filterfalse(set(y).__contains__, x))
        times = ("4 = " + str(time.clock() - start))
        return (lists,times)

    elif z == 4:
        start = time.clock()
        lists = (tuple(set(x) - set(y)))
        times = ("5 = " + str(time.clock() - start))
        return (lists,times)

    elif z == 5:
        start = time.clock()
        lists = ([tt for tt in x if tt not in y])
        times = ("6 = " + str(time.clock() - start))
        return (lists,times)

    else:    
        start = time.clock()
        Xarray = [iDa for iDa in x if iDa not in y]
        Yarray = [iDb for iDb in y if iDb not in x]
        lists = (str(Xarray + Yarray))
        times = ("7 = " + str(time.clock() - start))
        return (lists,times)

n = numberoftests

if performance == 2:
    a = [1,2,3,4,5]
    b = [3,2,6]
    for c in range(0,n):
        d = answer(a,b,c)
        print(d[0])

elif performance == 1:
    for tests in range(0,10):
        print("Test Number" + str(tests + 1))
        a = random.sample(range(1, 900000), 9999)
        b = random.sample(range(1, 900000), 9999)
        for c in range(0,n):
            #if c not in (1,4,5,6):
            d = answer(a,b,c)
            print(d[1])

I am little too late in the game for this but you can do a comparison of performance of some of the above mentioned code with this, two of the fastest contenders are,

list(set(x).symmetric_difference(set(y)))
list(set(x) ^ set(y))

I apologize for the elementary level of coding.

import time
import random
from itertools import filterfalse

# 1 - performance (time taken)
# 2 - correctness (answer - 1,4,5,6)
# set performance
performance = 1
numberoftests = 7

def answer(x,y,z):
    if z == 0:
        start = time.clock()
        lists = (str(list(set(x)-set(y))+list(set(y)-set(y))))
        times = ("1 = " + str(time.clock() - start))
        return (lists,times)

    elif z == 1:
        start = time.clock()
        lists = (str(list(set(x).symmetric_difference(set(y)))))
        times = ("2 = " + str(time.clock() - start))
        return (lists,times)

    elif z == 2:
        start = time.clock()
        lists = (str(list(set(x) ^ set(y))))
        times = ("3 = " + str(time.clock() - start))
        return (lists,times)

    elif z == 3:
        start = time.clock()
        lists = (filterfalse(set(y).__contains__, x))
        times = ("4 = " + str(time.clock() - start))
        return (lists,times)

    elif z == 4:
        start = time.clock()
        lists = (tuple(set(x) - set(y)))
        times = ("5 = " + str(time.clock() - start))
        return (lists,times)

    elif z == 5:
        start = time.clock()
        lists = ([tt for tt in x if tt not in y])
        times = ("6 = " + str(time.clock() - start))
        return (lists,times)

    else:    
        start = time.clock()
        Xarray = [iDa for iDa in x if iDa not in y]
        Yarray = [iDb for iDb in y if iDb not in x]
        lists = (str(Xarray + Yarray))
        times = ("7 = " + str(time.clock() - start))
        return (lists,times)

n = numberoftests

if performance == 2:
    a = [1,2,3,4,5]
    b = [3,2,6]
    for c in range(0,n):
        d = answer(a,b,c)
        print(d[0])

elif performance == 1:
    for tests in range(0,10):
        print("Test Number" + str(tests + 1))
        a = random.sample(range(1, 900000), 9999)
        b = random.sample(range(1, 900000), 9999)
        for c in range(0,n):
            #if c not in (1,4,5,6):
            d = answer(a,b,c)
            print(d[1])

回答 15

这是一些比较两个字符串列表的简单的,保留顺序的方法。

一种不寻常的方法,使用pathlib

import pathlib


temp1 = ["One", "Two", "Three", "Four"]
temp2 = ["One", "Two"]

p = pathlib.Path(*temp1)
r = p.relative_to(*temp2)
list(r.parts)
# ['Three', 'Four']

假设两个列表都包含以相同的开头的字符串。有关更多详细信息,请参阅文档。注意,与设置操作相比,它并不是特别快。


使用以下方法的直接实现itertools.zip_longest

import itertools as it


[x for x, y in it.zip_longest(temp1, temp2) if x != y]
# ['Three', 'Four']

Here are a few simple, order-preserving ways of diffing two lists of strings.

Code

An unusual approach using pathlib:

import pathlib


temp1 = ["One", "Two", "Three", "Four"]
temp2 = ["One", "Two"]

p = pathlib.Path(*temp1)
r = p.relative_to(*temp2)
list(r.parts)
# ['Three', 'Four']

This assumes both lists contain strings with equivalent beginnings. See the docs for more details. Note, it is not particularly fast compared to set operations.


A straight-forward implementation using itertools.zip_longest:

import itertools as it


[x for x, y in it.zip_longest(temp1, temp2) if x != y]
# ['Three', 'Four']

回答 16

这是另一个解决方案:

def diff(a, b):
    xa = [i for i in set(a) if i not in b]
    xb = [i for i in set(b) if i not in a]
    return xa + xb

This is another solution:

def diff(a, b):
    xa = [i for i in set(a) if i not in b]
    xb = [i for i in set(b) if i not in a]
    return xa + xb

回答 17

如果遇到问题TypeError: unhashable type: 'list',则需要将列表或集合转换为元组,例如

set(map(tuple, list_of_lists1)).symmetric_difference(set(map(tuple, list_of_lists2)))

另请参阅如何在python中比较列表/集合的列表?

If you run into TypeError: unhashable type: 'list' you need to turn lists or sets into tuples, e.g.

set(map(tuple, list_of_lists1)).symmetric_difference(set(map(tuple, list_of_lists2)))

See also How to compare a list of lists/sets in python?


回答 18

假设我们有两个清单

list1 = [1, 3, 5, 7, 9]
list2 = [1, 2, 3, 4, 5]

从上面的两个列表中可以看出,列表2中存在项目1、3、5,而项目7、9中则不存在。另一方面,列表1中存在项目1、3、5,而项目2、4中不存在。

返回包含项目7、9和2、4的新列表的最佳解决方案是什么?

上面的所有答案都找到了解决方案,现在最最佳的是什么?

def difference(list1, list2):
    new_list = []
    for i in list1:
        if i not in list2:
            new_list.append(i)

    for j in list2:
        if j not in list1:
            new_list.append(j)
    return new_list

def sym_diff(list1, list2):
    return list(set(list1).symmetric_difference(set(list2)))

使用timeit我们可以看到结果

t1 = timeit.Timer("difference(list1, list2)", "from __main__ import difference, 
list1, list2")
t2 = timeit.Timer("sym_diff(list1, list2)", "from __main__ import sym_diff, 
list1, list2")

print('Using two for loops', t1.timeit(number=100000), 'Milliseconds')
print('Using two for loops', t2.timeit(number=100000), 'Milliseconds')

退货

[7, 9, 2, 4]
Using two for loops 0.11572412995155901 Milliseconds
Using symmetric_difference 0.11285737506113946 Milliseconds

Process finished with exit code 0

Let’s say we have two lists

list1 = [1, 3, 5, 7, 9]
list2 = [1, 2, 3, 4, 5]

we can see from the above two lists that items 1, 3, 5 exist in list2 and items 7, 9 do not. On the other hand, items 1, 3, 5 exist in list1 and items 2, 4 do not.

What is the best solution to return a new list containing items 7, 9 and 2, 4?

All answers above find the solution, now whats the most optimal?

def difference(list1, list2):
    new_list = []
    for i in list1:
        if i not in list2:
            new_list.append(i)

    for j in list2:
        if j not in list1:
            new_list.append(j)
    return new_list

versus

def sym_diff(list1, list2):
    return list(set(list1).symmetric_difference(set(list2)))

Using timeit we can see the results

t1 = timeit.Timer("difference(list1, list2)", "from __main__ import difference, 
list1, list2")
t2 = timeit.Timer("sym_diff(list1, list2)", "from __main__ import sym_diff, 
list1, list2")

print('Using two for loops', t1.timeit(number=100000), 'Milliseconds')
print('Using two for loops', t2.timeit(number=100000), 'Milliseconds')

returns

[7, 9, 2, 4]
Using two for loops 0.11572412995155901 Milliseconds
Using symmetric_difference 0.11285737506113946 Milliseconds

Process finished with exit code 0

回答 19

arulmr解决方案的单行版本

def diff(listA, listB):
    return set(listA) - set(listB) | set(listA) -set(listB)

single line version of arulmr solution

def diff(listA, listB):
    return set(listA) - set(listB) | set(listA) -set(listB)

回答 20

如果您想要更像变更集的东西…可以使用Counter

from collections import Counter

def diff(a, b):
  """ more verbose than needs to be, for clarity """
  ca, cb = Counter(a), Counter(b)
  to_add = cb - ca
  to_remove = ca - cb
  changes = Counter(to_add)
  changes.subtract(to_remove)
  return changes

lista = ['one', 'three', 'four', 'four', 'one']
listb = ['one', 'two', 'three']

In [127]: diff(lista, listb)
Out[127]: Counter({'two': 1, 'one': -1, 'four': -2})
# in order to go from lista to list b, you need to add a "two", remove a "one", and remove two "four"s

In [128]: diff(listb, lista)
Out[128]: Counter({'four': 2, 'one': 1, 'two': -1})
# in order to go from listb to lista, you must add two "four"s, add a "one", and remove a "two"

if you want something more like a changeset… could use Counter

from collections import Counter

def diff(a, b):
  """ more verbose than needs to be, for clarity """
  ca, cb = Counter(a), Counter(b)
  to_add = cb - ca
  to_remove = ca - cb
  changes = Counter(to_add)
  changes.subtract(to_remove)
  return changes

lista = ['one', 'three', 'four', 'four', 'one']
listb = ['one', 'two', 'three']

In [127]: diff(lista, listb)
Out[127]: Counter({'two': 1, 'one': -1, 'four': -2})
# in order to go from lista to list b, you need to add a "two", remove a "one", and remove two "four"s

In [128]: diff(listb, lista)
Out[128]: Counter({'four': 2, 'one': 1, 'two': -1})
# in order to go from listb to lista, you must add two "four"s, add a "one", and remove a "two"

回答 21

我们可以计算交集减去列表的并集:

temp1 = ['One', 'Two', 'Three', 'Four']
temp2 = ['One', 'Two', 'Five']

set(temp1+temp2)-(set(temp1)&set(temp2))

Out: set(['Four', 'Five', 'Three']) 

We can calculate intersection minus union of lists:

temp1 = ['One', 'Two', 'Three', 'Four']
temp2 = ['One', 'Two', 'Five']

set(temp1+temp2)-(set(temp1)&set(temp2))

Out: set(['Four', 'Five', 'Three']) 

回答 22

只需一行即可解决。给定的问题是两个列表(temp1和temp2)在第三个列表(temp3)中返​​回它们的差。

temp3 = list(set(temp1).difference(set(temp2)))

This can be solved with one line. The question is given two lists (temp1 and temp2) return their difference in a third list (temp3).

temp3 = list(set(temp1).difference(set(temp2)))

回答 23

这是区分两个列表(无论内容如何)的一种简单方法,您可以得到如下所示的结果:

>>> from sets import Set
>>>
>>> l1 = ['xvda', False, 'xvdbb', 12, 'xvdbc']
>>> l2 = ['xvda', 'xvdbb', 'xvdbc', 'xvdbd', None]
>>>
>>> Set(l1).symmetric_difference(Set(l2))
Set([False, 'xvdbd', None, 12])

希望这会有所帮助。

Here is an simple way to distinguish two lists (whatever the contents are), you can get the result as shown below :

>>> from sets import Set
>>>
>>> l1 = ['xvda', False, 'xvdbb', 12, 'xvdbc']
>>> l2 = ['xvda', 'xvdbb', 'xvdbc', 'xvdbd', None]
>>>
>>> Set(l1).symmetric_difference(Set(l2))
Set([False, 'xvdbd', None, 12])

Hope this will helpful.


回答 24

我更喜欢使用转换为集合,然后使用“ difference()”函数。完整的代码是:

temp1 = ['One', 'Two', 'Three', 'Four'  ]                   
temp2 = ['One', 'Two']
set1 = set(temp1)
set2 = set(temp2)
set3 = set1.difference(set2)
temp3 = list(set3)
print(temp3)

输出:

>>>print(temp3)
['Three', 'Four']

这是最容易理解的,如果将来使用大数据,将来会更容易,如果不需要重复数据,将其转换为数据集将删除重复数据。希望能帮助到你 ;-)

I prefer to use converting to sets and then using the “difference()” function. The full code is :

temp1 = ['One', 'Two', 'Three', 'Four'  ]                   
temp2 = ['One', 'Two']
set1 = set(temp1)
set2 = set(temp2)
set3 = set1.difference(set2)
temp3 = list(set3)
print(temp3)

Output:

>>>print(temp3)
['Three', 'Four']

It’s the easiest to undersand, and morover in future if you work with large data, converting it to sets will remove duplicates if duplicates are not required. Hope it helps ;-)


回答 25

(list(set(a)-set(b))+list(set(b)-set(a)))
(list(set(a)-set(b))+list(set(b)-set(a)))

回答 26

def diffList(list1, list2):     # returns the difference between two lists.
    if len(list1) > len(list2):
        return (list(set(list1) - set(list2)))
    else:
        return (list(set(list2) - set(list1)))

例如,如果list1 = [10, 15, 20, 25, 30, 35, 40]list2 = [25, 40, 35]则返回的列表将是output = [10, 20, 30, 15]

def diffList(list1, list2):     # returns the difference between two lists.
    if len(list1) > len(list2):
        return (list(set(list1) - set(list2)))
    else:
        return (list(set(list2) - set(list1)))

e.g. if list1 = [10, 15, 20, 25, 30, 35, 40] and list2 = [25, 40, 35] then the returned list will be output = [10, 20, 30, 15]


如何根据对象的属性对对象列表进行排序?

问题:如何根据对象的属性对对象列表进行排序?

我有一个Python对象列表,我想按对象本身的属性对其进行排序。该列表如下所示:

>>> ut
[<Tag: 128>, <Tag: 2008>, <Tag: <>, <Tag: actionscript>, <Tag: addresses>,
 <Tag: aes>, <Tag: ajax> ...]

每个对象都有一个计数:

>>> ut[1].count
1L

我需要按递减计数对列表进行排序。

我已经看到了几种方法,但是我正在寻找Python的最佳实践。

I’ve got a list of Python objects that I’d like to sort by an attribute of the objects themselves. The list looks like:

>>> ut
[<Tag: 128>, <Tag: 2008>, <Tag: <>, <Tag: actionscript>, <Tag: addresses>,
 <Tag: aes>, <Tag: ajax> ...]

Each object has a count:

>>> ut[1].count
1L

I need to sort the list by number of counts descending.

I’ve seen several methods for this, but I’m looking for best practice in Python.


回答 0

# To sort the list in place...
ut.sort(key=lambda x: x.count, reverse=True)

# To return a new list, use the sorted() built-in function...
newlist = sorted(ut, key=lambda x: x.count, reverse=True)

有关按键排序的更多信息。

# To sort the list in place...
ut.sort(key=lambda x: x.count, reverse=True)

# To return a new list, use the sorted() built-in function...
newlist = sorted(ut, key=lambda x: x.count, reverse=True)

More on sorting by keys.


回答 1

可以使用最快的方法,尤其是在您的列表中有很多记录的情况下operator.attrgetter("count")。但是,它可以在预操作者版本的Python上运行,因此具有后备机制会很好。然后,您可能需要执行以下操作:

try: import operator
except ImportError: keyfun= lambda x: x.count # use a lambda if no operator module
else: keyfun= operator.attrgetter("count") # use operator since it's faster than lambda

ut.sort(key=keyfun, reverse=True) # sort in-place

A way that can be fastest, especially if your list has a lot of records, is to use operator.attrgetter("count"). However, this might run on an pre-operator version of Python, so it would be nice to have a fallback mechanism. You might want to do the following, then:

try: import operator
except ImportError: keyfun= lambda x: x.count # use a lambda if no operator module
else: keyfun= operator.attrgetter("count") # use operator since it's faster than lambda

ut.sort(key=keyfun, reverse=True) # sort in-place

回答 2

读者应注意,key =方法:

ut.sort(key=lambda x: x.count, reverse=True)

比向对象添加丰富的比较运算符快许多倍。我很惊讶地阅读了这篇文章(“ Python in a Nutshell”的第485页)。您可以通过在这个小程序上运行测试来确认这一点:

#!/usr/bin/env python
import random

class C:
    def __init__(self,count):
        self.count = count

    def __cmp__(self,other):
        return cmp(self.count,other.count)

longList = [C(random.random()) for i in xrange(1000000)] #about 6.1 secs
longList2 = longList[:]

longList.sort() #about 52 - 6.1 = 46 secs
longList2.sort(key = lambda c: c.count) #about 9 - 6.1 = 3 secs

我的非常少的测试表明,第一种方法的运行速度要慢10倍以上,但书中说,一般而言,它仅慢5倍左右。他们说的原因是由于python(timsort)中使用了高度优化的排序算法。

仍然,.sort(lambda)比普通的旧.sort()快是很奇怪的。我希望他们能解决这个问题。

Readers should notice that the key= method:

ut.sort(key=lambda x: x.count, reverse=True)

is many times faster than adding rich comparison operators to the objects. I was surprised to read this (page 485 of “Python in a Nutshell”). You can confirm this by running tests on this little program:

#!/usr/bin/env python
import random

class C:
    def __init__(self,count):
        self.count = count

    def __cmp__(self,other):
        return cmp(self.count,other.count)

longList = [C(random.random()) for i in xrange(1000000)] #about 6.1 secs
longList2 = longList[:]

longList.sort() #about 52 - 6.1 = 46 secs
longList2.sort(key = lambda c: c.count) #about 9 - 6.1 = 3 secs

My, very minimal, tests show the first sort is more than 10 times slower, but the book says it is only about 5 times slower in general. The reason they say is due to the highly optimizes sort algorithm used in python (timsort).

Still, its very odd that .sort(lambda) is faster than plain old .sort(). I hope they fix that.


回答 3

面向对象的方法

最好将对象排序逻辑(如果适用)设置为类的属性,而不是在每个实例中都要求进行排序。

这样可以确保一致性,并且不需要样板代码。

至少,您应该指定__eq____lt__操作此功能。然后使用sorted(list_of_objects)

class Card(object):

    def __init__(self, rank, suit):
        self.rank = rank
        self.suit = suit

    def __eq__(self, other):
        return self.rank == other.rank and self.suit == other.suit

    def __lt__(self, other):
        return self.rank < other.rank

hand = [Card(10, 'H'), Card(2, 'h'), Card(12, 'h'), Card(13, 'h'), Card(14, 'h')]
hand_order = [c.rank for c in hand]  # [10, 2, 12, 13, 14]

hand_sorted = sorted(hand)
hand_sorted_order = [c.rank for c in hand_sorted]  # [2, 10, 12, 13, 14]

Object-oriented approach

It’s good practice to make object sorting logic, if applicable, a property of the class rather than incorporated in each instance the ordering is required.

This ensures consistency and removes the need for boilerplate code.

At a minimum, you should specify __eq__ and __lt__ operations for this to work. Then just use sorted(list_of_objects).

class Card(object):

    def __init__(self, rank, suit):
        self.rank = rank
        self.suit = suit

    def __eq__(self, other):
        return self.rank == other.rank and self.suit == other.suit

    def __lt__(self, other):
        return self.rank < other.rank

hand = [Card(10, 'H'), Card(2, 'h'), Card(12, 'h'), Card(13, 'h'), Card(14, 'h')]
hand_order = [c.rank for c in hand]  # [10, 2, 12, 13, 14]

hand_sorted = sorted(hand)
hand_sorted_order = [c.rank for c in hand_sorted]  # [2, 10, 12, 13, 14]

回答 4

from operator import attrgetter
ut.sort(key = attrgetter('count'), reverse = True)
from operator import attrgetter
ut.sort(key = attrgetter('count'), reverse = True)

回答 5

它看起来很像Django ORM模型实例的列表。

为什么不对这样的查询进行排序:

ut = Tag.objects.order_by('-count')

It looks much like a list of Django ORM model instances.

Why not sort them on query like this:

ut = Tag.objects.order_by('-count')

回答 6

将丰富的比较运算符添加到对象类,然后使用列表的sort()方法。
参见python中的丰富比较


更新:尽管此方法可行,但我认为Triptych的解决方案更简单,因此更适合您的情况。

Add rich comparison operators to the object class, then use sort() method of the list.
See rich comparison in python.


Update: Although this method would work, I think solution from Triptych is better suited to your case because way simpler.


回答 7

如果要排序的属性property,则可以避免导入,operator.attrgetter而可以使用属性的fget方法。

例如,对于Circle具有属性的类,radius我们可以circles按如下所示对半径列表进行排序:

result = sorted(circles, key=Circle.radius.fget)

这不是最知名的功能,但通常使我免于导入的麻烦。

If the attribute you want to sort by is a property, then you can avoid importing operator.attrgetter and use the property’s fget method instead.

For example, for a class Circle with a property radius we could sort a list of circles by radii as follows:

result = sorted(circles, key=Circle.radius.fget)

This is not the most well-known feature but often saves me a line with the import.


如何在Python中将字典键作为列表返回?

问题:如何在Python中将字典键作为列表返回?

Python 2.7中,我可以将字典作为列表获取:

>>> newdict = {1:0, 2:0, 3:0}
>>> newdict.keys()
[1, 2, 3]

现在,在Python> = 3.3中,我得到如下信息:

>>> newdict.keys()
dict_keys([1, 2, 3])

因此,我必须这样做以获得列表:

newlist = list()
for i in newdict.keys():
    newlist.append(i)

我想知道,是否有更好的方法在Python 3中返回列表?

In Python 2.7, I could get dictionary keys, values, or items as a list:

>>> newdict = {1:0, 2:0, 3:0}
>>> newdict.keys()
[1, 2, 3]

Now, in Python >= 3.3, I get something like this:

>>> newdict.keys()
dict_keys([1, 2, 3])

So, I have to do this to get a list:

newlist = list()
for i in newdict.keys():
    newlist.append(i)

I’m wondering, is there a better way to return a list in Python 3?


回答 0

尝试list(newdict.keys())

这会将dict_keys对象转换为列表。

另一方面,您应该问自己是否重要。Python的编码方式是假设鸭子输入(如果它看起来像鸭子,而像鸭子一样嘎嘎叫,那就是鸭子)。在dict_keys大多数情况下,该对象的作用类似于列表。例如:

for key in newdict.keys():
  print(key)

显然,插入运算符可能不起作用,但是对于字典关键字列表而言,这并没有多大意义。

Try list(newdict.keys()).

This will convert the dict_keys object to a list.

On the other hand, you should ask yourself whether or not it matters. The Pythonic way to code is to assume duck typing (if it looks like a duck and it quacks like a duck, it’s a duck). The dict_keys object will act like a list for most purposes. For instance:

for key in newdict.keys():
  print(key)

Obviously, insertion operators may not work, but that doesn’t make much sense for a list of dictionary keys anyway.


回答 1

Python> = 3.5替代方法:解压缩为列表文字 [*newdict]

Python 3.5引入了新的拆包概括(PEP 448),使您现在可以轻松进行以下操作:

>>> newdict = {1:0, 2:0, 3:0}
>>> [*newdict]
[1, 2, 3]

与解压缩的对象可*任何可迭代的对象一起使用,并且由于字典在迭代过程中会返回其键,因此您可以在列表文字中使用它轻松创建列表。

添加.keys()ie [*newdict.keys()]可能有助于使您的意图更加明确,尽管这将花费您函数查找和调用的费用。(实际上,这不是您真正应该担心的事情)。

*iterable语法类似于做list(iterable)其行为最初记录在呼叫部分 Python的参考手册。对于PEP 448,放宽了对*iterable可能出现的位置的限制,使其也可以放置在列表,集合和元组文字中,“ 表达式”列表上的参考手册也进行了更新以说明这一点。


尽管这等效于list(newdict)它更快(至少对于小型词典而言),因为实际上没有执行任何函数调用:

%timeit [*newdict]
1000000 loops, best of 3: 249 ns per loop

%timeit list(newdict)
1000000 loops, best of 3: 508 ns per loop

%timeit [k for k in newdict]
1000000 loops, best of 3: 574 ns per loop

对于较大的字典,速度几乎是相同的(遍历大量集合的开销胜过了函数调用的小开销)。


您可以用类似的方式创建元组和字典键集:

>>> *newdict,
(1, 2, 3)
>>> {*newdict}
{1, 2, 3}

在元组的情况下要小心尾随逗号!

Python >= 3.5 alternative: unpack into a list literal [*newdict]

New unpacking generalizations (PEP 448) were introduced with Python 3.5 allowing you to now easily do:

>>> newdict = {1:0, 2:0, 3:0}
>>> [*newdict]
[1, 2, 3]

Unpacking with * works with any object that is iterable and, since dictionaries return their keys when iterated through, you can easily create a list by using it within a list literal.

Adding .keys() i.e [*newdict.keys()] might help in making your intent a bit more explicit though it will cost you a function look-up and invocation. (which, in all honesty, isn’t something you should really be worried about).

The *iterable syntax is similar to doing list(iterable) and its behaviour was initially documented in the Calls section of the Python Reference manual. With PEP 448 the restriction on where *iterable could appear was loosened allowing it to also be placed in list, set and tuple literals, the reference manual on Expression lists was also updated to state this.


Though equivalent to list(newdict) with the difference that it’s faster (at least for small dictionaries) because no function call is actually performed:

%timeit [*newdict]
1000000 loops, best of 3: 249 ns per loop

%timeit list(newdict)
1000000 loops, best of 3: 508 ns per loop

%timeit [k for k in newdict]
1000000 loops, best of 3: 574 ns per loop

with larger dictionaries the speed is pretty much the same (the overhead of iterating through a large collection trumps the small cost of a function call).


In a similar fashion, you can create tuples and sets of dictionary keys:

>>> *newdict,
(1, 2, 3)
>>> {*newdict}
{1, 2, 3}

beware of the trailing comma in the tuple case!


回答 2

list(newdict)在Python 2和Python 3中均可使用,在中提供了键的简单列表newdictkeys()没必要 (:

list(newdict) works in both Python 2 and Python 3, providing a simple list of the keys in newdict. keys() isn’t necessary. (:


回答 3

在“鸭子类型”定义上有一点点偏离- dict.keys()返回一个可迭代的对象,而不是类似列表的对象。它可以在任何可迭代的地方都可以使用-列表不能在任何地方使用。列表也是可迭代的,但可迭代的不是列表(或序列…)

在实际的用例中,与字典中的键有关的最常见的事情是遍历它们,因此这很有意义。如果确实需要它们作为清单,则可以调用list()

非常相似zip()-在大多数情况下,它会被迭代-为什么创建一个新的元组列表只是为了对其进行迭代,然后又将其丢弃?

这是python中使用更多迭代器(和生成器),而不是到处都是列表副本的一种大趋势的一部分。

dict.keys() 不过,应该可以理解-仔细检查是否有错别字或其他内容…对我来说效果很好:

>>> d = dict(zip(['Sounder V Depth, F', 'Vessel Latitude, Degrees-Minutes'], [None, None]))
>>> [key.split(", ") for key in d.keys()]
[['Sounder V Depth', 'F'], ['Vessel Latitude', 'Degrees-Minutes']]

A bit off on the “duck typing” definition — dict.keys() returns an iterable object, not a list-like object. It will work anywhere an iterable will work — not any place a list will. a list is also an iterable, but an iterable is NOT a list (or sequence…)

In real use-cases, the most common thing to do with the keys in a dict is to iterate through them, so this makes sense. And if you do need them as a list you can call list().

Very similarly for zip() — in the vast majority of cases, it is iterated through — why create an entire new list of tuples just to iterate through it and then throw it away again?

This is part of a large trend in python to use more iterators (and generators), rather than copies of lists all over the place.

dict.keys() should work with comprehensions, though — check carefully for typos or something… it works fine for me:

>>> d = dict(zip(['Sounder V Depth, F', 'Vessel Latitude, Degrees-Minutes'], [None, None]))
>>> [key.split(", ") for key in d.keys()]
[['Sounder V Depth', 'F'], ['Vessel Latitude', 'Degrees-Minutes']]

回答 4

您还可以使用列表推导

>>> newdict = {1:0, 2:0, 3:0}
>>> [k  for  k in  newdict.keys()]
[1, 2, 3]

或更短一点

>>> [k  for  k in  newdict]
[1, 2, 3]

注意:在3.7版以下的版本中,不能保证订购(订购仍然只是CPython 3.6的实现细节)。

You can also use a list comprehension:

>>> newdict = {1:0, 2:0, 3:0}
>>> [k  for  k in  newdict.keys()]
[1, 2, 3]

Or, shorter,

>>> [k  for  k in  newdict]
[1, 2, 3]

Note: Order is not guaranteed on versions under 3.7 (ordering is still only an implementation detail with CPython 3.6).


回答 5

不使用该keys方法转换为列表使其更具可读性:

list(newdict)

并且,当遍历字典时,不需要keys()

for key in newdict:
    print key

除非您要在循环中进行修改,否则将需要预先创建的键列表:

for key in list(newdict):
    del newdict[key]

在Python 2上,使用会产生少量性能提升keys()

Converting to a list without using the keys method makes it more readable:

list(newdict)

and, when looping through dictionaries, there’s no need for keys():

for key in newdict:
    print key

unless you are modifying it within the loop which would require a list of keys created beforehand:

for key in list(newdict):
    del newdict[key]

On Python 2 there is a marginal performance gain using keys().


回答 6

如果您需要单独存储密钥,那么此解决方案使用扩展的可迭代拆包(python3.x +),与迄今为止提供的所有其他解决方案相比,它的键入次数更少。

newdict = {1: 0, 2: 0, 3: 0}
*k, = newdict

k
# [1, 2, 3]

            ╒═══════════════╤═════════════════════════════════════════╕
             k = list(d)      9 characters (excluding whitespace)   
            ├───────────────┼─────────────────────────────────────────┤
             k = [*d]         6 characters                          
            ├───────────────┼─────────────────────────────────────────┤
             *k, = d          5 characters                          
            ╘═══════════════╧═════════════════════════════════════════╛

If you need to store the keys separately, here’s a solution that requires less typing than every other solution presented thus far, using Extended Iterable Unpacking (python3.x+).

newdict = {1: 0, 2: 0, 3: 0}
*k, = newdict

k
# [1, 2, 3]

            ╒═══════════════╤═════════════════════════════════════════╕
            │ k = list(d)   │   9 characters (excluding whitespace)   │
            ├───────────────┼─────────────────────────────────────────┤
            │ k = [*d]      │   6 characters                          │
            ├───────────────┼─────────────────────────────────────────┤
            │ *k, = d       │   5 characters                          │
            ╘═══════════════╧═════════════════════════════════════════╛

回答 7

我可以想到两种从字典中提取键的方法。

方法1:- 使用.keys()方法获取密钥,然后将其转换为列表。

some_dict = {1: 'one', 2: 'two', 3: 'three'}
list_of_keys = list(some_dict.keys())
print(list_of_keys)
-->[1,2,3]

方法2:- 创建一个空列表,然后通过循环将键附加到列表中。您也可以通过此循环获取值(仅将.keys()用于键,将.items()用于键和值提取)

list_of_keys = []
list_of_values = []
for key,val in some_dict.items():
    list_of_keys.append(key)
    list_of_values.append(val)

print(list_of_keys)
-->[1,2,3]

print(list_of_values)
-->['one','two','three']

I can think of 2 ways in which we can extract the keys from the dictionary.

Method 1: – To get the keys using .keys() method and then convert it to list.

some_dict = {1: 'one', 2: 'two', 3: 'three'}
list_of_keys = list(some_dict.keys())
print(list_of_keys)
-->[1,2,3]

Method 2: – To create an empty list and then append keys to the list via a loop. You can get the values with this loop as well (use .keys() for just keys and .items() for both keys and values extraction)

list_of_keys = []
list_of_values = []
for key,val in some_dict.items():
    list_of_keys.append(key)
    list_of_values.append(val)

print(list_of_keys)
-->[1,2,3]

print(list_of_values)
-->['one','two','three']

如何从列表中删除第一个项目?

问题:如何从列表中删除第一个项目?

我有表[0, 1, 2, 3, 4],我想将它做成[1, 2, 3, 4]。我该怎么办?

I have the list [0, 1, 2, 3, 4] I’d like to make it into [1, 2, 3, 4]. How do I go about this?


回答 0

Python清单

list.pop(索引)

>>> l = ['a', 'b', 'c', 'd']
>>> l.pop(0)
'a'
>>> l
['b', 'c', 'd']
>>> 

删除列表[索引]

>>> l = ['a', 'b', 'c', 'd']
>>> del l[0]
>>> l
['b', 'c', 'd']
>>> 

这些都将修改您的原始列表。

其他人建议使用切片:

  • 复制清单
  • 可以返回一个子集

另外,如果要执行许多pop(0),则应查看collections.deque

from collections import deque
>>> l = deque(['a', 'b', 'c', 'd'])
>>> l.popleft()
'a'
>>> l
deque(['b', 'c', 'd'])
  • 从列表的左端提供更高的性能

Python List

list.pop(index)

>>> l = ['a', 'b', 'c', 'd']
>>> l.pop(0)
'a'
>>> l
['b', 'c', 'd']
>>> 

del list[index]

>>> l = ['a', 'b', 'c', 'd']
>>> del l[0]
>>> l
['b', 'c', 'd']
>>> 

These both modify your original list.

Others have suggested using slicing:

  • Copies the list
  • Can return a subset

Also, if you are performing many pop(0), you should look at collections.deque

from collections import deque
>>> l = deque(['a', 'b', 'c', 'd'])
>>> l.popleft()
'a'
>>> l
deque(['b', 'c', 'd'])
  • Provides higher performance popping from left end of the list

回答 1

切片:

x = [0,1,2,3,4]
x = x[1:]

这实际上将返回原始的子集,但不会对其进行修改。

Slicing:

x = [0,1,2,3,4]
x = x[1:]

Which would actually return a subset of the original but not modify it.


回答 2

>>> x = [0, 1, 2, 3, 4]
>>> x.pop(0)
0

更多关于此这里

>>> x = [0, 1, 2, 3, 4]
>>> x.pop(0)
0

More on this here.


回答 3

使用列表切片,请参阅有关列表的Python教程以获取更多详细信息:

>>> l = [0, 1, 2, 3, 4]
>>> l[1:]
[1, 2, 3, 4]

With list slicing, see the Python tutorial about lists for more details:

>>> l = [0, 1, 2, 3, 4]
>>> l[1:]
[1, 2, 3, 4]

回答 4

你会这样做

l = [0, 1, 2, 3, 4]
l.pop(0)

要么 l = l[1:]

利弊

使用pop可以获取值

x = l.pop(0) x0

you would just do this

l = [0, 1, 2, 3, 4]
l.pop(0)

or l = l[1:]

Pros and Cons

Using pop you can retrieve the value

say x = l.pop(0) x would be 0


回答 5

然后将其删除:

x = [0, 1, 2, 3, 4]
del x[0]
print x
# [1, 2, 3, 4]

Then just delete it:

x = [0, 1, 2, 3, 4]
del x[0]
print x
# [1, 2, 3, 4]

回答 6

您可以list.reverse()用来反转列表,然后list.pop()删除最后一个元素,例如:

l = [0, 1, 2, 3, 4]
l.reverse()
print l
[4, 3, 2, 1, 0]


l.pop()
0
l.pop()
1
l.pop()
2
l.pop()
3
l.pop()
4

You can use list.reverse() to reverse the list, then list.pop() to remove the last element, for example:

l = [0, 1, 2, 3, 4]
l.reverse()
print l
[4, 3, 2, 1, 0]


l.pop()
0
l.pop()
1
l.pop()
2
l.pop()
3
l.pop()
4

回答 7

您也可以使用list.remove(a[0])pop删除列表中的第一个元素。

>>>> a=[1,2,3,4,5]
>>>> a.remove(a[0])
>>>> print a
>>>> [2,3,4,5]

You can also use list.remove(a[0]) to pop out the first element in the list.

>>>> a=[1,2,3,4,5]
>>>> a.remove(a[0])
>>>> print a
>>>> [2,3,4,5]

回答 8

如果使用numpy,则需要使用delete方法:

import numpy as np

a = np.array([1, 2, 3, 4, 5])

a = np.delete(a, 0)

print(a) # [2 3 4 5]

If you are working with numpy you need to use the delete method:

import numpy as np

a = np.array([1, 2, 3, 4, 5])

a = np.delete(a, 0)

print(a) # [2 3 4 5]

回答 9

有一个称为“双端队列”或双头队列的数据结构,它比列表更快,更高效。您可以使用列表并将其转换为双端队列,并在其中进行所需的转换。您也可以将双端队列转换回列表。

import collections
mylist = [0, 1, 2, 3, 4]

#make a deque from your list
de = collections.deque(mylist)

#you can remove from a deque from either left side or right side
de.popleft()
print(de)

#you can covert the deque back to list
mylist = list(de)
print(mylist)

Deque还提供了非常有用的功能,例如将元素插入列表的任一侧或任何特定的索引。您也可以旋转或反转双端队列。试试看!!

There is a datastructure called “deque” or double ended queue which is faster and efficient than a list. You can use your list and convert it to deque and do the required transformations in it. You can also convert the deque back to list.

import collections
mylist = [0, 1, 2, 3, 4]

#make a deque from your list
de = collections.deque(mylist)

#you can remove from a deque from either left side or right side
de.popleft()
print(de)

#you can covert the deque back to list
mylist = list(de)
print(mylist)

Deque also provides very useful functions like inserting elements to either side of the list or to any specific index. You can also rotate or reverse a deque. Give it a try!!