标签归档:list

如何检查列表是否为空?

问题:如何检查列表是否为空?

例如,如果通过以下内容:

a = []

如何检查是否a为空?

For example, if passed the following:

a = []

How do I check to see if a is empty?


回答 0

if not a:
  print("List is empty")

使用空的隐式布尔list是非常Python的。

if not a:
  print("List is empty")

Using the implicit booleanness of the empty list is quite pythonic.


回答 1

这样做的pythonic方法来自PEP 8样式指南(其中Yes表示“推荐”,No表示“不推荐”):

对于序列(字符串,列表,元组),请使用空序列为假的事实。

Yes: if not seq:
     if seq:

No:  if len(seq):
     if not len(seq):

The pythonic way to do it is from the PEP 8 style guide (where Yes means “recommended” and No means “not recommended”):

For sequences, (strings, lists, tuples), use the fact that empty sequences are false.

Yes: if not seq:
     if seq:

No:  if len(seq):
     if not len(seq):

回答 2

我明确喜欢它:

if len(li) == 0:
    print('the list is empty')

这样,它是100%清楚的li是一个序列(列表),我们要测试其大小。我的问题if not li: ...是它给人的错误印象li是布尔变量。

I prefer it explicitly:

if len(li) == 0:
    print('the list is empty')

This way it’s 100% clear that li is a sequence (list) and we want to test its size. My problem with if not li: ... is that it gives the false impression that li is a boolean variable.


回答 3

这是google首次针对“ python测试空数组”和类似的查询命中,再加上其他人似乎在推广问题,不仅限于列表,因此我想为很多人添加另一种类型的序列的警告可能会用。

其他方法不适用于NumPy数组

您需要注意NumPy数组,因为其他对lists或其他标准容器都适用的方法对NumPy数组无效。我在下面解释了原因,但总之,首选方法是使用size

“ pythonic”方式无效:第1部分

NumPy数组的“ pythonic”方法失败,因为NumPy尝试将数组转换为bools 的数组,并if x尝试bool一次对所有这些s 求值,以获得某种合计的真值。但这没有任何意义,因此您得到了ValueError

>>> x = numpy.array([0,1])
>>> if x: print("x")
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

“ pythonic”方式无效:第2部分

但是至少上述情况告诉您它失败了。如果您碰巧拥有一个仅包含一个元素的NumPy数组,则该if语句将“正常工作”,即不会产生错误。但是,如果该元素恰好是0(或0.0,或False,…),则该if语句将错误地导致False

>>> x = numpy.array([0,])
>>> if x: print("x")
... else: print("No x")
No x

但是显然x存在并且不为空!这个结果不是您想要的。

使用len会产生意想不到的结果

例如,

len( numpy.zeros((1,0)) )

即使数组有零个元素,也返回1。

numpythonic方式

SciPy常见问题解答中所述,在您知道拥有NumPy数组的所有情况下,正确的方法是使用if x.size

>>> x = numpy.array([0,1])
>>> if x.size: print("x")
x

>>> x = numpy.array([0,])
>>> if x.size: print("x")
... else: print("No x")
x

>>> x = numpy.zeros((1,0))
>>> if x.size: print("x")
... else: print("No x")
No x

如果不确定是a list,NumPy数组还是其他类型,可以将此方法与@dubiousjim给出的答案结合使用以确保对每种类型使用正确的测试。并不是很“ pythonic”,但事实证明,NumPy至少在这种意义上有意破坏了pythonicity。

如果你需要做的不仅仅是检查,如果输入的是空的,而你正在使用其他的功能NumPy的像索引或数学运算,它可能是更有效的(当然更常见)来强制输入一个NumPy的阵列。有一些不错的功能可以快速完成此操作-最重要的是numpy.asarray。这将接受您的输入,如果已经是数组,则不执行任何操作;如果是列表,元组等,则将您的输入包装到数组中,并有选择地将其转换为您选择的dtype。因此,它可以在任何时候都非常快,并且可以确保您只是假设输入是NumPy数组。我们通常甚至只使用相同的名称,因为转换为数组不会使它返回当前范围之外

x = numpy.asarray(x, dtype=numpy.double)

这将使x.size我在此页面上看到的所有情况下都可以进行检查。

This is the first google hit for “python test empty array” and similar queries, plus other people seem to be generalizing the question beyond just lists, so I thought I’d add a caveat for a different type of sequence that a lot of people might use.

Other methods don’t work for NumPy arrays

You need to be careful with NumPy arrays, because other methods that work fine for lists or other standard containers fail for NumPy arrays. I explain why below, but in short, the preferred method is to use size.

The “pythonic” way doesn’t work: Part 1

The “pythonic” way fails with NumPy arrays because NumPy tries to cast the array to an array of bools, and if x tries to evaluate all of those bools at once for some kind of aggregate truth value. But this doesn’t make any sense, so you get a ValueError:

>>> x = numpy.array([0,1])
>>> if x: print("x")
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

The “pythonic” way doesn’t work: Part 2

But at least the case above tells you that it failed. If you happen to have a NumPy array with exactly one element, the if statement will “work”, in the sense that you don’t get an error. However, if that one element happens to be 0 (or 0.0, or False, …), the if statement will incorrectly result in False:

>>> x = numpy.array([0,])
>>> if x: print("x")
... else: print("No x")
No x

But clearly x exists and is not empty! This result is not what you wanted.

Using len can give unexpected results

For example,

len( numpy.zeros((1,0)) )

returns 1, even though the array has zero elements.

The numpythonic way

As explained in the SciPy FAQ, the correct method in all cases where you know you have a NumPy array is to use if x.size:

>>> x = numpy.array([0,1])
>>> if x.size: print("x")
x

>>> x = numpy.array([0,])
>>> if x.size: print("x")
... else: print("No x")
x

>>> x = numpy.zeros((1,0))
>>> if x.size: print("x")
... else: print("No x")
No x

If you’re not sure whether it might be a list, a NumPy array, or something else, you could combine this approach with the answer @dubiousjim gives to make sure the right test is used for each type. Not very “pythonic”, but it turns out that NumPy intentionally broke pythonicity in at least this sense.

If you need to do more than just check if the input is empty, and you’re using other NumPy features like indexing or math operations, it’s probably more efficient (and certainly more common) to force the input to be a NumPy array. There are a few nice functions for doing this quickly — most importantly numpy.asarray. This takes your input, does nothing if it’s already an array, or wraps your input into an array if it’s a list, tuple, etc., and optionally converts it to your chosen dtype. So it’s very quick whenever it can be, and it ensures that you just get to assume the input is a NumPy array. We usually even just use the same name, as the conversion to an array won’t make it back outside of the current scope:

x = numpy.asarray(x, dtype=numpy.double)

This will make the x.size check work in all cases I see on this page.


回答 4

检查列表是否为空的最佳方法

例如,如果通过以下内容:

a = []

如何检查a是否为空?

简短答案:

将列表放在布尔上下文中(例如,使用ifor while语句)。它将测试False是否为空,True否则为空。例如:

if not a:                           # do this!
    print('a is an empty list')

人教版8

PEP 8是Python标准库中Python代码的官方Python样式指南,它断言:

对于序列(字符串,列表,元组),请使用以下事实:空序列为假。

Yes: if not seq:
     if seq:

No: if len(seq):
    if not len(seq):

我们应该期望标准库代码应尽可能地具有高性能和正确性。但是为什么会这样,为什么我们需要此指南?

说明

我经常从Python的新手那里看到这样的代码:

if len(a) == 0:                     # Don't do this!
    print('a is an empty list')

懒惰语言的用户可能会这样做:

if a == []:                         # Don't do this!
    print('a is an empty list')

这些在其各自的其他语言中都是正确的。在Python中,这甚至在语义上都是正确的。

但是我们认为它不是Python语言,因为Python通过布尔强制转换直接在列表对象的界面中支持这些语义。

文档中(并特别注意包含空列表[]):

默认情况下,除非对象的类定义了与该对象一起调用时__bool__()返回False__len__()方法或返回零的方法,否则该对象被视为true 。以下是大多数被视为错误的内置对象:

  • 定义为false的常量:NoneFalse
  • 任何数值类型的零:00.00jDecimal(0)Fraction(0, 1)
  • 空序列和集合:''()[]{}set()range(0)

以及数据模型文档:

object.__bool__(self)

调用实现真值测试和内置操作bool();应该返回FalseTrue。如果未定义此方法,__len__()则调用该方法( 如果已定义),并且如果其结果为非零,则将该对象视为true。如果一个类既未定义,也__len__() 未定义__bool__(),则其所有实例均被视为true。

object.__len__(self)

调用以实现内置函数len()。应该返回对象的长度,即> = 0的整数。此外,在布尔上下文中,未定义__bool__()方法且其__len__()方法返回零的对象被视为false。

所以代替这个:

if len(a) == 0:                     # Don't do this!
    print('a is an empty list')

或这个:

if a == []:                     # Don't do this!
    print('a is an empty list')

做这个:

if not a:
    print('a is an empty list')

做Pythonic通常可以提高性能:

它还清吗?(请注意,执行等效操作的时间越少越好:)

>>> import timeit
>>> min(timeit.repeat(lambda: len([]) == 0, repeat=100))
0.13775854044661884
>>> min(timeit.repeat(lambda: [] == [], repeat=100))
0.0984637276455409
>>> min(timeit.repeat(lambda: not [], repeat=100))
0.07878462291455435

对于规模而言,这是调用函数,构造并返回空列表的成本,您可以从上面使用的空度检查的成本中减去这些成本:

>>> min(timeit.repeat(lambda: [], repeat=100))
0.07074015751817342

我们看到,无论是与内建函数长度检查len相比,0 检查对空列表是太多比使用语言的内置语法记载高性能的少。

为什么?

对于len(a) == 0检查:

首先,Python必须检查全局变量以查看是否len有阴影。

然后,它必须调用函数load 0,并在Python中(而不是使用C)进行相等比较:

>>> import dis
>>> dis.dis(lambda: len([]) == 0)
  1           0 LOAD_GLOBAL              0 (len)
              2 BUILD_LIST               0
              4 CALL_FUNCTION            1
              6 LOAD_CONST               1 (0)
              8 COMPARE_OP               2 (==)
             10 RETURN_VALUE

并且对于[] == []它,它必须建立一个不必要的列表,然后再次在Python的虚拟机(而不是C)中执行比较操作。

>>> dis.dis(lambda: [] == [])
  1           0 BUILD_LIST               0
              2 BUILD_LIST               0
              4 COMPARE_OP               2 (==)
              6 RETURN_VALUE

因为列表的长度被缓存在对象实例头中,所以“ Pythonic”方式是一种更简单,更快速的检查:

>>> dis.dis(lambda: not [])
  1           0 BUILD_LIST               0
              2 UNARY_NOT
              4 RETURN_VALUE

来自C源代码和文档的证据

PyVarObject

这是PyObject对该ob_size字段的扩展。这仅用于具有长度概念的对象。这种类型通常不会出现在Python / C API中。它对应于由PyObject_VAR_HEAD宏扩展定义的字段。

Include / listobject.h中的c源:

typedef struct {
    PyObject_VAR_HEAD
    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size

对评论的回应:

我想指出,这也适用于非空的情况下,虽然它很丑陋与l=[]%timeit len(l) != 090.6纳秒±8.3纳秒,%timeit l != []55.6纳秒±3.09,%timeit not not l38.5±NS 0.372。但是,not not l尽管速度提高了三倍,但没有任何人可以享受。看起来很荒谬。但是速度胜出,
我想问题是要及时测试,因为这if l:足够了,但令人惊讶地%timeit bool(l)产生了101 ns±2.64 ns。有趣的是,没有这种惩罚就没有办法胁迫。%timeit l是没有用的,因为不会进行任何转换。

IPython的魔术%timeit在这里并非完全没有用:

In [1]: l = []                                                                  

In [2]: %timeit l                                                               
20 ns ± 0.155 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

In [3]: %timeit not l                                                           
24.4 ns ± 1.58 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [4]: %timeit not not l                                                       
30.1 ns ± 2.16 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

我们可以看到这里每增加一个线性成本not。我们希望看到成本ceteris paribus,即其他所有条件都相等-尽可能将其他所有条件最小化:

In [5]: %timeit if l: pass                                                      
22.6 ns ± 0.963 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [6]: %timeit if not l: pass                                                  
24.4 ns ± 0.796 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [7]: %timeit if not not l: pass                                              
23.4 ns ± 0.793 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

现在让我们看一看一个空列表的情况:

In [8]: l = [1]                                                                 

In [9]: %timeit if l: pass                                                      
23.7 ns ± 1.06 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [10]: %timeit if not l: pass                                                 
23.6 ns ± 1.64 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [11]: %timeit if not not l: pass                                             
26.3 ns ± 1 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

我们可以在这里看到的是,无论是将实际值传递bool给条件检查还是将列表本身传递给您,几乎没有什么区别,并且如果有的话,按原样提供列表会更快。

Python是用C编写的;它在C级别使用其逻辑。您用Python编写的任何内容都会变慢。除非您直接使用Python内置的机制,否则这可能会慢几个数量级。

Best way to check if a list is empty

For example, if passed the following:

a = []

How do I check to see if a is empty?

Short Answer:

Place the list in a boolean context (for example, with an if or while statement). It will test False if it is empty, and True otherwise. For example:

if not a:                           # do this!
    print('a is an empty list')

PEP 8

PEP 8, the official Python style guide for Python code in Python’s standard library, asserts:

For sequences, (strings, lists, tuples), use the fact that empty sequences are false.

Yes: if not seq:
     if seq:

No: if len(seq):
    if not len(seq):

We should expect that standard library code should be as performant and correct as possible. But why is that the case, and why do we need this guidance?

Explanation

I frequently see code like this from experienced programmers new to Python:

if len(a) == 0:                     # Don't do this!
    print('a is an empty list')

And users of lazy languages may be tempted to do this:

if a == []:                         # Don't do this!
    print('a is an empty list')

These are correct in their respective other languages. And this is even semantically correct in Python.

But we consider it un-Pythonic because Python supports these semantics directly in the list object’s interface via boolean coercion.

From the docs (and note specifically the inclusion of the empty list, []):

By default, an object is considered true unless its class defines either a __bool__() method that returns False or a __len__() method that returns zero, when called with the object. Here are most of the built-in objects considered false:

  • constants defined to be false: None and False.
  • zero of any numeric type: 0, 0.0, 0j, Decimal(0), Fraction(0, 1)
  • empty sequences and collections: '', (), [], {}, set(), range(0)

And the datamodel documentation:

object.__bool__(self)

Called to implement truth value testing and the built-in operation bool(); should return False or True. When this method is not defined, __len__() is called, if it is defined, and the object is considered true if its result is nonzero. If a class defines neither __len__() nor __bool__(), all its instances are considered true.

and

object.__len__(self)

Called to implement the built-in function len(). Should return the length of the object, an integer >= 0. Also, an object that doesn’t define a __bool__() method and whose __len__() method returns zero is considered to be false in a Boolean context.

So instead of this:

if len(a) == 0:                     # Don't do this!
    print('a is an empty list')

or this:

if a == []:                     # Don't do this!
    print('a is an empty list')

Do this:

if not a:
    print('a is an empty list')

Doing what’s Pythonic usually pays off in performance:

Does it pay off? (Note that less time to perform an equivalent operation is better:)

>>> import timeit
>>> min(timeit.repeat(lambda: len([]) == 0, repeat=100))
0.13775854044661884
>>> min(timeit.repeat(lambda: [] == [], repeat=100))
0.0984637276455409
>>> min(timeit.repeat(lambda: not [], repeat=100))
0.07878462291455435

For scale, here’s the cost of calling the function and constructing and returning an empty list, which you might subtract from the costs of the emptiness checks used above:

>>> min(timeit.repeat(lambda: [], repeat=100))
0.07074015751817342

We see that either checking for length with the builtin function len compared to 0 or checking against an empty list is much less performant than using the builtin syntax of the language as documented.

Why?

For the len(a) == 0 check:

First Python has to check the globals to see if len is shadowed.

Then it must call the function, load 0, and do the equality comparison in Python (instead of with C):

>>> import dis
>>> dis.dis(lambda: len([]) == 0)
  1           0 LOAD_GLOBAL              0 (len)
              2 BUILD_LIST               0
              4 CALL_FUNCTION            1
              6 LOAD_CONST               1 (0)
              8 COMPARE_OP               2 (==)
             10 RETURN_VALUE

And for the [] == [] it has to build an unnecessary list and then, again, do the comparison operation in Python’s virtual machine (as opposed to C)

>>> dis.dis(lambda: [] == [])
  1           0 BUILD_LIST               0
              2 BUILD_LIST               0
              4 COMPARE_OP               2 (==)
              6 RETURN_VALUE

The “Pythonic” way is a much simpler and faster check since the length of the list is cached in the object instance header:

>>> dis.dis(lambda: not [])
  1           0 BUILD_LIST               0
              2 UNARY_NOT
              4 RETURN_VALUE

Evidence from the C source and documentation

PyVarObject

This is an extension of PyObject that adds the ob_size field. This is only used for objects that have some notion of length. This type does not often appear in the Python/C API. It corresponds to the fields defined by the expansion of the PyObject_VAR_HEAD macro.

From the c source in Include/listobject.h:

typedef struct {
    PyObject_VAR_HEAD
    /* Vector of pointers to list elements.  list[0] is ob_item[0], etc. */
    PyObject **ob_item;

    /* ob_item contains space for 'allocated' elements.  The number
     * currently in use is ob_size.
     * Invariants:
     *     0 <= ob_size <= allocated
     *     len(list) == ob_size

Response to comments:

I would point out that this is also true for the non-empty case though its pretty ugly as with l=[] then %timeit len(l) != 0 90.6 ns ± 8.3 ns, %timeit l != [] 55.6 ns ± 3.09, %timeit not not l 38.5 ns ± 0.372. But there is no way anyone is going to enjoy not not l despite triple the speed. It looks ridiculous. But the speed wins out
I suppose the problem is testing with timeit since just if l: is sufficient but surprisingly %timeit bool(l) yields 101 ns ± 2.64 ns. Interesting there is no way to coerce to bool without this penalty. %timeit l is useless since no conversion would occur.

IPython magic, %timeit, is not entirely useless here:

In [1]: l = []                                                                  

In [2]: %timeit l                                                               
20 ns ± 0.155 ns per loop (mean ± std. dev. of 7 runs, 100000000 loops each)

In [3]: %timeit not l                                                           
24.4 ns ± 1.58 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [4]: %timeit not not l                                                       
30.1 ns ± 2.16 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

We can see there’s a bit of linear cost for each additional not here. We want to see the costs, ceteris paribus, that is, all else equal – where all else is minimized as far as possible:

In [5]: %timeit if l: pass                                                      
22.6 ns ± 0.963 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [6]: %timeit if not l: pass                                                  
24.4 ns ± 0.796 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [7]: %timeit if not not l: pass                                              
23.4 ns ± 0.793 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Now let’s look at the case for an unempty list:

In [8]: l = [1]                                                                 

In [9]: %timeit if l: pass                                                      
23.7 ns ± 1.06 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [10]: %timeit if not l: pass                                                 
23.6 ns ± 1.64 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [11]: %timeit if not not l: pass                                             
26.3 ns ± 1 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

What we can see here is that it makes little difference whether you pass in an actual bool to the condition check or the list itself, and if anything, giving the list, as is, is faster.

Python is written in C; it uses its logic at the C level. Anything you write in Python will be slower. And it will likely be orders of magnitude slower unless you’re using the mechanisms built into Python directly.


回答 5

空列表本身在真实值测试中被认为是错误的(请参阅python文档):

a = []
if a:
     print "not empty"

@达伦·托马斯

编辑:反对测试空列表为假的另一点:多态性怎么样?您不应该依赖列表作为列表。它应该像鸭子一样嘎嘎叫-当它没有元素时,如何使duckCollection嘎嘎叫“ False”?

你duckCollection应该实现__nonzero____len__因此如果一个:没有问题会工作。

An empty list is itself considered false in true value testing (see python documentation):

a = []
if a:
     print "not empty"

@Daren Thomas

EDIT: Another point against testing the empty list as False: What about polymorphism? You shouldn’t depend on a list being a list. It should just quack like a duck – how are you going to get your duckCollection to quack ”False” when it has no elements?

Your duckCollection should implement __nonzero__ or __len__ so the if a: will work without problems.


回答 6

帕特里克(已接受)的答案是正确的:这if not a:是正确的方法。Harley Holcombe的答案是正确的,因为这在PEP 8样式指南中。但是,答案没有一个能解释的是为什么遵循这个习惯用法是一个好主意-即使您个人发现它对于Ruby用户或其他任何人来说都不足够明确或令人困惑。

Python代码和Python社区都有非常强大的习惯用法。遵循这些惯用法可以使您的代码更容易为有Python经验的人阅读。当您违反这些习惯用法时,这是一个强烈的信号。

这是真的,if not a:不区分空列表None,或数字0,或空的元组,或空用户创建的集合类型,或空用户创建不-相当-集合类型,或单元素与NumPy阵列充当具有falsey标量值等。有时,明确这一点很重要。而在这种情况下,你知道什么,你想明确一下,这样你就可以测试这一点。例如,if not a and a is not None:表示“除None以外的任何虚假内容”,而if len(a) != 0:表示“仅空序列-此处除序列外的任何其他内容都是错误”,依此类推。除了精确测试要测试的内容外,这还向读者表明该测试很重要。

但是,当您没有什么要明确的内容时,除了if not a:会误导读者,还有其他任何事情。当您不重要时,您是在发出信号。(您也可以使代码不灵活,或慢,或什么的,但是这一切都不太重要。)如果你习惯性地误导这样的读者,那么当你这样做需要做一个区分,它会向任何人声张,因为您在代码中一直在“狼吞虎咽”。

Patrick’s (accepted) answer is right: if not a: is the right way to do it. Harley Holcombe’s answer is right that this is in the PEP 8 style guide. But what none of the answers explain is why it’s a good idea to follow the idiom—even if you personally find it’s not explicit enough or confusing to Ruby users or whatever.

Python code, and the Python community, has very strong idioms. Following those idioms makes your code easier to read for anyone experienced in Python. And when you violate those idioms, that’s a strong signal.

It’s true that if not a: doesn’t distinguish empty lists from None, or numeric 0, or empty tuples, or empty user-created collection types, or empty user-created not-quite-collection types, or single-element NumPy array acting as scalars with falsey values, etc. And sometimes it’s important to be explicit about that. And in that case, you know what you want to be explicit about, so you can test for exactly that. For example, if not a and a is not None: means “anything falsey except None”, while if len(a) != 0: means “only empty sequences—and anything besides a sequence is an error here”, and so on. Besides testing for exactly what you want to test, this also signals to the reader that this test is important.

But when you don’t have anything to be explicit about, anything other than if not a: is misleading the reader. You’re signaling something as important when it isn’t. (You may also be making the code less flexible, or slower, or whatever, but that’s all less important.) And if you habitually mislead the reader like this, then when you do need to make a distinction, it’s going to pass unnoticed because you’ve been “crying wolf” all over your code.


回答 7

为什么要检查?

似乎没有人已经解决了质疑你需要测试在首位名单。因为您没有提供其他上下文,所以我可以想象您可能不需要首先进行此检查,但是您不熟悉Python中的列表处理。

我认为最Python的方式是根本不检查,而只是处理列表。这样,无论是空还是满,它都会做正确的事情。

a = []

for item in a:
    <do something with item>

<rest of code>

这具有处理任何内容的好处,而不是要求对空虚的特定检查。如果a为空,则将不执行从属块,并且解释器将进入下一行。

如果确实需要检查数组是否为空,则其他答案就足够了。

Why check at all?

No one seems to have addressed questioning your need to test the list in the first place. Because you provided no additional context, I can imagine that you may not need to do this check in the first place, but are unfamiliar with list processing in Python.

I would argue that the most pythonic way is to not check at all, but rather to just process the list. That way it will do the right thing whether empty or full.

a = []

for item in a:
    <do something with item>

<rest of code>

This has the benefit of handling any contents of a, while not requiring a specific check for emptiness. If a is empty, the dependent block will not execute and the interpreter will fall through to the next line.

If you do actually need to check the array for emptiness, the other answers are sufficient.


回答 8

len()用于Python列表,字符串,字典和集合的O(1)操作。Python在内部跟踪这些容器中元素的数量。

JavaScript 有一个true / falsy的类似概念

len() is an O(1) operation for Python lists, strings, dicts, and sets. Python internally keeps track of the number of elements in these containers.

JavaScript has a similar notion of truthy/falsy.


回答 9

我写过:

if isinstance(a, (list, some, other, types, i, accept)) and not a:
    do_stuff

被投票为-1。我不确定这是否是因为读者反对该策略或认为答案对所提供的内容没有帮助。我会假装是后者,因为-不管什么都算是“ pythonic”-这都是正确的策略。除非您已经排除或准备好处理a例如的案例,否则False您需要的测试比just更具限制性if not a:。您可以使用如下形式:

if isinstance(a, numpy.ndarray) and not a.size:
    do_stuff
elif isinstance(a, collections.Sized) and not a:
    do_stuff

第一次测试是针对上述@Mike的回答。第三行也可以替换为:

elif isinstance(a, (list, tuple)) and not a:

如果您只想接受特定类型(及其子类型)的实例,或者使用:

elif isinstance(a, (list, tuple)) and not len(a):

您无需进行显式的类型检查就可以逃脱,但前提a是周围的上下文已经向您保证这是您准备处理的类型的值,或者如果您确定不准备处理的类型正在使用引发您准备处理的错误(例如,TypeError如果您调用len未定义的值)。通常,“ pythonic”约定似乎走到了最后。像鸭子一样挤压它,如果它不知道如何发出声音,则让它引发DuckError。但是,您仍然必须考虑要进行哪种类型的假设,以及您是否没有准备好正确处理的情况是否会在正确的地方出错。Numpy数组是一个很好的例子,只是盲目地依赖len 否则布尔类型转换可能无法完全满足您的期望。

I had written:

if isinstance(a, (list, some, other, types, i, accept)) and not a:
    do_stuff

which was voted -1. I’m not sure if that’s because readers objected to the strategy or thought the answer wasn’t helpful as presented. I’ll pretend it was the latter, since—whatever counts as “pythonic”—this is the correct strategy. Unless you’ve already ruled out, or are prepared to handle cases where a is, for example, False, you need a test more restrictive than just if not a:. You could use something like this:

if isinstance(a, numpy.ndarray) and not a.size:
    do_stuff
elif isinstance(a, collections.Sized) and not a:
    do_stuff

the first test is in response to @Mike’s answer, above. The third line could also be replaced with:

elif isinstance(a, (list, tuple)) and not a:

if you only want to accept instances of particular types (and their subtypes), or with:

elif isinstance(a, (list, tuple)) and not len(a):

You can get away without the explicit type check, but only if the surrounding context already assures you that a is a value of the types you’re prepared to handle, or if you’re sure that types you’re not prepared to handle are going to raise errors (e.g., a TypeError if you call len on a value for which it’s undefined) that you’re prepared to handle. In general, the “pythonic” conventions seem to go this last way. Squeeze it like a duck and let it raise a DuckError if it doesn’t know how to quack. You still have to think about what type assumptions you’re making, though, and whether the cases you’re not prepared to handle properly really are going to error out in the right places. The Numpy arrays are a good example where just blindly relying on len or the boolean typecast may not do precisely what you’re expecting.


回答 10

从有关真值测试的文档中

除此处列出的值外,所有其他值均被视为 True

  • None
  • False
  • 任何数值类型的零,例如00.00j
  • 任何空序列,例如''()[]
  • 任何空映射,例如{}
  • 用户定义的类的实例,如果该类定义了__bool__()__len__()方法,则该方法返回整数0或bool value时False

可以看出,空列表[]虚假的,因此对布尔值执行的操作听起来最有效:

if not a:
    print('"a" is empty!')

From documentation on truth value testing:

All values other than what is listed here are considered True

  • None
  • False
  • zero of any numeric type, for example, 0, 0.0, 0j.
  • any empty sequence, for example, '', (), [].
  • any empty mapping, for example, {}.
  • instances of user-defined classes, if the class defines a __bool__() or __len__() method, when that method returns the integer zero or bool value False.

As can be seen, empty list [] is falsy, so doing what would be done to a boolean value sounds most efficient:

if not a:
    print('"a" is empty!')

回答 11

您可以通过以下几种方法检查列表是否为空:

a = [] #the list

1)非常简单的pythonic方式:

if not a:
    print("a is empty")

在Python中,空容器如列表,元组,集合,字典,变量等被视为False。可以简单地将列表视为谓词(返回布尔值)。并且一个True值表示它是非空的。

2)一种非常明确的方法:使用len()来查找长度并检查其是否等于0

if len(a) == 0:
    print("a is empty")

3)或将其与匿名空列表进行比较:

if a == []:
    print("a is empty")

4)另一种愚蠢的做法是使用exceptioniter()

try:
    next(iter(a))
    # list has elements
except StopIteration:
    print("Error: a is empty")

Here are a few ways you can check if a list is empty:

a = [] #the list

1) The pretty simple pythonic way:

if not a:
    print("a is empty")

In Python, empty containers such as lists,tuples,sets,dicts,variables etc are seen as False. One could simply treat the list as a predicate (returning a Boolean value). And a True value would indicate that it’s non-empty.

2) A much explicit way: using the len() to find the length and check if it equals to 0:

if len(a) == 0:
    print("a is empty")

3) Or comparing it to an anonymous empty list:

if a == []:
    print("a is empty")

4) Another yet silly way to do is using exception and iter():

try:
    next(iter(a))
    # list has elements
except StopIteration:
    print("Error: a is empty")

回答 12

我更喜欢以下内容:

if a == []:
   print "The list is empty."

I prefer the following:

if a == []:
   print "The list is empty."

回答 13

方法1(首选):

if not a : 
   print ("Empty") 

方法2:

if len(a) == 0 :
   print( "Empty" )

方法3:

if a == [] :
  print ("Empty")

Method 1 (Preferred):

if not a : 
   print ("Empty") 

Method 2 :

if len(a) == 0 :
   print( "Empty" )

Method 3:

if a == [] :
  print ("Empty")

回答 14

def list_test (L):
    if   L is None  : print('list is None')
    elif not L      : print('list is empty')
    else: print('list has %d elements' % len(L))

list_test(None)
list_test([])
list_test([1,2,3])

有时最好分别测试一下是否None为空,因为这是两个不同的状态。上面的代码产生以下输出:

list is None 
list is empty 
list has 3 elements

虽然None毫无价值,但虚假的。因此,如果您不想对None-ness 进行单独测试,则不必这样做。

def list_test2 (L):
    if not L      : print('list is empty')
    else: print('list has %d elements' % len(L))

list_test2(None)
list_test2([])
list_test2([1,2,3])

产生预期

list is empty
list is empty
list has 3 elements
def list_test (L):
    if   L is None  : print('list is None')
    elif not L      : print('list is empty')
    else: print('list has %d elements' % len(L))

list_test(None)
list_test([])
list_test([1,2,3])

It is sometimes good to test for None and for emptiness separately as those are two different states. The code above produces the following output:

list is None 
list is empty 
list has 3 elements

Although it’s worth nothing that None is falsy. So if you don’t want to separate test for None-ness, you don’t have to do that.

def list_test2 (L):
    if not L      : print('list is empty')
    else: print('list has %d elements' % len(L))

list_test2(None)
list_test2([])
list_test2([1,2,3])

produces expected

list is empty
list is empty
list has 3 elements

回答 15

给出了许多答案,其中很多都很好。我只想补充一下

not a

也将通过None和其他类型的空结构。如果您确实要检查一个空列表,可以执行以下操作:

if isinstance(a, list) and len(a)==0:
    print("Received an empty list")

Many answers have been given, and a lot of them are pretty good. I just wanted to add that the check

not a

will also pass for None and other types of empty structures. If you truly want to check for an empty list, you can do this:

if isinstance(a, list) and len(a)==0:
    print("Received an empty list")

回答 16

我们可以使用简单的方法:

item_list=[]
if len(item_list) == 0:
    print("list is empty")
else:
    print("list is not empty")

we could use a simple if else:

item_list=[]
if len(item_list) == 0:
    print("list is empty")
else:
    print("list is not empty")

回答 17

如果要检查列表是否为空:

l = []
if l:
    # do your stuff.

如果要检查列表中的所有值是否为空。但是它将是True一个空列表:

l = ["", False, 0, '', [], {}, ()]
if all(bool(x) for x in l):
    # do your stuff.

如果要同时使用两种情况:

def empty_list(lst):
    if len(lst) == 0:
        return False
    else:
        return all(bool(x) for x in l)

现在您可以使用:

if empty_list(lst):
    # do your stuff.

If you want to check if a list is empty:

l = []
if l:
    # do your stuff.

If you want to check whether all the values in list is empty. However it will be True for an empty list:

l = ["", False, 0, '', [], {}, ()]
if all(bool(x) for x in l):
    # do your stuff.

If you want to use both cases together:

def empty_list(lst):
    if len(lst) == 0:
        return False
    else:
        return all(bool(x) for x in l)

Now you can use:

if empty_list(lst):
    # do your stuff.

回答 18

受@dubiousjim解决方案的启发,我建议使用附加的常规检查来确定它是否可迭代

import collections
def is_empty(a):
    return not a and isinstance(a, collections.Iterable)

注意:字符串被认为是可迭代的。- and not isinstance(a,(str,unicode))如果要排除空字符串,请添加

测试:

>>> is_empty('sss')
False
>>> is_empty(555)
False
>>> is_empty(0)
False
>>> is_empty('')
True
>>> is_empty([3])
False
>>> is_empty([])
True
>>> is_empty({})
True
>>> is_empty(())
True

Being inspired by @dubiousjim’s solution, I propose to use an additional general check of whether is it something iterable

import collections
def is_empty(a):
    return not a and isinstance(a, collections.Iterable)

Note: a string is considered to be iterable. – add and not isinstance(a,(str,unicode)) if you want the empty string to be excluded

Test:

>>> is_empty('sss')
False
>>> is_empty(555)
False
>>> is_empty(0)
False
>>> is_empty('')
True
>>> is_empty([3])
False
>>> is_empty([])
True
>>> is_empty({})
True
>>> is_empty(())
True

回答 19

print('not empty' if a else 'empty')

实用一点:

a.pop() if a else None

和最透明的版本:

if a: a.pop() 
print('not empty' if a else 'empty')

a little more practical:

a.pop() if a else None

and shertest version:

if a: a.pop() 

回答 20

从python3开始,您可以使用

a == []

检查列表是否为空

编辑:这也适用于python2.7。

我不确定为什么会有这么多复杂的答案。很清楚直接

From python3 onwards you can use

a == []

to check if the list is empty

EDIT : This works with python2.7 too..

I am not sure why there are so many complicated answers. It’s pretty clear and straightforward


回答 21

您甚至可以尝试使用bool()这样

    a = [1,2,3];
    print bool(a); # it will return True
    a = [];
    print bool(a); # it will return False

我喜欢这种方式来检查列表是否为空。

非常方便实用。

You can even try using bool() like this

    a = [1,2,3];
    print bool(a); # it will return True
    a = [];
    print bool(a); # it will return False

I love this way for checking list is empty or not.

Very handy and useful.


回答 22

只需使用is_empty()或使功能类似于:

def is_empty(any_structure):
    if any_structure:
        print('Structure is not empty.')
        return True
    else:
        print('Structure is empty.')
        return False  

它可以用于任何data_structure,例如列表,元组,字典等。通过这些,您可以使用just多次调用它is_empty(any_structure)

Simply use is_empty() or make function like:-

def is_empty(any_structure):
    if any_structure:
        print('Structure is not empty.')
        return True
    else:
        print('Structure is empty.')
        return False  

It can be used for any data_structure like a list,tuples, dictionary and many more. By these, you can call it many times using just is_empty(any_structure).


回答 23

一种简单的方法是检查长度等于零。

if len(a) == 0:
    print("a is empty")

Simple way is checking the length is equal zero.

if len(a) == 0:
    print("a is empty")

回答 24

空列表的真值是,False而非空列表的真值是True

The truth value of an empty list is False whereas for a non-empty list it is True.


回答 25

这给我带来了一个特殊的用例:我实际上想要一个函数来告诉我列表是否为空。我想避免在此处编写自己的函数或使用lambda表达式(因为它似乎应该足够简单):

foo = itertools.takewhile(is_not_empty, (f(x) for x in itertools.count(1)))

当然,有一种非常自然的方法:

foo = itertools.takewhile(bool, (f(x) for x in itertools.count(1)))

当然,也不能使用boolif(即if bool(L):),因为它暗示。但是,对于明确需要“不为空”作为函数的情况,bool则是最佳选择。

What brought me here is a special use-case: I actually wanted a function to tell me if a list is empty or not. I wanted to avoid writing my own function or using a lambda-expression here (because it seemed like it should be simple enough):

foo = itertools.takewhile(is_not_empty, (f(x) for x in itertools.count(1)))

And, of course, there is a very natural way to do it:

foo = itertools.takewhile(bool, (f(x) for x in itertools.count(1)))

Of course, do not use bool in if (i.e., if bool(L):) because it’s implied. But, for the cases when “is not empty” is explicitly needed as a function, bool is the best choice.


回答 26

要检查列表是否为空,可以使用以下两种方法。但是请记住,我们应该避免显式检查序列类型的方法(这是一种less pythonic方法):

def enquiry(list1): 
    if len(list1) == 0: 
        return 0
    else: 
        return 1

# ––––––––––––––––––––––––––––––––

list1 = [] 

if enquiry(list1): 
    print ("The list isn't empty") 
else: 
    print("The list is Empty") 

# Result: "The list is Empty".

第二种方法是more pythonic一种。此方法是一种隐式检查方法,比以前的方法更可取。

def enquiry(list1): 
    if not list1: 
        return True
    else: 
        return False

# ––––––––––––––––––––––––––––––––

list1 = [] 

if enquiry(list1): 
    print ("The list is Empty") 
else: 
    print ("The list isn't empty") 

# Result: "The list is Empty"

希望这可以帮助。

To check whether a list is empty or not you can use two following ways. But remember, we should avoid the way of explicitly checking for a type of sequence (it’s a less pythonic way):

def enquiry(list1): 
    if len(list1) == 0: 
        return 0
    else: 
        return 1

# ––––––––––––––––––––––––––––––––

list1 = [] 

if enquiry(list1): 
    print ("The list isn't empty") 
else: 
    print("The list is Empty") 

# Result: "The list is Empty".

The second way is a more pythonic one. This method is an implicit way of checking and much more preferable than the previous one.

def enquiry(list1): 
    if not list1: 
        return True
    else: 
        return False

# ––––––––––––––––––––––––––––––––

list1 = [] 

if enquiry(list1): 
    print ("The list is Empty") 
else: 
    print ("The list isn't empty") 

# Result: "The list is Empty"

Hope this helps.


在列表中查找项目的索引

问题:在列表中查找项目的索引

给定一个列表["foo", "bar", "baz"]和列表中的项目"bar",如何1在Python中获取其索引()?

Given a list ["foo", "bar", "baz"] and an item in the list "bar", how do I get its index (1) in Python?


回答 0

>>> ["foo", "bar", "baz"].index("bar")
1

参考:数据结构>列表中的更多内容

注意事项

请注意,虽然这也许是回答这个问题最彻底的方法是问index是一个相当薄弱的组件listAPI,而我不记得我最后一次使用它的愤怒。在评论中已向我指出,由于此答案被大量引用,因此应使其更完整。有关list.index以下注意事项。最初值得一看它的文档可能是值得的:

list.index(x[, start[, end]])

在值等于x的第一项的列表中返回从零开始的索引。ValueError如果没有此类项目,则引发a 。

可选参数startend的解释与切片符号相同,用于将搜索限制到列表的特定子序列。返回的索引是相对于完整序列的开始而不是开始参数计算的。

列表长度的线性时间复杂度

一个index调用检查,以列表的每一个元素,直到它找到一个匹配。如果您的列表很长,并且您大概不知道它在列表中的什么位置,则此搜索可能会成为瓶颈。在这种情况下,您应该考虑使用其他数据结构。请注意,如果您大致知道在哪里找到匹配项,则可以给出index提示。例如,在此代码段中,l.index(999_999, 999_990, 1_000_000)它比straight快大约五个数量级l.index(999_999),因为前者只需要搜索10个条目,而后者要搜索一百万个:

>>> import timeit
>>> timeit.timeit('l.index(999_999)', setup='l = list(range(0, 1_000_000))', number=1000)
9.356267921015387
>>> timeit.timeit('l.index(999_999, 999_990, 1_000_000)', setup='l = list(range(0, 1_000_000))', number=1000)
0.0004404920036904514

仅将第一个匹配项的索引返回到其参数

呼叫index顺序搜索列表,直到找到匹配项,然后在该处停止。如果希望需要更多匹配项的索引,则应使用列表推导或生成器表达式。

>>> [1, 1].index(1)
0
>>> [i for i, e in enumerate([1, 2, 1]) if e == 1]
[0, 2]
>>> g = (i for i, e in enumerate([1, 2, 1]) if e == 1)
>>> next(g)
0
>>> next(g)
2

我曾经使用过的大多数地方index,现在我使用列表推导或生成器表达式,因为它们更具通用性。因此,如果您打算接触index,请看看这些出色的Python功能。

如果元素不在列表中则抛出

如果该项目不存在,则调用会index导致ValueError

>>> [1, 1].index(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 2 is not in list

如果该项目可能不在列表中,则您应该

  1. 首先使用item in my_list(干净,可读的方法)进行检查,或者
  2. index呼叫包裹在一个try/except可以捕获的块中ValueError(可能更快,至少在要搜索的列表较长且通常存在该项目的情况下。)
>>> ["foo", "bar", "baz"].index("bar")
1

Reference: Data Structures > More on Lists

Caveats follow

Note that while this is perhaps the cleanest way to answer the question as asked, index is a rather weak component of the list API, and I can’t remember the last time I used it in anger. It’s been pointed out to me in the comments that because this answer is heavily referenced, it should be made more complete. Some caveats about list.index follow. It is probably worth initially taking a look at the documentation for it:

list.index(x[, start[, end]])

Return zero-based index in the list of the first item whose value is equal to x. Raises a ValueError if there is no such item.

The optional arguments start and end are interpreted as in the slice notation and are used to limit the search to a particular subsequence of the list. The returned index is computed relative to the beginning of the full sequence rather than the start argument.

Linear time-complexity in list length

An index call checks every element of the list in order, until it finds a match. If your list is long, and you don’t know roughly where in the list it occurs, this search could become a bottleneck. In that case, you should consider a different data structure. Note that if you know roughly where to find the match, you can give index a hint. For instance, in this snippet, l.index(999_999, 999_990, 1_000_000) is roughly five orders of magnitude faster than straight l.index(999_999), because the former only has to search 10 entries, while the latter searches a million:

>>> import timeit
>>> timeit.timeit('l.index(999_999)', setup='l = list(range(0, 1_000_000))', number=1000)
9.356267921015387
>>> timeit.timeit('l.index(999_999, 999_990, 1_000_000)', setup='l = list(range(0, 1_000_000))', number=1000)
0.0004404920036904514

Only returns the index of the first match to its argument

A call to index searches through the list in order until it finds a match, and stops there. If you expect to need indices of more matches, you should use a list comprehension, or generator expression.

>>> [1, 1].index(1)
0
>>> [i for i, e in enumerate([1, 2, 1]) if e == 1]
[0, 2]
>>> g = (i for i, e in enumerate([1, 2, 1]) if e == 1)
>>> next(g)
0
>>> next(g)
2

Most places where I once would have used index, I now use a list comprehension or generator expression because they’re more generalizable. So if you’re considering reaching for index, take a look at these excellent Python features.

Throws if element not present in list

A call to index results in a ValueError if the item’s not present.

>>> [1, 1].index(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 2 is not in list

If the item might not be present in the list, you should either

  1. Check for it first with item in my_list (clean, readable approach), or
  2. Wrap the index call in a try/except block which catches ValueError (probably faster, at least when the list to search is long, and the item is usually present.)

回答 1

学习Python真正有用的一件事是使用交互式帮助功能:

>>> help(["foo", "bar", "baz"])
Help on list object:

class list(object)
 ...

 |
 |  index(...)
 |      L.index(value, [start, [stop]]) -> integer -- return first index of value
 |

这通常会引导您找到所需的方法。

One thing that is really helpful in learning Python is to use the interactive help function:

>>> help(["foo", "bar", "baz"])
Help on list object:

class list(object)
 ...

 |
 |  index(...)
 |      L.index(value, [start, [stop]]) -> integer -- return first index of value
 |

which will often lead you to the method you are looking for.


回答 2

大多数答案都说明了如何查找单个索引,但是如果该项目多次在列表中,则它们的方法不会返回多个索引。用途enumerate()

for i, j in enumerate(['foo', 'bar', 'baz']):
    if j == 'bar':
        print(i)

index()函数仅返回第一个匹配项,而enumerate()返回所有匹配项。

作为列表理解:

[i for i, j in enumerate(['foo', 'bar', 'baz']) if j == 'bar']

这也是另一个小解决方案itertools.count()(与枚举几乎相同):

from itertools import izip as zip, count # izip for maximum efficiency
[i for i, j in zip(count(), ['foo', 'bar', 'baz']) if j == 'bar']

对于较大的列表,这比使用enumerate()以下命令更有效:

$ python -m timeit -s "from itertools import izip as zip, count" "[i for i, j in zip(count(), ['foo', 'bar', 'baz']*500) if j == 'bar']"
10000 loops, best of 3: 174 usec per loop
$ python -m timeit "[i for i, j in enumerate(['foo', 'bar', 'baz']*500) if j == 'bar']"
10000 loops, best of 3: 196 usec per loop

The majority of answers explain how to find a single index, but their methods do not return multiple indexes if the item is in the list multiple times. Use enumerate():

for i, j in enumerate(['foo', 'bar', 'baz']):
    if j == 'bar':
        print(i)

The index() function only returns the first occurrence, while enumerate() returns all occurrences.

As a list comprehension:

[i for i, j in enumerate(['foo', 'bar', 'baz']) if j == 'bar']

Here’s also another small solution with itertools.count() (which is pretty much the same approach as enumerate):

from itertools import izip as zip, count # izip for maximum efficiency
[i for i, j in zip(count(), ['foo', 'bar', 'baz']) if j == 'bar']

This is more efficient for larger lists than using enumerate():

$ python -m timeit -s "from itertools import izip as zip, count" "[i for i, j in zip(count(), ['foo', 'bar', 'baz']*500) if j == 'bar']"
10000 loops, best of 3: 174 usec per loop
$ python -m timeit "[i for i, j in enumerate(['foo', 'bar', 'baz']*500) if j == 'bar']"
10000 loops, best of 3: 196 usec per loop

回答 3

要获取所有索引:

indexes = [i for i,x in enumerate(xs) if x == 'foo']

To get all indexes:

indexes = [i for i,x in enumerate(xs) if x == 'foo']

回答 4

index()返回值的第一个索引!

| 索引(…)
| L.index(value,[start,[stop]])->整数-返回值的第一个索引

def all_indices(value, qlist):
    indices = []
    idx = -1
    while True:
        try:
            idx = qlist.index(value, idx+1)
            indices.append(idx)
        except ValueError:
            break
    return indices

all_indices("foo", ["foo","bar","baz","foo"])

index() returns the first index of value!

| index(…)
| L.index(value, [start, [stop]]) -> integer — return first index of value

def all_indices(value, qlist):
    indices = []
    idx = -1
    while True:
        try:
            idx = qlist.index(value, idx+1)
            indices.append(idx)
        except ValueError:
            break
    return indices

all_indices("foo", ["foo","bar","baz","foo"])

回答 5

如果该元素不在列表中,则会出现问题。此函数处理该问题:

# if element is found it returns index of element else returns None

def find_element_in_list(element, list_element):
    try:
        index_element = list_element.index(element)
        return index_element
    except ValueError:
        return None

A problem will arise if the element is not in the list. This function handles the issue:

# if element is found it returns index of element else returns None

def find_element_in_list(element, list_element):
    try:
        index_element = list_element.index(element)
        return index_element
    except ValueError:
        return None

回答 6

a = ["foo","bar","baz",'bar','any','much']

indexes = [index for index in range(len(a)) if a[index] == 'bar']
a = ["foo","bar","baz",'bar','any','much']

indexes = [index for index in range(len(a)) if a[index] == 'bar']

回答 7

您必须设置条件以检查要搜索的元素是否在列表中

if 'your_element' in mylist:
    print mylist.index('your_element')
else:
    print None

You have to set a condition to check if the element you’re searching is in the list

if 'your_element' in mylist:
    print mylist.index('your_element')
else:
    print None

回答 8

此处提出的所有功能均会重现固有的语言行为,但会掩盖正在发生的事情。

[i for i in range(len(mylist)) if mylist[i]==myterm]  # get the indices

[each for each in mylist if each==myterm]             # get the items

mylist.index(myterm) if myterm in mylist else None    # get the first index and fail quietly

如果该语言提供了执行所需功能的方法,为什么还要编写具有异常处理功能的函数?

All of the proposed functions here reproduce inherent language behavior but obscure what’s going on.

[i for i in range(len(mylist)) if mylist[i]==myterm]  # get the indices

[each for each in mylist if each==myterm]             # get the items

mylist.index(myterm) if myterm in mylist else None    # get the first index and fail quietly

Why write a function with exception handling if the language provides the methods to do what you want itself?


回答 9

如果需要所有索引,则可以使用NumPy

import numpy as np

array = [1, 2, 1, 3, 4, 5, 1]
item = 1
np_array = np.array(array)
item_index = np.where(np_array==item)
print item_index
# Out: (array([0, 2, 6], dtype=int64),)

这是一个清晰易读的解决方案。

If you want all indexes, then you can use NumPy:

import numpy as np

array = [1, 2, 1, 3, 4, 5, 1]
item = 1
np_array = np.array(array)
item_index = np.where(np_array==item)
print item_index
# Out: (array([0, 2, 6], dtype=int64),)

It is clear, readable solution.


回答 10

在Python中给定包含该项目的列表的情况下查找项目的索引

对于列表["foo", "bar", "baz"]和列表中的项目,"bar"用Python获取索引(1)的最干净方法是什么?

好吧,可以肯定的是,这里有index方法,它返回第一次出现的索引:

>>> l = ["foo", "bar", "baz"]
>>> l.index('bar')
1

此方法存在两个问题:

  • 如果该值不在列表中,则会得到一个 ValueError
  • 如果列表中有多个值,则仅获取第一个的索引

没有值

如果该值可能丢失,则需要捕获 ValueError

您可以使用这样的可重用定义来执行此操作:

def index(a_list, value):
    try:
        return a_list.index(value)
    except ValueError:
        return None

并像这样使用它:

>>> print(index(l, 'quux'))
None
>>> print(index(l, 'bar'))
1

不利的一面是,您可能会检查返回的值isis not无:

result = index(a_list, value)
if result is not None:
    do_something(result)

列表中有多个值

如果可能发生更多次,您将无法获得有关以下方面的完整信息list.index

>>> l.append('bar')
>>> l
['foo', 'bar', 'baz', 'bar']
>>> l.index('bar')              # nothing at index 3?
1

您可以将索引枚举到列表中:

>>> [index for index, v in enumerate(l) if v == 'bar']
[1, 3]
>>> [index for index, v in enumerate(l) if v == 'boink']
[]

如果没有出现,则可以通过布尔检查结果来进行检查,或者如果对结果进行循环,则什么也不做:

indexes = [index for index, v in enumerate(l) if v == 'boink']
for index in indexes:
    do_something(index)

用熊猫更好地处理数据

如果您有熊猫,则可以通过Series对象轻松获得以下信息:

>>> import pandas as pd
>>> series = pd.Series(l)
>>> series
0    foo
1    bar
2    baz
3    bar
dtype: object

比较检查将返回一系列布尔值:

>>> series == 'bar'
0    False
1     True
2    False
3     True
dtype: bool

通过下标符号将该布尔值系列传递给该系列,您将只获得匹配的成员:

>>> series[series == 'bar']
1    bar
3    bar
dtype: object

如果只需要索引,index属性将返回一系列整数:

>>> series[series == 'bar'].index
Int64Index([1, 3], dtype='int64')

而且,如果要将它们放在列表或元组中,只需将它们传递给构造函数即可:

>>> list(series[series == 'bar'].index)
[1, 3]

是的,您也可以使用带有枚举的列表理解,但这在我看来并不那么优雅-您正在用Python进行相等性测试,而不是让用C编写的内置代码来处理它:

>>> [i for i, value in enumerate(l) if value == 'bar']
[1, 3]

这是XY问题吗?

XY问题是在询问您尝试的解决方案,而不是您的实际问题。

为什么您认为需要列表中给定元素的索引?

如果您已经知道该值,为什么还要关心它在列表中的位置?

如果值不存在,则捕获ValueError相当冗长-我宁愿避免这种情况。

无论如何,我通常都会遍历该列表,因此我通常会保留一个指向任何有趣信息的指针,并使用枚举获取索引。

如果您要处理数据,则可能应该使用pandas-与我展示的纯Python解决方法相比,pandas的工具要优雅得多。

我不记得list.index自己需要。但是,我浏览了Python标准库,并且看到了一些很好的用法。

idlelibGUI和文本解析中,有很多用途。

keyword模块使用它在模块中查找注释标记,以通过元编程自动重新生成其中的关键字列表。

在Lib / mailbox.py中,它似乎像有序映射一样在使用它:

key_list[key_list.index(old)] = new

del key_list[key_list.index(key)]

在Lib / http / cookiejar.py中,似乎用来获取下个月的内容:

mon = MONTHS_LOWER.index(mon.lower())+1

在Lib / tarfile.py中,类似于distutils来获取最多一个项目的切片:

members = members[:members.index(tarinfo)]

在Lib / pickletools.py中:

numtopop = before.index(markobject)

这些用法似乎有一个共同点,即它们似乎在受限制大小的列表上运行(由于O的n(n)查找时间而很重要list.index),并且它们主要用于解析(对于Idle,则通常用于UI)。

尽管有用例,但这种情况很少见。如果发现自己正在寻找答案,请问自己正在做的事情是否最直接地使用了该用例所用语言提供的工具。

Finding the index of an item given a list containing it in Python

For a list ["foo", "bar", "baz"] and an item in the list "bar", what’s the cleanest way to get its index (1) in Python?

Well, sure, there’s the index method, which returns the index of the first occurrence:

>>> l = ["foo", "bar", "baz"]
>>> l.index('bar')
1

There are a couple of issues with this method:

  • if the value isn’t in the list, you’ll get a ValueError
  • if more than one of the value is in the list, you only get the index for the first one

No values

If the value could be missing, you need to catch the ValueError.

You can do so with a reusable definition like this:

def index(a_list, value):
    try:
        return a_list.index(value)
    except ValueError:
        return None

And use it like this:

>>> print(index(l, 'quux'))
None
>>> print(index(l, 'bar'))
1

And the downside of this is that you will probably have a check for if the returned value is or is not None:

result = index(a_list, value)
if result is not None:
    do_something(result)

More than one value in the list

If you could have more occurrences, you’ll not get complete information with list.index:

>>> l.append('bar')
>>> l
['foo', 'bar', 'baz', 'bar']
>>> l.index('bar')              # nothing at index 3?
1

You might enumerate into a list comprehension the indexes:

>>> [index for index, v in enumerate(l) if v == 'bar']
[1, 3]
>>> [index for index, v in enumerate(l) if v == 'boink']
[]

If you have no occurrences, you can check for that with boolean check of the result, or just do nothing if you loop over the results:

indexes = [index for index, v in enumerate(l) if v == 'boink']
for index in indexes:
    do_something(index)

Better data munging with pandas

If you have pandas, you can easily get this information with a Series object:

>>> import pandas as pd
>>> series = pd.Series(l)
>>> series
0    foo
1    bar
2    baz
3    bar
dtype: object

A comparison check will return a series of booleans:

>>> series == 'bar'
0    False
1     True
2    False
3     True
dtype: bool

Pass that series of booleans to the series via subscript notation, and you get just the matching members:

>>> series[series == 'bar']
1    bar
3    bar
dtype: object

If you want just the indexes, the index attribute returns a series of integers:

>>> series[series == 'bar'].index
Int64Index([1, 3], dtype='int64')

And if you want them in a list or tuple, just pass them to the constructor:

>>> list(series[series == 'bar'].index)
[1, 3]

Yes, you could use a list comprehension with enumerate too, but that’s just not as elegant, in my opinion – you’re doing tests for equality in Python, instead of letting builtin code written in C handle it:

>>> [i for i, value in enumerate(l) if value == 'bar']
[1, 3]

Is this an XY problem?

The XY problem is asking about your attempted solution rather than your actual problem.

Why do you think you need the index given an element in a list?

If you already know the value, why do you care where it is in a list?

If the value isn’t there, catching the ValueError is rather verbose – and I prefer to avoid that.

I’m usually iterating over the list anyways, so I’ll usually keep a pointer to any interesting information, getting the index with enumerate.

If you’re munging data, you should probably be using pandas – which has far more elegant tools than the pure Python workarounds I’ve shown.

I do not recall needing list.index, myself. However, I have looked through the Python standard library, and I see some excellent uses for it.

There are many, many uses for it in idlelib, for GUI and text parsing.

The keyword module uses it to find comment markers in the module to automatically regenerate the list of keywords in it via metaprogramming.

In Lib/mailbox.py it seems to be using it like an ordered mapping:

key_list[key_list.index(old)] = new

and

del key_list[key_list.index(key)]

In Lib/http/cookiejar.py, seems to be used to get the next month:

mon = MONTHS_LOWER.index(mon.lower())+1

In Lib/tarfile.py similar to distutils to get a slice up to an item:

members = members[:members.index(tarinfo)]

In Lib/pickletools.py:

numtopop = before.index(markobject)

What these usages seem to have in common is that they seem to operate on lists of constrained sizes (important because of O(n) lookup time for list.index), and they’re mostly used in parsing (and UI in the case of Idle).

While there are use-cases for it, they are fairly uncommon. If you find yourself looking for this answer, ask yourself if what you’re doing is the most direct usage of the tools provided by the language for your use-case.


回答 11

具有该zip功能的所有索引:

get_indexes = lambda x, xs: [i for (y, i) in zip(xs, range(len(xs))) if x == y]

print get_indexes(2, [1, 2, 3, 4, 5, 6, 3, 2, 3, 2])
print get_indexes('f', 'xsfhhttytffsafweef')

All indexes with the zip function:

get_indexes = lambda x, xs: [i for (y, i) in zip(xs, range(len(xs))) if x == y]

print get_indexes(2, [1, 2, 3, 4, 5, 6, 3, 2, 3, 2])
print get_indexes('f', 'xsfhhttytffsafweef')

回答 12

获取列表中一个或多个(相同)项目的所有出现次数和位置

使用enumerate(alist)可以存储第一个元素(n),即元素x等于要查找的内容时列表的索引。

>>> alist = ['foo', 'spam', 'egg', 'foo']
>>> foo_indexes = [n for n,x in enumerate(alist) if x=='foo']
>>> foo_indexes
[0, 3]
>>>

让我们使函数findindex

该函数将项目和列表作为参数,并返回项目在列表中的位置,就像我们之前看到的那样。

def indexlist(item2find, list_or_string):
  "Returns all indexes of an item in a list or a string"
  return [n for n,item in enumerate(list_or_string) if item==item2find]

print(indexlist("1", "010101010"))

输出量


[1, 3, 5, 7]

简单

for n, i in enumerate([1, 2, 3, 4, 1]):
    if i == 1:
        print(n)

输出:

0
4

Getting all the occurrences and the position of one or more (identical) items in a list

With enumerate(alist) you can store the first element (n) that is the index of the list when the element x is equal to what you look for.

>>> alist = ['foo', 'spam', 'egg', 'foo']
>>> foo_indexes = [n for n,x in enumerate(alist) if x=='foo']
>>> foo_indexes
[0, 3]
>>>

Let’s make our function findindex

This function takes the item and the list as arguments and return the position of the item in the list, like we saw before.

def indexlist(item2find, list_or_string):
  "Returns all indexes of an item in a list or a string"
  return [n for n,item in enumerate(list_or_string) if item==item2find]

print(indexlist("1", "010101010"))

Output


[1, 3, 5, 7]

Simple

for n, i in enumerate([1, 2, 3, 4, 1]):
    if i == 1:
        print(n)

Output:

0
4

回答 13

只需您可以选择

a = [['hand', 'head'], ['phone', 'wallet'], ['lost', 'stock']]
b = ['phone', 'lost']

res = [[x[0] for x in a].index(y) for y in b]

Simply you can go with

a = [['hand', 'head'], ['phone', 'wallet'], ['lost', 'stock']]
b = ['phone', 'lost']

res = [[x[0] for x in a].index(y) for y in b]

回答 14

另外一个选项

>>> a = ['red', 'blue', 'green', 'red']
>>> b = 'red'
>>> offset = 0;
>>> indices = list()
>>> for i in range(a.count(b)):
...     indices.append(a.index(b,offset))
...     offset = indices[-1]+1
... 
>>> indices
[0, 3]
>>> 

Another option

>>> a = ['red', 'blue', 'green', 'red']
>>> b = 'red'
>>> offset = 0;
>>> indices = list()
>>> for i in range(a.count(b)):
...     indices.append(a.index(b,offset))
...     offset = indices[-1]+1
... 
>>> indices
[0, 3]
>>> 

回答 15

而现在,对于完全不同的东西…

…就像在获取索引之前确认项目的存在。这种方法的好处是,该函数始终返回一个索引列表-即使它是一个空列表。它也适用于字符串。

def indices(l, val):
    """Always returns a list containing the indices of val in the_list"""
    retval = []
    last = 0
    while val in l[last:]:
            i = l[last:].index(val)
            retval.append(last + i)
            last += i + 1   
    return retval

l = ['bar','foo','bar','baz','bar','bar']
q = 'bar'
print indices(l,q)
print indices(l,'bat')
print indices('abcdaababb','a')

当粘贴到交互式python窗口中时:

Python 2.7.6 (v2.7.6:3a1db0d2747e, Nov 10 2013, 00:42:54) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def indices(the_list, val):
...     """Always returns a list containing the indices of val in the_list"""
...     retval = []
...     last = 0
...     while val in the_list[last:]:
...             i = the_list[last:].index(val)
...             retval.append(last + i)
...             last += i + 1   
...     return retval
... 
>>> l = ['bar','foo','bar','baz','bar','bar']
>>> q = 'bar'
>>> print indices(l,q)
[0, 2, 4, 5]
>>> print indices(l,'bat')
[]
>>> print indices('abcdaababb','a')
[0, 4, 5, 7]
>>> 

更新资料

经过一年的低沉的python开发,我对最初的答案感到有些尴尬,因此要想保持纪录,肯定可以使用上面的代码;然而,很多更地道的方式来获得相同的行为是使用列表理解,用枚举()函数一起。

像这样:

def indices(l, val):
    """Always returns a list containing the indices of val in the_list"""
    return [index for index, value in enumerate(l) if value == val]

l = ['bar','foo','bar','baz','bar','bar']
q = 'bar'
print indices(l,q)
print indices(l,'bat')
print indices('abcdaababb','a')

将其粘贴到交互式python窗口中时会生成:

Python 2.7.14 |Anaconda, Inc.| (default, Dec  7 2017, 11:07:58) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def indices(l, val):
...     """Always returns a list containing the indices of val in the_list"""
...     return [index for index, value in enumerate(l) if value == val]
... 
>>> l = ['bar','foo','bar','baz','bar','bar']
>>> q = 'bar'
>>> print indices(l,q)
[0, 2, 4, 5]
>>> print indices(l,'bat')
[]
>>> print indices('abcdaababb','a')
[0, 4, 5, 7]
>>> 

现在,在回顾了这个问题和所有答案之后,我意识到这正是FMc在他先前的答案中提出的。当我最初回答这个问题时,我什至没有看到那个答案,因为我不理解。我希望我的详细示例能有助于理解。

如果上面的单行代码对您仍然没有意义,我强烈建议您使用Google“ python list comprehension”,并花一些时间来熟悉一下自己。它只是众多强大功能之一,使使用Python开发代码感到非常高兴。

And now, for something completely different…

… like confirming the existence of the item before getting the index. The nice thing about this approach is the function always returns a list of indices — even if it is an empty list. It works with strings as well.

def indices(l, val):
    """Always returns a list containing the indices of val in the_list"""
    retval = []
    last = 0
    while val in l[last:]:
            i = l[last:].index(val)
            retval.append(last + i)
            last += i + 1   
    return retval

l = ['bar','foo','bar','baz','bar','bar']
q = 'bar'
print indices(l,q)
print indices(l,'bat')
print indices('abcdaababb','a')

When pasted into an interactive python window:

Python 2.7.6 (v2.7.6:3a1db0d2747e, Nov 10 2013, 00:42:54) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def indices(the_list, val):
...     """Always returns a list containing the indices of val in the_list"""
...     retval = []
...     last = 0
...     while val in the_list[last:]:
...             i = the_list[last:].index(val)
...             retval.append(last + i)
...             last += i + 1   
...     return retval
... 
>>> l = ['bar','foo','bar','baz','bar','bar']
>>> q = 'bar'
>>> print indices(l,q)
[0, 2, 4, 5]
>>> print indices(l,'bat')
[]
>>> print indices('abcdaababb','a')
[0, 4, 5, 7]
>>> 

Update

After another year of heads-down python development, I’m a bit embarrassed by my original answer, so to set the record straight, one can certainly use the above code; however, the much more idiomatic way to get the same behavior would be to use list comprehension, along with the enumerate() function.

Something like this:

def indices(l, val):
    """Always returns a list containing the indices of val in the_list"""
    return [index for index, value in enumerate(l) if value == val]

l = ['bar','foo','bar','baz','bar','bar']
q = 'bar'
print indices(l,q)
print indices(l,'bat')
print indices('abcdaababb','a')

Which, when pasted into an interactive python window yields:

Python 2.7.14 |Anaconda, Inc.| (default, Dec  7 2017, 11:07:58) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def indices(l, val):
...     """Always returns a list containing the indices of val in the_list"""
...     return [index for index, value in enumerate(l) if value == val]
... 
>>> l = ['bar','foo','bar','baz','bar','bar']
>>> q = 'bar'
>>> print indices(l,q)
[0, 2, 4, 5]
>>> print indices(l,'bat')
[]
>>> print indices('abcdaababb','a')
[0, 4, 5, 7]
>>> 

And now, after reviewing this question and all the answers, I realize that this is exactly what FMc suggested in his earlier answer. At the time I originally answered this question, I didn’t even see that answer, because I didn’t understand it. I hope that my somewhat more verbose example will aid understanding.

If the single line of code above still doesn’t make sense to you, I highly recommend you Google ‘python list comprehension’ and take a few minutes to familiarize yourself. It’s just one of the many powerful features that make it a joy to use Python to develop code.


回答 16

FMc和user7177的答案的变体将给出一个字典,该字典可以返回任何条目的所有索引:

>>> a = ['foo','bar','baz','bar','any', 'foo', 'much']
>>> l = dict(zip(set(a), map(lambda y: [i for i,z in enumerate(a) if z is y ], set(a))))
>>> l['foo']
[0, 5]
>>> l ['much']
[6]
>>> l
{'baz': [2], 'foo': [0, 5], 'bar': [1, 3], 'any': [4], 'much': [6]}
>>> 

您也可以将其用作单个衬纸,以获取单个条目的所有索引。尽管我确实使用set(a)减少了调用lambda的次数,但是并不能保证效率。

A variant on the answer from FMc and user7177 will give a dict that can return all indices for any entry:

>>> a = ['foo','bar','baz','bar','any', 'foo', 'much']
>>> l = dict(zip(set(a), map(lambda y: [i for i,z in enumerate(a) if z is y ], set(a))))
>>> l['foo']
[0, 5]
>>> l ['much']
[6]
>>> l
{'baz': [2], 'foo': [0, 5], 'bar': [1, 3], 'any': [4], 'much': [6]}
>>> 

You could also use this as a one liner to get all indices for a single entry. There are no guarantees for efficiency, though I did use set(a) to reduce the number of times the lambda is called.


回答 17

此解决方案不如其他解决方案强大,但是如果您是初学者并且仅了解for循环,则仍然可以在避免ValueError的情况下找到项目的第一个索引:

def find_element(p,t):
    i = 0
    for e in p:
        if e == t:
            return i
        else:
            i +=1
    return -1

This solution is not as powerful as others, but if you’re a beginner and only know about forloops it’s still possible to find the first index of an item while avoiding the ValueError:

def find_element(p,t):
    i = 0
    for e in p:
        if e == t:
            return i
        else:
            i +=1
    return -1

回答 18

在列表L中查找项目x的索引:

idx = L.index(x) if (x in L) else -1

Finding index of item x in list L:

idx = L.index(x) if (x in L) else -1

回答 19

由于Python列表是从零开始的,因此我们可以使用zip内置函数,如下所示:

>>> [i for i,j in zip(range(len(haystack)), haystack) if j == 'needle' ]

其中“ haystack”是有问题的列表,“ needle”是要查找的项目。

(注意:这里我们使用i进行迭代以获取索引,但是如果我们需要专注于项目,可以切换到j。)

Since Python lists are zero-based, we can use the zip built-in function as follows:

>>> [i for i,j in zip(range(len(haystack)), haystack) if j == 'needle' ]

where “haystack” is the list in question and “needle” is the item to look for.

(Note: Here we are iterating using i to get the indexes, but if we need rather to focus on the items we can switch to j.)


回答 20

name ="bar"
list = [["foo", 1], ["bar", 2], ["baz", 3]]
new_list=[]
for item in list:
    new_list.append(item[0])
print(new_list)
try:
    location= new_list.index(name)
except:
    location=-1
print (location)

这说明了字符串是否也不在列表中,如果字符串也不在列表中,则 location = -1

name ="bar"
list = [["foo", 1], ["bar", 2], ["baz", 3]]
new_list=[]
for item in list:
    new_list.append(item[0])
print(new_list)
try:
    location= new_list.index(name)
except:
    location=-1
print (location)

This accounts for if the string is not in the list too, if it isn’t in the list then location = -1


回答 21

index()如果找不到该项目,Python 方法将引发错误。因此,相反,您可以使其类似于indexOf()JavaScript 的功能,-1如果未找到该项目,它将返回:

try:
    index = array.index('search_keyword')
except ValueError:
    index = -1

Python index() method throws an error if the item was not found. So instead you can make it similar to the indexOf() function of JavaScript which returns -1 if the item was not found:

try:
    index = array.index('search_keyword')
except ValueError:
    index = -1

回答 22

有一个更实用的答案。

list(filter(lambda x: x[1]=="bar",enumerate(["foo", "bar", "baz", "bar", "baz", "bar", "a", "b", "c"])))

更通用的形式:

def get_index_of(lst, element):
    return list(map(lambda x: x[0],\
       (list(filter(lambda x: x[1]==element, enumerate(lst))))))

There is a more functional answer to this.

list(filter(lambda x: x[1]=="bar",enumerate(["foo", "bar", "baz", "bar", "baz", "bar", "a", "b", "c"])))

More generic form:

def get_index_of(lst, element):
    return list(map(lambda x: x[0],\
       (list(filter(lambda x: x[1]==element, enumerate(lst))))))

回答 23

让我们将名称指定lst给您拥有的列表。可以将列表转换lstnumpy array。并且,然后使用numpy.where获取列表中所选项目的索引。以下是实现它的方法。

import numpy as np

lst = ["foo", "bar", "baz"]  #lst: : 'list' data type
print np.where( np.array(lst) == 'bar')[0][0]

>>> 1

Let’s give the name lst to the list that you have. One can convert the list lst to a numpy array. And, then use numpy.where to get the index of the chosen item in the list. Following is the way in which you will implement it.

import numpy as np

lst = ["foo", "bar", "baz"]  #lst: : 'list' data type
print np.where( np.array(lst) == 'bar')[0][0]

>>> 1

回答 24

对于那些来自像我这样的另一种语言的人,也许有一个简单的循环,它更易于理解和使用:

mylist = ["foo", "bar", "baz", "bar"]
newlist = enumerate(mylist)
for index, item in newlist:
  if item == "bar":
    print(index, item)

我很感激枚举到底是做什么的?。那帮助我理解了。

For those coming from another language like me, maybe with a simple loop it’s easier to understand and use it:

mylist = ["foo", "bar", "baz", "bar"]
newlist = enumerate(mylist)
for index, item in newlist:
  if item == "bar":
    print(index, item)

I am thankful for So what exactly does enumerate do?. That helped me to understand.


回答 25

如果您打算一次查找索引,则可以使用“索引”方法。但是,如果要多次搜索数据,则建议使用bisect模块。请记住,使用bisect模块的数据必须进行排序。因此,您可以对数据进行一次排序,然后可以使用二等分。在我的机器上使用bisect模块比使用索引方法快20倍。

这是使用Python 3.8及更高版本语法的代码示例:

import bisect
from timeit import timeit

def bisect_search(container, value):
    return (
      index 
      if (index := bisect.bisect_left(container, value)) < len(container) 
      and container[index] == value else -1
    )

data = list(range(1000))
# value to search
value = 666

# times to test
ttt = 1000

t1 = timeit(lambda: data.index(value), number=ttt)
t2 = timeit(lambda: bisect_search(data, value), number=ttt)

print(f"{t1=:.4f}, {t2=:.4f}, diffs {t1/t2=:.2f}")

输出:

t1=0.0400, t2=0.0020, diffs t1/t2=19.60

If you are going to find an index once then using “index” method is fine. However, if you are going to search your data more than once then I recommend using bisect module. Keep in mind that using bisect module data must be sorted. So you sort data once and then you can use bisect. Using bisect module on my machine is about 20 times faster than using index method.

Here is an example of code using Python 3.8 and above syntax:

import bisect
from timeit import timeit

def bisect_search(container, value):
    return (
      index 
      if (index := bisect.bisect_left(container, value)) < len(container) 
      and container[index] == value else -1
    )

data = list(range(1000))
# value to search
value = 666

# times to test
ttt = 1000

t1 = timeit(lambda: data.index(value), number=ttt)
t2 = timeit(lambda: bisect_search(data, value), number=ttt)

print(f"{t1=:.4f}, {t2=:.4f}, diffs {t1/t2=:.2f}")

Output:

t1=0.0400, t2=0.0020, diffs t1/t2=19.60

回答 26

如果性能值得关注:

在众多答案中提到,内置方法 list.index(item)方法是O(n)算法。如果您需要执行一次,那就很好。但是,如果您需要多次访问元素的索引,则首先创建一个由项-索引对组成的字典(O(n)),然后每次需要时在O(1)处访问索引就更有意义了。它。

如果您确定列表中的项目不会重复,则可以轻松地进行以下操作:

myList = ["foo", "bar", "baz"]

# Create the dictionary
myDict = dict((e,i) for i,e in enumerate(myList))

# Lookup
myDict["bar"] # Returns 1
# myDict.get("blah") if you don't want an error to be raised if element not found.

如果您可能有重复的元素,并且需要返回其所有索引:

from collections import defaultdict as dd
myList = ["foo", "bar", "bar", "baz", "foo"]

# Create the dictionary
myDict = dd(list)
for i,e in enumerate(myList):
    myDict[e].append(i)

# Lookup
myDict["foo"] # Returns [0, 4]

If performance is of concern:

It is mentioned in numerous answers that the built-in method of list.index(item) method is an O(n) algorithm. It is fine if you need to perform this once. But if you need to access the indices of elements a number of times, it makes more sense to first create a dictionary (O(n)) of item-index pairs, and then access the index at O(1) every time you need it.

If you are sure that the items in your list are never repeated, you can easily:

myList = ["foo", "bar", "baz"]

# Create the dictionary
myDict = dict((e,i) for i,e in enumerate(myList))

# Lookup
myDict["bar"] # Returns 1
# myDict.get("blah") if you don't want an error to be raised if element not found.

If you may have duplicate elements, and need to return all of their indices:

from collections import defaultdict as dd
myList = ["foo", "bar", "bar", "baz", "foo"]

# Create the dictionary
myDict = dd(list)
for i,e in enumerate(myList):
    myDict[e].append(i)

# Lookup
myDict["foo"] # Returns [0, 4]

回答 27

如@TerryA所示,许多答案都讨论了如何查找一个索引。

more_itertools是一个第三方库,具有用于在可迭代对象中定位多个索引的工具。

给定

import more_itertools as mit


iterable = ["foo", "bar", "baz", "ham", "foo", "bar", "baz"]

查找多个观测值的索引:

list(mit.locate(iterable, lambda x: x == "bar"))
# [1, 5]

测试多个项目:

list(mit.locate(iterable, lambda x: x in {"bar", "ham"}))
# [1, 3, 5]

另请参见使用的更多选项more_itertools.locate。通过安装> pip install more_itertools

As indicated by @TerryA, many answers discuss how to find one index.

more_itertools is a third-party library with tools to locate multiple indices within an iterable.

Given

import more_itertools as mit


iterable = ["foo", "bar", "baz", "ham", "foo", "bar", "baz"]

Code

Find indices of multiple observations:

list(mit.locate(iterable, lambda x: x == "bar"))
# [1, 5]

Test multiple items:

list(mit.locate(iterable, lambda x: x in {"bar", "ham"}))
# [1, 3, 5]

See also more options with more_itertools.locate. Install via > pip install more_itertools.


回答 28

使用dictionary,其中首先处理列表,然后向其添加索引

from collections import defaultdict

index_dict = defaultdict(list)    
word_list =  ['foo','bar','baz','bar','any', 'foo', 'much']

for word_index in range(len(word_list)) :
    index_dict[word_list[word_index]].append(word_index)

word_index_to_find = 'foo'       
print(index_dict[word_index_to_find])

# output :  [0, 5]

using dictionary , where process the list first and then add the index to it

from collections import defaultdict

index_dict = defaultdict(list)    
word_list =  ['foo','bar','baz','bar','any', 'foo', 'much']

for word_index in range(len(word_list)) :
    index_dict[word_list[word_index]].append(word_index)

word_index_to_find = 'foo'       
print(index_dict[word_index_to_find])

# output :  [0, 5]

回答 29

在我看来,这["foo", "bar", "baz"].index("bar")是好的,但还不够!因为如果“ bar”不在字典中,请ValueError提出。因此,您可以使用以下功能:

def find_index(arr, name):
    try:
        return arr.index(name)
    except ValueError:
        return -1

if __name__ == '__main__':
    print(find_index(["foo", "bar", "baz"], "bar"))

结果是:

1个

如果name不是arr,则函数返回-1。例如:

打印(find_index([“ foo”,“ bar”,“ baz”],“ fooo”))

-1

in my opinion the ["foo", "bar", "baz"].index("bar") is good but it isn’t enough!because if “bar” isn’t in dictionary,ValueError raised.So you can use this function:

def find_index(arr, name):
    try:
        return arr.index(name)
    except ValueError:
        return -1

if __name__ == '__main__':
    print(find_index(["foo", "bar", "baz"], "bar"))

and the result is:

1

and if name wasn’t at arr,the function return -1.for example:

print(find_index([“foo”, “bar”, “baz”], “fooo”))

-1


Python的list方法append和extend有什么区别?

问题:Python的list方法append和extend有什么区别?

列表方法append()和之间有什么区别extend()

What’s the difference between the list methods append() and extend()?


回答 0

append:在末尾追加对象。

x = [1, 2, 3]
x.append([4, 5])
print (x)

给你: [1, 2, 3, [4, 5]]


extend:通过附加来自iterable的元素来扩展列表。

x = [1, 2, 3]
x.extend([4, 5])
print (x)

给你: [1, 2, 3, 4, 5]

append: Appends object at the end.

x = [1, 2, 3]
x.append([4, 5])
print (x)

gives you: [1, 2, 3, [4, 5]]


extend: Extends list by appending elements from the iterable.

x = [1, 2, 3]
x.extend([4, 5])
print (x)

gives you: [1, 2, 3, 4, 5]


回答 1

append将元素添加到列表,并将extend第一个列表与另一个列表(或另一个可迭代的列表,不一定是列表)连接。

>>> li = ['a', 'b', 'mpilgrim', 'z', 'example']
>>> li
['a', 'b', 'mpilgrim', 'z', 'example']

>>> li.append("new")
>>> li
['a', 'b', 'mpilgrim', 'z', 'example', 'new']

>>> li.append(["new", 2])
>>> li
['a', 'b', 'mpilgrim', 'z', 'example', 'new', ['new', 2]]

>>> li.insert(2, "new")
>>> li
['a', 'b', 'new', 'mpilgrim', 'z', 'example', 'new', ['new', 2]]

>>> li.extend(["two", "elements"])
>>> li
['a', 'b', 'new', 'mpilgrim', 'z', 'example', 'new', ['new', 2], 'two', 'elements']

append adds an element to a list, and extend concatenates the first list with another list (or another iterable, not necessarily a list.)

>>> li = ['a', 'b', 'mpilgrim', 'z', 'example']
>>> li
['a', 'b', 'mpilgrim', 'z', 'example']

>>> li.append("new")
>>> li
['a', 'b', 'mpilgrim', 'z', 'example', 'new']

>>> li.append(["new", 2])
>>> li
['a', 'b', 'mpilgrim', 'z', 'example', 'new', ['new', 2]]

>>> li.insert(2, "new")
>>> li
['a', 'b', 'new', 'mpilgrim', 'z', 'example', 'new', ['new', 2]]

>>> li.extend(["two", "elements"])
>>> li
['a', 'b', 'new', 'mpilgrim', 'z', 'example', 'new', ['new', 2], 'two', 'elements']

回答 2

列表方法追加和扩展之间有什么区别?

  • append将其参数作为单个元素添加到列表的末尾。列表本身的长度将增加一。
  • extend遍历其参数,将每个元素添加到列表,扩展列表。无论迭代参数中有多少元素,列表的长度都会增加。

append

list.append方法将一个对象附加到列表的末尾。

my_list.append(object) 

无论对象是什么,无论是数字,字符串,另一个列表还是其他对象,它都将my_list作为单个条目添加到列表的末尾。

>>> my_list
['foo', 'bar']
>>> my_list.append('baz')
>>> my_list
['foo', 'bar', 'baz']

因此请记住,列表是一个对象。如果将另一个列表追加到列表中,则第一个列表将是列表末尾的单个对象(可能不是您想要的):

>>> another_list = [1, 2, 3]
>>> my_list.append(another_list)
>>> my_list
['foo', 'bar', 'baz', [1, 2, 3]]
                     #^^^^^^^^^--- single item at the end of the list.

extend

list.extend方法通过附加来自可迭代对象的元素来扩展列表:

my_list.extend(iterable)

因此,通过扩展,可迭代的每个元素都将附加到列表中。例如:

>>> my_list
['foo', 'bar']
>>> another_list = [1, 2, 3]
>>> my_list.extend(another_list)
>>> my_list
['foo', 'bar', 1, 2, 3]

请记住,字符串是可迭代的,因此,如果用字符串扩展列表,则在迭代字符串时将附加每个字符(可能不是您想要的):

>>> my_list.extend('baz')
>>> my_list
['foo', 'bar', 1, 2, 3, 'b', 'a', 'z']

运算符重载,__add__+)和__iadd__+=

这两个++=运营商的定义list。它们在语义上类似扩展。

my_list + another_list 在内存中创建第三个列表,因此您可以返回它的结果,但是它要求第二个可迭代的列表。

my_list += another_list就地修改列表(如我们所见,它就地运算符,并且列表是可变对象),因此不会创建新列表。它也像扩展一样工作,因为第二个可迭代对象可以是任何一种可迭代对象。

不要混淆- my_list = my_list + another_list不等于+=-它为您提供了分配给my_list的全新列表。

时间复杂度

追加具有恒定的时间复杂度 O(1)。

扩展具有时间复杂度O(k)。

遍历多次调用会append增加复杂性,使其等效于extend的复杂性,并且由于extend的迭代是在C中实现的,因此,如果您打算将可迭代对象的后续项追加到列表中,它将总是更快。

性能

您可能会想知道什么是性能更高的,因为append可以用来实现与extend相同的结果。以下功能执行相同的操作:

def append(alist, iterable):
    for item in iterable:
        alist.append(item)

def extend(alist, iterable):
    alist.extend(iterable)

因此,让我们为它们计时:

import timeit

>>> min(timeit.repeat(lambda: append([], "abcdefghijklmnopqrstuvwxyz")))
2.867846965789795
>>> min(timeit.repeat(lambda: extend([], "abcdefghijklmnopqrstuvwxyz")))
0.8060121536254883

在时间上发表评论

评论者说:

完美的答案,我只是错过了仅添加一个元素进行比较的时机

做语义上正确的事情。如果您想将所有元素附加到可迭代对象中,请使用extend。如果您仅添加一个元素,请使用append

好的,让我们创建一个实验来看看如何及时进行:

def append_one(a_list, element):
    a_list.append(element)

def extend_one(a_list, element):
    """creating a new list is semantically the most direct
    way to create an iterable to give to extend"""
    a_list.extend([element])

import timeit

而且我们看到,单单使用扩展创建一个可迭代的方法是(少量)浪费时间:

>>> min(timeit.repeat(lambda: append_one([], 0)))
0.2082819009956438
>>> min(timeit.repeat(lambda: extend_one([], 0)))
0.2397019260097295

我们从中了解到,extend只有一个元素要附加时,使用并没有任何好处。

同样,这些时间并不是那么重要。我只是向他们说明,在Python中做正确的语义就是正确的方法。

可以想象,您可以在两个可比较的操作上测试时序,并得到模棱两可或相反的结果。只要专注于做语义上正确的事情。

结论

我们看到,extend在语义上更清晰,而且它可以比运行速度非常快append当你打算在一个迭代的每个元素添加到列表中。

如果只有一个元素(不可迭代)添加到列表中,请使用append

What is the difference between the list methods append and extend?

  • append adds its argument as a single element to the end of a list. The length of the list itself will increase by one.
  • extend iterates over its argument adding each element to the list, extending the list. The length of the list will increase by however many elements were in the iterable argument.

append

The list.append method appends an object to the end of the list.

my_list.append(object) 

Whatever the object is, whether a number, a string, another list, or something else, it gets added onto the end of my_list as a single entry on the list.

>>> my_list
['foo', 'bar']
>>> my_list.append('baz')
>>> my_list
['foo', 'bar', 'baz']

So keep in mind that a list is an object. If you append another list onto a list, the first list will be a single object at the end of the list (which may not be what you want):

>>> another_list = [1, 2, 3]
>>> my_list.append(another_list)
>>> my_list
['foo', 'bar', 'baz', [1, 2, 3]]
                     #^^^^^^^^^--- single item at the end of the list.

extend

The list.extend method extends a list by appending elements from an iterable:

my_list.extend(iterable)

So with extend, each element of the iterable gets appended onto the list. For example:

>>> my_list
['foo', 'bar']
>>> another_list = [1, 2, 3]
>>> my_list.extend(another_list)
>>> my_list
['foo', 'bar', 1, 2, 3]

Keep in mind that a string is an iterable, so if you extend a list with a string, you’ll append each character as you iterate over the string (which may not be what you want):

>>> my_list.extend('baz')
>>> my_list
['foo', 'bar', 1, 2, 3, 'b', 'a', 'z']

Operator Overload, __add__ (+) and __iadd__ (+=)

Both + and += operators are defined for list. They are semantically similar to extend.

my_list + another_list creates a third list in memory, so you can return the result of it, but it requires that the second iterable be a list.

my_list += another_list modifies the list in-place (it is the in-place operator, and lists are mutable objects, as we’ve seen) so it does not create a new list. It also works like extend, in that the second iterable can be any kind of iterable.

Don’t get confused – my_list = my_list + another_list is not equivalent to += – it gives you a brand new list assigned to my_list.

Time Complexity

Append has constant time complexity, O(1).

Extend has time complexity, O(k).

Iterating through the multiple calls to append adds to the complexity, making it equivalent to that of extend, and since extend’s iteration is implemented in C, it will always be faster if you intend to append successive items from an iterable onto a list.

Performance

You may wonder what is more performant, since append can be used to achieve the same outcome as extend. The following functions do the same thing:

def append(alist, iterable):
    for item in iterable:
        alist.append(item)

def extend(alist, iterable):
    alist.extend(iterable)

So let’s time them:

import timeit

>>> min(timeit.repeat(lambda: append([], "abcdefghijklmnopqrstuvwxyz")))
2.867846965789795
>>> min(timeit.repeat(lambda: extend([], "abcdefghijklmnopqrstuvwxyz")))
0.8060121536254883

Addressing a comment on timings

A commenter said:

Perfect answer, I just miss the timing of comparing adding only one element

Do the semantically correct thing. If you want to append all elements in an iterable, use extend. If you’re just adding one element, use append.

Ok, so let’s create an experiment to see how this works out in time:

def append_one(a_list, element):
    a_list.append(element)

def extend_one(a_list, element):
    """creating a new list is semantically the most direct
    way to create an iterable to give to extend"""
    a_list.extend([element])

import timeit

And we see that going out of our way to create an iterable just to use extend is a (minor) waste of time:

>>> min(timeit.repeat(lambda: append_one([], 0)))
0.2082819009956438
>>> min(timeit.repeat(lambda: extend_one([], 0)))
0.2397019260097295

We learn from this that there’s nothing gained from using extend when we have only one element to append.

Also, these timings are not that important. I am just showing them to make the point that, in Python, doing the semantically correct thing is doing things the Right Way™.

It’s conceivable that you might test timings on two comparable operations and get an ambiguous or inverse result. Just focus on doing the semantically correct thing.

Conclusion

We see that extend is semantically clearer, and that it can run much faster than append, when you intend to append each element in an iterable to a list.

If you only have a single element (not in an iterable) to add to the list, use append.


回答 3

append追加一个元素。extend追加元素列表。

请注意,如果您传递要追加的列表,它仍会添加一个元素:

>>> a = [1, 2, 3]
>>> a.append([4, 5, 6])
>>> a
[1, 2, 3, [4, 5, 6]]

append appends a single element. extend appends a list of elements.

Note that if you pass a list to append, it still adds one element:

>>> a = [1, 2, 3]
>>> a.append([4, 5, 6])
>>> a
[1, 2, 3, [4, 5, 6]]

回答 4

追加与扩充

在此处输入图片说明

使用append,您可以附加一个元素来扩展列表:

>>> a = [1,2]
>>> a.append(3)
>>> a
[1,2,3]

如果要扩展多个元素,则应使用extend,因为您只能附加一个元素或一个元素列表:

>>> a.append([4,5])
>>> a
>>> [1,2,3,[4,5]]

这样您就可以获得一个嵌套列表

您可以像这样通过扩展来扩展单个元素

>>> a = [1,2]
>>> a.extend([3])
>>> a
[1,2,3]

或者,与追加不同的是,一次扩展更多元素而不将列表嵌套到原始列表中(这就是名称扩展的原因)

>>> a.extend([4,5,6])
>>> a
[1,2,3,4,5,6]

两种方法都添加一个元素

在此处输入图片说明

尽管添加和添加都比较简单,但是添加和扩展都可以在列表末尾添加一个元素。

追加1个元素

>>> x = [1,2]
>>> x.append(3)
>>> x
[1,2,3]

扩展一个元素

>>> x = [1,2]
>>> x.extend([3])
>>> x
[1,2,3]

添加更多元素…结果不同

如果对多个元素使用append,则必须将元素列表作为参数传递,您将获得NESTED列表!

>>> x = [1,2]
>>> x.append([3,4])
>>> x
[1,2,[3,4]]

相反,使用extend,您将一个列表作为参数传递,但是您将获得一个列表,其中包含未嵌套在旧元素中的新元素。

>>> z = [1,2] 
>>> z.extend([3,4])
>>> z
[1,2,3,4]

因此,使用更多元素,您将使用extend获得包含更多项目的列表。但是,追加列表不会在列表中添加更多元素,而是一个嵌套列表的元素,您可以在代码输出中清楚地看到。

在此处输入图片说明

在此处输入图片说明

Append vs Extend

enter image description here

With append you can append a single element that will extend the list:

>>> a = [1,2]
>>> a.append(3)
>>> a
[1,2,3]

If you want to extend more than one element you should use extend, because you can only append one elment or one list of element:

>>> a.append([4,5])
>>> a
>>> [1,2,3,[4,5]]

So that you get a nested list

Instead with extend, you can extend a single element like this

>>> a = [1,2]
>>> a.extend([3])
>>> a
[1,2,3]

Or, differently, from append, extend more elements in one time without nesting the list into the original one (that’s the reason of the name extend)

>>> a.extend([4,5,6])
>>> a
[1,2,3,4,5,6]

Adding one element with both methods

enter image description here

Both append and extend can add one element to the end of the list, though append is simpler.

append 1 element

>>> x = [1,2]
>>> x.append(3)
>>> x
[1,2,3]

extend one element

>>> x = [1,2]
>>> x.extend([3])
>>> x
[1,2,3]

Adding more elements… with different results

If you use append for more than one element, you have to pass a list of elements as arguments and you will obtain a NESTED list!

>>> x = [1,2]
>>> x.append([3,4])
>>> x
[1,2,[3,4]]

With extend, instead, you pass a list as an argument, but you will obtain a list with the new element that is not nested in the old one.

>>> z = [1,2] 
>>> z.extend([3,4])
>>> z
[1,2,3,4]

So, with more elements, you will use extend to get a list with more items. However, appending a list will not add more elements to the list, but one element that is a nested list as you can clearly see in the output of the code.

enter image description here

enter image description here


回答 5

以下两个片段在语义上是等效的:

for item in iterator:
    a_list.append(item)

a_list.extend(iterator)

当循环在C中实现时,后者可能会更快。

The following two snippets are semantically equivalent:

for item in iterator:
    a_list.append(item)

and

a_list.extend(iterator)

The latter may be faster as the loop is implemented in C.


回答 6

append()方法将单个项目添加到列表的末尾。

x = [1, 2, 3]
x.append([4, 5])
x.append('abc')
print(x)
# gives you
[1, 2, 3, [4, 5], 'abc']

extend()方法采用一个参数,一个列表,并将该参数的每个项目附加到原始列表中。(列表以类的形式实现。“创建”列表实际上是在实例化一个类。因此,列表具有对其进行操作的方法。)

x = [1, 2, 3]
x.extend([4, 5])
x.extend('abc')
print(x)
# gives you
[1, 2, 3, 4, 5, 'a', 'b', 'c']

潜入Python

The append() method adds a single item to the end of the list.

x = [1, 2, 3]
x.append([4, 5])
x.append('abc')
print(x)
# gives you
[1, 2, 3, [4, 5], 'abc']

The extend() method takes one argument, a list, and appends each of the items of the argument to the original list. (Lists are implemented as classes. “Creating” a list is really instantiating a class. As such, a list has methods that operate on it.)

x = [1, 2, 3]
x.extend([4, 5])
x.extend('abc')
print(x)
# gives you
[1, 2, 3, 4, 5, 'a', 'b', 'c']

From Dive Into Python.


回答 7

您可以使用“ +”返回扩展名,而不是就地扩展名。

l1=range(10)

l1+[11]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11]

l2=range(10,1,-1)

l1+l2

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 9, 8, 7, 6, 5, 4, 3, 2]

+=就地行为类似,但与append&略有不同extend。其中一个最大的不同+=,从appendextend是当它在功能范围时,看到这个博客帖子

You can use “+” for returning extend, instead of extending in place.

l1=range(10)

l1+[11]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11]

l2=range(10,1,-1)

l1+l2

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 9, 8, 7, 6, 5, 4, 3, 2]

Similarly += for in place behavior, but with slight differences from append & extend. One of the biggest differences of += from append and extend is when it is used in function scopes, see this blog post.


回答 8

append(object) -通过将对象添加到列表来更新列表。

x = [20]
# List passed to the append(object) method is treated as a single object.
x.append([21, 22, 23])
# Hence the resultant list length will be 2
print(x)
--> [20, [21, 22, 23]]

extend(list) -本质上是串联两个列表。

x = [20]
# The parameter passed to extend(list) method is treated as a list.
# Eventually it is two lists being concatenated.
x.extend([21, 22, 23])
# Here the resultant list's length is 4
print(x)
[20, 21, 22, 23]

append(object) – Updates the list by adding an object to the list.

x = [20]
# List passed to the append(object) method is treated as a single object.
x.append([21, 22, 23])
# Hence the resultant list length will be 2
print(x)
--> [20, [21, 22, 23]]

extend(list) – Essentially concatenates two lists.

x = [20]
# The parameter passed to extend(list) method is treated as a list.
# Eventually it is two lists being concatenated.
x.extend([21, 22, 23])
# Here the resultant list's length is 4
print(x)
[20, 21, 22, 23]

回答 9

extend()可以与迭代器参数一起使用。这是一个例子。您希望通过以下方式从列表列表中列出一个列表:

list2d = [[1,2,3],[4,5,6], [7], [8,9]]

你要

>>>
[1, 2, 3, 4, 5, 6, 7, 8, 9]

您可能itertools.chain.from_iterable()会这样做。该方法的输出是一个迭代器。它的实现等效于

def from_iterable(iterables):
    # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
    for it in iterables:
        for element in it:
            yield element

回到我们的例子,我们可以做

import itertools
list2d = [[1,2,3],[4,5,6], [7], [8,9]]
merged = list(itertools.chain.from_iterable(list2d))

并获得通缉名单。

以下是等效extend()用于迭代器参数的方法:

merged = []
merged.extend(itertools.chain.from_iterable(list2d))
print(merged)
>>>
[1, 2, 3, 4, 5, 6, 7, 8, 9]

extend() can be used with an iterator argument. Here is an example. You wish to make a list out of a list of lists this way:

From

list2d = [[1,2,3],[4,5,6], [7], [8,9]]

you want

>>>
[1, 2, 3, 4, 5, 6, 7, 8, 9]

You may use itertools.chain.from_iterable() to do so. This method’s output is an iterator. Its implementation is equivalent to

def from_iterable(iterables):
    # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
    for it in iterables:
        for element in it:
            yield element

Back to our example, we can do

import itertools
list2d = [[1,2,3],[4,5,6], [7], [8,9]]
merged = list(itertools.chain.from_iterable(list2d))

and get the wanted list.

Here is how equivalently extend() can be used with an iterator argument:

merged = []
merged.extend(itertools.chain.from_iterable(list2d))
print(merged)
>>>
[1, 2, 3, 4, 5, 6, 7, 8, 9]

回答 10

这等效于appendextend使用+运算符:

>>> x = [1,2,3]
>>> x
[1, 2, 3]
>>> x = x + [4,5,6] # Extend
>>> x
[1, 2, 3, 4, 5, 6]
>>> x = x + [[7,8]] # Append
>>> x
[1, 2, 3, 4, 5, 6, [7, 8]]

This is the equivalent of append and extend using the + operator:

>>> x = [1,2,3]
>>> x
[1, 2, 3]
>>> x = x + [4,5,6] # Extend
>>> x
[1, 2, 3, 4, 5, 6]
>>> x = x + [[7,8]] # Append
>>> x
[1, 2, 3, 4, 5, 6, [7, 8]]

回答 11

append():基本上在Python中用于添加一个元素。

范例1:

>> a = [1, 2, 3, 4]
>> a.append(5)
>> print(a)
>> a = [1, 2, 3, 4, 5]

范例2:

>> a = [1, 2, 3, 4]
>> a.append([5, 6])
>> print(a)
>> a = [1, 2, 3, 4, [5, 6]]

extend():extend()用于合并两个列表或在一个列表中插入多个元素。

范例1:

>> a = [1, 2, 3, 4]
>> b = [5, 6, 7, 8]
>> a.extend(b)
>> print(a)
>> a = [1, 2, 3, 4, 5, 6, 7, 8]

范例2:

>> a = [1, 2, 3, 4]
>> a.extend([5, 6])
>> print(a)
>> a = [1, 2, 3, 4, 5, 6]

append(): It is basically used in Python to add one element.

Example 1:

>> a = [1, 2, 3, 4]
>> a.append(5)
>> print(a)
>> a = [1, 2, 3, 4, 5]

Example 2:

>> a = [1, 2, 3, 4]
>> a.append([5, 6])
>> print(a)
>> a = [1, 2, 3, 4, [5, 6]]

extend(): Where extend(), is used to merge two lists or insert multiple elements in one list.

Example 1:

>> a = [1, 2, 3, 4]
>> b = [5, 6, 7, 8]
>> a.extend(b)
>> print(a)
>> a = [1, 2, 3, 4, 5, 6, 7, 8]

Example 2:

>> a = [1, 2, 3, 4]
>> a.extend([5, 6])
>> print(a)
>> a = [1, 2, 3, 4, 5, 6]

回答 12

已经暗示但未解释的一个有趣的观点是,扩展比添加快。对于任何在内部具有append的循环,都应考虑将其替换为list.extend(processed_elements)。

请记住,添加新元素可能会导致整个列表重新定位到内存中的更好位置。如果由于一次添加1个元素而多次执行此操作,则总体性能会受到影响。在这种意义上,list.extend类似于“” .join(stringlist)。

An interesting point that has been hinted, but not explained, is that extend is faster than append. For any loop that has append inside should be considered to be replaced by list.extend(processed_elements).

Bear in mind that apprending new elements might result in the realloaction of the whole list to a better location in memory. If this is done several times because we are appending 1 element at a time, overall performance suffers. In this sense, list.extend is analogous to “”.join(stringlist).


回答 13

Append一次添加全部数据。整个数据将被添加到新创建的索引中。另一方面,extend顾名思义,扩展了当前数组。

例如

list1 = [123, 456, 678]
list2 = [111, 222]

随着append我们得到:

result = [123, 456, 678, [111, 222]]

extend我们得到:

result = [123, 456, 678, 111, 222]

Append adds the entire data at once. The whole data will be added to the newly created index. On the other hand, extend, as it name suggests, extends the current array.

For example

list1 = [123, 456, 678]
list2 = [111, 222]

With append we get:

result = [123, 456, 678, [111, 222]]

While on extend we get:

result = [123, 456, 678, 111, 222]

回答 14

一本英语词典定义的话append,并extend为:

append:在书面文档的末尾添加(某些内容)。
扩大:扩大。放大或扩大


有了这些知识,现在让我们了解

1)之间的区别appendextend

append

  • 所有Python对象原样追加到列表的末尾(即,作为列表中的最后一个元素)。
  • 结果列表可以嵌套,并包含异构元素(即列表,字符串,元组,字典,集合等)。

extend

  • 接受任何iterable作为其参数,并使列表更大
  • 结果列表始终是一维列表(即无嵌套),由于apply的结果,列表中可能包含异类元素(例如,字符,整数,浮点数)list(iterable)

2)之间的相似性appendextend

  • 两者都只是一个论点。
  • 两者都就地修改列表。
  • 结果,两个都返回None

lis = [1, 2, 3]

# 'extend' is equivalent to this
lis = lis + list(iterable)

# 'append' simply appends its argument as the last element to the list
# as long as the argument is a valid Python object
list.append(object)

An English dictionary defines the words append and extend as:

append: add (something) to the end of a written document.
extend: make larger. Enlarge or expand


With that knowledge, now let’s understand

1) The difference between append and extend

append:

  • Appends any Python object as-is to the end of the list (i.e. as a the last element in the list).
  • The resulting list may be nested and contain heterogeneous elements (i.e. list, string, tuple, dictionary, set, etc.)

extend:

  • Accepts any iterable as its argument and makes the list larger.
  • The resulting list is always one-dimensional list (i.e. no nesting) and it may contain heterogeneous elements in it (e.g. characters, integers, float) as a result of applying list(iterable).

2) Similarity between append and extend

  • Both take exactly one argument.
  • Both modify the list in-place.
  • As a result, both returns None.

Example

lis = [1, 2, 3]

# 'extend' is equivalent to this
lis = lis + list(iterable)

# 'append' simply appends its argument as the last element to the list
# as long as the argument is a valid Python object
list.append(object)

回答 15

我希望我可以对这个问题做出有益的补充。例如Info,如果您的列表存储了一个特定类型的对象,则这种情况extend不适用于该方法:在for循环中,Info每次生成一个对象并extend用于将其存储到列表中时,它将失败。异常如下所示:

TypeError:“ Info”对象不可迭代

但是,如果使用该append方法,则结果可以。因为每次使用该extend方法时,它将始终将其视为列表或任何其他集合类型,因此需要对其进行迭代,并将其放置在上一个列表之后。显然,不能迭代特定的对象。

I hope I can make a useful supplement to this question. If your list stores a specific type object, for example Info, here is a situation that extend method is not suitable: In a for loop and and generating an Info object every time and using extend to store it into your list, it will fail. The exception is like below:

TypeError: ‘Info’ object is not iterable

But if you use the append method, the result is OK. Because every time using the extend method, it will always treat it as a list or any other collection type, iterate it, and place it after the previous list. A specific object can not be iterated, obviously.


回答 16

直观区分它们

l1 = ['a', 'b', 'c']
l2 = ['d', 'e', 'f']
l1.append(l2)
l1
['a', 'b', 'c', ['d', 'e', 'f']]

就像l1在她体内复制一个身体(嵌套)一样。

# Reset l1 = ['a', 'b', 'c']
l1.extend(l2)
l1
['a', 'b', 'c', 'd', 'e', 'f']

就像两个分开的人结婚并组建了一个家庭。

此外,我还列出了所有列表方法的详尽清单供您参考。

list_methods = {'Add': {'extend', 'append', 'insert'},
                'Remove': {'pop', 'remove', 'clear'}
                'Sort': {'reverse', 'sort'},
                'Search': {'count', 'index'},
                'Copy': {'copy'},
                }

To distinguish them intuitively

l1 = ['a', 'b', 'c']
l2 = ['d', 'e', 'f']
l1.append(l2)
l1
['a', 'b', 'c', ['d', 'e', 'f']]

It’s like l1 reproduce a body inside her body(nested).

# Reset l1 = ['a', 'b', 'c']
l1.extend(l2)
l1
['a', 'b', 'c', 'd', 'e', 'f']

It’s like that two separated individuals get married and construct an united family.

Besides I make an exhaustive cheatsheet of all list’s methods for your reference.

list_methods = {'Add': {'extend', 'append', 'insert'},
                'Remove': {'pop', 'remove', 'clear'}
                'Sort': {'reverse', 'sort'},
                'Search': {'count', 'index'},
                'Copy': {'copy'},
                }

回答 17

extend(L)通过在给定列表中追加所有项目来扩展列表L

>>> a
[1, 2, 3]
a.extend([4])  #is eqivalent of a[len(a):] = [4]
>>> a
[1, 2, 3, 4]
a = [1, 2, 3]
>>> a
[1, 2, 3]
>>> a[len(a):] = [4]
>>> a
[1, 2, 3, 4]

extend(L) extends the list by appending all the items in the given list L.

>>> a
[1, 2, 3]
a.extend([4])  #is eqivalent of a[len(a):] = [4]
>>> a
[1, 2, 3, 4]
a = [1, 2, 3]
>>> a
[1, 2, 3]
>>> a[len(a):] = [4]
>>> a
[1, 2, 3, 4]

回答 18

append列表仅将一项 “扩展”(就地),即传递的单个对象(作为参数)。

extend“扩展”的名单(到位)尽可能多的项目对象传递(作为参数)包含的内容。

这可能会使str对象有些混乱。

  1. 如果您将字符串作为参数传递: append将在末尾添加单个字符串项,但 extend将添加与该字符串的长度一样多的“单个”“ str”项。
  2. 如果您将字符串列表作为参数传递:: append仍将在末尾添加单个“列表”项, extend并将添加与所传递列表的长度一样多的“列表”项。
def append_o(a_list, element):
    a_list.append(element)
    print('append:', end = ' ')
    for item in a_list:
        print(item, end = ',')
    print()

def extend_o(a_list, element):
    a_list.extend(element)
    print('extend:', end = ' ')
    for item in a_list:
        print(item, end = ',')
    print()
append_o(['ab'],'cd')

extend_o(['ab'],'cd')
append_o(['ab'],['cd', 'ef'])
extend_o(['ab'],['cd', 'ef'])
append_o(['ab'],['cd'])
extend_o(['ab'],['cd'])

生成:

append: ab,cd,
extend: ab,c,d,
append: ab,['cd', 'ef'],
extend: ab,cd,ef,
append: ab,['cd'],
extend: ab,cd,

append “extends” the list (in place) by only one item, the single object passed (as argument).

extend “extends” the list (in place) by as many items as the object passed (as argument) contains.

This may be slightly confusing for str objects.

  1. If you pass a string as argument: append will add a single string item at the end but extend will add as many “single” ‘str’ items as the length of that string.
  2. If you pass a list of strings as argument: append will still add a single ‘list’ item at the end and extend will add as many ‘list’ items as the length of the passed list.
def append_o(a_list, element):
    a_list.append(element)
    print('append:', end = ' ')
    for item in a_list:
        print(item, end = ',')
    print()

def extend_o(a_list, element):
    a_list.extend(element)
    print('extend:', end = ' ')
    for item in a_list:
        print(item, end = ',')
    print()
append_o(['ab'],'cd')

extend_o(['ab'],'cd')
append_o(['ab'],['cd', 'ef'])
extend_o(['ab'],['cd', 'ef'])
append_o(['ab'],['cd'])
extend_o(['ab'],['cd'])

produces:

append: ab,cd,
extend: ab,c,d,
append: ab,['cd', 'ef'],
extend: ab,cd,ef,
append: ab,['cd'],
extend: ab,cd,

回答 19

追加和扩展是python中的可扩展性机制之一。

追加:将元素添加到列表的末尾。

my_list = [1,2,3,4]

要向列表中添加新元素,我们可以通过以下方式使用append方法。

my_list.append(5)

将要添加新元素的默认位置始终位于(length + 1)位置。

插入:使用插入方法来克服附加的限制。使用insert,我们可以显式定义要在其中插入新元素的确切位置。

insert(index,object)的方法描述符。它有两个参数,第一个是我们要插入元素的索引,第二个是元素本身。

Example: my_list = [1,2,3,4]
my_list[4, 'a']
my_list
[1,2,3,4,'a']

扩展:当我们要将两个或多个列表合并为一个列表时,这非常有用。如果不扩展,如果我们要连接两个列表,则生成的对象将包含一个列表列表。

a = [1,2]
b = [3]
a.append(b)
print (a)
[1,2,[3]]

如果尝试访问位置2的元素,则会得到一个列表([3]),而不是元素。要加入两个列表,我们必须使用append。

a = [1,2]
b = [3]
a.extend(b)
print (a)
[1,2,3]

加入多个列表

a = [1]
b = [2]
c = [3]
a.extend(b+c)
print (a)
[1,2,3]

Append and extend are one of the extensibility mechanisms in python.

Append: Adds an element to the end of the list.

my_list = [1,2,3,4]

To add a new element to the list, we can use append method in the following way.

my_list.append(5)

The default location that the new element will be added is always in the (length+1) position.

Insert: The insert method was used to overcome the limitations of append. With insert, we can explicitly define the exact position we want our new element to be inserted at.

Method descriptor of insert(index, object). It takes two arguments, first being the index we want to insert our element and second the element itself.

Example: my_list = [1,2,3,4]
my_list[4, 'a']
my_list
[1,2,3,4,'a']

Extend: This is very useful when we want to join two or more lists into a single list. Without extend, if we want to join two lists, the resulting object will contain a list of lists.

a = [1,2]
b = [3]
a.append(b)
print (a)
[1,2,[3]]

If we try to access the element at pos 2, we get a list ([3]), instead of the element. To join two lists, we’ll have to use append.

a = [1,2]
b = [3]
a.extend(b)
print (a)
[1,2,3]

To join multiple lists

a = [1]
b = [2]
c = [3]
a.extend(b+c)
print (a)
[1,2,3]

如何克隆或复制列表?

问题:如何克隆或复制列表?

在Python中克隆或复制列表有哪些选项?

在使用new_list = my_list,任何修改new_list改变my_list每次。为什么是这样?

What are the options to clone or copy a list in Python?

While using new_list = my_list, any modifications to new_list changes my_list everytime. Why is this?


回答 0

使用new_list = my_list,您实际上没有两个列表。分配只是将引用复制到列表,而不是实际列表,因此将两者复制new_listmy_list在分配后引用同一列表。

要实际复制列表,您有多种可能:

  • 您可以使用内建list.copy()方法(自Python 3.3起可用):

    new_list = old_list.copy()
  • 您可以将其切片:

    new_list = old_list[:]

    Alex Martelli对此看法(至少是在2007年)是,这是一种怪异的语法,永远不要使用它。;)(在他看来,下一个更具可读性)。

  • 您可以使用内置list()函数:

    new_list = list(old_list)
  • 您可以使用generic copy.copy()

    import copy
    new_list = copy.copy(old_list)

    这比list()因为必须找出old_listfirst 的数据类型慢一些。

  • 如果列表包含对象,并且您也想复制它们,请使用generic copy.deepcopy()

    import copy
    new_list = copy.deepcopy(old_list)

    显然,这是最慢且最需要内存的方法,但有时是不可避免的。

例:

import copy

class Foo(object):
    def __init__(self, val):
         self.val = val

    def __repr__(self):
        return 'Foo({!r})'.format(self.val)

foo = Foo(1)

a = ['foo', foo]
b = a.copy()
c = a[:]
d = list(a)
e = copy.copy(a)
f = copy.deepcopy(a)

# edit orignal list and instance 
a.append('baz')
foo.val = 5

print('original: %r\nlist.copy(): %r\nslice: %r\nlist(): %r\ncopy: %r\ndeepcopy: %r'
      % (a, b, c, d, e, f))

结果:

original: ['foo', Foo(5), 'baz']
list.copy(): ['foo', Foo(5)]
slice: ['foo', Foo(5)]
list(): ['foo', Foo(5)]
copy: ['foo', Foo(5)]
deepcopy: ['foo', Foo(1)]

With new_list = my_list, you don’t actually have two lists. The assignment just copies the reference to the list, not the actual list, so both new_list and my_list refer to the same list after the assignment.

To actually copy the list, you have various possibilities:

  • You can use the builtin list.copy() method (available since Python 3.3):

    new_list = old_list.copy()
    
  • You can slice it:

    new_list = old_list[:]
    

    Alex Martelli’s opinion (at least back in 2007) about this is, that it is a weird syntax and it does not make sense to use it ever. ;) (In his opinion, the next one is more readable).

  • You can use the built in list() function:

    new_list = list(old_list)
    
  • You can use generic copy.copy():

    import copy
    new_list = copy.copy(old_list)
    

    This is a little slower than list() because it has to find out the datatype of old_list first.

  • If the list contains objects and you want to copy them as well, use generic copy.deepcopy():

    import copy
    new_list = copy.deepcopy(old_list)
    

    Obviously the slowest and most memory-needing method, but sometimes unavoidable.

Example:

import copy

class Foo(object):
    def __init__(self, val):
         self.val = val

    def __repr__(self):
        return 'Foo({!r})'.format(self.val)

foo = Foo(1)

a = ['foo', foo]
b = a.copy()
c = a[:]
d = list(a)
e = copy.copy(a)
f = copy.deepcopy(a)

# edit orignal list and instance 
a.append('baz')
foo.val = 5

print('original: %r\nlist.copy(): %r\nslice: %r\nlist(): %r\ncopy: %r\ndeepcopy: %r'
      % (a, b, c, d, e, f))

Result:

original: ['foo', Foo(5), 'baz']
list.copy(): ['foo', Foo(5)]
slice: ['foo', Foo(5)]
list(): ['foo', Foo(5)]
copy: ['foo', Foo(5)]
deepcopy: ['foo', Foo(1)]

回答 1

Felix已经提供了一个很好的答案,但是我想我将对各种方法进行速度比较:

  1. 10.59秒(105.9us / itn)- copy.deepcopy(old_list)
  2. 10.16秒(101.6us / itn)-使用Deepcopy Copy()复制类的纯python 方法
  3. 1.488秒(14.88us / itn)-纯python Copy()方法不复制类(仅字典/列表/元组)
  4. 0.325秒(3.25us / itn)- for item in old_list: new_list.append(item)
  5. 0.217秒(2.17us / itn)- [i for i in old_list]列表理解
  6. 0.186秒(1.86us / itn)- copy.copy(old_list)
  7. 0.075秒(0.75us / itn)- list(old_list)
  8. 0.053秒(0.53us / itn)- new_list = []; new_list.extend(old_list)
  9. 0.039秒(0.39us / itn)- old_list[:]列表切片

因此最快的是列表切片。但是请注意copy.copy()list[:]和和python版本list(list)不同,和copy.deepcopy()和不会在列表中复制任何列表,字典和类实例,因此,如果原始版本更改,它们也会在复制的列表中更改,反之亦然。

(如果有人有兴趣或想提出任何问题,请使用以下脚本:)

from copy import deepcopy

class old_class:
    def __init__(self):
        self.blah = 'blah'

class new_class(object):
    def __init__(self):
        self.blah = 'blah'

dignore = {str: None, unicode: None, int: None, type(None): None}

def Copy(obj, use_deepcopy=True):
    t = type(obj)

    if t in (list, tuple):
        if t == tuple:
            # Convert to a list if a tuple to 
            # allow assigning to when copying
            is_tuple = True
            obj = list(obj)
        else: 
            # Otherwise just do a quick slice copy
            obj = obj[:]
            is_tuple = False

        # Copy each item recursively
        for x in xrange(len(obj)):
            if type(obj[x]) in dignore:
                continue
            obj[x] = Copy(obj[x], use_deepcopy)

        if is_tuple: 
            # Convert back into a tuple again
            obj = tuple(obj)

    elif t == dict: 
        # Use the fast shallow dict copy() method and copy any 
        # values which aren't immutable (like lists, dicts etc)
        obj = obj.copy()
        for k in obj:
            if type(obj[k]) in dignore:
                continue
            obj[k] = Copy(obj[k], use_deepcopy)

    elif t in dignore: 
        # Numeric or string/unicode? 
        # It's immutable, so ignore it!
        pass 

    elif use_deepcopy: 
        obj = deepcopy(obj)
    return obj

if __name__ == '__main__':
    import copy
    from time import time

    num_times = 100000
    L = [None, 'blah', 1, 543.4532, 
         ['foo'], ('bar',), {'blah': 'blah'},
         old_class(), new_class()]

    t = time()
    for i in xrange(num_times):
        Copy(L)
    print 'Custom Copy:', time()-t

    t = time()
    for i in xrange(num_times):
        Copy(L, use_deepcopy=False)
    print 'Custom Copy Only Copying Lists/Tuples/Dicts (no classes):', time()-t

    t = time()
    for i in xrange(num_times):
        copy.copy(L)
    print 'copy.copy:', time()-t

    t = time()
    for i in xrange(num_times):
        copy.deepcopy(L)
    print 'copy.deepcopy:', time()-t

    t = time()
    for i in xrange(num_times):
        L[:]
    print 'list slicing [:]:', time()-t

    t = time()
    for i in xrange(num_times):
        list(L)
    print 'list(L):', time()-t

    t = time()
    for i in xrange(num_times):
        [i for i in L]
    print 'list expression(L):', time()-t

    t = time()
    for i in xrange(num_times):
        a = []
        a.extend(L)
    print 'list extend:', time()-t

    t = time()
    for i in xrange(num_times):
        a = []
        for y in L:
            a.append(y)
    print 'list append:', time()-t

    t = time()
    for i in xrange(num_times):
        a = []
        a.extend(i for i in L)
    print 'generator expression extend:', time()-t

Felix already provided an excellent answer, but I thought I’d do a speed comparison of the various methods:

  1. 10.59 sec (105.9us/itn) – copy.deepcopy(old_list)
  2. 10.16 sec (101.6us/itn) – pure python Copy() method copying classes with deepcopy
  3. 1.488 sec (14.88us/itn) – pure python Copy() method not copying classes (only dicts/lists/tuples)
  4. 0.325 sec (3.25us/itn) – for item in old_list: new_list.append(item)
  5. 0.217 sec (2.17us/itn) – [i for i in old_list] (a list comprehension)
  6. 0.186 sec (1.86us/itn) – copy.copy(old_list)
  7. 0.075 sec (0.75us/itn) – list(old_list)
  8. 0.053 sec (0.53us/itn) – new_list = []; new_list.extend(old_list)
  9. 0.039 sec (0.39us/itn) – old_list[:] (list slicing)

So the fastest is list slicing. But be aware that copy.copy(), list[:] and list(list), unlike copy.deepcopy() and the python version don’t copy any lists, dictionaries and class instances in the list, so if the originals change, they will change in the copied list too and vice versa.

(Here’s the script if anyone’s interested or wants to raise any issues:)

from copy import deepcopy

class old_class:
    def __init__(self):
        self.blah = 'blah'

class new_class(object):
    def __init__(self):
        self.blah = 'blah'

dignore = {str: None, unicode: None, int: None, type(None): None}

def Copy(obj, use_deepcopy=True):
    t = type(obj)

    if t in (list, tuple):
        if t == tuple:
            # Convert to a list if a tuple to 
            # allow assigning to when copying
            is_tuple = True
            obj = list(obj)
        else: 
            # Otherwise just do a quick slice copy
            obj = obj[:]
            is_tuple = False

        # Copy each item recursively
        for x in xrange(len(obj)):
            if type(obj[x]) in dignore:
                continue
            obj[x] = Copy(obj[x], use_deepcopy)

        if is_tuple: 
            # Convert back into a tuple again
            obj = tuple(obj)

    elif t == dict: 
        # Use the fast shallow dict copy() method and copy any 
        # values which aren't immutable (like lists, dicts etc)
        obj = obj.copy()
        for k in obj:
            if type(obj[k]) in dignore:
                continue
            obj[k] = Copy(obj[k], use_deepcopy)

    elif t in dignore: 
        # Numeric or string/unicode? 
        # It's immutable, so ignore it!
        pass 

    elif use_deepcopy: 
        obj = deepcopy(obj)
    return obj

if __name__ == '__main__':
    import copy
    from time import time

    num_times = 100000
    L = [None, 'blah', 1, 543.4532, 
         ['foo'], ('bar',), {'blah': 'blah'},
         old_class(), new_class()]

    t = time()
    for i in xrange(num_times):
        Copy(L)
    print 'Custom Copy:', time()-t

    t = time()
    for i in xrange(num_times):
        Copy(L, use_deepcopy=False)
    print 'Custom Copy Only Copying Lists/Tuples/Dicts (no classes):', time()-t

    t = time()
    for i in xrange(num_times):
        copy.copy(L)
    print 'copy.copy:', time()-t

    t = time()
    for i in xrange(num_times):
        copy.deepcopy(L)
    print 'copy.deepcopy:', time()-t

    t = time()
    for i in xrange(num_times):
        L[:]
    print 'list slicing [:]:', time()-t

    t = time()
    for i in xrange(num_times):
        list(L)
    print 'list(L):', time()-t

    t = time()
    for i in xrange(num_times):
        [i for i in L]
    print 'list expression(L):', time()-t

    t = time()
    for i in xrange(num_times):
        a = []
        a.extend(L)
    print 'list extend:', time()-t

    t = time()
    for i in xrange(num_times):
        a = []
        for y in L:
            a.append(y)
    print 'list append:', time()-t

    t = time()
    for i in xrange(num_times):
        a = []
        a.extend(i for i in L)
    print 'generator expression extend:', time()-t

回答 2

有人告诉我Python 3.3+ 增加了list.copy()方法,该方法应与切片一样快:

newlist = old_list.copy()

I’ve been told that Python 3.3+ adds list.copy() method, which should be as fast as slicing:

newlist = old_list.copy()


回答 3

在Python中克隆或复制列表有哪些选项?

在Python 3中,可以使用以下方式创建浅表副本:

a_copy = a_list.copy()

在Python 2和3中,您可以获得包含原始文档完整切片的浅表副本:

a_copy = a_list[:]

说明

复制列表有两种语义方式。浅表副本创建相同对象的新列表,深表副本创建包含新的等效对象的新列表。

浅表副本

浅表副本仅复制列表本身,列表本身是对列表中对象的引用的容器。如果它们本身包含的对象是可变的,并且其中一个被更改,则更改将反映在两个列表中。

在Python 2和3中有不同的方法来执行此操作。Python2的方法也将在Python 3中工作。

Python 2

在Python 2中,制作列表的浅表副本的惯用方法是使用原始列表的完整切片:

a_copy = a_list[:]

您还可以通过将列表通过列表构造函数传递来完成同一件事,

a_copy = list(a_list)

但是使用构造函数的效率较低:

>>> timeit
>>> l = range(20)
>>> min(timeit.repeat(lambda: l[:]))
0.30504298210144043
>>> min(timeit.repeat(lambda: list(l)))
0.40698814392089844

Python 3

在Python 3中,列表获取list.copy方法:

a_copy = a_list.copy()

在Python 3.5中:

>>> import timeit
>>> l = list(range(20))
>>> min(timeit.repeat(lambda: l[:]))
0.38448613602668047
>>> min(timeit.repeat(lambda: list(l)))
0.6309100328944623
>>> min(timeit.repeat(lambda: l.copy()))
0.38122922903858125

制作另一个指针并没有进行复制

然后,每次使用my_list更改时,使用new_list = my_list都会修改new_list。为什么是这样?

my_list只是指向内存中实际列表的名称。当您说不new_list = my_list制作副本时,只是在添加另一个名称,该名称指向内存中的原始列表。复制列表时,我们可能会遇到类似的问题。

>>> l = [[], [], []]
>>> l_copy = l[:]
>>> l_copy
[[], [], []]
>>> l_copy[0].append('foo')
>>> l_copy
[['foo'], [], []]
>>> l
[['foo'], [], []]

该列表只是指向内容的指针数组,因此浅表副本仅复制指针,因此您有两个不同的列表,但是它们具有相同的内容。要复制内容,您需要一个深层副本。

深拷贝

为了使列表的深层副本,在Python 2或3时,使用deepcopy了在copy模块

import copy
a_deep_copy = copy.deepcopy(a_list)

为了演示这如何使我们创建新的子列表:

>>> import copy
>>> l
[['foo'], [], []]
>>> l_deep_copy = copy.deepcopy(l)
>>> l_deep_copy[0].pop()
'foo'
>>> l_deep_copy
[[], [], []]
>>> l
[['foo'], [], []]

因此,我们看到深度复制的列表与原始列表完全不同。您可以滚动自己的函数-但不能。通过使用标准库的Deepcopy函数,您可能会创建本来没有的bug。

不要使用 eval

您可能会将此视为深度复制的一种方法,但不要这样做:

problematic_deep_copy = eval(repr(a_list))
  1. 这很危险,特别是如果您正在评估来自不信任来源的内容时。
  2. 这是不可靠的,如果您要复制的子元素没有可以用来复制等效元素的表示形式。
  3. 它的性能也较差。

在64位Python 2.7中:

>>> import timeit
>>> import copy
>>> l = range(10)
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
27.55826997756958
>>> min(timeit.repeat(lambda: eval(repr(l))))
29.04534101486206

在64位Python 3.5上:

>>> import timeit
>>> import copy
>>> l = list(range(10))
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
16.84255409205798
>>> min(timeit.repeat(lambda: eval(repr(l))))
34.813894678023644

What are the options to clone or copy a list in Python?

In Python 3, a shallow copy can be made with:

a_copy = a_list.copy()

In Python 2 and 3, you can get a shallow copy with a full slice of the original:

a_copy = a_list[:]

Explanation

There are two semantic ways to copy a list. A shallow copy creates a new list of the same objects, a deep copy creates a new list containing new equivalent objects.

Shallow list copy

A shallow copy only copies the list itself, which is a container of references to the objects in the list. If the objects contained themselves are mutable and one is changed, the change will be reflected in both lists.

There are different ways to do this in Python 2 and 3. The Python 2 ways will also work in Python 3.

Python 2

In Python 2, the idiomatic way of making a shallow copy of a list is with a complete slice of the original:

a_copy = a_list[:]

You can also accomplish the same thing by passing the list through the list constructor,

a_copy = list(a_list)

but using the constructor is less efficient:

>>> timeit
>>> l = range(20)
>>> min(timeit.repeat(lambda: l[:]))
0.30504298210144043
>>> min(timeit.repeat(lambda: list(l)))
0.40698814392089844

Python 3

In Python 3, lists get the list.copy method:

a_copy = a_list.copy()

In Python 3.5:

>>> import timeit
>>> l = list(range(20))
>>> min(timeit.repeat(lambda: l[:]))
0.38448613602668047
>>> min(timeit.repeat(lambda: list(l)))
0.6309100328944623
>>> min(timeit.repeat(lambda: l.copy()))
0.38122922903858125

Making another pointer does not make a copy

Using new_list = my_list then modifies new_list every time my_list changes. Why is this?

my_list is just a name that points to the actual list in memory. When you say new_list = my_list you’re not making a copy, you’re just adding another name that points at that original list in memory. We can have similar issues when we make copies of lists.

>>> l = [[], [], []]
>>> l_copy = l[:]
>>> l_copy
[[], [], []]
>>> l_copy[0].append('foo')
>>> l_copy
[['foo'], [], []]
>>> l
[['foo'], [], []]

The list is just an array of pointers to the contents, so a shallow copy just copies the pointers, and so you have two different lists, but they have the same contents. To make copies of the contents, you need a deep copy.

Deep copies

To make a deep copy of a list, in Python 2 or 3, use deepcopy in the copy module:

import copy
a_deep_copy = copy.deepcopy(a_list)

To demonstrate how this allows us to make new sub-lists:

>>> import copy
>>> l
[['foo'], [], []]
>>> l_deep_copy = copy.deepcopy(l)
>>> l_deep_copy[0].pop()
'foo'
>>> l_deep_copy
[[], [], []]
>>> l
[['foo'], [], []]

And so we see that the deep copied list is an entirely different list from the original. You could roll your own function – but don’t. You’re likely to create bugs you otherwise wouldn’t have by using the standard library’s deepcopy function.

Don’t use eval

You may see this used as a way to deepcopy, but don’t do it:

problematic_deep_copy = eval(repr(a_list))
  1. It’s dangerous, particularly if you’re evaluating something from a source you don’t trust.
  2. It’s not reliable, if a subelement you’re copying doesn’t have a representation that can be eval’d to reproduce an equivalent element.
  3. It’s also less performant.

In 64 bit Python 2.7:

>>> import timeit
>>> import copy
>>> l = range(10)
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
27.55826997756958
>>> min(timeit.repeat(lambda: eval(repr(l))))
29.04534101486206

on 64 bit Python 3.5:

>>> import timeit
>>> import copy
>>> l = list(range(10))
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
16.84255409205798
>>> min(timeit.repeat(lambda: eval(repr(l))))
34.813894678023644

回答 4

已经有很多答案可以告诉您如何制作正确的副本,但是没有一个答案说明您原来的“副本”失败的原因。

Python不会将值存储在变量中。它将名称绑定到对象。您的原始任务采用了所引用的对象并将其my_list绑定到该对象new_list。无论您使用哪个名称,都只有一个列表,因此将其引用为时所做的更改my_list将保持不变new_list。该问题的其他每个答案都为您提供了不同的方法来创建要绑定的新对象new_list

列表中的每个元素都像名称一样,因为每个元素都非排他地绑定到对象。浅表副本会创建一个新列表,其元素绑定到与以前相同的对象。

new_list = list(my_list)  # or my_list[:], but I prefer this syntax
# is simply a shorter way of:
new_list = [element for element in my_list]

要使列表复制更进一步,请复制列表引用的每个对象,然后将这些元素副本绑定到新列表。

import copy  
# each element must have __copy__ defined for this...
new_list = [copy.copy(element) for element in my_list]

这不是一个深层副本,因为列表的每个元素都可以引用其他对象,就像列表绑定到其元素一样。要递归复制列表中的每个元素,然后递归复制每个元素引用的其他对象,依此类推:执行深层复制。

import copy
# each element must have __deepcopy__ defined for this...
new_list = copy.deepcopy(my_list)

有关复制中极端情况的更多信息,请参见文档

There are many answers already that tell you how to make a proper copy, but none of them say why your original ‘copy’ failed.

Python doesn’t store values in variables; it binds names to objects. Your original assignment took the object referred to by my_list and bound it to new_list as well. No matter which name you use there is still only one list, so changes made when referring to it as my_list will persist when referring to it as new_list. Each of the other answers to this question give you different ways of creating a new object to bind to new_list.

Each element of a list acts like a name, in that each element binds non-exclusively to an object. A shallow copy creates a new list whose elements bind to the same objects as before.

new_list = list(my_list)  # or my_list[:], but I prefer this syntax
# is simply a shorter way of:
new_list = [element for element in my_list]

To take your list copy one step further, copy each object that your list refers to, and bind those element copies to a new list.

import copy  
# each element must have __copy__ defined for this...
new_list = [copy.copy(element) for element in my_list]

This is not yet a deep copy, because each element of a list may refer to other objects, just like the list is bound to its elements. To recursively copy every element in the list, and then each other object referred to by each element, and so on: perform a deep copy.

import copy
# each element must have __deepcopy__ defined for this...
new_list = copy.deepcopy(my_list)

See the documentation for more information about corner cases in copying.


回答 5

采用 thing[:]

>>> a = [1,2]
>>> b = a[:]
>>> a += [3]
>>> a
[1, 2, 3]
>>> b
[1, 2]
>>> 

Use thing[:]

>>> a = [1,2]
>>> b = a[:]
>>> a += [3]
>>> a
[1, 2, 3]
>>> b
[1, 2]
>>> 

回答 6

让我们从头开始,探讨这个问题。

因此,假设您有两个列表:

list_1=['01','98']
list_2=[['01','98']]

我们必须复制两个列表,现在从第一个列表开始:

因此,首先让我们尝试将变量设置为copy原始列表list_1

copy=list_1

现在,如果您正在考虑将副本复制到list_1,那么您错了。该id函数可以显示两个变量是否可以指向同一对象。让我们尝试一下:

print(id(copy))
print(id(list_1))

输出为:

4329485320
4329485320

这两个变量是完全相同的参数。你惊喜吗?

因此,我们知道python在变量中不存储任何内容,变量只是引用对象,而对象存储值。这里的对象是a,list但是我们通过两个不同的变量名称创建了对该对象的两个引用。这意味着两个变量都指向相同的对象,只是名称不同。

当您这样做时copy=list_1,它实际上是在做:

在此处输入图片说明

在图像list_1和副本中,这是两个变量名,但是两个变量的对象相同,即 list

因此,如果您尝试修改复制的列表,那么它也将修改原始列表,因为该列表仅存在于此列表中,无论您是从复制列表还是从原始列表进行操作,都将修改该列表:

copy[0]="modify"

print(copy)
print(list_1)

输出:

['modify', '98']
['modify', '98']

因此,它修改了原始列表:

现在,让我们进入用于复制列表的pythonic方法。

copy_1=list_1[:]

此方法解决了我们遇到的第一个问题:

print(id(copy_1))
print(id(list_1))

4338792136
4338791432

因此,如我们所见,两个列表都具有不同的ID,这意味着两个变量都指向不同的对象。所以这里实际发生的是:

在此处输入图片说明

现在,让我们尝试修改列表,看看我们是否仍然面临上一个问题:

copy_1[0]="modify"

print(list_1)
print(copy_1)

输出为:

['01', '98']
['modify', '98']

如您所见,它仅修改了复制的列表。这意味着它有效。

你认为我们完成了吗?否。让我们尝试复制嵌套列表。

copy_2=list_2[:]

list_2应该引用另一个对象,即的副本list_2。让我们检查:

print(id((list_2)),id(copy_2))

我们得到输出:

4330403592 4330403528

现在我们可以假设两个列表都指向不同的对象,所以现在让我们尝试对其进行修改,然后看看它在提供我们想要的东西:

copy_2[0][1]="modify"

print(list_2,copy_2)

这给了我们输出:

[['01', 'modify']] [['01', 'modify']]

这似乎有点令人困惑,因为我们以前使用的相同方法有效。让我们尝试理解这一点。

当您这样做时:

copy_2=list_2[:]

您只复制外部列表,而不复制内部列表。我们可以id再次使用该功能进行检查。

print(id(copy_2[0]))
print(id(list_2[0]))

输出为:

4329485832
4329485832

当我们这样做时copy_2=list_2[:],会发生以下情况:

在此处输入图片说明

它创建列表的副本,但仅创建外部列表副本,而不创建嵌套列表副本,两个变量的嵌套列表相同,因此,如果您尝试修改嵌套列表,则由于嵌套列表对象相同,它也会修改原始列表对于两个列表。

解决办法是什么?解决方案是deepcopy功能。

from copy import deepcopy
deep=deepcopy(list_2)

让我们检查一下:

print(id((list_2)),id(deep))

4322146056 4322148040

两个外部列表都有不同的ID,让我们在内部嵌套列表上尝试一下。

print(id(deep[0]))
print(id(list_2[0]))

输出为:

4322145992
4322145800

如您所见,两个ID不同,这意味着我们可以假设两个嵌套列表现在都指向不同的对象。

这意味着当您执行deep=deepcopy(list_2)实际操作时:

在此处输入图片说明

两个嵌套列表都指向不同的对象,并且它们现在具有单独的嵌套列表副本。

现在,让我们尝试修改嵌套列表,看看它是否解决了先前的问题:

deep[0][1]="modify"
print(list_2,deep)

它输出:

[['01', '98']] [['01', 'modify']]

如您所见,它没有修改原始的嵌套列表,只修改了复制的列表。

Let’s start from the beginning and explore this question.

So let’s suppose you have two lists:

list_1=['01','98']
list_2=[['01','98']]

And we have to copy both lists, now starting from the first list:

So first let’s try by setting the variable copy to our original list, list_1:

copy=list_1

Now if you are thinking copy copied the list_1, then you are wrong. The id function can show us if two variables can point to the same object. Let’s try this:

print(id(copy))
print(id(list_1))

The output is:

4329485320
4329485320

Both variables are the exact same argument. Are you surprised?

So as we know python doesn’t store anything in a variable, Variables are just referencing to the object and object store the value. Here object is a list but we created two references to that same object by two different variable names. This means that both variables are pointing to the same object, just with different names.

When you do copy=list_1, it is actually doing:

enter image description here

Here in the image list_1 and copy are two variable names but the object is same for both variable which is list

So if you try to modify copied list then it will modify the original list too because the list is only one there, you will modify that list no matter you do from the copied list or from the original list:

copy[0]="modify"

print(copy)
print(list_1)

output:

['modify', '98']
['modify', '98']

So it modified the original list :

Now let’s move onto a pythonic method for copying lists.

copy_1=list_1[:]

This method fixes the first issue we had:

print(id(copy_1))
print(id(list_1))

4338792136
4338791432

So as we can see our both list having different id and it means that both variables are pointing to different objects. So what actually going on here is:

enter image description here

Now let’s try to modify the list and let’s see if we still face the previous problem:

copy_1[0]="modify"

print(list_1)
print(copy_1)

The output is:

['01', '98']
['modify', '98']

As you can see, it only modified the copied list. That means it worked.

Do you think we’re done? No. Let’s try to copy our nested list.

copy_2=list_2[:]

list_2 should reference to another object which is copy of list_2. Let’s check:

print(id((list_2)),id(copy_2))

We get the output:

4330403592 4330403528

Now we can assume both lists are pointing different object, so now let’s try to modify it and let’s see it is giving what we want:

copy_2[0][1]="modify"

print(list_2,copy_2)

This gives us the output:

[['01', 'modify']] [['01', 'modify']]

This may seem a little bit confusing, because the same method we previously used worked. Let’s try to understand this.

When you do:

copy_2=list_2[:]

You’re only copying the outer list, not the inside list. We can use the id function once again to check this.

print(id(copy_2[0]))
print(id(list_2[0]))

The output is:

4329485832
4329485832

When we do copy_2=list_2[:], this happens:

enter image description here

It creates the copy of list but only outer list copy, not the nested list copy, nested list is same for both variable, so if you try to modify the nested list then it will modify the original list too as the nested list object is same for both lists.

What is the solution? The solution is the deepcopy function.

from copy import deepcopy
deep=deepcopy(list_2)

Let’s check this:

print(id((list_2)),id(deep))

4322146056 4322148040

Both outer lists have different IDs, let’s try this on the inner nested lists.

print(id(deep[0]))
print(id(list_2[0]))

The output is:

4322145992
4322145800

As you can see both IDs are different, meaning we can assume that both nested lists are pointing different object now.

This means when you do deep=deepcopy(list_2) what actually happens:

enter image description here

Both nested lists are pointing different object and they have separate copy of nested list now.

Now let’s try to modify the nested list and see if it solved the previous issue or not:

deep[0][1]="modify"
print(list_2,deep)

It outputs:

[['01', '98']] [['01', 'modify']]

As you can see, it didn’t modify the original nested list, it only modified the copied list.


回答 7

Python这样做的习惯是 newList = oldList[:]

Python’s idiom for doing this is newList = oldList[:]


回答 8

Python 3.6计时

以下是使用Python 3.6.8的计时结果。请记住,这些时间是相对的,而不是绝对的。

我坚持只做浅表副本,并且还添加了一些新的方法,这些新方法在Python2中是不可能的,例如list.copy()等效于Python3 slice)和列表解包的两种形式(*new_list, = listnew_list = [*list]):

METHOD                  TIME TAKEN
b = [*a]                2.75180600000021
b = a * 1               3.50215399999990
b = a[:]                3.78278899999986  # Python2 winner (see above)
b = a.copy()            4.20556500000020  # Python3 "slice equivalent" (see above)
b = []; b.extend(a)     4.68069800000012
b = a[0:len(a)]         6.84498999999959
*b, = a                 7.54031799999984
b = list(a)             7.75815899999997
b = [i for i in a]      18.4886440000000
b = copy.copy(a)        18.8254879999999
b = []
for item in a:
  b.append(item)        35.4729199999997

我们可以看到Python2赢家仍然表现不错,但并没有在很大程度上超越Python3 list.copy(),特别是考虑到后者的优越可读性。

黑马是拆包和重新打包的方法(b = [*a]),比原始切片快25%,是其他拆包方法(*b, = a)的两倍以上。

b = a * 1 也做得很好。

请注意,这些方法对于列表以外的任何输入均不输出等效结果。它们都适用于可切片的对象,少数适用于任何可迭代的对象,但仅copy.copy()适用于更通用的Python对象。


这是有关各方的测试代码(来自此处的模板):

import timeit

COUNT = 50000000
print("Array duplicating. Tests run", COUNT, "times")
setup = 'a = [0,1,2,3,4,5,6,7,8,9]; import copy'

print("b = list(a)\t\t", timeit.timeit(stmt='b = list(a)', setup=setup, number=COUNT))
print("b = copy.copy(a)\t", timeit.timeit(stmt='b = copy.copy(a)', setup=setup, number=COUNT))
print("b = a.copy()\t\t", timeit.timeit(stmt='b = a.copy()', setup=setup, number=COUNT))
print("b = a[:]\t\t", timeit.timeit(stmt='b = a[:]', setup=setup, number=COUNT))
print("b = a[0:len(a)]\t\t", timeit.timeit(stmt='b = a[0:len(a)]', setup=setup, number=COUNT))
print("*b, = a\t\t\t", timeit.timeit(stmt='*b, = a', setup=setup, number=COUNT))
print("b = []; b.extend(a)\t", timeit.timeit(stmt='b = []; b.extend(a)', setup=setup, number=COUNT))
print("b = []; for item in a: b.append(item)\t", timeit.timeit(stmt='b = []\nfor item in a:  b.append(item)', setup=setup, number=COUNT))
print("b = [i for i in a]\t", timeit.timeit(stmt='b = [i for i in a]', setup=setup, number=COUNT))
print("b = [*a]\t\t", timeit.timeit(stmt='b = [*a]', setup=setup, number=COUNT))
print("b = a * 1\t\t", timeit.timeit(stmt='b = a * 1', setup=setup, number=COUNT))

Python 3.6 Timings

Here are the timing results using Python 3.6.8. Keep in mind these times are relative to one another, not absolute.

I stuck to only doing shallow copies, and also added some new methods that weren’t possible in Python2, such as list.copy() (the Python3 slice equivalent) and two forms of list unpacking (*new_list, = list and new_list = [*list]):

METHOD                  TIME TAKEN
b = [*a]                2.75180600000021
b = a * 1               3.50215399999990
b = a[:]                3.78278899999986  # Python2 winner (see above)
b = a.copy()            4.20556500000020  # Python3 "slice equivalent" (see above)
b = []; b.extend(a)     4.68069800000012
b = a[0:len(a)]         6.84498999999959
*b, = a                 7.54031799999984
b = list(a)             7.75815899999997
b = [i for i in a]      18.4886440000000
b = copy.copy(a)        18.8254879999999
b = []
for item in a:
  b.append(item)        35.4729199999997

We can see the Python2 winner still does well, but doesn’t edge out Python3 list.copy() by much, especially considering the superior readability of the latter.

The dark horse is the unpacking and repacking method (b = [*a]), which is ~25% faster than raw slicing, and more than twice as fast as the other unpacking method (*b, = a).

b = a * 1 also does surprisingly well.

Note that these methods do not output equivalent results for any input other than lists. They all work for sliceable objects, a few work for any iterable, but only copy.copy() works for more general Python objects.


Here is the testing code for interested parties (Template from here):

import timeit

COUNT = 50000000
print("Array duplicating. Tests run", COUNT, "times")
setup = 'a = [0,1,2,3,4,5,6,7,8,9]; import copy'

print("b = list(a)\t\t", timeit.timeit(stmt='b = list(a)', setup=setup, number=COUNT))
print("b = copy.copy(a)\t", timeit.timeit(stmt='b = copy.copy(a)', setup=setup, number=COUNT))
print("b = a.copy()\t\t", timeit.timeit(stmt='b = a.copy()', setup=setup, number=COUNT))
print("b = a[:]\t\t", timeit.timeit(stmt='b = a[:]', setup=setup, number=COUNT))
print("b = a[0:len(a)]\t\t", timeit.timeit(stmt='b = a[0:len(a)]', setup=setup, number=COUNT))
print("*b, = a\t\t\t", timeit.timeit(stmt='*b, = a', setup=setup, number=COUNT))
print("b = []; b.extend(a)\t", timeit.timeit(stmt='b = []; b.extend(a)', setup=setup, number=COUNT))
print("b = []; for item in a: b.append(item)\t", timeit.timeit(stmt='b = []\nfor item in a:  b.append(item)', setup=setup, number=COUNT))
print("b = [i for i in a]\t", timeit.timeit(stmt='b = [i for i in a]', setup=setup, number=COUNT))
print("b = [*a]\t\t", timeit.timeit(stmt='b = [*a]', setup=setup, number=COUNT))
print("b = a * 1\t\t", timeit.timeit(stmt='b = a * 1', setup=setup, number=COUNT))

回答 9

所有其他贡献者都给出了不错的答案,当您只有一个维(级别)列表时,这些方法就可以copy.deepcopy()工作,但是到目前为止,提到的方法仅适用于克隆/复制列表,而list当您处于列表中时,它不能指向嵌套对象使用多维嵌套列表(列表列表)。虽然Felix Kling在回答中提到了此问题,但问题还有很多,并且可能是使用内置方法的变通办法,它可以证明是更快的替代方法deepcopy

虽然new_list = old_list[:]copy.copy(old_list)'对于Py3k old_list.copy()适用于单层列表,它们还原为指向list嵌套在old_list和中的对象new_list,而对其中一个list对象的更改则永久存在于另一个对象中。

编辑:揭露新信息

正如Aaron HallPM 2Ring 所指出的那样,使用eval()不仅是一个坏主意,而且比慢得多copy.deepcopy()

这意味着对于多维列表,唯一的选择是copy.deepcopy()。话虽这么说,当您尝试在中等大小的多维数组上使用它时,性能确实会下降,这确实不是一个选择。我尝试timeit使用42×42的阵列,对于生物信息学应用程序,这并不是闻所未闻的,甚至还不是那么大,我放弃了等待响应,只是开始在这篇文章中输入我的编辑。

这样看来,唯一真正的选择是初始化多个列表并独立处理它们。如果有人对如何处理多维列表复制有任何其他建议,将不胜感激。

如其他人所述,使用模块和多维列表存在 严重的性能问题。copycopy.deepcopy

All of the other contributors gave great answers, which work when you have a single dimension (leveled) list, however of the methods mentioned so far, only copy.deepcopy() works to clone/copy a list and not have it point to the nested list objects when you are working with multidimensional, nested lists (list of lists). While Felix Kling refers to it in his answer, there is a little bit more to the issue and possibly a workaround using built-ins that might prove a faster alternative to deepcopy.

While new_list = old_list[:], copy.copy(old_list)' and for Py3k old_list.copy() work for single-leveled lists, they revert to pointing at the list objects nested within the old_list and the new_list, and changes to one of the list objects are perpetuated in the other.

Edit: New information brought to light

As was pointed out by both Aaron Hall and PM 2Ring using eval() is not only a bad idea, it is also much slower than copy.deepcopy().

This means that for multidimensional lists, the only option is copy.deepcopy(). With that being said, it really isn’t an option as the performance goes way south when you try to use it on a moderately sized multidimensional array. I tried to timeit using a 42×42 array, not unheard of or even that large for bioinformatics applications, and I gave up on waiting for a response and just started typing my edit to this post.

It would seem that the only real option then is to initialize multiple lists and work on them independently. If anyone has any other suggestions, for how to handle multidimensional list copying, it would be appreciated.

As others have stated, there are significant performance issues using the copy module and copy.deepcopy for multidimensional lists.


回答 10

令我惊讶的是尚未提及,因此出于完整性考虑…

您可以使用“ splat运算符”:进行列表解压缩*,这也会复制列表中的元素。

old_list = [1, 2, 3]

new_list = [*old_list]

new_list.append(4)
old_list == [1, 2, 3]
new_list == [1, 2, 3, 4]

该方法的明显缺点是仅在Python 3.5+中可用。

尽管在时间上比较明智,但它似乎比其他常用方法要好。

x = [random.random() for _ in range(1000)]

%timeit a = list(x)
%timeit a = x.copy()
%timeit a = x[:]

%timeit a = [*x]

#: 2.47 µs ± 38.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#: 2.47 µs ± 54.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#: 2.39 µs ± 58.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

#: 2.22 µs ± 43.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

It surprises me that this hasn’t been mentioned yet, so for the sake of completeness…

You can perform list unpacking with the “splat operator”: *, which will also copy elements of your list.

old_list = [1, 2, 3]

new_list = [*old_list]

new_list.append(4)
old_list == [1, 2, 3]
new_list == [1, 2, 3, 4]

The obvious downside to this method is that it is only available in Python 3.5+.

Timing wise though, this appears to perform better than other common methods.

x = [random.random() for _ in range(1000)]

%timeit a = list(x)
%timeit a = x.copy()
%timeit a = x[:]

%timeit a = [*x]

#: 2.47 µs ± 38.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#: 2.47 µs ± 54.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#: 2.39 µs ± 58.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

#: 2.22 µs ± 43.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

回答 11

已经给出的答案中缺少一种独立于python版本的非常简单的方法,您可以在大多数时间使用它(至少我可以这样做):

new_list = my_list * 1       #Solution 1 when you are not using nested lists

但是,如果my_list包含其他容器(例如,嵌套列表),则必须使用Deepcopy,如上面复制库中答案中所建议的那样。例如:

import copy
new_list = copy.deepcopy(my_list)   #Solution 2 when you are using nested lists

奖励:如果您不想复制元素,请使用(也称为浅表复制):

new_list = my_list[:]

让我们了解解决方案1和解决方案2之间的区别

>>> a = range(5)
>>> b = a*1
>>> a,b
([0, 1, 2, 3, 4], [0, 1, 2, 3, 4])
>>> a[2] = 55 
>>> a,b
([0, 1, 55, 3, 4], [0, 1, 2, 3, 4])

如您所见,当我们不使用嵌套列表时,解决方案1可以完美地工作。让我们检查一下将解决方案1应用于嵌套列表时会发生什么。

>>> from copy import deepcopy
>>> a = [range(i,i+4) for i in range(3)]
>>> a
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
>>> b = a*1
>>> c = deepcopy(a)
>>> for i in (a, b, c): print i   
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
>>> a[2].append('99')
>>> for i in (a, b, c): print i   
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5, 99]]
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5, 99]]   #Solution#1 didn't work in nested list
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]       #Solution #2 - DeepCopy worked in nested list

A very simple approach independent of python version was missing in already given answers which you can use most of the time (at least I do):

new_list = my_list * 1       #Solution 1 when you are not using nested lists

However, If my_list contains other containers (for eg. nested lists) you must use deepcopy as others suggested in the answers above from the copy library. For example:

import copy
new_list = copy.deepcopy(my_list)   #Solution 2 when you are using nested lists

.Bonus: If you don’t want to copy elements use (aka shallow copy):

new_list = my_list[:]

Let’s understand difference between Solution#1 and Solution #2

>>> a = range(5)
>>> b = a*1
>>> a,b
([0, 1, 2, 3, 4], [0, 1, 2, 3, 4])
>>> a[2] = 55 
>>> a,b
([0, 1, 55, 3, 4], [0, 1, 2, 3, 4])

As you can see Solution #1 worked perfectly when we were not using the nested lists. Let’s check what will happen when we apply solution #1 to nested lists.

>>> from copy import deepcopy
>>> a = [range(i,i+4) for i in range(3)]
>>> a
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
>>> b = a*1
>>> c = deepcopy(a)
>>> for i in (a, b, c): print i   
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]
>>> a[2].append('99')
>>> for i in (a, b, c): print i   
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5, 99]]
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5, 99]]   #Solution#1 didn't work in nested list
[[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]       #Solution #2 - DeepCopy worked in nested list

回答 12

请注意,在某些情况下,如果您定义了自己的自定义类并且想要保留属性,则应使用copy.copy()copy.deepcopy()而不是替代方法,例如在Python 3中:

import copy

class MyList(list):
    pass

lst = MyList([1,2,3])

lst.name = 'custom list'

d = {
'original': lst,
'slicecopy' : lst[:],
'lstcopy' : lst.copy(),
'copycopy': copy.copy(lst),
'deepcopy': copy.deepcopy(lst)
}


for k,v in d.items():
    print('lst: {}'.format(k), end=', ')
    try:
        name = v.name
    except AttributeError:
        name = 'NA'
    print('name: {}'.format(name))

输出:

lst: original, name: custom list
lst: slicecopy, name: NA
lst: lstcopy, name: NA
lst: copycopy, name: custom list
lst: deepcopy, name: custom list

Note that there are some cases where if you have defined your own custom class and you want to keep the attributes then you should use copy.copy() or copy.deepcopy() rather than the alternatives, for example in Python 3:

import copy

class MyList(list):
    pass

lst = MyList([1,2,3])

lst.name = 'custom list'

d = {
'original': lst,
'slicecopy' : lst[:],
'lstcopy' : lst.copy(),
'copycopy': copy.copy(lst),
'deepcopy': copy.deepcopy(lst)
}


for k,v in d.items():
    print('lst: {}'.format(k), end=', ')
    try:
        name = v.name
    except AttributeError:
        name = 'NA'
    print('name: {}'.format(name))

Outputs:

lst: original, name: custom list
lst: slicecopy, name: NA
lst: lstcopy, name: NA
lst: copycopy, name: custom list
lst: deepcopy, name: custom list

回答 13

new_list = my_list[:]

new_list = my_list 尝试了解这一点。假设my_list位于X位置的堆内存中,即my_list指向X。现在,通过分配new_list = my_list,让new_list指向X。这称为浅拷贝。

现在,如果您进行分配,new_list = my_list[:]您只需将my_list的每个对象复制到new_list。这称为深拷贝。

您可以执行的另一种方法是:

  • new_list = list(old_list)
  • import copy new_list = copy.deepcopy(old_list)
new_list = my_list[:]

new_list = my_list Try to understand this. Let’s say that my_list is in the heap memory at location X i.e. my_list is pointing to the X. Now by assigning new_list = my_list you’re Letting new_list pointing to the X. This is known as shallow Copy.

Now if you assign new_list = my_list[:] You’re simply copying each object of my_list to new_list. This is known as Deep copy.

The Other way you can do this are :

  • new_list = list(old_list)
  • import copy new_list = copy.deepcopy(old_list)

回答 14

我想发布一些与其他答案有些不同的东西。即使这很可能不是最容易理解或最快的选择,但它提供了一些深入了解深度复制工作原理的内部视图,并且是深度复制的另一种替代选择。我的函数是否有错误并不重要,因为这样做的目的是显示一种复制对象(如问题答案)的方法,也可以以此为手段来解释深度复制在其核心中的工作方式。

深层复制功能的核心是进行浅层复制的方法。怎么样?简单。任何深层复制功能只会复制不可变对象的容器。对嵌套列表进行深度复制时,仅复制外部列表,而不复制列表内部的可变对象。您仅在复制容器。上课也一样。对类进行深度复制时,将对所有可变属性进行深度复制。又怎样?您为什么只需要复制容器,如列表,字典,元组,迭代器,类和类实例?

这很简单。可变对象实际上不能被复制。它永远不能更改,因此它只是一个值。这意味着您不必重复字符串,数字,布尔值或任何重复的字符串。但是,您将如何复制容器?简单。您只需使用所有值初始化一个新容器。Deepcopy依赖于递归。它会复制所有容器,甚至是其中包含容器的容器,直到没有剩余容器为止。容器是一个不变的对象。

知道这一点后,无需任何引用即可完全复制对象非常容易。这是一个用于深度复制基本数据类型的函数(不适用于自定义类,但您可以随时添加它)

def deepcopy(x):
  immutables = (str, int, bool, float)
  mutables = (list, dict, tuple)
  if isinstance(x, immutables):
    return x
  elif isinstance(x, mutables):
    if isinstance(x, tuple):
      return tuple(deepcopy(list(x)))
    elif isinstance(x, list):
      return [deepcopy(y) for y in x]
    elif isinstance(x, dict):
      values = [deepcopy(y) for y in list(x.values())]
      keys = list(x.keys())
      return dict(zip(keys, values))

Python自己的内置Deepcopy是基于该示例的。唯一的区别是,它支持其他类型,并且通过将属性复制到新的重复类中来支持用户类,并且还可以通过使用备忘录列表或字典对已经看到的对象的引用来阻止无限递归。制作深拷贝确实就是这样。从本质上讲,深层复制只是浅层复制。我希望这个答案可以为问题增添一些内容。

例子

假设您有以下列表:[1,2,3]。不可变的数字不能重复,但是另一层可以重复。您可以使用列表理解来复制它:[x表示[1、2、3]中的x

现在,假设您有以下列表:[[1,2],[3,4],[5,6]]。这次,您想创建一个函数,该函数使用递归来深度复制列表的所有层。代替先前的列表理解:

[x for x in _list]

它使用一个新的列表:

[deepcopy_list(x) for x in _list]

而且deepcopy_list看起来像这样:

def deepcopy_list(x):
  if isinstance(x, (str, bool, float, int)):
    return x
  else:
    return [deepcopy_list(y) for y in x]

然后,您现在有了一个函数,该函数可以使用递归str,bool,floast,int甚至列表的任何列表深复制到无限多个图层。在那里,您可以进行深度复制。

TLDR:Deepcopy使用递归来复制对象,并且仅返回与以前相同的不可变对象,因为不能复制不可变对象。但是,它将深层复制可变对象的最内层,直到到达对象的最外层可变层。

I wanted to post something a bit different then some of the other answers. Even though this is most likely not the most understandable, or fastest option, it provides a bit of an inside view of how deep copy works, as well as being another alternative option for deep copying. It doesn’t really matter if my function has bugs, since the point of this is to show a way to copy objects like the question answers, but also to use this as a point to explain how deepcopy works at its core.

At the core of any deep copy function is way to make a shallow copy. How? Simple. Any deep copy function only duplicates the containers of immutable objects. When you deepcopy a nested list, you are only duplicating the outer lists, not the mutable objects inside of the lists. You are only duplicating the containers. The same works for classes, too. When you deepcopy a class, you deepcopy all of its mutable attributes. So, how? How come you only have to copy the containers, like lists, dicts, tuples, iters, classes, and class instances?

It’s simple. A mutable object can’t really be duplicated. It can never be changed, so it is only a single value. That means you never have to duplicate strings, numbers, bools, or any of those. But how would you duplicate the containers? Simple. You make just initialize a new container with all of the values. Deepcopy relies on recursion. It duplicates all the containers, even ones with containers inside of them, until no containers are left. A container is an immutable object.

Once you know that, completely duplicating an object without any references is pretty easy. Here’s a function for deepcopying basic data-types (wouldn’t work for custom classes but you could always add that)

def deepcopy(x):
  immutables = (str, int, bool, float)
  mutables = (list, dict, tuple)
  if isinstance(x, immutables):
    return x
  elif isinstance(x, mutables):
    if isinstance(x, tuple):
      return tuple(deepcopy(list(x)))
    elif isinstance(x, list):
      return [deepcopy(y) for y in x]
    elif isinstance(x, dict):
      values = [deepcopy(y) for y in list(x.values())]
      keys = list(x.keys())
      return dict(zip(keys, values))

Python’s own built-in deepcopy is based around that example. The only difference is it supports other types, and also supports user-classes by duplicating the attributes into a new duplicate class, and also blocks infinite-recursion with a reference to an object it’s already seen using a memo list or dictionary. And that’s really it for making deep copies. At its core, making a deep copy is just making shallow copies. I hope this answer adds something to the question.

EXAMPLES

Say you have this list: [1, 2, 3]. The immutable numbers cannot be duplicated, but the other layer can. You can duplicate it using a list comprehension: [x for x in [1, 2, 3]

Now, imagine you have this list: [[1, 2], [3, 4], [5, 6]]. This time, you want to make a function, which uses recursion to deep copy all layers of the list. Instead of the previous list comprehension:

[x for x in _list]

It uses a new one for lists:

[deepcopy_list(x) for x in _list]

And deepcopy_list looks like this:

def deepcopy_list(x):
  if isinstance(x, (str, bool, float, int)):
    return x
  else:
    return [deepcopy_list(y) for y in x]

Then now you have a function which can deepcopy any list of strs, bools, floast, ints and even lists to infinitely many layers using recursion. And there you have it, deepcopying.

TLDR: Deepcopy uses recursion to duplicate objects, and merely returns the same immutable objects as before, as immutable objects cannot be duplicated. However, it deepcopies the most inner layers of mutable objects until it reaches the outermost mutable layer of an object.


回答 15

从id和gc进入内存的实用角度。

>>> b = a = ['hell', 'word']
>>> c = ['hell', 'word']

>>> id(a), id(b), id(c)
(4424020872, 4424020872, 4423979272) 
     |           |
      -----------

>>> id(a[0]), id(b[0]), id(c[0])
(4424018328, 4424018328, 4424018328) # all referring to same 'hell'
     |           |           |
      -----------------------

>>> id(a[0][0]), id(b[0][0]), id(c[0][0])
(4422785208, 4422785208, 4422785208) # all referring to same 'h'
     |           |           |
      -----------------------

>>> a[0] += 'o'
>>> a,b,c
(['hello', 'word'], ['hello', 'word'], ['hell', 'word'])  # b changed too
>>> id(a[0]), id(b[0]), id(c[0])
(4424018384, 4424018384, 4424018328) # augmented assignment changed a[0],b[0]
     |           |
      -----------

>>> b = a = ['hell', 'word']
>>> id(a[0]), id(b[0]), id(c[0])
(4424018328, 4424018328, 4424018328) # the same hell
     |           |           |
      -----------------------

>>> import gc
>>> gc.get_referrers(a[0]) 
[['hell', 'word'], ['hell', 'word']]  # one copy belong to a,b, the another for c
>>> gc.get_referrers(('hell'))
[['hell', 'word'], ['hell', 'word'], ('hell', None)] # ('hello', None) 

A slight practical perspective to look into memory through id and gc.

>>> b = a = ['hell', 'word']
>>> c = ['hell', 'word']

>>> id(a), id(b), id(c)
(4424020872, 4424020872, 4423979272) 
     |           |
      -----------

>>> id(a[0]), id(b[0]), id(c[0])
(4424018328, 4424018328, 4424018328) # all referring to same 'hell'
     |           |           |
      -----------------------

>>> id(a[0][0]), id(b[0][0]), id(c[0][0])
(4422785208, 4422785208, 4422785208) # all referring to same 'h'
     |           |           |
      -----------------------

>>> a[0] += 'o'
>>> a,b,c
(['hello', 'word'], ['hello', 'word'], ['hell', 'word'])  # b changed too
>>> id(a[0]), id(b[0]), id(c[0])
(4424018384, 4424018384, 4424018328) # augmented assignment changed a[0],b[0]
     |           |
      -----------

>>> b = a = ['hell', 'word']
>>> id(a[0]), id(b[0]), id(c[0])
(4424018328, 4424018328, 4424018328) # the same hell
     |           |           |
      -----------------------

>>> import gc
>>> gc.get_referrers(a[0]) 
[['hell', 'word'], ['hell', 'word']]  # one copy belong to a,b, the another for c
>>> gc.get_referrers(('hell'))
[['hell', 'word'], ['hell', 'word'], ('hell', None)] # ('hello', None) 

回答 16

在执行以下操作时,请记住在Python中:

    list1 = ['apples','bananas','pineapples']
    list2 = list1

List2不是存储实际的列表,而是对list1的引用。因此,当您对list1执行任何操作时,list2也会发生变化。使用复制模块(不是默认值,可从pip下载)制作列表的原始副本(copy.copy()用于简单列表,copy.deepcopy()用于嵌套列表)。这将使副本不会随第一个列表更改。

Remember that in Python when you do:

    list1 = ['apples','bananas','pineapples']
    list2 = list1

List2 isn’t storing the actual list, but a reference to list1. So when you do anything to list1, list2 changes as well. use the copy module (not default, download on pip) to make an original copy of the list(copy.copy() for simple lists, copy.deepcopy() for nested ones). This makes a copy that doesn’t change with the first list.


回答 17

deepcopy选项是唯一适用于我的方法:

from copy import deepcopy

a = [   [ list(range(1, 3)) for i in range(3) ]   ]
b = deepcopy(a)
b[0][1]=[3]
print('Deep:')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ]   ]
b = a*1
b[0][1]=[3]
print('*1:')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ] ]
b = a[:]
b[0][1]=[3]
print('Vector copy:')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ]  ]
b = list(a)
b[0][1]=[3]
print('List copy:')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ]  ]
b = a.copy()
b[0][1]=[3]
print('.copy():')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ]  ]
b = a
b[0][1]=[3]
print('Shallow:')
print(a)
print(b)
print('-----------------------------')

导致输出:

Deep:
[[[1, 2], [1, 2], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
*1:
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
Vector copy:
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
List copy:
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
.copy():
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
Shallow:
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------

The deepcopy option is the only method that works for me:

from copy import deepcopy

a = [   [ list(range(1, 3)) for i in range(3) ]   ]
b = deepcopy(a)
b[0][1]=[3]
print('Deep:')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ]   ]
b = a*1
b[0][1]=[3]
print('*1:')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ] ]
b = a[:]
b[0][1]=[3]
print('Vector copy:')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ]  ]
b = list(a)
b[0][1]=[3]
print('List copy:')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ]  ]
b = a.copy()
b[0][1]=[3]
print('.copy():')
print(a)
print(b)
print('-----------------------------')
a = [   [ list(range(1, 3)) for i in range(3) ]  ]
b = a
b[0][1]=[3]
print('Shallow:')
print(a)
print(b)
print('-----------------------------')

leads to output of:

Deep:
[[[1, 2], [1, 2], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
*1:
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
Vector copy:
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
List copy:
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
.copy():
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------
Shallow:
[[[1, 2], [3], [1, 2]]]
[[[1, 2], [3], [1, 2]]]
-----------------------------

访问“ for”循环中的索引?

问题:访问“ for”循环中的索引?

如何for在如下所示的循环中访问索引?

ints = [8, 23, 45, 12, 78]
for i in ints:
    print('item #{} = {}'.format(???, i))

我想得到以下输出:

item #1 = 8
item #2 = 23
item #3 = 45
item #4 = 12
item #5 = 78

当我使用循环遍历它时for,如何访问循环索引(在这种情况下为1到5)?

How do I access the index in a for loop like the following?

ints = [8, 23, 45, 12, 78]
for i in ints:
    print('item #{} = {}'.format(???, i))

I want to get this output:

item #1 = 8
item #2 = 23
item #3 = 45
item #4 = 12
item #5 = 78

When I loop through it using a for loop, how do I access the loop index, from 1 to 5 in this case?


回答 0

使用其他状态变量,例如索引变量(通常在C或PHP等语言中使用),被认为是非Python的。

更好的选择是使用enumerate()Python 2和3中都提供的内置函数。

for idx, val in enumerate(ints):
    print(idx, val)

进一步了解PEP 279

Using an additional state variable, such as an index variable (which you would normally use in languages such as C or PHP), is considered non-pythonic.

The better option is to use the built-in function enumerate(), available in both Python 2 and 3:

for idx, val in enumerate(ints):
    print(idx, val)

Check out PEP 279 for more.


回答 1

使用for循环,在这种情况下如何访问循环索引(从1到5)?

用于enumerate在迭代时获取带有元素的索引:

for index, item in enumerate(items):
    print(index, item)

并请注意,Python的索引从零开始,因此上述值将为0到4。如果要计数1到5,请执行以下操作:

for count, item in enumerate(items, start=1):
    print(count, item)

单项控制流

您所要求的是以下Pythonic等效项,这是大多数低级语言程序员将使用的算法:

index = 0            # Python's indexing starts at zero
for item in items:   # Python's for loops are a "for each" loop 
    print(index, item)
    index += 1

或使用没有for-each循环的语言:

index = 0
while index < len(items):
    print(index, items[index])
    index += 1

或有时在Python中更常见(但唯一):

for index in range(len(items)):
    print(index, items[index])

使用枚举功能

Python的enumerate功能通过隐藏索引的记帐,并将可迭代项封装到另一个可迭代项(一个enumerate对象)中,从而减少了视觉混乱,该可迭代项产生了两个索引元组以及原始可迭代项将提供的项目。看起来像这样:

for index, item in enumerate(items, start=0):   # default is zero
    print(index, item)

此代码示例很好说明了Python特有的代码与非Python特有的代码之间的区别的典范示例。惯用代码是复杂的(但不复杂)Python,以预期使用的方式编写。语言的设计者期望使用惯用代码,这意味着通常该代码不仅更具可读性,而且效率更高。

计数

即使您不需要索引,但是您需要对迭代次数(有时是理想的)1进行计数,而最终的数字将是您的计数。

for count, item in enumerate(items, start=1):   # default is zero
    print(item)

print('there were {0} items printed'.format(count))

当您说想要从1到5时,该计数似乎更多地是您想要的内容(而不是索引)。


分解-逐步说明

为了分解这些示例,假设我们有一个要迭代的项目列表,并带有一个索引:

items = ['a', 'b', 'c', 'd', 'e']

现在,我们通过此可迭代的枚举,创建一个枚举对象:

enumerate_object = enumerate(items) # the enumerate object

我们可以从该迭代中提取第一个项目,以使我们可以使用该next函数进行循环:

iteration = next(enumerate_object) # first iteration from enumerate
print(iteration)

我们看到我们得到了元组0,第一个索引,和'a',第一项:

(0, 'a')

我们可以使用所谓的“ 序列拆包 ”从这两个元组中提取元素:

index, item = iteration
#   0,  'a' = (0, 'a') # essentially this.

当我们检查时index,我们发现它指的是第一个索引0,并且item指的是第一项'a'

>>> print(index)
0
>>> print(item)
a

结论

  • Python索引从零开始
  • 要在迭代过程中从迭代器获取这些索引,请使用枚举函数
  • 以惯用方式使用枚举(以及元组拆包)将创建更易读和可维护的代码:

这样做:

for index, item in enumerate(items, start=0):   # Python indexes start at zero
    print(index, item)

Using a for loop, how do I access the loop index, from 1 to 5 in this case?

Use enumerate to get the index with the element as you iterate:

for index, item in enumerate(items):
    print(index, item)

And note that Python’s indexes start at zero, so you would get 0 to 4 with the above. If you want the count, 1 to 5, do this:

for count, item in enumerate(items, start=1):
    print(count, item)

Unidiomatic control flow

What you are asking for is the Pythonic equivalent of the following, which is the algorithm most programmers of lower-level languages would use:

index = 0            # Python's indexing starts at zero
for item in items:   # Python's for loops are a "for each" loop 
    print(index, item)
    index += 1

Or in languages that do not have a for-each loop:

index = 0
while index < len(items):
    print(index, items[index])
    index += 1

or sometimes more commonly (but unidiomatically) found in Python:

for index in range(len(items)):
    print(index, items[index])

Use the Enumerate Function

Python’s enumerate function reduces the visual clutter by hiding the accounting for the indexes, and encapsulating the iterable into another iterable (an enumerate object) that yields a two-item tuple of the index and the item that the original iterable would provide. That looks like this:

for index, item in enumerate(items, start=0):   # default is zero
    print(index, item)

This code sample is fairly well the canonical example of the difference between code that is idiomatic of Python and code that is not. Idiomatic code is sophisticated (but not complicated) Python, written in the way that it was intended to be used. Idiomatic code is expected by the designers of the language, which means that usually this code is not just more readable, but also more efficient.

Getting a count

Even if you don’t need indexes as you go, but you need a count of the iterations (sometimes desirable) you can start with 1 and the final number will be your count.

for count, item in enumerate(items, start=1):   # default is zero
    print(item)

print('there were {0} items printed'.format(count))

The count seems to be more what you intend to ask for (as opposed to index) when you said you wanted from 1 to 5.


Breaking it down – a step by step explanation

To break these examples down, say we have a list of items that we want to iterate over with an index:

items = ['a', 'b', 'c', 'd', 'e']

Now we pass this iterable to enumerate, creating an enumerate object:

enumerate_object = enumerate(items) # the enumerate object

We can pull the first item out of this iterable that we would get in a loop with the next function:

iteration = next(enumerate_object) # first iteration from enumerate
print(iteration)

And we see we get a tuple of 0, the first index, and 'a', the first item:

(0, 'a')

we can use what is referred to as “sequence unpacking” to extract the elements from this two-tuple:

index, item = iteration
#   0,  'a' = (0, 'a') # essentially this.

and when we inspect index, we find it refers to the first index, 0, and item refers to the first item, 'a'.

>>> print(index)
0
>>> print(item)
a

Conclusion

  • Python indexes start at zero
  • To get these indexes from an iterable as you iterate over it, use the enumerate function
  • Using enumerate in the idiomatic way (along with tuple unpacking) creates code that is more readable and maintainable:

So do this:

for index, item in enumerate(items, start=0):   # Python indexes start at zero
    print(index, item)

回答 2

这是很简单的,从开始它1以外0

for index, item in enumerate(iterable, start=1):
   print index, item

注意

重要提示,尽管index可能会引起误解,但tuple (idx, item)在这里。好去。

It’s pretty simple to start it from 1 other than 0:

for index, item in enumerate(iterable, start=1):
   print index, item

Note

Important hint, though a little misleading since index will be a tuple (idx, item) here. Good to go.


回答 3

for i in range(len(ints)):
   print i, ints[i]
for i in range(len(ints)):
   print i, ints[i]

回答 4

按照Python的规范,有几种方法可以做到这一点。在所有示例中均假定:lst = [1, 2, 3, 4, 5]

1.使用枚举(被认为是最惯用的

for index, element in enumerate(lst):
    # do the things that need doing here

我认为这也是最安全的选择,因为消除了进行无限递归的机会。项目及其索引都保存在变量中,无需编写任何其他代码即可访问该项目。

2.创建一个变量来保存索引(使用for

for index in range(len(lst)):   # or xrange
    # you will have to write extra code to get the element

3.创建一个变量来保存索引(使用while

index = 0
while index < len(lst):
    # you will have to write extra code to get the element
    index += 1  # escape infinite recursion

4.总有另一种方法

如前所述,还有其他方法尚未在此处进行说明,它们甚至可能在其他情况下更适用。例如使用itertools.chainfor。它比其他示例更好地处理嵌套循环。

As is the norm in Python there are several ways to do this. In all examples assume: lst = [1, 2, 3, 4, 5]

1. Using enumerate (considered most idiomatic)

for index, element in enumerate(lst):
    # do the things that need doing here

This is also the safest option in my opinion because the chance of going into infinite recursion has been eliminated. Both the item and its index are held in variables and there is no need to write any further code to access the item.

2. Creating a variable to hold the index (using for)

for index in range(len(lst)):   # or xrange
    # you will have to write extra code to get the element

3. Creating a variable to hold the index (using while)

index = 0
while index < len(lst):
    # you will have to write extra code to get the element
    index += 1  # escape infinite recursion

4. There is always another way

As explained before, there are other ways to do this that have not been explained here and they may even apply more in other situations. e.g using itertools.chain with for. It handles nested loops better than the other examples.


回答 5

老式的方式:

for ix in range(len(ints)):
    print ints[ix]

清单理解:

[ (ix, ints[ix]) for ix in range(len(ints))]

>>> ints
[1, 2, 3, 4, 5]
>>> for ix in range(len(ints)): print ints[ix]
... 
1
2
3
4
5
>>> [ (ix, ints[ix]) for ix in range(len(ints))]
[(0, 1), (1, 2), (2, 3), (3, 4), (4, 5)]
>>> lc = [ (ix, ints[ix]) for ix in range(len(ints))]
>>> for tup in lc:
...     print tup
... 
(0, 1)
(1, 2)
(2, 3)
(3, 4)
(4, 5)
>>> 

Old fashioned way:

for ix in range(len(ints)):
    print ints[ix]

List comprehension:

[ (ix, ints[ix]) for ix in range(len(ints))]

>>> ints
[1, 2, 3, 4, 5]
>>> for ix in range(len(ints)): print ints[ix]
... 
1
2
3
4
5
>>> [ (ix, ints[ix]) for ix in range(len(ints))]
[(0, 1), (1, 2), (2, 3), (3, 4), (4, 5)]
>>> lc = [ (ix, ints[ix]) for ix in range(len(ints))]
>>> for tup in lc:
...     print tup
... 
(0, 1)
(1, 2)
(2, 3)
(3, 4)
(4, 5)
>>> 

回答 6

Python 2.7中访问循环内列表索引的最快方法是对小列表使用range方法,对中型和大型列表使用枚举方法

请参阅不同的方法,可以在列表和访问索引值被用来遍历和其性能指标(我想是对您有用)下面的代码样本:

from timeit import timeit

# Using range
def range_loop(iterable):
    for i in range(len(iterable)):
        1 + iterable[i]

# Using xrange
def xrange_loop(iterable):
    for i in xrange(len(iterable)):
        1 + iterable[i]

# Using enumerate
def enumerate_loop(iterable):
    for i, val in enumerate(iterable):
        1 + val

# Manual indexing
def manual_indexing_loop(iterable):
    index = 0
    for item in iterable:
        1 + item
        index += 1

请参阅以下每种方法的性能指标:

from timeit import timeit

def measure(l, number=10000):
print "Measure speed for list with %d items" % len(l)
print "xrange: ", timeit(lambda :xrange_loop(l), number=number)
print "range: ", timeit(lambda :range_loop(l), number=number)
print "enumerate: ", timeit(lambda :enumerate_loop(l), number=number)
print "manual_indexing: ", timeit(lambda :manual_indexing_loop(l), number=number)

measure(range(1000))
# Measure speed for list with 1000 items
# xrange:  0.758321046829
# range:  0.701184988022
# enumerate:  0.724966049194
# manual_indexing:  0.894635915756

measure(range(10000))
# Measure speed for list with 100000 items
# xrange:  81.4756360054
# range:  75.0172479153
# enumerate:  74.687623024
# manual_indexing:  91.6308541298

measure(range(10000000), number=100)
# Measure speed for list with 10000000 items
# xrange:  82.267786026
# range:  84.0493988991
# enumerate:  78.0344707966
# manual_indexing:  95.0491430759

结果,使用range方法是列出1000个项目中最快的一种。对于大小大于10000的列表,enumerate则为获胜者。

在下面添加一些有用的链接:

The fastest way to access indexes of list within loop in Python 2.7 is to use the range method for small lists and enumerate method for medium and huge size lists.

Please see different approaches which can be used to iterate over list and access index value and their performance metrics (which I suppose would be useful for you) in code samples below:

from timeit import timeit

# Using range
def range_loop(iterable):
    for i in range(len(iterable)):
        1 + iterable[i]

# Using xrange
def xrange_loop(iterable):
    for i in xrange(len(iterable)):
        1 + iterable[i]

# Using enumerate
def enumerate_loop(iterable):
    for i, val in enumerate(iterable):
        1 + val

# Manual indexing
def manual_indexing_loop(iterable):
    index = 0
    for item in iterable:
        1 + item
        index += 1

See performance metrics for each method below:

from timeit import timeit

def measure(l, number=10000):
print "Measure speed for list with %d items" % len(l)
print "xrange: ", timeit(lambda :xrange_loop(l), number=number)
print "range: ", timeit(lambda :range_loop(l), number=number)
print "enumerate: ", timeit(lambda :enumerate_loop(l), number=number)
print "manual_indexing: ", timeit(lambda :manual_indexing_loop(l), number=number)

measure(range(1000))
# Measure speed for list with 1000 items
# xrange:  0.758321046829
# range:  0.701184988022
# enumerate:  0.724966049194
# manual_indexing:  0.894635915756

measure(range(10000))
# Measure speed for list with 100000 items
# xrange:  81.4756360054
# range:  75.0172479153
# enumerate:  74.687623024
# manual_indexing:  91.6308541298

measure(range(10000000), number=100)
# Measure speed for list with 10000000 items
# xrange:  82.267786026
# range:  84.0493988991
# enumerate:  78.0344707966
# manual_indexing:  95.0491430759

As the result, using range method is the fastest one up to list with 1000 items. For list with size > 10 000 items enumerate is the winner.

Adding some useful links below:


回答 7

首先,索引将从0到4。编程语言从0开始计数;从0开始计数。不要忘了,否则您将遇到索引超出范围的异常。for循环中需要的只是一个从0到4的变量,如下所示:

for x in range(0, 5):

请记住,我写了0到5,因为循环在最大值之前停了一个数字。:)

要获取索引的值,请使用

list[index]

First of all, the indexes will be from 0 to 4. Programming languages start counting from 0; don’t forget that or you will come across an index out of bounds exception. All you need in the for loop is a variable counting from 0 to 4 like so:

for x in range(0, 5):

Keep in mind that I wrote 0 to 5 because the loop stops one number before the max. :)

To get the value of an index use

list[index]

回答 8

这是for循环访问索引时得到的结果:

for i in enumerate(items): print(i)

items = [8, 23, 45, 12, 78]

for i in enumerate(items):
    print("index/value", i)

结果:

# index/value (0, 8)
# index/value (1, 23)
# index/value (2, 45)
# index/value (3, 12)
# index/value (4, 78)

for i, val in enumerate(items): print(i, val)

items = [8, 23, 45, 12, 78]

for i, val in enumerate(items):
    print("index", i, "for value", val)

结果:

# index 0 for value 8
# index 1 for value 23
# index 2 for value 45
# index 3 for value 12
# index 4 for value 78

for i, val in enumerate(items): print(i)

items = [8, 23, 45, 12, 78]

for i, val in enumerate(items):
    print("index", i)

结果:

# index 0
# index 1
# index 2
# index 3
# index 4

Here’s what you get when you’re accessing index in for loops:

for i in enumerate(items): print(i)

items = [8, 23, 45, 12, 78]

for i in enumerate(items):
    print("index/value", i)

Result:

# index/value (0, 8)
# index/value (1, 23)
# index/value (2, 45)
# index/value (3, 12)
# index/value (4, 78)

for i, val in enumerate(items): print(i, val)

items = [8, 23, 45, 12, 78]

for i, val in enumerate(items):
    print("index", i, "for value", val)

Result:

# index 0 for value 8
# index 1 for value 23
# index 2 for value 45
# index 3 for value 12
# index 4 for value 78

for i, val in enumerate(items): print(i)

items = [8, 23, 45, 12, 78]

for i, val in enumerate(items):
    print("index", i)

Result:

# index 0
# index 1
# index 2
# index 3
# index 4

回答 9

根据此讨论:http : //bytes.com/topic/python/answers/464012-objects-list-index

循环计数器迭代

当前用于遍历索引的惯用法使用内置range函数:

for i in range(len(sequence)):
    # work with index i

可以通过旧习惯用法或使用新的zip内置函数来实现元素和索引的循环:

for i in range(len(sequence)):
    e = sequence[i]
    # work with index i and element e

要么

for i, e in zip(range(len(sequence)), sequence):
    # work with index i and element e

通过http://www.python.org/dev/peps/pep-0212/

According to this discussion: http://bytes.com/topic/python/answers/464012-objects-list-index

Loop counter iteration

The current idiom for looping over the indices makes use of the built-in range function:

for i in range(len(sequence)):
    # work with index i

Looping over both elements and indices can be achieved either by the old idiom or by using the new zip built-in function:

for i in range(len(sequence)):
    e = sequence[i]
    # work with index i and element e

or

for i, e in zip(range(len(sequence)), sequence):
    # work with index i and element e

via http://www.python.org/dev/peps/pep-0212/


回答 10

您可以使用以下代码进行操作:

ints = [8, 23, 45, 12, 78]
index = 0

for value in (ints):
    index +=1
    print index, value

如果您需要在循环结束时重置索引值,请使用此代码:

ints = [8, 23, 45, 12, 78]
index = 0

for value in (ints):
    index +=1
    print index, value
    if index >= len(ints)-1:
        index = 0

You can do it with this code:

ints = [8, 23, 45, 12, 78]
index = 0

for value in (ints):
    index +=1
    print index, value

Use this code if you need to reset the index value at the end of the loop:

ints = [8, 23, 45, 12, 78]
index = 0

for value in (ints):
    index +=1
    print index, value
    if index >= len(ints)-1:
        index = 0

回答 11

解决此问题的最佳方法是使用枚举内置python函数。
枚举返回元组
第一个值是索引,
第二个值是该索引处数组的元素

In [1]: ints = [8, 23, 45, 12, 78]

In [2]: for idx, val in enumerate(ints):
   ...:         print(idx, val)
   ...:     
(0, 8)
(1, 23)
(2, 45)
(3, 12)
(4, 78)

Best solution for this problem is use enumerate in-build python function.
enumerate return tuple
first value is index
second value is element of array at that index

In [1]: ints = [8, 23, 45, 12, 78]

In [2]: for idx, val in enumerate(ints):
   ...:         print(idx, val)
   ...:     
(0, 8)
(1, 23)
(2, 45)
(3, 12)
(4, 78)

回答 12

在您的问题中,您写道:“在这种情况下,我如何从1到5访问循环索引?”

但是,列表的索引从零开始。因此,那么我们需要知道您真正想要的是列表中每个项目的索引和项目,还是您真正想要的是从1开始的数字。幸运的是,在Python中,执行这一项或两项都很容易。

首先,要澄清一下,该enumerate函数迭代地返回列表中每个项目的索引和相应项目。

alist = [1, 2, 3, 4, 5]

for n, a in enumerate(alist):
    print("%d %d" % (n, a))

上面的输出是

0 1
1 2
2 3
3 4
4 5

请注意,索引从0开始。这种索引在包括Python和C在内的现代编程语言中很常见。

如果希望循环跨越列表的一部分,则可以将标准Python语法用于列表的一部分。例如,要从列表中的第二个项目循环到最后一个但不包括最后一个项目,可以使用

for n, a in enumerate(alist[1:-1]):
    print("%d %d" % (n, a))

请再次注意,输出索引从0开始,

0 2
1 3
2 4

这给我们带来了start=n的开关enumerate()。这只是使索引偏移,您可以等效地在循环内向索引简单地添加一个数字。

for n, a in enumerate(alist, start=1):
    print("%d %d" % (n, a))

其输出是

1 1
2 2
3 3
4 4
5 5

In your question, you write “how do I access the loop index, from 1 to 5 in this case?”

However, the index for a list runs from zero. So, then we need to know if what you actually want is the index and item for each item in a list, or whether you really want numbers starting from 1. Fortunately, in Python, it is easy to do either or both.

First, to clarify, the enumerate function iteratively returns the index and corresponding item for each item in a list.

alist = [1, 2, 3, 4, 5]

for n, a in enumerate(alist):
    print("%d %d" % (n, a))

The output for the above is then,

0 1
1 2
2 3
3 4
4 5

Notice that the index runs from 0. This kind of indexing is common among modern programming languages including Python and C.

If you want your loop to span a part of the list, you can use the standard Python syntax for a part of the list. For example, to loop from the second item in a list up to but not including the last item, you could use

for n, a in enumerate(alist[1:-1]):
    print("%d %d" % (n, a))

Note that once again, the output index runs from 0,

0 2
1 3
2 4

That brings us to the start=n switch for enumerate(). This simply offsets the index, you can equivalently simply add a number to the index inside the loop.

for n, a in enumerate(alist, start=1):
    print("%d %d" % (n, a))

for which the output is

1 1
2 2
3 3
4 4
5 5

回答 13

如果我要迭代,nums = [1, 2, 3, 4, 5]我会做

for i, num in enumerate(nums, start=1):
    print(i, num)

或获得长度为 l = len(nums)

for i in range(l):
    print(i+1, nums[i])

If I were to iterate nums = [1, 2, 3, 4, 5] I would do

for i, num in enumerate(nums, start=1):
    print(i, num)

Or get the length as l = len(nums)

for i in range(l):
    print(i+1, nums[i])

回答 14

如果列表中没有重复的值:

for i in ints:
    indx = ints.index(i)
    print(i, indx)

If there is no duplicate value in the list:

for i in ints:
    indx = ints.index(i)
    print(i, indx)

回答 15

您也可以尝试以下操作:

data = ['itemA.ABC', 'itemB.defg', 'itemC.drug', 'itemD.ashok']
x = []
for (i, item) in enumerate(data):
      a = (i, str(item).split('.'))
      x.append(a)
for index, value in x:
     print(index, value)

输出是

0 ['itemA', 'ABC']
1 ['itemB', 'defg']
2 ['itemC', 'drug']
3 ['itemD', 'ashok']

You can also try this:

data = ['itemA.ABC', 'itemB.defg', 'itemC.drug', 'itemD.ashok']
x = []
for (i, item) in enumerate(data):
      a = (i, str(item).split('.'))
      x.append(a)
for index, value in x:
     print(index, value)

The output is

0 ['itemA', 'ABC']
1 ['itemB', 'defg']
2 ['itemC', 'drug']
3 ['itemD', 'ashok']

回答 16

您可以使用index方法

ints = [8, 23, 45, 12, 78]
inds = [ints.index(i) for i in ints]

编辑 在注释中突出显示,如果中存在重复项ints,则此方法不起作用,下面的方法应适用于以下任何值ints

ints = [8, 8, 8, 23, 45, 12, 78]
inds = [tup[0] for tup in enumerate(ints)]

或者

ints = [8, 8, 8, 23, 45, 12, 78]
inds = [tup for tup in enumerate(ints)]

如果要同时获取索引和值ints作为元组列表。

它使用在enumerate此问题的选定答案中的方法,但具有列表理解功能,因此可以用较少的代码来加快速度。

You can use the index method

ints = [8, 23, 45, 12, 78]
inds = [ints.index(i) for i in ints]

EDIT Highlighted in the comment that this method doesn’t work if there are duplicates in ints, the method below should work for any values in ints:

ints = [8, 8, 8, 23, 45, 12, 78]
inds = [tup[0] for tup in enumerate(ints)]

Or alternatively

ints = [8, 8, 8, 23, 45, 12, 78]
inds = [tup for tup in enumerate(ints)]

if you want to get both the index and the value in ints as a list of tuples.

It uses the method of enumerate in the selected answer to this question, but with list comprehension, making it faster with less code.


回答 17

使用While循环的简单答案:

arr = [8, 23, 45, 12, 78]
i = 0
while i<len(arr):
    print("Item ",i+1," = ",arr[i])
    i +=1

输出:

在此处输入图片说明

Simple answer using While Loop:

arr = [8, 23, 45, 12, 78]
i = 0
while i<len(arr):
    print("Item ",i+1," = ",arr[i])
    i +=1

Output:

enter image description here


回答 18

要使用for循环在列表理解中打印(索引,值)的元组:

ints = [8, 23, 45, 12, 78]
print [(i,ints[i]) for i in range(len(ints))]

输出:

[(0, 8), (1, 23), (2, 45), (3, 12), (4, 78)]

To print tuple of (index, value) in list comprehension using a for loop:

ints = [8, 23, 45, 12, 78]
print [(i,ints[i]) for i in range(len(ints))]

Output:

[(0, 8), (1, 23), (2, 45), (3, 12), (4, 78)]

回答 19

这足以达到目的:

list1 = [10, 'sumit', 43.21, 'kumar', '43', 'test', 3]
for x in list1:
    print('index:', list1.index(x), 'value:', x)

This serves the purpose well enough:

list1 = [10, 'sumit', 43.21, 'kumar', '43', 'test', 3]
for x in list1:
    print('index:', list1.index(x), 'value:', x)

如何从列表列表中制作平面列表?

问题:如何从列表列表中制作平面列表?

我想知道是否有捷径可以从Python的列表清单中做出一个简单的清单。

我可以for循环执行此操作,但是也许有一些很酷的“单线”功能?我尝试使用reduce(),但出现错误。

l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
reduce(lambda x, y: x.extend(y), l)

错误信息

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <lambda>
AttributeError: 'NoneType' object has no attribute 'extend'

I wonder whether there is a shortcut to make a simple list out of list of lists in Python.

I can do that in a for loop, but maybe there is some cool “one-liner”? I tried it with reduce(), but I get an error.

Code

l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
reduce(lambda x, y: x.extend(y), l)

Error message

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <lambda>
AttributeError: 'NoneType' object has no attribute 'extend'

回答 0

给定一个列表列表l

flat_list = [item for sublist in l for item in sublist]

意思是:

flat_list = []
for sublist in l:
    for item in sublist:
        flat_list.append(item)

比到目前为止发布的快捷方式快。(l是要展平的列表。)

这是相应的功能:

flatten = lambda l: [item for sublist in l for item in sublist]

作为证据,您可以使用timeit标准库中的模块:

$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 1.1 msec per loop

说明:基于快捷方式+(包括中的隐含使用sum)的必要性是O(L**2)当有L个子列表时-随着中间结果列表的长度越来越长,每一步都会分配一个新的中间结果列表对象,并且所有项目必须复制前一个中间结果中的结果(以及最后添加的一些新结果)。因此,为简单起见,而又不失去一般性,请说您有I个项目的L个子列表:第一个I项目来回复制L-1次,第二个I项目L-2次,依此类推;等等。总份数是I乘以x从1到L的x的总和,即I * (L**2)/2

列表理解只生成一次列表,然后将每个项目(从其原始居住地复制到结果列表)也恰好复制一次。

Given a list of lists l,

flat_list = [item for sublist in l for item in sublist]

which means:

flat_list = []
for sublist in l:
    for item in sublist:
        flat_list.append(item)

is faster than the shortcuts posted so far. (l is the list to flatten.)

Here is the corresponding function:

flatten = lambda l: [item for sublist in l for item in sublist]

As evidence, you can use the timeit module in the standard library:

$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' '[item for sublist in l for item in sublist]'
10000 loops, best of 3: 143 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'sum(l, [])'
1000 loops, best of 3: 969 usec per loop
$ python -mtimeit -s'l=[[1,2,3],[4,5,6], [7], [8,9]]*99' 'reduce(lambda x,y: x+y,l)'
1000 loops, best of 3: 1.1 msec per loop

Explanation: the shortcuts based on + (including the implied use in sum) are, of necessity, O(L**2) when there are L sublists — as the intermediate result list keeps getting longer, at each step a new intermediate result list object gets allocated, and all the items in the previous intermediate result must be copied over (as well as a few new ones added at the end). So, for simplicity and without actual loss of generality, say you have L sublists of I items each: the first I items are copied back and forth L-1 times, the second I items L-2 times, and so on; total number of copies is I times the sum of x for x from 1 to L excluded, i.e., I * (L**2)/2.

The list comprehension just generates one list, once, and copies each item over (from its original place of residence to the result list) also exactly once.


回答 1

您可以使用itertools.chain()

import itertools
list2d = [[1,2,3], [4,5,6], [7], [8,9]]
merged = list(itertools.chain(*list2d))

或者,您可以使用itertools.chain.from_iterable()不需要使用*运算符解压缩列表的方法:

import itertools
list2d = [[1,2,3], [4,5,6], [7], [8,9]]
merged = list(itertools.chain.from_iterable(list2d))

You can use itertools.chain():

import itertools
list2d = [[1,2,3], [4,5,6], [7], [8,9]]
merged = list(itertools.chain(*list2d))

Or you can use itertools.chain.from_iterable() which doesn’t require unpacking the list with the * operator:

import itertools
list2d = [[1,2,3], [4,5,6], [7], [8,9]]
merged = list(itertools.chain.from_iterable(list2d))

回答 2

作者注意:这是低效的。但是很有趣,因为类人猿很棒。它不适用于生产Python代码。

>>> sum(l, [])
[1, 2, 3, 4, 5, 6, 7, 8, 9]

这只是对在第一个参数中传递的iterable元素求和,将第二个参数视为和的初始值(如果未给出,0则改为使用和,这种情况下会给您带来错误)。

由于您是对嵌套列表求和,因此实际上得到[1,3]+[2,4]的结果sum([[1,3],[2,4]],[])等于[1,3,2,4]

请注意,仅适用于列表列表。对于列表列表列表,您将需要其他解决方案。

Note from the author: This is inefficient. But fun, because monoids are awesome. It’s not appropriate for production Python code.

>>> sum(l, [])
[1, 2, 3, 4, 5, 6, 7, 8, 9]

This just sums the elements of iterable passed in the first argument, treating second argument as the initial value of the sum (if not given, 0 is used instead and this case will give you an error).

Because you are summing nested lists, you actually get [1,3]+[2,4] as a result of sum([[1,3],[2,4]],[]), which is equal to [1,3,2,4].

Note that only works on lists of lists. For lists of lists of lists, you’ll need another solution.


回答 3

我使用perfplot(我的一个宠物项目,本质上是一个包装纸timeit)测试了大多数建议的解决方案,然后发现

functools.reduce(operator.iconcat, a, [])

串联多个小列表和几个长列表时,这是最快的解决方案。(operator.iadd同样快。)

在此处输入图片说明

在此处输入图片说明


复制剧情的代码:

import functools
import itertools
import numpy
import operator
import perfplot


def forfor(a):
    return [item for sublist in a for item in sublist]


def sum_brackets(a):
    return sum(a, [])


def functools_reduce(a):
    return functools.reduce(operator.concat, a)


def functools_reduce_iconcat(a):
    return functools.reduce(operator.iconcat, a, [])


def itertools_chain(a):
    return list(itertools.chain.from_iterable(a))


def numpy_flat(a):
    return list(numpy.array(a).flat)


def numpy_concatenate(a):
    return list(numpy.concatenate(a))


perfplot.show(
    setup=lambda n: [list(range(10))] * n,
    # setup=lambda n: [list(range(n))] * 10,
    kernels=[
        forfor,
        sum_brackets,
        functools_reduce,
        functools_reduce_iconcat,
        itertools_chain,
        numpy_flat,
        numpy_concatenate,
    ],
    n_range=[2 ** k for k in range(16)],
    xlabel="num lists (of length 10)",
    # xlabel="len lists (10 lists total)"
)

I tested most suggested solutions with perfplot (a pet project of mine, essentially a wrapper around timeit), and found

functools.reduce(operator.iconcat, a, [])

to be the fastest solution, both when many small lists and few long lists are concatenated. (operator.iadd is equally fast.)

enter image description here

enter image description here


Code to reproduce the plot:

import functools
import itertools
import numpy
import operator
import perfplot


def forfor(a):
    return [item for sublist in a for item in sublist]


def sum_brackets(a):
    return sum(a, [])


def functools_reduce(a):
    return functools.reduce(operator.concat, a)


def functools_reduce_iconcat(a):
    return functools.reduce(operator.iconcat, a, [])


def itertools_chain(a):
    return list(itertools.chain.from_iterable(a))


def numpy_flat(a):
    return list(numpy.array(a).flat)


def numpy_concatenate(a):
    return list(numpy.concatenate(a))


perfplot.show(
    setup=lambda n: [list(range(10))] * n,
    # setup=lambda n: [list(range(n))] * 10,
    kernels=[
        forfor,
        sum_brackets,
        functools_reduce,
        functools_reduce_iconcat,
        itertools_chain,
        numpy_flat,
        numpy_concatenate,
    ],
    n_range=[2 ** k for k in range(16)],
    xlabel="num lists (of length 10)",
    # xlabel="len lists (10 lists total)"
)

回答 4

from functools import reduce #python 3

>>> l = [[1,2,3],[4,5,6], [7], [8,9]]
>>> reduce(lambda x,y: x+y,l)
[1, 2, 3, 4, 5, 6, 7, 8, 9]

extend()您的示例中的方法将修改x而不是返回有用的值(期望值reduce())。

reduce版本的更快方法是

>>> import operator
>>> l = [[1,2,3],[4,5,6], [7], [8,9]]
>>> reduce(operator.concat, l)
[1, 2, 3, 4, 5, 6, 7, 8, 9]
from functools import reduce #python 3

>>> l = [[1,2,3],[4,5,6], [7], [8,9]]
>>> reduce(lambda x,y: x+y,l)
[1, 2, 3, 4, 5, 6, 7, 8, 9]

The extend() method in your example modifies x instead of returning a useful value (which reduce() expects).

A faster way to do the reduce version would be

>>> import operator
>>> l = [[1,2,3],[4,5,6], [7], [8,9]]
>>> reduce(operator.concat, l)
[1, 2, 3, 4, 5, 6, 7, 8, 9]

回答 5

如果您使用Django,请不要重新发明轮子:

>>> from django.contrib.admin.utils import flatten
>>> l = [[1,2,3], [4,5], [6]]
>>> flatten(l)
>>> [1, 2, 3, 4, 5, 6]

熊猫

>>> from pandas.core.common import flatten
>>> list(flatten(l))

Itertools

>>> import itertools
>>> flatten = itertools.chain.from_iterable
>>> list(flatten(l))

Matplotlib

>>> from matplotlib.cbook import flatten
>>> list(flatten(l))

Unipath

>>> from unipath.path import flatten
>>> list(flatten(l))

Setuptools

>>> from setuptools.namespaces import flatten
>>> list(flatten(l))

Don’t reinvent the wheel if you use Django:

>>> from django.contrib.admin.utils import flatten
>>> l = [[1,2,3], [4,5], [6]]
>>> flatten(l)
>>> [1, 2, 3, 4, 5, 6]

Pandas:

>>> from pandas.core.common import flatten
>>> list(flatten(l))

Itertools:

>>> import itertools
>>> flatten = itertools.chain.from_iterable
>>> list(flatten(l))

Matplotlib

>>> from matplotlib.cbook import flatten
>>> list(flatten(l))

Unipath:

>>> from unipath.path import flatten
>>> list(flatten(l))

Setuptools:

>>> from setuptools.namespaces import flatten
>>> list(flatten(l))

回答 6

这是适用于数字字符串嵌套列表和混合容器的通用方法。

#from typing import Iterable 
from collections import Iterable                            # < py38


def flatten(items):
    """Yield items from any nested iterable; see Reference."""
    for x in items:
        if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
            for sub_x in flatten(x):
                yield sub_x
        else:
            yield x

注意事项

  • 在Python 3中,yield from flatten(x)可以替换for sub_x in flatten(x): yield sub_x
  • 在Python 3.8,抽象基类移动collection.abc所述typing模块。

演示版

lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(flatten(lst))                                         # nested lists
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

mixed = [[1, [2]], (3, 4, {5, 6}, 7), 8, "9"]              # numbers, strs, nested & mixed
list(flatten(mixed))
# [1, 2, 3, 4, 5, 6, 7, 8, '9']

参考

  • 此解决方案是根据Beazley,D.和B. Jones的食谱修改的。食谱4.14,Python Cookbook第三版,O’Reilly Media Inc.,塞巴斯托波尔,加利福尼亚:2013年。
  • 找到了较早的SO帖子,可能是原始的演示。

Here is a general approach that applies to numbers, strings, nested lists and mixed containers.

Code

#from typing import Iterable 
from collections import Iterable                            # < py38


def flatten(items):
    """Yield items from any nested iterable; see Reference."""
    for x in items:
        if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
            for sub_x in flatten(x):
                yield sub_x
        else:
            yield x

Notes:

  • In Python 3, yield from flatten(x) can replace for sub_x in flatten(x): yield sub_x
  • In Python 3.8, abstract base classes are moved from collection.abc to the typing module.

Demo

lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(flatten(lst))                                         # nested lists
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

mixed = [[1, [2]], (3, 4, {5, 6}, 7), 8, "9"]              # numbers, strs, nested & mixed
list(flatten(mixed))
# [1, 2, 3, 4, 5, 6, 7, 8, '9']

Reference

  • This solution is modified from a recipe in Beazley, D. and B. Jones. Recipe 4.14, Python Cookbook 3rd Ed., O’Reilly Media Inc. Sebastopol, CA: 2013.
  • Found an earlier SO post, possibly the original demonstration.

回答 7

如果要展平不知道嵌套深度的数据结构,可以使用1iteration_utilities.deepflatten

>>> from iteration_utilities import deepflatten

>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(deepflatten(l, depth=1))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

>>> l = [[1, 2, 3], [4, [5, 6]], 7, [8, 9]]
>>> list(deepflatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

它是一个生成器,因此您需要将结果list强制转换为或对其进行显式迭代。


如果只展平一个级别,并且每个项目本身都是可迭代的,则还可以使用iteration_utilities.flatten它本身只是一个薄包装itertools.chain.from_iterable

>>> from iteration_utilities import flatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(flatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

只是添加一些时间(基于NicoSchlömer的答案,其中不包括此答案中提供的功能):

在此处输入图片说明

这是一个对数对数图,可以容纳跨度很大的值。对于定性推理:越低越好。

研究结果表明,如果迭代只包含几个内部iterables然后sum将最快,但长期iterables只itertools.chain.from_iterableiteration_utilities.deepflatten或嵌套的理解与合理的性能itertools.chain.from_iterable是最快的(如已被尼科Schlömer注意到)。

from itertools import chain
from functools import reduce
from collections import Iterable  # or from collections.abc import Iterable
import operator
from iteration_utilities import deepflatten

def nested_list_comprehension(lsts):
    return [item for sublist in lsts for item in sublist]

def itertools_chain_from_iterable(lsts):
    return list(chain.from_iterable(lsts))

def pythons_sum(lsts):
    return sum(lsts, [])

def reduce_add(lsts):
    return reduce(lambda x, y: x + y, lsts)

def pylangs_flatten(lsts):
    return list(flatten(lsts))

def flatten(items):
    """Yield items from any nested iterable; see REF."""
    for x in items:
        if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
            yield from flatten(x)
        else:
            yield x

def reduce_concat(lsts):
    return reduce(operator.concat, lsts)

def iteration_utilities_deepflatten(lsts):
    return list(deepflatten(lsts, depth=1))


from simple_benchmark import benchmark

b = benchmark(
    [nested_list_comprehension, itertools_chain_from_iterable, pythons_sum, reduce_add,
     pylangs_flatten, reduce_concat, iteration_utilities_deepflatten],
    arguments={2**i: [[0]*5]*(2**i) for i in range(1, 13)},
    argument_name='number of inner lists'
)

b.plot()

1免责声明:我是该图书馆的作者

If you want to flatten a data-structure where you don’t know how deep it’s nested you could use iteration_utilities.deepflatten1

>>> from iteration_utilities import deepflatten

>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(deepflatten(l, depth=1))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

>>> l = [[1, 2, 3], [4, [5, 6]], 7, [8, 9]]
>>> list(deepflatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

It’s a generator so you need to cast the result to a list or explicitly iterate over it.


To flatten only one level and if each of the items is itself iterable you can also use iteration_utilities.flatten which itself is just a thin wrapper around itertools.chain.from_iterable:

>>> from iteration_utilities import flatten
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> list(flatten(l))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Just to add some timings (based on Nico Schlömer answer that didn’t include the function presented in this answer):

enter image description here

It’s a log-log plot to accommodate for the huge range of values spanned. For qualitative reasoning: Lower is better.

The results show that if the iterable contains only a few inner iterables then sum will be fastest, however for long iterables only the itertools.chain.from_iterable, iteration_utilities.deepflatten or the nested comprehension have reasonable performance with itertools.chain.from_iterable being the fastest (as already noticed by Nico Schlömer).

from itertools import chain
from functools import reduce
from collections import Iterable  # or from collections.abc import Iterable
import operator
from iteration_utilities import deepflatten

def nested_list_comprehension(lsts):
    return [item for sublist in lsts for item in sublist]

def itertools_chain_from_iterable(lsts):
    return list(chain.from_iterable(lsts))

def pythons_sum(lsts):
    return sum(lsts, [])

def reduce_add(lsts):
    return reduce(lambda x, y: x + y, lsts)

def pylangs_flatten(lsts):
    return list(flatten(lsts))

def flatten(items):
    """Yield items from any nested iterable; see REF."""
    for x in items:
        if isinstance(x, Iterable) and not isinstance(x, (str, bytes)):
            yield from flatten(x)
        else:
            yield x

def reduce_concat(lsts):
    return reduce(operator.concat, lsts)

def iteration_utilities_deepflatten(lsts):
    return list(deepflatten(lsts, depth=1))


from simple_benchmark import benchmark

b = benchmark(
    [nested_list_comprehension, itertools_chain_from_iterable, pythons_sum, reduce_add,
     pylangs_flatten, reduce_concat, iteration_utilities_deepflatten],
    arguments={2**i: [[0]*5]*(2**i) for i in range(1, 13)},
    argument_name='number of inner lists'
)

b.plot()

1 Disclaimer: I’m the author of that library


回答 8

我收回我的声明。总和不是赢家。尽管列表较小时速度更快。但是,列表较大时,性能会大大降低。

>>> timeit.Timer(
        '[item for sublist in l for item in sublist]',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10000'
    ).timeit(100)
2.0440959930419922

sum版本仍在运行一分钟以上,尚未处理!

对于中型列表:

>>> timeit.Timer(
        '[item for sublist in l for item in sublist]',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10'
    ).timeit()
20.126545906066895
>>> timeit.Timer(
        'reduce(lambda x,y: x+y,l)',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10'
    ).timeit()
22.242258071899414
>>> timeit.Timer(
        'sum(l, [])',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10'
    ).timeit()
16.449732065200806

使用小清单和时间:number = 1000000

>>> timeit.Timer(
        '[item for sublist in l for item in sublist]',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]]'
    ).timeit()
2.4598159790039062
>>> timeit.Timer(
        'reduce(lambda x,y: x+y,l)',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]]'
    ).timeit()
1.5289170742034912
>>> timeit.Timer(
        'sum(l, [])',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]]'
    ).timeit()
1.0598428249359131

I take my statement back. sum is not the winner. Although it is faster when the list is small. But the performance degrades significantly with larger lists.

>>> timeit.Timer(
        '[item for sublist in l for item in sublist]',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10000'
    ).timeit(100)
2.0440959930419922

The sum version is still running for more than a minute and it hasn’t done processing yet!

For medium lists:

>>> timeit.Timer(
        '[item for sublist in l for item in sublist]',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10'
    ).timeit()
20.126545906066895
>>> timeit.Timer(
        'reduce(lambda x,y: x+y,l)',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10'
    ).timeit()
22.242258071899414
>>> timeit.Timer(
        'sum(l, [])',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]] * 10'
    ).timeit()
16.449732065200806

Using small lists and timeit: number=1000000

>>> timeit.Timer(
        '[item for sublist in l for item in sublist]',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]]'
    ).timeit()
2.4598159790039062
>>> timeit.Timer(
        'reduce(lambda x,y: x+y,l)',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]]'
    ).timeit()
1.5289170742034912
>>> timeit.Timer(
        'sum(l, [])',
        'l=[[1, 2, 3], [4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7]]'
    ).timeit()
1.0598428249359131

回答 9

似乎与operator.add!当您将两个列表加在一起时,正确的术语是concat,而不是添加。operator.concat是您需要使用的。

如果您认为功能正常,那么就这么简单:

>>> from functools import reduce
>>> list2d = ((1, 2, 3), (4, 5, 6), (7,), (8, 9))
>>> reduce(operator.concat, list2d)
(1, 2, 3, 4, 5, 6, 7, 8, 9)

您会看到reduce尊重序列类型,因此在提供元组时,您会得到一个元组。让我们尝试一个列表:

>>> list2d = [[1, 2, 3],[4, 5, 6], [7], [8, 9]]
>>> reduce(operator.concat, list2d)
[1, 2, 3, 4, 5, 6, 7, 8, 9]

啊哈,您会得到一个清单。

性能如何:

>>> list2d = [[1, 2, 3],[4, 5, 6], [7], [8, 9]]
>>> %timeit list(itertools.chain.from_iterable(list2d))
1000000 loops, best of 3: 1.36 µs per loop

from_iterable相当快!但这是无法与相比的concat

>>> list2d = ((1, 2, 3),(4, 5, 6), (7,), (8, 9))
>>> %timeit reduce(operator.concat, list2d)
1000000 loops, best of 3: 492 ns per loop

There seems to be a confusion with operator.add! When you add two lists together, the correct term for that is concat, not add. operator.concat is what you need to use.

If you’re thinking functional, it is as easy as this::

>>> from functools import reduce
>>> list2d = ((1, 2, 3), (4, 5, 6), (7,), (8, 9))
>>> reduce(operator.concat, list2d)
(1, 2, 3, 4, 5, 6, 7, 8, 9)

You see reduce respects the sequence type, so when you supply a tuple, you get back a tuple. Let’s try with a list::

>>> list2d = [[1, 2, 3],[4, 5, 6], [7], [8, 9]]
>>> reduce(operator.concat, list2d)
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Aha, you get back a list.

How about performance::

>>> list2d = [[1, 2, 3],[4, 5, 6], [7], [8, 9]]
>>> %timeit list(itertools.chain.from_iterable(list2d))
1000000 loops, best of 3: 1.36 µs per loop

from_iterable is pretty fast! But it’s no comparison to reduce with concat.

>>> list2d = ((1, 2, 3),(4, 5, 6), (7,), (8, 9))
>>> %timeit reduce(operator.concat, list2d)
1000000 loops, best of 3: 492 ns per loop

回答 10

为什么使用扩展?

reduce(lambda x, y: x+y, l)

这应该工作正常。

Why do you use extend?

reduce(lambda x, y: x+y, l)

This should work fine.


回答 11

考虑安装more_itertools软件包。

> pip install more_itertools

它附带了一个实现flattensource,来自itertools配方):

import more_itertools


lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.flatten(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

从2.4版开始,您可以使用more_itertools.collapsesource,由abarnet提供)来展平更复杂,嵌套的可迭代对象。

lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.collapse(lst)) 
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

lst = [[1, 2, 3], [[4, 5, 6]], [[[7]]], 8, 9]              # complex nesting
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

Consider installing the more_itertools package.

> pip install more_itertools

It ships with an implementation for flatten (source, from the itertools recipes):

import more_itertools


lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.flatten(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

As of version 2.4, you can flatten more complicated, nested iterables with more_itertools.collapse (source, contributed by abarnet).

lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
list(more_itertools.collapse(lst)) 
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

lst = [[1, 2, 3], [[4, 5, 6]], [[[7]]], 8, 9]              # complex nesting
list(more_itertools.collapse(lst))
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

回答 12

您的函数不起作用的原因是因为扩展名扩展了数组就位并且不返回它。您仍然可以使用以下方法从lambda返回x:

reduce(lambda x,y: x.extend(y) or x, l)

注意:扩展比列表上的+更有效。

The reason your function didn’t work is because the extend extends an array in-place and doesn’t return it. You can still return x from lambda, using something like this:

reduce(lambda x,y: x.extend(y) or x, l)

Note: extend is more efficient than + on lists.


回答 13

def flatten(l, a):
    for i in l:
        if isinstance(i, list):
            flatten(i, a)
        else:
            a.append(i)
    return a

print(flatten([[[1, [1,1, [3, [4,5,]]]], 2, 3], [4, 5],6], []))

# [1, 1, 1, 3, 4, 5, 2, 3, 4, 5, 6]
def flatten(l, a):
    for i in l:
        if isinstance(i, list):
            flatten(i, a)
        else:
            a.append(i)
    return a

print(flatten([[[1, [1,1, [3, [4,5,]]]], 2, 3], [4, 5],6], []))

# [1, 1, 1, 3, 4, 5, 2, 3, 4, 5, 6]

回答 14

递归版本

x = [1,2,[3,4],[5,[6,[7]]],8,9,[10]]

def flatten_list(k):
    result = list()
    for i in k:
        if isinstance(i,list):

            #The isinstance() function checks if the object (first argument) is an 
            #instance or subclass of classinfo class (second argument)

            result.extend(flatten_list(i)) #Recursive call
        else:
            result.append(i)
    return result

flatten_list(x)
#result = [1,2,3,4,5,6,7,8,9,10]

Recursive version

x = [1,2,[3,4],[5,[6,[7]]],8,9,[10]]

def flatten_list(k):
    result = list()
    for i in k:
        if isinstance(i,list):

            #The isinstance() function checks if the object (first argument) is an 
            #instance or subclass of classinfo class (second argument)

            result.extend(flatten_list(i)) #Recursive call
        else:
            result.append(i)
    return result

flatten_list(x)
#result = [1,2,3,4,5,6,7,8,9,10]

回答 15

matplotlib.cbook.flatten() 即使嵌套列表比示例嵌套更深,它也适用于嵌套列表。

import matplotlib
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
print(list(matplotlib.cbook.flatten(l)))
l2 = [[1, 2, 3], [4, 5, 6], [7], [8, [9, 10, [11, 12, [13]]]]]
print list(matplotlib.cbook.flatten(l2))

结果:

[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

这比下划线._。flatten快18倍:

Average time over 1000 trials of matplotlib.cbook.flatten: 2.55e-05 sec
Average time over 1000 trials of underscore._.flatten: 4.63e-04 sec
(time for underscore._)/(time for matplotlib.cbook) = 18.1233394636

matplotlib.cbook.flatten() will work for nested lists even if they nest more deeply than the example.

import matplotlib
l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
print(list(matplotlib.cbook.flatten(l)))
l2 = [[1, 2, 3], [4, 5, 6], [7], [8, [9, 10, [11, 12, [13]]]]]
print list(matplotlib.cbook.flatten(l2))

Result:

[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

This is 18x faster than underscore._.flatten:

Average time over 1000 trials of matplotlib.cbook.flatten: 2.55e-05 sec
Average time over 1000 trials of underscore._.flatten: 4.63e-04 sec
(time for underscore._)/(time for matplotlib.cbook) = 18.1233394636

回答 16

在处理基于文本的可变长度列表时,可接受的答案对我不起作用。这是对我有用的另一种方法。

l = ['aaa', 'bb', 'cccccc', ['xx', 'yyyyyyy']]

接受的答案无效

flat_list = [item for sublist in l for item in sublist]
print(flat_list)
['a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'c', 'c', 'c', 'xx', 'yyyyyyy']

新提出的解决方案,没有工作对我来说:

flat_list = []
_ = [flat_list.extend(item) if isinstance(item, list) else flat_list.append(item) for item in l if item]
print(flat_list)
['aaa', 'bb', 'cccccc', 'xx', 'yyyyyyy']

The accepted answer did not work for me when dealing with text-based lists of variable lengths. Here is an alternate approach that did work for me.

l = ['aaa', 'bb', 'cccccc', ['xx', 'yyyyyyy']]

Accepted answer that did not work:

flat_list = [item for sublist in l for item in sublist]
print(flat_list)
['a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'c', 'c', 'c', 'xx', 'yyyyyyy']

New proposed solution that did work for me:

flat_list = []
_ = [flat_list.extend(item) if isinstance(item, list) else flat_list.append(item) for item in l if item]
print(flat_list)
['aaa', 'bb', 'cccccc', 'xx', 'yyyyyyy']

回答 17

上面的Anil函数的一个坏功能是,它要求用户始终手动将第二个参数指定为空列表[]。相反,这应该是默认设置。由于Python对象的工作方式,这些对象应在函数内部而不是参数中设置。

这是一个工作功能:

def list_flatten(l, a=None):
    #check a
    if a is None:
        #initialize with empty list
        a = []

    for i in l:
        if isinstance(i, list):
            list_flatten(i, a)
        else:
            a.append(i)
    return a

测试:

In [2]: lst = [1, 2, [3], [[4]],[5,[6]]]

In [3]: lst
Out[3]: [1, 2, [3], [[4]], [5, [6]]]

In [11]: list_flatten(lst)
Out[11]: [1, 2, 3, 4, 5, 6]

An bad feature of Anil’s function above is that it requires the user to always manually specify the second argument to be an empty list []. This should instead be a default. Due to the way Python objects work, these should be set inside the function, not in the arguments.

Here’s a working function:

def list_flatten(l, a=None):
    #check a
    if a is None:
        #initialize with empty list
        a = []

    for i in l:
        if isinstance(i, list):
            list_flatten(i, a)
        else:
            a.append(i)
    return a

Testing:

In [2]: lst = [1, 2, [3], [[4]],[5,[6]]]

In [3]: lst
Out[3]: [1, 2, [3], [[4]], [5, [6]]]

In [11]: list_flatten(lst)
Out[11]: [1, 2, 3, 4, 5, 6]

回答 18

以下对我来说似乎最简单:

>>> import numpy as np
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> print (np.concatenate(l))
[1 2 3 4 5 6 7 8 9]

Following seem simplest to me:

>>> import numpy as np
>>> l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
>>> print (np.concatenate(l))
[1 2 3 4 5 6 7 8 9]

回答 19

也可以使用NumPy的flat

import numpy as np
list(np.array(l).flat)

编辑11/02/2016:仅当子列表具有相同尺寸时才可用。

One can also use NumPy’s flat:

import numpy as np
list(np.array(l).flat)

Edit 11/02/2016: Only works when sublists have identical dimensions.


回答 20

您可以使用numpy:
flat_list = list(np.concatenate(list_of_list))

You can use numpy :
flat_list = list(np.concatenate(list_of_list))


回答 21

如果您愿意放弃一点速度以获得更干净的外观,则可以使用numpy.concatenate().tolist()numpy.concatenate().ravel().tolist()

import numpy

l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] * 99

%timeit numpy.concatenate(l).ravel().tolist()
1000 loops, best of 3: 313 µs per loop

%timeit numpy.concatenate(l).tolist()
1000 loops, best of 3: 312 µs per loop

%timeit [item for sublist in l for item in sublist]
1000 loops, best of 3: 31.5 µs per loop

您可以在docs numpy.concatenatenumpy.ravel中找到更多信息

If you are willing to give up a tiny amount of speed for a cleaner look, then you could use numpy.concatenate().tolist() or numpy.concatenate().ravel().tolist():

import numpy

l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] * 99

%timeit numpy.concatenate(l).ravel().tolist()
1000 loops, best of 3: 313 µs per loop

%timeit numpy.concatenate(l).tolist()
1000 loops, best of 3: 312 µs per loop

%timeit [item for sublist in l for item in sublist]
1000 loops, best of 3: 31.5 µs per loop

You can find out more here in the docs numpy.concatenate and numpy.ravel


回答 22

我找到的最快解决方案(无论如何都是大型列表):

import numpy as np
#turn list into an array and flatten()
np.array(l).flatten()

做完了!您当然可以通过执行list(l)将其转换为列表

Fastest solution I have found (for large list anyway):

import numpy as np
#turn list into an array and flatten()
np.array(l).flatten()

Done! You can of course turn it back into a list by executing list(l)


回答 23

underscore.py包装风扇的简单代码

from underscore import _
_.flatten([[1, 2, 3], [4, 5, 6], [7], [8, 9]])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

它解决了所有扁平化问题(无列表项或复杂的嵌套)

from underscore import _
# 1 is none list item
# [2, [3]] is complex nesting
_.flatten([1, [2, [3]], [4, 5, 6], [7], [8, 9]])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

您可以underscore.py使用pip 安装

pip install underscore.py

Simple code for underscore.py package fan

from underscore import _
_.flatten([[1, 2, 3], [4, 5, 6], [7], [8, 9]])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

It solves all flatten problems (none list item or complex nesting)

from underscore import _
# 1 is none list item
# [2, [3]] is complex nesting
_.flatten([1, [2, [3]], [4, 5, 6], [7], [8, 9]])
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

You can install underscore.py with pip

pip install underscore.py

回答 24

def flatten(alist):
    if alist == []:
        return []
    elif type(alist) is not list:
        return [alist]
    else:
        return flatten(alist[0]) + flatten(alist[1:])
def flatten(alist):
    if alist == []:
        return []
    elif type(alist) is not list:
        return [alist]
    else:
        return flatten(alist[0]) + flatten(alist[1:])

回答 25

注意:以下内容适用于Python 3.3+,因为它使用yield_fromsix也是第三方软件包,尽管它很稳定。或者,您可以使用sys.version


在的情况下obj = [[1, 2,], [3, 4], [5, 6]],这里的所有解决方案都不错,包括列表理解和itertools.chain.from_iterable

但是,请考虑以下稍微复杂的情况:

>>> obj = [[1, 2, 3], [4, 5], 6, 'abc', [7], [8, [9, 10]]]

这里有几个问题:

  • 一个元素6只是一个标量。它是不可迭代的,因此上述路由将在此处失败。
  • 其中一个要素,'abc'技术上可迭代(所有str s为)。但是,在行与行之间进行一点阅读时,您并不想这样处理-您希望将其视为单个元素。
  • 最后一个元素[8, [9, 10]]本身就是嵌套的可迭代对象。基本列表理解,chain.from_iterable仅提取“下一级”。

您可以通过以下方法对此进行补救:

>>> from collections import Iterable
>>> from six import string_types

>>> def flatten(obj):
...     for i in obj:
...         if isinstance(i, Iterable) and not isinstance(i, string_types):
...             yield from flatten(i)
...         else:
...             yield i


>>> list(flatten(obj))
[1, 2, 3, 4, 5, 6, 'abc', 7, 8, 9, 10]

在这里,您检查子元素(1)是否可通过Iterable,ABC从进行迭代itertools,但还要确保(2)元素不是 “字符串状”的。

Note: Below applies to Python 3.3+ because it uses yield_from. six is also a third-party package, though it is stable. Alternately, you could use sys.version.


In the case of obj = [[1, 2,], [3, 4], [5, 6]], all of the solutions here are good, including list comprehension and itertools.chain.from_iterable.

However, consider this slightly more complex case:

>>> obj = [[1, 2, 3], [4, 5], 6, 'abc', [7], [8, [9, 10]]]

There are several problems here:

  • One element, 6, is just a scalar; it’s not iterable, so the above routes will fail here.
  • One element, 'abc', is technically iterable (all strs are). However, reading between the lines a bit, you don’t want to treat it as such–you want to treat it as a single element.
  • The final element, [8, [9, 10]] is itself a nested iterable. Basic list comprehension and chain.from_iterable only extract “1 level down.”

You can remedy this as follows:

>>> from collections import Iterable
>>> from six import string_types

>>> def flatten(obj):
...     for i in obj:
...         if isinstance(i, Iterable) and not isinstance(i, string_types):
...             yield from flatten(i)
...         else:
...             yield i


>>> list(flatten(obj))
[1, 2, 3, 4, 5, 6, 'abc', 7, 8, 9, 10]

Here, you check that the sub-element (1) is iterable with Iterable, an ABC from itertools, but also want to ensure that (2) the element is not “string-like.”


回答 26

flat_list = []
for i in list_of_list:
    flat_list+=i

该代码也可以很好地工作,因为它会一直扩展列表。虽然非常相似,但是只有一个for循环。因此,它比添加2 for循环具有更少的复杂性。

flat_list = []
for i in list_of_list:
    flat_list+=i

This Code also works fine as it just extend the list all the way. Although it is much similar but only have one for loop. So It have less complexity than adding 2 for loops.


回答 27

from nltk import flatten

l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
flatten(l)

这种解决方案相对于大多数其他解决方案的优势在于,如果您有类似以下的列表:

l = [1, [2, 3], [4, 5, 6], [7], [8, 9]]

虽然其他大多数解决方案都会引发错误,但此解决方案可以解决这些问题。

from nltk import flatten

l = [[1, 2, 3], [4, 5, 6], [7], [8, 9]]
flatten(l)

The advantage of this solution over most others here is that if you have a list like:

l = [1, [2, 3], [4, 5, 6], [7], [8, 9]]

while most other solutions throw an error this solution handles them.


回答 28

这可能不是最有效的方法,但我认为应该放一个衬里(实际上是两个衬里)。两种版本均可在任意层次的嵌套列表上使用,并利用语言功能(Python3.5)和递归。

def make_list_flat (l):
    flist = []
    flist.extend ([l]) if (type (l) is not list) else [flist.extend (make_list_flat (e)) for e in l]
    return flist

a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = make_list_flat(a)
print (flist)

输出是

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]

这以深度优先的方式工作。递归向下进行,直到找到一个非列表元素,然后扩展局部变量flist,然后将其回滚到父变量。每当flist返回时,它就会扩展到flist列表理解中的父级。因此,从根本上返回一个平面列表。

上面的代码创建了几个本地列表并返回它们,用于扩展父级列表。我认为解决此问题的方法可能是创建gloabl flist,如下所示。

a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = []
def make_list_flat (l):
    flist.extend ([l]) if (type (l) is not list) else [make_list_flat (e) for e in l]

make_list_flat(a)
print (flist)

输出再次

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]

尽管目前我不确定效率。

This may not be the most efficient way but I thought to put a one-liner (actually a two-liner). Both versions will work on arbitrary hierarchy nested lists, and exploits language features (Python3.5) and recursion.

def make_list_flat (l):
    flist = []
    flist.extend ([l]) if (type (l) is not list) else [flist.extend (make_list_flat (e)) for e in l]
    return flist

a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = make_list_flat(a)
print (flist)

The output is

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]

This works in a depth first manner. The recursion goes down until it finds a non-list element, then extends the local variable flist and then rolls back it to the parent. Whenever flist is returned, it is extended to the parent’s flist in the list comprehension. Therefore, at the root, a flat list is returned.

The above one creates several local lists and returns them which are used to extend the parent’s list. I think the way around for this may be creating a gloabl flist, like below.

a = [[1, 2], [[[[3, 4, 5], 6]]], 7, [8, [9, [10, 11], 12, [13, 14, [15, [[16, 17], 18]]]]]]
flist = []
def make_list_flat (l):
    flist.extend ([l]) if (type (l) is not list) else [make_list_flat (e) for e in l]

make_list_flat(a)
print (flist)

The output is again

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]

Although I am not sure at this time about the efficiency.


回答 29

适用于整数的异质和均质列表的另一种异常方法:

from typing import List


def flatten(l: list) -> List[int]:
    """Flatten an arbitrary deep nested list of lists of integers.

    Examples:
        >>> flatten([1, 2, [1, [10]]])
        [1, 2, 1, 10]

    Args:
        l: Union[l, Union[int, List[int]]

    Returns:
        Flatted list of integer
    """
    return [int(i.strip('[ ]')) for i in str(l).split(',')]

Another unusual approach that works for hetero- and homogeneous lists of integers:

from typing import List


def flatten(l: list) -> List[int]:
    """Flatten an arbitrary deep nested list of lists of integers.

    Examples:
        >>> flatten([1, 2, [1, [10]]])
        [1, 2, 1, 10]

    Args:
        l: Union[l, Union[int, List[int]]

    Returns:
        Flatted list of integer
    """
    return [int(i.strip('[ ]')) for i in str(l).split(',')]

了解切片符号

问题:了解切片符号

我需要一个关于Python切片符号的很好的解释(引用是一个加号)。

对我而言,此表示法需要一些注意。

它看起来非常强大,但是我还没有完全了解它。

I need a good explanation (references are a plus) on Python’s slice notation.

To me, this notation needs a bit of picking up.

It looks extremely powerful, but I haven’t quite got my head around it.


回答 0

真的很简单:

a[start:stop]  # items start through stop-1
a[start:]      # items start through the rest of the array
a[:stop]       # items from the beginning through stop-1
a[:]           # a copy of the whole array

还有一个step值,可以与以上任何一种一起使用:

a[start:stop:step] # start through not past stop, by step

要记住的关键点是该:stop值表示不在所选切片中的第一个值。所以,之间的差stopstart是选择的元素的数量(如果step是1,默认值)。

另一个功能是startstop可能是负数,这意味着它从数组的末尾而不是开头开始计数。所以:

a[-1]    # last item in the array
a[-2:]   # last two items in the array
a[:-2]   # everything except the last two items

同样,step可能为负数:

a[::-1]    # all items in the array, reversed
a[1::-1]   # the first two items, reversed
a[:-3:-1]  # the last two items, reversed
a[-3::-1]  # everything except the last two items, reversed

如果项目数量少于您的要求,Python对程序员很友好。例如,如果您要求a[:-2]并且a仅包含一个元素,则会得到一个空列表,而不是一个错误。有时您会更喜欢该错误,因此您必须意识到这种情况可能会发生。

slice()对象的关系

[]上面的代码中实际上将切片运算符与slice()使用:符号的对象一起使用(仅在内有效[]),即:

a[start:stop:step]

等效于:

a[slice(start, stop, step)]

切片对象也表现略有不同,这取决于参数的个数,同样range(),即两个slice(stop)slice(start, stop[, step])支持。要跳过指定给定参数的操作,可以使用None,例如a[start:]等于a[slice(start, None)]a[::-1]等于a[slice(None, None, -1)]

尽管:基于的符号对于简单切片非常有帮助,但是slice()对象的显式使用简化了切片的编程生成。

It’s pretty simple really:

a[start:stop]  # items start through stop-1
a[start:]      # items start through the rest of the array
a[:stop]       # items from the beginning through stop-1
a[:]           # a copy of the whole array

There is also the step value, which can be used with any of the above:

a[start:stop:step] # start through not past stop, by step

The key point to remember is that the :stop value represents the first value that is not in the selected slice. So, the difference between stop and start is the number of elements selected (if step is 1, the default).

The other feature is that start or stop may be a negative number, which means it counts from the end of the array instead of the beginning. So:

a[-1]    # last item in the array
a[-2:]   # last two items in the array
a[:-2]   # everything except the last two items

Similarly, step may be a negative number:

a[::-1]    # all items in the array, reversed
a[1::-1]   # the first two items, reversed
a[:-3:-1]  # the last two items, reversed
a[-3::-1]  # everything except the last two items, reversed

Python is kind to the programmer if there are fewer items than you ask for. For example, if you ask for a[:-2] and a only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, so you have to be aware that this may happen.

Relation to slice() object

The slicing operator [] is actually being used in the above code with a slice() object using the : notation (which is only valid within []), i.e.:

a[start:stop:step]

is equivalent to:

a[slice(start, stop, step)]

Slice objects also behave slightly differently depending on the number of arguments, similarly to range(), i.e. both slice(stop) and slice(start, stop[, step]) are supported. To skip specifying a given argument, one might use None, so that e.g. a[start:] is equivalent to a[slice(start, None)] or a[::-1] is equivalent to a[slice(None, None, -1)].

While the :-based notation is very helpful for simple slicing, the explicit use of slice() objects simplifies the programmatic generation of slicing.


回答 1

Python的教程谈论它(稍微向下滚动,直到你得到关于切片的部分)。

ASCII艺术图也有助于记住切片的工作方式:

 +---+---+---+---+---+---+
 | P | y | t | h | o | n |
 +---+---+---+---+---+---+
 0   1   2   3   4   5   6
-6  -5  -4  -3  -2  -1

记住切片如何工作的一种方法是将索引视为指向字符之间的指针,第一个字符的左边缘编号为0。然后,n个字符的字符串的最后符的右侧边缘具有索引n

The Python tutorial talks about it (scroll down a bit until you get to the part about slicing).

The ASCII art diagram is helpful too for remembering how slices work:

 +---+---+---+---+---+---+
 | P | y | t | h | o | n |
 +---+---+---+---+---+---+
 0   1   2   3   4   5   6
-6  -5  -4  -3  -2  -1

One way to remember how slices work is to think of the indices as pointing between characters, with the left edge of the first character numbered 0. Then the right edge of the last character of a string of n characters has index n.


回答 2

列举语法允许的可能性:

>>> seq[:]                # [seq[0],   seq[1],          ..., seq[-1]    ]
>>> seq[low:]             # [seq[low], seq[low+1],      ..., seq[-1]    ]
>>> seq[:high]            # [seq[0],   seq[1],          ..., seq[high-1]]
>>> seq[low:high]         # [seq[low], seq[low+1],      ..., seq[high-1]]
>>> seq[::stride]         # [seq[0],   seq[stride],     ..., seq[-1]    ]
>>> seq[low::stride]      # [seq[low], seq[low+stride], ..., seq[-1]    ]
>>> seq[:high:stride]     # [seq[0],   seq[stride],     ..., seq[high-1]]
>>> seq[low:high:stride]  # [seq[low], seq[low+stride], ..., seq[high-1]]

当然,如果(high-low)%stride != 0,则终点将比稍低high-1

如果stride为负,则由于我们递减计数,因此顺序有所更改:

>>> seq[::-stride]        # [seq[-1],   seq[-1-stride],   ..., seq[0]    ]
>>> seq[high::-stride]    # [seq[high], seq[high-stride], ..., seq[0]    ]
>>> seq[:low:-stride]     # [seq[-1],   seq[-1-stride],   ..., seq[low+1]]
>>> seq[high:low:-stride] # [seq[high], seq[high-stride], ..., seq[low+1]]

扩展切片(带有逗号和省略号)通常仅由特殊的数据结构(例如NumPy)使用;基本序列不支持它们。

>>> class slicee:
...     def __getitem__(self, item):
...         return repr(item)
...
>>> slicee()[0, 1:2, ::5, ...]
'(0, slice(1, 2, None), slice(None, None, 5), Ellipsis)'

Enumerating the possibilities allowed by the grammar:

>>> seq[:]                # [seq[0],   seq[1],          ..., seq[-1]    ]
>>> seq[low:]             # [seq[low], seq[low+1],      ..., seq[-1]    ]
>>> seq[:high]            # [seq[0],   seq[1],          ..., seq[high-1]]
>>> seq[low:high]         # [seq[low], seq[low+1],      ..., seq[high-1]]
>>> seq[::stride]         # [seq[0],   seq[stride],     ..., seq[-1]    ]
>>> seq[low::stride]      # [seq[low], seq[low+stride], ..., seq[-1]    ]
>>> seq[:high:stride]     # [seq[0],   seq[stride],     ..., seq[high-1]]
>>> seq[low:high:stride]  # [seq[low], seq[low+stride], ..., seq[high-1]]

Of course, if (high-low)%stride != 0, then the end point will be a little lower than high-1.

If stride is negative, the ordering is changed a bit since we’re counting down:

>>> seq[::-stride]        # [seq[-1],   seq[-1-stride],   ..., seq[0]    ]
>>> seq[high::-stride]    # [seq[high], seq[high-stride], ..., seq[0]    ]
>>> seq[:low:-stride]     # [seq[-1],   seq[-1-stride],   ..., seq[low+1]]
>>> seq[high:low:-stride] # [seq[high], seq[high-stride], ..., seq[low+1]]

Extended slicing (with commas and ellipses) are mostly used only by special data structures (like NumPy); the basic sequences don’t support them.

>>> class slicee:
...     def __getitem__(self, item):
...         return repr(item)
...
>>> slicee()[0, 1:2, ::5, ...]
'(0, slice(1, 2, None), slice(None, None, 5), Ellipsis)'

回答 3

上面的答案不讨论切片分配。要了解切片分配,在ASCII艺术中添加另一个概念很有帮助:

                +---+---+---+---+---+---+
                | P | y | t | h | o | n |
                +---+---+---+---+---+---+
Slice position: 0   1   2   3   4   5   6
Index position:   0   1   2   3   4   5

>>> p = ['P','y','t','h','o','n']
# Why the two sets of numbers:
# indexing gives items, not lists
>>> p[0]
 'P'
>>> p[5]
 'n'

# Slicing gives lists
>>> p[0:1]
 ['P']
>>> p[0:2]
 ['P','y']

对于从零到n的切片,一种试探法是:“零是起点,从起点开始,并在列表中取n个项目”。

>>> p[5] # the last of six items, indexed from zero
 'n'
>>> p[0:5] # does NOT include the last item!
 ['P','y','t','h','o']
>>> p[0:6] # not p[0:5]!!!
 ['P','y','t','h','o','n']

另一个启发式方法是:“对于任何切片,将起始位置替换为零,应用先前的启发式方法以获取列表的末尾,然后将第一个数字向上计数以从开始处切掉项”

>>> p[0:4] # Start at the beginning and count out 4 items
 ['P','y','t','h']
>>> p[1:4] # Take one item off the front
 ['y','t','h']
>>> p[2:4] # Take two items off the front
 ['t','h']
# etc.

切片分配的第一个规则是,由于切片返回一个列表,因此切片分配需要一个列表(或其他可迭代的):

>>> p[2:3]
 ['t']
>>> p[2:3] = ['T']
>>> p
 ['P','y','T','h','o','n']
>>> p[2:3] = 't'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only assign an iterable

您还可以在上面看到的切片分配的第二条规则是,切片索引会返回列表的任何部分,这与切片分配更改的部分相同:

>>> p[2:4]
 ['T','h']
>>> p[2:4] = ['t','r']
>>> p
 ['P','y','t','r','o','n']

切片分配的第三个规则是,分配的列表(可迭代)不必具有相同的长度。索引切片被简单地切出,并被分配的内容全部替换:

>>> p = ['P','y','t','h','o','n'] # Start over
>>> p[2:4] = ['s','p','a','m']
>>> p
 ['P','y','s','p','a','m','o','n']

习惯最棘手的部分是分配给空片。使用启发式1和2可以很容易地索引一个空片:

>>> p = ['P','y','t','h','o','n']
>>> p[0:4]
 ['P','y','t','h']
>>> p[1:4]
 ['y','t','h']
>>> p[2:4]
 ['t','h']
>>> p[3:4]
 ['h']
>>> p[4:4]
 []

然后,一旦您看到了,将切片分配给空切片也是有意义的:

>>> p = ['P','y','t','h','o','n']
>>> p[2:4] = ['x','y'] # Assigned list is same length as slice
>>> p
 ['P','y','x','y','o','n'] # Result is same length
>>> p = ['P','y','t','h','o','n']
>>> p[3:4] = ['x','y'] # Assigned list is longer than slice
>>> p
 ['P','y','t','x','y','o','n'] # The result is longer
>>> p = ['P','y','t','h','o','n']
>>> p[4:4] = ['x','y']
>>> p
 ['P','y','t','h','x','y','o','n'] # The result is longer still

请注意,由于我们没有更改分片的第二个数字(4),因此即使我们分配给空分片,插入的项目也总是紧靠’o’堆积。因此,空切片分配的位置是非空切片分配的位置的逻辑扩展。

进行一点备份,当您继续进行我们从头开始计算切片的过程时,会发生什么?

>>> p = ['P','y','t','h','o','n']
>>> p[0:4]
 ['P','y','t','h']
>>> p[1:4]
 ['y','t','h']
>>> p[2:4]
 ['t','h']
>>> p[3:4]
 ['h']
>>> p[4:4]
 []
>>> p[5:4]
 []
>>> p[6:4]
 []

使用切片,一旦完成,就完成了;它不会开始向后切片。在Python中,除非您通过使用负数明确要求它们,否则您不会获得负面的进步。

>>> p[5:3:-1]
 ['n','o']

“一旦完成,就完成了”规则有一些奇怪的后果:

>>> p[4:4]
 []
>>> p[5:4]
 []
>>> p[6:4]
 []
>>> p[6]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

实际上,与索引相比,Python切片非常防错:

>>> p[100:200]
 []
>>> p[int(2e99):int(1e99)]
 []

有时这会派上用场,但也会导致一些奇怪的行为:

>>> p
 ['P', 'y', 't', 'h', 'o', 'n']
>>> p[int(2e99):int(1e99)] = ['p','o','w','e','r']
>>> p
 ['P', 'y', 't', 'h', 'o', 'n', 'p', 'o', 'w', 'e', 'r']

根据您的应用程序,这可能……或可能不是……您所希望的!


以下是我的原始答案的文字。它对很多人都有用,所以我不想删除它。

>>> r=[1,2,3,4]
>>> r[1:1]
[]
>>> r[1:1]=[9,8]
>>> r
[1, 9, 8, 2, 3, 4]
>>> r[1:1]=['blah']
>>> r
[1, 'blah', 9, 8, 2, 3, 4]

这也可以澄清切片和索引之间的区别。

The answers above don’t discuss slice assignment. To understand slice assignment, it’s helpful to add another concept to the ASCII art:

                +---+---+---+---+---+---+
                | P | y | t | h | o | n |
                +---+---+---+---+---+---+
Slice position: 0   1   2   3   4   5   6
Index position:   0   1   2   3   4   5

>>> p = ['P','y','t','h','o','n']
# Why the two sets of numbers:
# indexing gives items, not lists
>>> p[0]
 'P'
>>> p[5]
 'n'

# Slicing gives lists
>>> p[0:1]
 ['P']
>>> p[0:2]
 ['P','y']

One heuristic is, for a slice from zero to n, think: “zero is the beginning, start at the beginning and take n items in a list”.

>>> p[5] # the last of six items, indexed from zero
 'n'
>>> p[0:5] # does NOT include the last item!
 ['P','y','t','h','o']
>>> p[0:6] # not p[0:5]!!!
 ['P','y','t','h','o','n']

Another heuristic is, “for any slice, replace the start by zero, apply the previous heuristic to get the end of the list, then count the first number back up to chop items off the beginning”

>>> p[0:4] # Start at the beginning and count out 4 items
 ['P','y','t','h']
>>> p[1:4] # Take one item off the front
 ['y','t','h']
>>> p[2:4] # Take two items off the front
 ['t','h']
# etc.

The first rule of slice assignment is that since slicing returns a list, slice assignment requires a list (or other iterable):

>>> p[2:3]
 ['t']
>>> p[2:3] = ['T']
>>> p
 ['P','y','T','h','o','n']
>>> p[2:3] = 't'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only assign an iterable

The second rule of slice assignment, which you can also see above, is that whatever portion of the list is returned by slice indexing, that’s the same portion that is changed by slice assignment:

>>> p[2:4]
 ['T','h']
>>> p[2:4] = ['t','r']
>>> p
 ['P','y','t','r','o','n']

The third rule of slice assignment is, the assigned list (iterable) doesn’t have to have the same length; the indexed slice is simply sliced out and replaced en masse by whatever is being assigned:

>>> p = ['P','y','t','h','o','n'] # Start over
>>> p[2:4] = ['s','p','a','m']
>>> p
 ['P','y','s','p','a','m','o','n']

The trickiest part to get used to is assignment to empty slices. Using heuristic 1 and 2 it’s easy to get your head around indexing an empty slice:

>>> p = ['P','y','t','h','o','n']
>>> p[0:4]
 ['P','y','t','h']
>>> p[1:4]
 ['y','t','h']
>>> p[2:4]
 ['t','h']
>>> p[3:4]
 ['h']
>>> p[4:4]
 []

And then once you’ve seen that, slice assignment to the empty slice makes sense too:

>>> p = ['P','y','t','h','o','n']
>>> p[2:4] = ['x','y'] # Assigned list is same length as slice
>>> p
 ['P','y','x','y','o','n'] # Result is same length
>>> p = ['P','y','t','h','o','n']
>>> p[3:4] = ['x','y'] # Assigned list is longer than slice
>>> p
 ['P','y','t','x','y','o','n'] # The result is longer
>>> p = ['P','y','t','h','o','n']
>>> p[4:4] = ['x','y']
>>> p
 ['P','y','t','h','x','y','o','n'] # The result is longer still

Note that, since we are not changing the second number of the slice (4), the inserted items always stack right up against the ‘o’, even when we’re assigning to the empty slice. So the position for the empty slice assignment is the logical extension of the positions for the non-empty slice assignments.

Backing up a little bit, what happens when you keep going with our procession of counting up the slice beginning?

>>> p = ['P','y','t','h','o','n']
>>> p[0:4]
 ['P','y','t','h']
>>> p[1:4]
 ['y','t','h']
>>> p[2:4]
 ['t','h']
>>> p[3:4]
 ['h']
>>> p[4:4]
 []
>>> p[5:4]
 []
>>> p[6:4]
 []

With slicing, once you’re done, you’re done; it doesn’t start slicing backwards. In Python you don’t get negative strides unless you explicitly ask for them by using a negative number.

>>> p[5:3:-1]
 ['n','o']

There are some weird consequences to the “once you’re done, you’re done” rule:

>>> p[4:4]
 []
>>> p[5:4]
 []
>>> p[6:4]
 []
>>> p[6]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: list index out of range

In fact, compared to indexing, Python slicing is bizarrely error-proof:

>>> p[100:200]
 []
>>> p[int(2e99):int(1e99)]
 []

This can come in handy sometimes, but it can also lead to somewhat strange behavior:

>>> p
 ['P', 'y', 't', 'h', 'o', 'n']
>>> p[int(2e99):int(1e99)] = ['p','o','w','e','r']
>>> p
 ['P', 'y', 't', 'h', 'o', 'n', 'p', 'o', 'w', 'e', 'r']

Depending on your application, that might… or might not… be what you were hoping for there!


Below is the text of my original answer. It has been useful to many people, so I didn’t want to delete it.

>>> r=[1,2,3,4]
>>> r[1:1]
[]
>>> r[1:1]=[9,8]
>>> r
[1, 9, 8, 2, 3, 4]
>>> r[1:1]=['blah']
>>> r
[1, 'blah', 9, 8, 2, 3, 4]

This may also clarify the difference between slicing and indexing.


回答 4

解释Python的切片符号

总之,冒号(:)在标符号(subscriptable[subscriptarg])使切片符号-它具有可选参数,startstopstep

sliceable[start:stop:step]

Python切片是一种计算快速的方法,可以有条不紊地访问部分数据。我认为,即使是一名中级Python程序员,这也是该语言必须熟悉的一个方面。

重要定义

首先,让我们定义一些术语:

start:切片的开始索引,它将包含此索引处的元素,除非它与stop相同,默认为0,即第一个索引。如果为负,则表示从头开始n

stop:切片的结束索引,包含该索引处的元素,默认为要切片的序列的长度,即直到并包括结束。

步骤:索引增加的数量,默认为1。如果为负,则按相反方向切片。

索引如何工作

您可以使这些正数或负数中的任何一个。正数的含义很简单,但对于负数,就像在Python索引,向后从最终的计数启动停止,并为一步,你只需递减索引。此示例来自文档的教程,但我对其进行了稍微修改,以指示每个索引引用序列中的哪个项目:

 +---+---+---+---+---+---+
 | P | y | t | h | o | n |
 +---+---+---+---+---+---+
   0   1   2   3   4   5 
  -6  -5  -4  -3  -2  -1

切片如何工作

要将切片符号与支持它的序列一起使用,必须在序列后的方括号中至少包含一个冒号(根据Python数据模型,该括号实际上实现__getitem__了序列的方法)。

切片符号的工作方式如下:

sequence[start:stop:step]

并记得startstopstep有默认值,因此要访问默认值,只需省略参数。

从列表(或支持字符串的任何其他序列,如字符串)中获取最后九个元素的切片表示法如下所示:

my_list[-9:]

看到此内容时,我将括号中的部分读为“从末尾到第9位”。(实际上,我在心理上将其缩写为“ -9,on”)

说明:

完整的符号是

my_list[-9:None:None]

并替换为默认值(实际上,当step为负数时,stop默认值为-len(my_list) - 1,因此None对于stop而言,实际上仅意味着它会到达执行此操作的最后一个步骤):

my_list[-9:len(my_list):1]

冒号:是什么将告诉Python,你给它一个切片,而不是一个常规的索引。这就是为什么在Python 2中制作列表的浅表副本的惯用方式是

list_copy = sequence[:]

清除它们的方法是:

del my_list[:]

(Python 3获得了list.copyand list.clear方法。)

step为负数时,默认值startstop更改

默认情况下,当step参数为空(或None)时,会将其分配给+1

但是,您可以传入一个负整数,列表(或大多数其他标准可切片)将从头到尾切成片。

因此,负片将更改startand 的默认值stop

在来源中确认

我希望鼓励用户阅读源代码和文档。切片对象源代码和此逻辑可在此处找到。首先我们确定是否step为负:

 step_is_negative = step_sign < 0;

如果是这样,则下限是-1 指我们一直切到开始并包括起点,上限是长度减去1,这意味着我们从末尾开始。(请注意,此语义-1不同从一个-1用户可能通过在Python索引指示最后项)。

if (step_is_negative) {
    lower = PyLong_FromLong(-1L);
    if (lower == NULL)
        goto error;

    upper = PyNumber_Add(length, lower);
    if (upper == NULL)
        goto error;
}

否则step为正,下界将为零,上限(我们将达到但不包括在内)的是切片列表的长度。

else {
    lower = _PyLong_Zero;
    Py_INCREF(lower);
    upper = length;
    Py_INCREF(upper);
}

然后,我们可能需要应用默认设置startstop-那么默认的start时计算为上限step是否定的:

if (self->start == Py_None) {
    start = step_is_negative ? upper : lower;
    Py_INCREF(start);
}

stop,下限:

if (self->stop == Py_None) {
    stop = step_is_negative ? lower : upper;
    Py_INCREF(stop);
}

给您的切片起一个描述性的名字!

您可能会发现,将形成切片与将切片传递给list.__getitem__方法分开很有用(这就是方括号所做的事情)。即使您并不陌生,它也可以使您的代码更具可读性,以便其他可能必须阅读您的代码的人可以更轻松地了解您的操作。

但是,您不能只将一些用冒号分隔的整数分配给变量。您需要使用slice对象:

last_nine_slice = slice(-9, None)

第二个参数,None,是必需的,使得第一参数被解释为所述start参数否则这将是stop自变量

然后可以将slice对象传递给序列:

>>> list(range(100))[last_nine_slice]
[91, 92, 93, 94, 95, 96, 97, 98, 99]

有趣的是,范围也可以切片:

>>> range(100)[last_nine_slice]
range(91, 100)

内存注意事项:

由于Python列表切片在内存中创建了新对象,因此需要注意的另一个重要功能是itertools.islice。通常,您需要遍历一个切片,而不仅仅是在内存中静态创建它。islice对此很完美。一个警告,它不支持负的参数startstop或者step,如果这是一个问题,您可能需要计算指标或反向迭代提前。

length = 100
last_nine_iter = itertools.islice(list(range(length)), length-9, None, 1)
list_last_nine = list(last_nine_iter)

现在:

>>> list_last_nine
[91, 92, 93, 94, 95, 96, 97, 98, 99]

列表切片可以复制的事实是列表本身的功能。如果要切片高级对象(例如Pandas DataFrame),则它可能会返回原始视图,而不是副本。

Explain Python’s slice notation

In short, the colons (:) in subscript notation (subscriptable[subscriptarg]) make slice notation – which has the optional arguments, start, stop, step:

sliceable[start:stop:step]

Python slicing is a computationally fast way to methodically access parts of your data. In my opinion, to be even an intermediate Python programmer, it’s one aspect of the language that it is necessary to be familiar with.

Important Definitions

To begin with, let’s define a few terms:

start: the beginning index of the slice, it will include the element at this index unless it is the same as stop, defaults to 0, i.e. the first index. If it’s negative, it means to start n items from the end.

stop: the ending index of the slice, it does not include the element at this index, defaults to length of the sequence being sliced, that is, up to and including the end.

step: the amount by which the index increases, defaults to 1. If it’s negative, you’re slicing over the iterable in reverse.

How Indexing Works

You can make any of these positive or negative numbers. The meaning of the positive numbers is straightforward, but for negative numbers, just like indexes in Python, you count backwards from the end for the start and stop, and for the step, you simply decrement your index. This example is from the documentation’s tutorial, but I’ve modified it slightly to indicate which item in a sequence each index references:

 +---+---+---+---+---+---+
 | P | y | t | h | o | n |
 +---+---+---+---+---+---+
   0   1   2   3   4   5 
  -6  -5  -4  -3  -2  -1

How Slicing Works

To use slice notation with a sequence that supports it, you must include at least one colon in the square brackets that follow the sequence (which actually implement the __getitem__ method of the sequence, according to the Python data model.)

Slice notation works like this:

sequence[start:stop:step]

And recall that there are defaults for start, stop, and step, so to access the defaults, simply leave out the argument.

Slice notation to get the last nine elements from a list (or any other sequence that supports it, like a string) would look like this:

my_list[-9:]

When I see this, I read the part in the brackets as “9th from the end, to the end.” (Actually, I abbreviate it mentally as “-9, on”)

Explanation:

The full notation is

my_list[-9:None:None]

and to substitute the defaults (actually when step is negative, stop‘s default is -len(my_list) - 1, so None for stop really just means it goes to whichever end step takes it to):

my_list[-9:len(my_list):1]

The colon, :, is what tells Python you’re giving it a slice and not a regular index. That’s why the idiomatic way of making a shallow copy of lists in Python 2 is

list_copy = sequence[:]

And clearing them is with:

del my_list[:]

(Python 3 gets a list.copy and list.clear method.)

When step is negative, the defaults for start and stop change

By default, when the step argument is empty (or None), it is assigned to +1.

But you can pass in a negative integer, and the list (or most other standard slicables) will be sliced from the end to the beginning.

Thus a negative slice will change the defaults for start and stop!

Confirming this in the source

I like to encourage users to read the source as well as the documentation. The source code for slice objects and this logic is found here. First we determine if step is negative:

 step_is_negative = step_sign < 0;

If so, the lower bound is -1 meaning we slice all the way up to and including the beginning, and the upper bound is the length minus 1, meaning we start at the end. (Note that the semantics of this -1 is different from a -1 that users may pass indexes in Python indicating the last item.)

if (step_is_negative) {
    lower = PyLong_FromLong(-1L);
    if (lower == NULL)
        goto error;

    upper = PyNumber_Add(length, lower);
    if (upper == NULL)
        goto error;
}

Otherwise step is positive, and the lower bound will be zero and the upper bound (which we go up to but not including) the length of the sliced list.

else {
    lower = _PyLong_Zero;
    Py_INCREF(lower);
    upper = length;
    Py_INCREF(upper);
}

Then, we may need to apply the defaults for start and stop – the default then for start is calculated as the upper bound when step is negative:

if (self->start == Py_None) {
    start = step_is_negative ? upper : lower;
    Py_INCREF(start);
}

and stop, the lower bound:

if (self->stop == Py_None) {
    stop = step_is_negative ? lower : upper;
    Py_INCREF(stop);
}

Give your slices a descriptive name!

You may find it useful to separate forming the slice from passing it to the list.__getitem__ method (that’s what the square brackets do). Even if you’re not new to it, it keeps your code more readable so that others that may have to read your code can more readily understand what you’re doing.

However, you can’t just assign some integers separated by colons to a variable. You need to use the slice object:

last_nine_slice = slice(-9, None)

The second argument, None, is required, so that the first argument is interpreted as the start argument otherwise it would be the stop argument.

You can then pass the slice object to your sequence:

>>> list(range(100))[last_nine_slice]
[91, 92, 93, 94, 95, 96, 97, 98, 99]

It’s interesting that ranges also take slices:

>>> range(100)[last_nine_slice]
range(91, 100)

Memory Considerations:

Since slices of Python lists create new objects in memory, another important function to be aware of is itertools.islice. Typically you’ll want to iterate over a slice, not just have it created statically in memory. islice is perfect for this. A caveat, it doesn’t support negative arguments to start, stop, or step, so if that’s an issue you may need to calculate indices or reverse the iterable in advance.

length = 100
last_nine_iter = itertools.islice(list(range(length)), length-9, None, 1)
list_last_nine = list(last_nine_iter)

and now:

>>> list_last_nine
[91, 92, 93, 94, 95, 96, 97, 98, 99]

The fact that list slices make a copy is a feature of lists themselves. If you’re slicing advanced objects like a Pandas DataFrame, it may return a view on the original, and not a copy.


回答 5

当我第一次看到切片语法时,有几件事对我来说并不立即显而易见:

>>> x = [1,2,3,4,5,6]
>>> x[::-1]
[6,5,4,3,2,1]

颠倒序列的简单方法!

如果出于某种原因,您想要按相反的顺序进行第二个项目:

>>> x = [1,2,3,4,5,6]
>>> x[::-2]
[6,4,2]

And a couple of things that weren’t immediately obvious to me when I first saw the slicing syntax:

>>> x = [1,2,3,4,5,6]
>>> x[::-1]
[6,5,4,3,2,1]

Easy way to reverse sequences!

And if you wanted, for some reason, every second item in the reversed sequence:

>>> x = [1,2,3,4,5,6]
>>> x[::-2]
[6,4,2]

回答 6

在Python 2.7中

用Python切片

[a:b:c]

len = length of string, tuple or list

c -- default is +1. The sign of c indicates forward or backward, absolute value of c indicates steps. Default is forward with step size 1. Positive means forward, negative means backward.

a --  When c is positive or blank, default is 0. When c is negative, default is -1.

b --  When c is positive or blank, default is len. When c is negative, default is -(len+1).

了解索引分配非常重要。

In forward direction, starts at 0 and ends at len-1

In backward direction, starts at -1 and ends at -len

当您说[a:b:c]时,您要说的是根据c的符号(向前或向后),从a开始,到b结束(不包括bth索引处的元素)。使用上面的索引规则,请记住,您只会在此范围内找到元素:

-len, -len+1, -len+2, ..., 0, 1, 2,3,4 , len -1

但是这个范围在两个方向上都无限地继续:

...,-len -2 ,-len-1,-len, -len+1, -len+2, ..., 0, 1, 2,3,4 , len -1, len, len +1, len+2 , ....

例如:

             0    1    2   3    4   5   6   7   8   9   10   11
             a    s    t   r    i   n   g
    -9  -8  -7   -6   -5  -4   -3  -2  -1

如果您对a,b和c的选择允许您在使用上方a,b,c的规则遍历时与上述范围重叠,则您将获得一个包含元素的列表(在遍历期间被触摸)或一个空列表。

最后一件事:如果a和b相等,那么您还会得到一个空列表:

>>> l1
[2, 3, 4]

>>> l1[:]
[2, 3, 4]

>>> l1[::-1] # a default is -1 , b default is -(len+1)
[4, 3, 2]

>>> l1[:-4:-1] # a default is -1
[4, 3, 2]

>>> l1[:-3:-1] # a default is -1
[4, 3]

>>> l1[::] # c default is +1, so a default is 0, b default is len
[2, 3, 4]

>>> l1[::-1] # c is -1 , so a default is -1 and b default is -(len+1)
[4, 3, 2]


>>> l1[-100:-200:-1] # Interesting
[]

>>> l1[-1:-200:-1] # Interesting
[4, 3, 2]


>>> l1[-1:-1:1]
[]


>>> l1[-1:5:1] # Interesting
[4]


>>> l1[1:-7:1]
[]

>>> l1[1:-7:-1] # Interesting
[3, 2]

>>> l1[:-2:-2] # a default is -1, stop(b) at -2 , step(c) by 2 in reverse direction
[4]

In Python 2.7

Slicing in Python

[a:b:c]

len = length of string, tuple or list

c -- default is +1. The sign of c indicates forward or backward, absolute value of c indicates steps. Default is forward with step size 1. Positive means forward, negative means backward.

a --  When c is positive or blank, default is 0. When c is negative, default is -1.

b --  When c is positive or blank, default is len. When c is negative, default is -(len+1).

Understanding index assignment is very important.

In forward direction, starts at 0 and ends at len-1

In backward direction, starts at -1 and ends at -len

When you say [a:b:c], you are saying depending on the sign of c (forward or backward), start at a and end at b (excluding element at bth index). Use the indexing rule above and remember you will only find elements in this range:

-len, -len+1, -len+2, ..., 0, 1, 2,3,4 , len -1

But this range continues in both directions infinitely:

...,-len -2 ,-len-1,-len, -len+1, -len+2, ..., 0, 1, 2,3,4 , len -1, len, len +1, len+2 , ....

For example:

             0    1    2   3    4   5   6   7   8   9   10   11
             a    s    t   r    i   n   g
    -9  -8  -7   -6   -5  -4   -3  -2  -1

If your choice of a, b, and c allows overlap with the range above as you traverse using rules for a,b,c above you will either get a list with elements (touched during traversal) or you will get an empty list.

One last thing: if a and b are equal, then also you get an empty list:

>>> l1
[2, 3, 4]

>>> l1[:]
[2, 3, 4]

>>> l1[::-1] # a default is -1 , b default is -(len+1)
[4, 3, 2]

>>> l1[:-4:-1] # a default is -1
[4, 3, 2]

>>> l1[:-3:-1] # a default is -1
[4, 3]

>>> l1[::] # c default is +1, so a default is 0, b default is len
[2, 3, 4]

>>> l1[::-1] # c is -1 , so a default is -1 and b default is -(len+1)
[4, 3, 2]


>>> l1[-100:-200:-1] # Interesting
[]

>>> l1[-1:-200:-1] # Interesting
[4, 3, 2]


>>> l1[-1:-1:1]
[]


>>> l1[-1:5:1] # Interesting
[4]


>>> l1[1:-7:1]
[]

>>> l1[1:-7:-1] # Interesting
[3, 2]

>>> l1[:-2:-2] # a default is -1, stop(b) at -2 , step(c) by 2 in reverse direction
[4]

回答 7

http://wiki.python.org/moin/MovingToPythonFromOtherLanguages中找到了这张很棒的桌子

Python indexes and slices for a six-element list.
Indexes enumerate the elements, slices enumerate the spaces between the elements.

Index from rear:    -6  -5  -4  -3  -2  -1      a=[0,1,2,3,4,5]    a[1:]==[1,2,3,4,5]
Index from front:    0   1   2   3   4   5      len(a)==6          a[:5]==[0,1,2,3,4]
                   +---+---+---+---+---+---+    a[0]==0            a[:-2]==[0,1,2,3]
                   | a | b | c | d | e | f |    a[5]==5            a[1:2]==[1]
                   +---+---+---+---+---+---+    a[-1]==5           a[1:-1]==[1,2,3,4]
Slice from front:  :   1   2   3   4   5   :    a[-2]==4
Slice from rear:   :  -5  -4  -3  -2  -1   :
                                                b=a[:]
                                                b==[0,1,2,3,4,5] (shallow copy of a)

Found this great table at http://wiki.python.org/moin/MovingToPythonFromOtherLanguages

Python indexes and slices for a six-element list.
Indexes enumerate the elements, slices enumerate the spaces between the elements.

Index from rear:    -6  -5  -4  -3  -2  -1      a=[0,1,2,3,4,5]    a[1:]==[1,2,3,4,5]
Index from front:    0   1   2   3   4   5      len(a)==6          a[:5]==[0,1,2,3,4]
                   +---+---+---+---+---+---+    a[0]==0            a[:-2]==[0,1,2,3]
                   | a | b | c | d | e | f |    a[5]==5            a[1:2]==[1]
                   +---+---+---+---+---+---+    a[-1]==5           a[1:-1]==[1,2,3,4]
Slice from front:  :   1   2   3   4   5   :    a[-2]==4
Slice from rear:   :  -5  -4  -3  -2  -1   :
                                                b=a[:]
                                                b==[0,1,2,3,4,5] (shallow copy of a)

回答 8

使用了一点之后,我意识到最简单的描述是它与for循环中的参数完全相同…

(from:to:step)

它们都是可选的:

(:to:step)
(from::step)
(from:to)

然后,负索引只需要您将字符串的长度添加到负索引即可理解。

无论如何这对我有用…

After using it a bit I realise that the simplest description is that it is exactly the same as the arguments in a for loop…

(from:to:step)

Any of them are optional:

(:to:step)
(from::step)
(from:to)

Then the negative indexing just needs you to add the length of the string to the negative indices to understand it.

This works for me anyway…


回答 9

我发现更容易记住它是如何工作的,然后我可以找出任何特定的开始/停止/步骤组合。

首先了解它是有启发性的range()

def range(start=0, stop, step=1):  # Illegal syntax, but that's the effect
    i = start
    while (i < stop if step > 0 else i > stop):
        yield i
        i += step

从头开始start,以递增step,不达到stop。很简单。

要记住的关于负步长的事情stop是,无论是更高还是更低,始终是被排除的终点。如果您想以相反的顺序切割同一片,则分开进行反转会更清洁:例如'abcde'[1:-2][::-1]从左侧切出一个字符,从右侧切出两个字符,然后反转。(另请参见reversed()。)

序列切片相同,不同之处在于它首先对负索引进行规范化,并且永远不会超出序列范围:

待办事项:当abs(step)> 1;时,下面的代码有一个“永不超出序列”的错误;我我打补丁是正确的,但很难理解。

def this_is_how_slicing_works(seq, start=None, stop=None, step=1):
    if start is None:
        start = (0 if step > 0 else len(seq)-1)
    elif start < 0:
        start += len(seq)
    if not 0 <= start < len(seq):  # clip if still outside bounds
        start = (0 if step > 0 else len(seq)-1)
    if stop is None:
        stop = (len(seq) if step > 0 else -1)  # really -1, not last element
    elif stop < 0:
        stop += len(seq)
    for i in range(start, stop, step):
        if 0 <= i < len(seq):
            yield seq[i]

不必担心is None细节-请记住,省略start和/或stop始终做正确的事情可以为您提供整个序列。

首先,通过对负索引进行规范化,可以从开始到结束独立地对开始和/或停止进行计数:'abcde'[1:-2] == 'abcde'[1:3] == 'bc'尽管range(1,-2) == []。标准化有时被认为是“对长度取模”,但请注意,它仅将长度加一次:例如'abcde'[-53:42],只是整个字符串。

I find it easier to remember how it works, and then I can figure out any specific start/stop/step combination.

It’s instructive to understand range() first:

def range(start=0, stop, step=1):  # Illegal syntax, but that's the effect
    i = start
    while (i < stop if step > 0 else i > stop):
        yield i
        i += step

Begin from start, increment by step, do not reach stop. Very simple.

The thing to remember about negative step is that stop is always the excluded end, whether it’s higher or lower. If you want same slice in opposite order, it’s much cleaner to do the reversal separately: e.g. 'abcde'[1:-2][::-1] slices off one char from left, two from right, then reverses. (See also reversed().)

Sequence slicing is same, except it first normalizes negative indexes, and it can never go outside the sequence:

TODO: The code below had a bug with “never go outside the sequence” when abs(step)>1; I think I patched it to be correct, but it’s hard to understand.

def this_is_how_slicing_works(seq, start=None, stop=None, step=1):
    if start is None:
        start = (0 if step > 0 else len(seq)-1)
    elif start < 0:
        start += len(seq)
    if not 0 <= start < len(seq):  # clip if still outside bounds
        start = (0 if step > 0 else len(seq)-1)
    if stop is None:
        stop = (len(seq) if step > 0 else -1)  # really -1, not last element
    elif stop < 0:
        stop += len(seq)
    for i in range(start, stop, step):
        if 0 <= i < len(seq):
            yield seq[i]

Don’t worry about the is None details – just remember that omitting start and/or stop always does the right thing to give you the whole sequence.

Normalizing negative indexes first allows start and/or stop to be counted from the end independently: 'abcde'[1:-2] == 'abcde'[1:3] == 'bc' despite range(1,-2) == []. The normalization is sometimes thought of as “modulo the length”, but note it adds the length just once: e.g. 'abcde'[-53:42] is just the whole string.


回答 10

我自己使用“元素之间的索引点”方法来思考它,但是描述它有时可以帮助他人获得它的一种方法是:

mylist[X:Y]

X是所需的第一个元素的索引。
Y是您不需要的第一个元素的索引。

I use the “an index points between elements” method of thinking about it myself, but one way of describing it which sometimes helps others get it is this:

mylist[X:Y]

X is the index of the first element you want.
Y is the index of the first element you don’t want.


回答 11

Index:
      ------------>
  0   1   2   3   4
+---+---+---+---+---+
| a | b | c | d | e |
+---+---+---+---+---+
  0  -4  -3  -2  -1
      <------------

Slice:
    <---------------|
|--------------->
:   1   2   3   4   :
+---+---+---+---+---+
| a | b | c | d | e |
+---+---+---+---+---+
:  -4  -3  -2  -1   :
|--------------->
    <---------------|

我希望这将帮助您在Python中为列表建模。

参考:http : //wiki.python.org/moin/MovingToPythonFromOtherLanguages

Index:
      ------------>
  0   1   2   3   4
+---+---+---+---+---+
| a | b | c | d | e |
+---+---+---+---+---+
  0  -4  -3  -2  -1
      <------------

Slice:
    <---------------|
|--------------->
:   1   2   3   4   :
+---+---+---+---+---+
| a | b | c | d | e |
+---+---+---+---+---+
:  -4  -3  -2  -1   :
|--------------->
    <---------------|

I hope this will help you to model the list in Python.

Reference: http://wiki.python.org/moin/MovingToPythonFromOtherLanguages


回答 12

Python切片符号:

a[start:end:step]
  • 对于startend,负值被解释为相对于序列的末尾。
  • 对于正指标end指示的位置后,要包含的最后一个元素。
  • 空白值的默认设置如下:[+0:-0:1]
  • 使用否定步骤会颠倒对start和的解释。end

该符号扩展到(numpy)个矩阵和多维数组。例如,要切片整个列,可以使用:

m[::,0:2:] ## slice the first two columns

切片包含数组元素的引用,而不是副本。如果要为数组创建单独的副本,可以使用deepcopy()

Python slicing notation:

a[start:end:step]
  • For start and end, negative values are interpreted as being relative to the end of the sequence.
  • Positive indices for end indicate the position after the last element to be included.
  • Blank values are defaulted as follows: [+0:-0:1].
  • Using a negative step reverses the interpretation of start and end

The notation extends to (numpy) matrices and multidimensional arrays. For example, to slice entire columns you can use:

m[::,0:2:] ## slice the first two columns

Slices hold references, not copies, of the array elements. If you want to make a separate copy an array, you can use deepcopy().


回答 13

您还可以使用切片分配从列表中删除一个或多个元素:

r = [1, 'blah', 9, 8, 2, 3, 4]
>>> r[1:4] = []
>>> r
[1, 2, 3, 4]

You can also use slice assignment to remove one or more elements from a list:

r = [1, 'blah', 9, 8, 2, 3, 4]
>>> r[1:4] = []
>>> r
[1, 2, 3, 4]

回答 14

这只是一些额外的信息…请考虑以下列表

>>> l=[12,23,345,456,67,7,945,467]

反转列表的其他技巧:

>>> l[len(l):-len(l)-1:-1]
[467, 945, 7, 67, 456, 345, 23, 12]

>>> l[:-len(l)-1:-1]
[467, 945, 7, 67, 456, 345, 23, 12]

>>> l[len(l)::-1]
[467, 945, 7, 67, 456, 345, 23, 12]

>>> l[::-1]
[467, 945, 7, 67, 456, 345, 23, 12]

>>> l[-1:-len(l)-1:-1]
[467, 945, 7, 67, 456, 345, 23, 12]

This is just for some extra info… Consider the list below

>>> l=[12,23,345,456,67,7,945,467]

Few other tricks for reversing the list:

>>> l[len(l):-len(l)-1:-1]
[467, 945, 7, 67, 456, 345, 23, 12]

>>> l[:-len(l)-1:-1]
[467, 945, 7, 67, 456, 345, 23, 12]

>>> l[len(l)::-1]
[467, 945, 7, 67, 456, 345, 23, 12]

>>> l[::-1]
[467, 945, 7, 67, 456, 345, 23, 12]

>>> l[-1:-len(l)-1:-1]
[467, 945, 7, 67, 456, 345, 23, 12]

回答 15

这是我教新手切片的方法:

了解索引和切片之间的区别:

Wiki Python的这张惊人图片清楚地区分了索引编制和切片。

在此处输入图片说明

这是一个包含六个元素的列表。为了更好地了解切片,请将该列表视为一组六个盒子放在一起。每个盒子中都有一个字母。

索引就像处理盒子的内容。您可以检查任何框的内容。但是您不能一次检查多个框的内容。您甚至可以更换包装箱中的物品。但是您不能将两个球放在一个盒子中,也不能一次更换两个球。

In [122]: alpha = ['a', 'b', 'c', 'd', 'e', 'f']

In [123]: alpha
Out[123]: ['a', 'b', 'c', 'd', 'e', 'f']

In [124]: alpha[0]
Out[124]: 'a'

In [127]: alpha[0] = 'A'

In [128]: alpha
Out[128]: ['A', 'b', 'c', 'd', 'e', 'f']

In [129]: alpha[0,1]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-129-c7eb16585371> in <module>()
----> 1 alpha[0,1]

TypeError: list indices must be integers, not tuple

切片就像处理盒子本身。您可以拿起第一个盒子并将其放在另一个桌子上。要拿起盒子,您只需要知道盒子的开始和结束位置即可。

您甚至可以拾取前三个框,最后两个框或1到4之间的所有框。因此,如果您知道开始和结束,则可以选择任何一组框。这些位置称为开始位置和停止位置。

有趣的是,您可以一次替换多个盒子。您也可以在需要的地方放置多个盒子。

In [130]: alpha[0:1]
Out[130]: ['A']

In [131]: alpha[0:1] = 'a'

In [132]: alpha
Out[132]: ['a', 'b', 'c', 'd', 'e', 'f']

In [133]: alpha[0:2] = ['A', 'B']

In [134]: alpha
Out[134]: ['A', 'B', 'c', 'd', 'e', 'f']

In [135]: alpha[2:2] = ['x', 'xx']

In [136]: alpha
Out[136]: ['A', 'B', 'x', 'xx', 'c', 'd', 'e', 'f']

切片步骤:

到现在为止,您一直在不断挑选箱子。但是有时您需要离散地接机。例如,您可以每隔两个框取一次。您甚至可以从头开始每隔三个盒子拿起。此值称为步长。这代表您连续的拾音器之间的差距。如果您要从头到尾选择框,则步长应该为正,反之亦然。

In [137]: alpha = ['a', 'b', 'c', 'd', 'e', 'f']

In [142]: alpha[1:5:2]
Out[142]: ['b', 'd']

In [143]: alpha[-1:-5:-2]
Out[143]: ['f', 'd']

In [144]: alpha[1:5:-2]
Out[144]: []

In [145]: alpha[-1:-5:2]
Out[145]: []

Python如何找出缺失的参数:

切片时,如果遗漏任何参数,Python会尝试自动找出它。

如果您检查CPython的源代码,则会发现一个名为PySlice_GetIndicesEx()的函数,该函数可以为任何给定参数找出切片的索引。这是Python中的逻辑等效代码。

此函数采用Python对象和可选参数进行切片,并返回所请求切片的开始,停止,步长和切片长度。

def py_slice_get_indices_ex(obj, start=None, stop=None, step=None):

    length = len(obj)

    if step is None:
        step = 1
    if step == 0:
        raise Exception("Step cannot be zero.")

    if start is None:
        start = 0 if step > 0 else length - 1
    else:
        if start < 0:
            start += length
        if start < 0:
            start = 0 if step > 0 else -1
        if start >= length:
            start = length if step > 0 else length - 1

    if stop is None:
        stop = length if step > 0 else -1
    else:
        if stop < 0:
            stop += length
        if stop < 0:
            stop = 0 if step > 0 else -1
        if stop >= length:
            stop = length if step > 0 else length - 1

    if (step < 0 and stop >= start) or (step > 0 and start >= stop):
        slice_length = 0
    elif step < 0:
        slice_length = (stop - start + 1)/(step) + 1
    else:
        slice_length = (stop - start - 1)/(step) + 1

    return (start, stop, step, slice_length)

这就是切片背后的智能。由于Python具有称为slice的内置函数,因此您可以传递一些参数并检查其计算缺失参数的技巧。

In [21]: alpha = ['a', 'b', 'c', 'd', 'e', 'f']

In [22]: s = slice(None, None, None)

In [23]: s
Out[23]: slice(None, None, None)

In [24]: s.indices(len(alpha))
Out[24]: (0, 6, 1)

In [25]: range(*s.indices(len(alpha)))
Out[25]: [0, 1, 2, 3, 4, 5]

In [26]: s = slice(None, None, -1)

In [27]: range(*s.indices(len(alpha)))
Out[27]: [5, 4, 3, 2, 1, 0]

In [28]: s = slice(None, 3, -1)

In [29]: range(*s.indices(len(alpha)))
Out[29]: [5, 4]

注意:这篇文章最初是在我的博客Python切片背后的情报中撰写的。

This is how I teach slices to newbies:

Understanding the difference between indexing and slicing:

Wiki Python has this amazing picture which clearly distinguishes indexing and slicing.

Enter image description here

It is a list with six elements in it. To understand slicing better, consider that list as a set of six boxes placed together. Each box has an alphabet in it.

Indexing is like dealing with the contents of box. You can check contents of any box. But you can’t check the contents of multiple boxes at once. You can even replace the contents of the box. But you can’t place two balls in one box or replace two balls at a time.

In [122]: alpha = ['a', 'b', 'c', 'd', 'e', 'f']

In [123]: alpha
Out[123]: ['a', 'b', 'c', 'd', 'e', 'f']

In [124]: alpha[0]
Out[124]: 'a'

In [127]: alpha[0] = 'A'

In [128]: alpha
Out[128]: ['A', 'b', 'c', 'd', 'e', 'f']

In [129]: alpha[0,1]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-129-c7eb16585371> in <module>()
----> 1 alpha[0,1]

TypeError: list indices must be integers, not tuple

Slicing is like dealing with boxes themselves. You can pick up the first box and place it on another table. To pick up the box, all you need to know is the position of beginning and ending of the box.

You can even pick up the first three boxes or the last two boxes or all boxes between 1 and 4. So, you can pick any set of boxes if you know the beginning and ending. These positions are called start and stop positions.

The interesting thing is that you can replace multiple boxes at once. Also you can place multiple boxes wherever you like.

In [130]: alpha[0:1]
Out[130]: ['A']

In [131]: alpha[0:1] = 'a'

In [132]: alpha
Out[132]: ['a', 'b', 'c', 'd', 'e', 'f']

In [133]: alpha[0:2] = ['A', 'B']

In [134]: alpha
Out[134]: ['A', 'B', 'c', 'd', 'e', 'f']

In [135]: alpha[2:2] = ['x', 'xx']

In [136]: alpha
Out[136]: ['A', 'B', 'x', 'xx', 'c', 'd', 'e', 'f']

Slicing With Step:

Till now you have picked boxes continuously. But sometimes you need to pick up discretely. For example, you can pick up every second box. You can even pick up every third box from the end. This value is called step size. This represents the gap between your successive pickups. The step size should be positive if You are picking boxes from the beginning to end and vice versa.

In [137]: alpha = ['a', 'b', 'c', 'd', 'e', 'f']

In [142]: alpha[1:5:2]
Out[142]: ['b', 'd']

In [143]: alpha[-1:-5:-2]
Out[143]: ['f', 'd']

In [144]: alpha[1:5:-2]
Out[144]: []

In [145]: alpha[-1:-5:2]
Out[145]: []

How Python Figures Out Missing Parameters:

When slicing, if you leave out any parameter, Python tries to figure it out automatically.

If you check the source code of CPython, you will find a function called PySlice_GetIndicesEx() which figures out indices to a slice for any given parameters. Here is the logical equivalent code in Python.

This function takes a Python object and optional parameters for slicing and returns the start, stop, step, and slice length for the requested slice.

def py_slice_get_indices_ex(obj, start=None, stop=None, step=None):

    length = len(obj)

    if step is None:
        step = 1
    if step == 0:
        raise Exception("Step cannot be zero.")

    if start is None:
        start = 0 if step > 0 else length - 1
    else:
        if start < 0:
            start += length
        if start < 0:
            start = 0 if step > 0 else -1
        if start >= length:
            start = length if step > 0 else length - 1

    if stop is None:
        stop = length if step > 0 else -1
    else:
        if stop < 0:
            stop += length
        if stop < 0:
            stop = 0 if step > 0 else -1
        if stop >= length:
            stop = length if step > 0 else length - 1

    if (step < 0 and stop >= start) or (step > 0 and start >= stop):
        slice_length = 0
    elif step < 0:
        slice_length = (stop - start + 1)/(step) + 1
    else:
        slice_length = (stop - start - 1)/(step) + 1

    return (start, stop, step, slice_length)

This is the intelligence that is present behind slices. Since Python has an built-in function called slice, you can pass some parameters and check how smartly it calculates missing parameters.

In [21]: alpha = ['a', 'b', 'c', 'd', 'e', 'f']

In [22]: s = slice(None, None, None)

In [23]: s
Out[23]: slice(None, None, None)

In [24]: s.indices(len(alpha))
Out[24]: (0, 6, 1)

In [25]: range(*s.indices(len(alpha)))
Out[25]: [0, 1, 2, 3, 4, 5]

In [26]: s = slice(None, None, -1)

In [27]: range(*s.indices(len(alpha)))
Out[27]: [5, 4, 3, 2, 1, 0]

In [28]: s = slice(None, 3, -1)

In [29]: range(*s.indices(len(alpha)))
Out[29]: [5, 4]

Note: This post was originally written in my blog, The Intelligence Behind Python Slices.


回答 16

通常,编写带有很多硬编码索引值的代码会导致可读性和维护混乱。例如,如果一年后返回代码,您将对其进行查看,并想知道编写代码时的想法。显示的解决方案只是一种更清晰地说明代码实际运行方式的方式。通常,内置slice()创建一个slice对象,该对象可在允许slice的任何地方使用。例如:

>>> items = [0, 1, 2, 3, 4, 5, 6]
>>> a = slice(2, 4)
>>> items[2:4]
[2, 3]
>>> items[a]
[2, 3]
>>> items[a] = [10,11]
>>> items
[0, 1, 10, 11, 4, 5, 6]
>>> del items[a]
>>> items
[0, 1, 4, 5, 6]

如果您有切片实例s,则可以分别通过查看其s.start,s.stop和s.step属性来获取有关其的更多信息。例如:

>>> a = slice(10, 50, 2)
>>> a.start
10
>>> a.stop
50
>>> a.step
2
>>>

As a general rule, writing code with a lot of hardcoded index values leads to a readability and maintenance mess. For example, if you come back to the code a year later, you’ll look at it and wonder what you were thinking when you wrote it. The solution shown is simply a way of more clearly stating what your code is actually doing. In general, the built-in slice() creates a slice object that can be used anywhere a slice is allowed. For example:

>>> items = [0, 1, 2, 3, 4, 5, 6]
>>> a = slice(2, 4)
>>> items[2:4]
[2, 3]
>>> items[a]
[2, 3]
>>> items[a] = [10,11]
>>> items
[0, 1, 10, 11, 4, 5, 6]
>>> del items[a]
>>> items
[0, 1, 4, 5, 6]

If you have a slice instance s, you can get more information about it by looking at its s.start, s.stop, and s.step attributes, respectively. For example:

>>> a = slice(10, 50, 2)
>>> a.start
10
>>> a.stop
50
>>> a.step
2
>>>

回答 17

1.切片符号

为简单起见,请记住slice只有一种形式:

s[start:end:step]

这是它的工作方式:

  • s:可以切片的对象
  • start:开始迭代的第一个索引
  • end:最后一个索引,请注意,end索引将不包含在结果切片中
  • step:选择每个step索引元素

另一种进口的东西:所有的startendstep可以省略!如果省略了它们,它们的默认值将被使用:0len(s)1相应地。

因此可能的变化是:

# Mostly used variations
s[start:end]
s[start:]
s[:end]

# Step-related variations
s[:end:step]
s[start::step]
s[::step]

# Make a copy
s[:]

注意:如果start >= end(仅考虑step>0),Python将返回一个空slice []

2.陷阱

上一部分解释了切片如何工作的核心功能,并且在大多数情况下都可以使用。但是,您应该注意一些陷阱,本部分将对它们进行说明。

负指标

使Python学习者感到困惑的第一件事就是索引可能是负数! 不要惊慌:负索引意味着倒数。

例如:

s[-5:]    # Start at the 5th index from the end of array,
          # thus returning the last 5 elements.
s[:-5]    # Start at index 0, and end until the 5th index from end of array,
          # thus returning s[0:len(s)-5].

负步

使事情更加混乱的是,这step也可能是负面的!

否定步骤意味着向后迭代数组:从头到尾,包括结束索引,并且从结果中排除开始索引。

:当步骤为负值,默认值startlen(s)(虽然end不等于0,因为s[::-1]包含s[0])。例如:

s[::-1]            # Reversed slice
s[len(s)::-1]      # The same as above, reversed slice
s[0:len(s):-1]     # Empty list

超出范围错误?

惊奇: 当索引超出范围时,slice不会引发IndexError!

如果索引超出范围,Python将尽力将索引设置为0len(s)根据情况。例如:

s[:len(s)+5]      # The same as s[:len(s)]
s[-len(s)-5::]    # The same as s[0:]
s[len(s)+5::-1]   # The same as s[len(s)::-1], and the same as s[::-1]

3.例子

让我们以示例结束这个答案,解释我们所讨论的一切:

# Create our array for demonstration
In [1]: s = [i for i in range(10)]

In [2]: s
Out[2]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [3]: s[2:]   # From index 2 to last index
Out[3]: [2, 3, 4, 5, 6, 7, 8, 9]

In [4]: s[:8]   # From index 0 up to index 8
Out[4]: [0, 1, 2, 3, 4, 5, 6, 7]

In [5]: s[4:7]  # From index 4 (included) up to index 7(excluded)
Out[5]: [4, 5, 6]

In [6]: s[:-2]  # Up to second last index (negative index)
Out[6]: [0, 1, 2, 3, 4, 5, 6, 7]

In [7]: s[-2:]  # From second last index (negative index)
Out[7]: [8, 9]

In [8]: s[::-1] # From last to first in reverse order (negative step)
Out[8]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

In [9]: s[::-2] # All odd numbers in reversed order
Out[9]: [9, 7, 5, 3, 1]

In [11]: s[-2::-2] # All even numbers in reversed order
Out[11]: [8, 6, 4, 2, 0]

In [12]: s[3:15]   # End is out of range, and Python will set it to len(s).
Out[12]: [3, 4, 5, 6, 7, 8, 9]

In [14]: s[5:1]    # Start > end; return empty list
Out[14]: []

In [15]: s[11]     # Access index 11 (greater than len(s)) will raise an IndexError
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-15-79ffc22473a3> in <module>()
----> 1 s[11]

IndexError: list index out of range

1. Slice Notation

To make it simple, remember slice has only one form:

s[start:end:step]

and here is how it works:

  • s: an object that can be sliced
  • start: first index to start iteration
  • end: last index, NOTE that end index will not be included in the resulted slice
  • step: pick element every step index

Another import thing: all start,end, step can be omitted! And if they are omitted, their default value will be used: 0,len(s),1 accordingly.

So possible variations are:

# Mostly used variations
s[start:end]
s[start:]
s[:end]

# Step-related variations
s[:end:step]
s[start::step]
s[::step]

# Make a copy
s[:]

NOTE: If start >= end (considering only when step>0), Python will return a empty slice [].

2. Pitfalls

The above part explains the core features on how slice works, and it will work on most occasions. However, there can be pitfalls you should watch out, and this part explains them.

Negative indexes

The very first thing that confuses Python learners is that an index can be negative! Don’t panic: a negative index means count backwards.

For example:

s[-5:]    # Start at the 5th index from the end of array,
          # thus returning the last 5 elements.
s[:-5]    # Start at index 0, and end until the 5th index from end of array,
          # thus returning s[0:len(s)-5].

Negative step

Making things more confusing is that step can be negative too!

A negative step means iterate the array backwards: from the end to start, with the end index included, and the start index excluded from the result.

NOTE: when step is negative, the default value for start is len(s) (while end does not equal to 0, because s[::-1] contains s[0]). For example:

s[::-1]            # Reversed slice
s[len(s)::-1]      # The same as above, reversed slice
s[0:len(s):-1]     # Empty list

Out of range error?

Be surprised: slice does not raise an IndexError when the index is out of range!

If the index is out of range, Python will try its best to set the index to 0 or len(s) according to the situation. For example:

s[:len(s)+5]      # The same as s[:len(s)]
s[-len(s)-5::]    # The same as s[0:]
s[len(s)+5::-1]   # The same as s[len(s)::-1], and the same as s[::-1]

3. Examples

Let’s finish this answer with examples, explaining everything we have discussed:

# Create our array for demonstration
In [1]: s = [i for i in range(10)]

In [2]: s
Out[2]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [3]: s[2:]   # From index 2 to last index
Out[3]: [2, 3, 4, 5, 6, 7, 8, 9]

In [4]: s[:8]   # From index 0 up to index 8
Out[4]: [0, 1, 2, 3, 4, 5, 6, 7]

In [5]: s[4:7]  # From index 4 (included) up to index 7(excluded)
Out[5]: [4, 5, 6]

In [6]: s[:-2]  # Up to second last index (negative index)
Out[6]: [0, 1, 2, 3, 4, 5, 6, 7]

In [7]: s[-2:]  # From second last index (negative index)
Out[7]: [8, 9]

In [8]: s[::-1] # From last to first in reverse order (negative step)
Out[8]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

In [9]: s[::-2] # All odd numbers in reversed order
Out[9]: [9, 7, 5, 3, 1]

In [11]: s[-2::-2] # All even numbers in reversed order
Out[11]: [8, 6, 4, 2, 0]

In [12]: s[3:15]   # End is out of range, and Python will set it to len(s).
Out[12]: [3, 4, 5, 6, 7, 8, 9]

In [14]: s[5:1]    # Start > end; return empty list
Out[14]: []

In [15]: s[11]     # Access index 11 (greater than len(s)) will raise an IndexError
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-15-79ffc22473a3> in <module>()
----> 1 s[11]

IndexError: list index out of range

回答 18

先前的答案没有讨论使用著名的NumPy包可以实现的多维数组切片:

切片也可以应用于多维数组。

# Here, a is a NumPy array

>>> a
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])
>>> a[:2, 0:3:2]
array([[1, 3],
       [5, 7]])

的“ :2”逗号在第一维和操作之前,“ 0:3:2”逗号在第二维操作之后。

The previous answers don’t discuss multi-dimensional array slicing which is possible using the famous NumPy package:

Slicing can also be applied to multi-dimensional arrays.

# Here, a is a NumPy array

>>> a
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])
>>> a[:2, 0:3:2]
array([[1, 3],
       [5, 7]])

The “:2” before the comma operates on the first dimension and the “0:3:2” after the comma operates on the second dimension.


回答 19

#!/usr/bin/env python

def slicegraphical(s, lista):

    if len(s) > 9:
        print """Enter a string of maximum 9 characters,
    so the printig would looki nice"""
        return 0;
    # print " ",
    print '  '+'+---' * len(s) +'+'
    print ' ',
    for letter in s:
        print '| {}'.format(letter),
    print '|'
    print " ",; print '+---' * len(s) +'+'

    print " ",
    for letter in range(len(s) +1):
        print '{}  '.format(letter),
    print ""
    for letter in range(-1*(len(s)), 0):
        print ' {}'.format(letter),
    print ''
    print ''


    for triada in lista:
        if len(triada) == 3:
            if triada[0]==None and triada[1] == None and triada[2] == None:
                # 000
                print s+'[   :   :   ]' +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] == None and triada[1] == None and triada[2] != None:
                # 001
                print s+'[   :   :{0:2d} ]'.format(triada[2], '','') +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] == None and triada[1] != None and triada[2] == None:
                # 010
                print s+'[   :{0:2d} :   ]'.format(triada[1]) +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] == None and triada[1] != None and triada[2] != None:
                # 011
                print s+'[   :{0:2d} :{1:2d} ]'.format(triada[1], triada[2]) +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] != None and triada[1] == None and triada[2] == None:
                # 100
                print s+'[{0:2d} :   :   ]'.format(triada[0]) +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] != None and triada[1] == None and triada[2] != None:
                # 101
                print s+'[{0:2d} :   :{1:2d} ]'.format(triada[0], triada[2]) +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] != None and triada[1] != None and triada[2] == None:
                # 110
                print s+'[{0:2d} :{1:2d} :   ]'.format(triada[0], triada[1]) +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] != None and triada[1] != None and triada[2] != None:
                # 111
                print s+'[{0:2d} :{1:2d} :{2:2d} ]'.format(triada[0], triada[1], triada[2]) +' = ', s[triada[0]:triada[1]:triada[2]]

        elif len(triada) == 2:
            if triada[0] == None and triada[1] == None:
                # 00
                print s+'[   :   ]    ' + ' = ', s[triada[0]:triada[1]]
            elif triada[0] == None and triada[1] != None:
                # 01
                print s+'[   :{0:2d} ]    '.format(triada[1]) + ' = ', s[triada[0]:triada[1]]
            elif triada[0] != None and triada[1] == None:
                # 10
                print s+'[{0:2d} :   ]    '.format(triada[0]) + ' = ', s[triada[0]:triada[1]]
            elif triada[0] != None and triada[1] != None:
                # 11
                print s+'[{0:2d} :{1:2d} ]    '.format(triada[0],triada[1]) + ' = ', s[triada[0]:triada[1]]

        elif len(triada) == 1:
            print s+'[{0:2d} ]        '.format(triada[0]) + ' = ', s[triada[0]]


if __name__ == '__main__':
    # Change "s" to what ever string you like, make it 9 characters for
    # better representation.
    s = 'COMPUTERS'

    # add to this list different lists to experement with indexes
    # to represent ex. s[::], use s[None, None,None], otherwise you get an error
    # for s[2:] use s[2:None]

    lista = [[4,7],[2,5,2],[-5,1,-1],[4],[-4,-6,-1], [2,-3,1],[2,-3,-1], [None,None,-1],[-5,None],[-5,0,-1],[-5,None,-1],[-1,1,-2]]

    slicegraphical(s, lista)

您可以运行此脚本并进行实验,以下是我从脚本中获得的一些示例。

  +---+---+---+---+---+---+---+---+---+
  | C | O | M | P | U | T | E | R | S |
  +---+---+---+---+---+---+---+---+---+
  0   1   2   3   4   5   6   7   8   9   
 -9  -8  -7  -6  -5  -4  -3  -2  -1 

COMPUTERS[ 4 : 7 ]     =  UTE
COMPUTERS[ 2 : 5 : 2 ] =  MU
COMPUTERS[-5 : 1 :-1 ] =  UPM
COMPUTERS[ 4 ]         =  U
COMPUTERS[-4 :-6 :-1 ] =  TU
COMPUTERS[ 2 :-3 : 1 ] =  MPUT
COMPUTERS[ 2 :-3 :-1 ] =  
COMPUTERS[   :   :-1 ] =  SRETUPMOC
COMPUTERS[-5 :   ]     =  UTERS
COMPUTERS[-5 : 0 :-1 ] =  UPMO
COMPUTERS[-5 :   :-1 ] =  UPMOC
COMPUTERS[-1 : 1 :-2 ] =  SEUM
[Finished in 0.9s]

当使用否定步骤时,请注意答案右移1。

#!/usr/bin/env python

def slicegraphical(s, lista):

    if len(s) > 9:
        print """Enter a string of maximum 9 characters,
    so the printig would looki nice"""
        return 0;
    # print " ",
    print '  '+'+---' * len(s) +'+'
    print ' ',
    for letter in s:
        print '| {}'.format(letter),
    print '|'
    print " ",; print '+---' * len(s) +'+'

    print " ",
    for letter in range(len(s) +1):
        print '{}  '.format(letter),
    print ""
    for letter in range(-1*(len(s)), 0):
        print ' {}'.format(letter),
    print ''
    print ''


    for triada in lista:
        if len(triada) == 3:
            if triada[0]==None and triada[1] == None and triada[2] == None:
                # 000
                print s+'[   :   :   ]' +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] == None and triada[1] == None and triada[2] != None:
                # 001
                print s+'[   :   :{0:2d} ]'.format(triada[2], '','') +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] == None and triada[1] != None and triada[2] == None:
                # 010
                print s+'[   :{0:2d} :   ]'.format(triada[1]) +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] == None and triada[1] != None and triada[2] != None:
                # 011
                print s+'[   :{0:2d} :{1:2d} ]'.format(triada[1], triada[2]) +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] != None and triada[1] == None and triada[2] == None:
                # 100
                print s+'[{0:2d} :   :   ]'.format(triada[0]) +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] != None and triada[1] == None and triada[2] != None:
                # 101
                print s+'[{0:2d} :   :{1:2d} ]'.format(triada[0], triada[2]) +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] != None and triada[1] != None and triada[2] == None:
                # 110
                print s+'[{0:2d} :{1:2d} :   ]'.format(triada[0], triada[1]) +' = ', s[triada[0]:triada[1]:triada[2]]
            elif triada[0] != None and triada[1] != None and triada[2] != None:
                # 111
                print s+'[{0:2d} :{1:2d} :{2:2d} ]'.format(triada[0], triada[1], triada[2]) +' = ', s[triada[0]:triada[1]:triada[2]]

        elif len(triada) == 2:
            if triada[0] == None and triada[1] == None:
                # 00
                print s+'[   :   ]    ' + ' = ', s[triada[0]:triada[1]]
            elif triada[0] == None and triada[1] != None:
                # 01
                print s+'[   :{0:2d} ]    '.format(triada[1]) + ' = ', s[triada[0]:triada[1]]
            elif triada[0] != None and triada[1] == None:
                # 10
                print s+'[{0:2d} :   ]    '.format(triada[0]) + ' = ', s[triada[0]:triada[1]]
            elif triada[0] != None and triada[1] != None:
                # 11
                print s+'[{0:2d} :{1:2d} ]    '.format(triada[0],triada[1]) + ' = ', s[triada[0]:triada[1]]

        elif len(triada) == 1:
            print s+'[{0:2d} ]        '.format(triada[0]) + ' = ', s[triada[0]]


if __name__ == '__main__':
    # Change "s" to what ever string you like, make it 9 characters for
    # better representation.
    s = 'COMPUTERS'

    # add to this list different lists to experement with indexes
    # to represent ex. s[::], use s[None, None,None], otherwise you get an error
    # for s[2:] use s[2:None]

    lista = [[4,7],[2,5,2],[-5,1,-1],[4],[-4,-6,-1], [2,-3,1],[2,-3,-1], [None,None,-1],[-5,None],[-5,0,-1],[-5,None,-1],[-1,1,-2]]

    slicegraphical(s, lista)

You can run this script and experiment with it, below is some samples that I got from the script.

  +---+---+---+---+---+---+---+---+---+
  | C | O | M | P | U | T | E | R | S |
  +---+---+---+---+---+---+---+---+---+
  0   1   2   3   4   5   6   7   8   9   
 -9  -8  -7  -6  -5  -4  -3  -2  -1 

COMPUTERS[ 4 : 7 ]     =  UTE
COMPUTERS[ 2 : 5 : 2 ] =  MU
COMPUTERS[-5 : 1 :-1 ] =  UPM
COMPUTERS[ 4 ]         =  U
COMPUTERS[-4 :-6 :-1 ] =  TU
COMPUTERS[ 2 :-3 : 1 ] =  MPUT
COMPUTERS[ 2 :-3 :-1 ] =  
COMPUTERS[   :   :-1 ] =  SRETUPMOC
COMPUTERS[-5 :   ]     =  UTERS
COMPUTERS[-5 : 0 :-1 ] =  UPMO
COMPUTERS[-5 :   :-1 ] =  UPMOC
COMPUTERS[-1 : 1 :-2 ] =  SEUM
[Finished in 0.9s]

When using a negative step, notice that the answer is shifted to the right by 1.


回答 20

我的大脑似乎很高兴接受lst[start:end]包含start-th项的内容。我什至可以说这是一个“自然的假设”。

但是偶尔会有一个疑问浮出水面,我的大脑要求确保它不含end-th元素。

在这些时刻,我依靠这个简单的定理:

for any n,    lst = lst[:n] + lst[n:]

这个漂亮的属性告诉我,lst[start:end]它不包含end-th项,因为它在中lst[end:]

注意,该定理对任何一个n都成立。例如,您可以检查

lst = range(10)
lst[:-42] + lst[-42:] == lst

返回True

My brain seems happy to accept that lst[start:end] contains the start-th item. I might even say that it is a ‘natural assumption’.

But occasionally a doubt creeps in and my brain asks for reassurance that it does not contain the end-th element.

In these moments I rely on this simple theorem:

for any n,    lst = lst[:n] + lst[n:]

This pretty property tells me that lst[start:end] does not contain the end-th item because it is in lst[end:].

Note that this theorem is true for any n at all. For example, you can check that

lst = range(10)
lst[:-42] + lst[-42:] == lst

returns True.


回答 21

我认为,如果以以下方式(继续阅读)看待它,您将更好地理解和记住Python字符串切片表示法。

让我们使用以下字符串…

azString = "abcdefghijklmnopqrstuvwxyz"

对于那些不知道的人,您可以azString使用符号来创建任何子字符串azString[x:y]

来自其他编程语言的那是常识受到损害的时候。x和y是什么?

在寻求一种记忆技术时,我不得不坐下来并运行几种方案,该技术将帮助我记住x和y是什么,并帮助我在第一次尝试中正确地分割字符串。

我的结论是,x和y应该被视为包围我们要附加的字符串的边界索引。因此,我们应该将表达式视为azString[index1, index2]或什至更清晰azString[index_of_first_character, index_after_the_last_character]

这是该示例的可视化示例…

Letters   a b c d e f g h i j ...
         ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑
             ┊           ┊
Indexes  0 1 2 3 4 5 6 7 8 9 ...
             ┊           ┊
cdefgh    index1       index2

因此,您要做的就是将index1和index2设置为所需子字符串周围的值。例如,要获取子字符串“ cdefgh”,您可以使用azString[2:8],因为“ c”左侧的索引为2,而右侧“ h”的索引为8。

请记住,我们正在设定界限。这些边界是您可以放置​​一些括号的位置,这些括号将像这样围绕子字符串…

ab [ cdefgh ] ij

该技巧始终有效,并且易于记忆。

In my opinion, you will understand and memorize better the Python string slicing notation if you look at it the following way (read on).

Let’s work with the following string …

azString = "abcdefghijklmnopqrstuvwxyz"

For those who don’t know, you can create any substring from azString using the notation azString[x:y]

Coming from other programming languages, that’s when the common sense gets compromised. What are x and y?

I had to sit down and run several scenarios in my quest for a memorization technique that will help me remember what x and y are and help me slice strings properly at the first attempt.

My conclusion is that x and y should be seen as the boundary indexes that are surrounding the strings that we want to extra. So we should see the expression as azString[index1, index2] or even more clearer as azString[index_of_first_character, index_after_the_last_character].

Here is an example visualization of that …

Letters   a b c d e f g h i j ...
         ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑ ↑
             ┊           ┊
Indexes  0 1 2 3 4 5 6 7 8 9 ...
             ┊           ┊
cdefgh    index1       index2

So all you have to do is setting index1 and index2 to the values that will surround the desired substring. For instance, to get the substring “cdefgh”, you can use azString[2:8], because the index on the left side of “c” is 2 and the one on the right size of “h” is 8.

Remember that we are setting the boundaries. And those boundaries are the positions where you could place some brackets that will be wrapped around the substring like this …

a b [ c d e f g h ] i j

That trick works all the time and is easy to memorize.


回答 22

先前的大多数答案都清除了有关切片符号的问题。

用于切片的扩展索引语法为aList[start:stop:step],基本示例为:

在此处输入图片说明

更多切片示例:15个扩展切片

Most of the previous answers clears up questions about slice notation.

The extended indexing syntax used for slicing is aList[start:stop:step], and basic examples are:

Enter image description here:

More slicing examples: 15 Extended Slices


回答 23

在Python中,切片的最基本形式如下:

l[start:end]

where l是某个集合,start是一个包含索引,并且end是一个排斥索引。

In [1]: l = list(range(10))

In [2]: l[:5] # First five elements
Out[2]: [0, 1, 2, 3, 4]

In [3]: l[-5:] # Last five elements
Out[3]: [5, 6, 7, 8, 9]

从头开始切片时,可以省略零索引,而从末尾切片时,可以省略最终索引,因为它是多余的,所以不要太冗长:

In [5]: l[:3] == l[0:3]
Out[5]: True

In [6]: l[7:] == l[7:len(l)]
Out[6]: True

在相对于集合末尾进行偏移量时,负整数很有用:

In [7]: l[:-1] # Include all elements but the last one
Out[7]: [0, 1, 2, 3, 4, 5, 6, 7, 8]

In [8]: l[-3:] # Take the last three elements
Out[8]: [7, 8, 9]

切片时可以提供超出范围的索引,例如:

In [9]: l[:20] # 20 is out of index bounds, and l[20] will raise an IndexError exception
Out[9]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [11]: l[-20:] # -20 is out of index bounds, and l[-20] will raise an IndexError exception
Out[11]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

请记住,切片集合的结果是一个全新的集合。另外,在分配中使用切片表示法时,切片分配的长度不必相同。将保留分配的切片之前和之后的值,并且集合将缩小或增长以包含新值:

In [16]: l[2:6] = list('abc') # Assigning fewer elements than the ones contained in the sliced collection l[2:6]

In [17]: l
Out[17]: [0, 1, 'a', 'b', 'c', 6, 7, 8, 9]

In [18]: l[2:5] = list('hello') # Assigning more elements than the ones contained in the sliced collection l [2:5]

In [19]: l
Out[19]: [0, 1, 'h', 'e', 'l', 'l', 'o', 6, 7, 8, 9]

如果省略开始索引和结束索引,则将复制该集合:

In [14]: l_copy = l[:]

In [15]: l == l_copy and l is not l_copy
Out[15]: True

如果在执行赋值操作时省略了开始索引和结束索引,则集合的全部内容将被引用的副本代替:

In [20]: l[:] = list('hello...')

In [21]: l
Out[21]: ['h', 'e', 'l', 'l', 'o', '.', '.', '.']

除了基本切片之外,还可以应用以下符号:

l[start:end:step]

where l是一个集合,start是一个包含索引,end是一个排他索引,并且step是一个可用于获取第n个项目的跨度l

In [22]: l = list(range(10))

In [23]: l[::2] # Take the elements which indexes are even
Out[23]: [0, 2, 4, 6, 8]

In [24]: l[1::2] # Take the elements which indexes are odd
Out[24]: [1, 3, 5, 7, 9]

使用step提供了一个有用的技巧来反转Python中的集合:

In [25]: l[::-1]
Out[25]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

step下面的示例也可以使用负整数:

In[28]:  l[::-2]
Out[28]: [9, 7, 5, 3, 1]

但是,对使用负值step可能会造成混乱。此外,为了Python化,应避免使用startend以及step在一个片。如果需要这样做,请考虑分两次进行(一次进行切片,另一次进行大跨步)。

In [29]: l = l[::2] # This step is for striding

In [30]: l
Out[30]: [0, 2, 4, 6, 8]

In [31]: l = l[1:-1] # This step is for slicing

In [32]: l
Out[32]: [2, 4, 6]

In Python, the most basic form for slicing is the following:

l[start:end]

where l is some collection, start is an inclusive index, and end is an exclusive index.

In [1]: l = list(range(10))

In [2]: l[:5] # First five elements
Out[2]: [0, 1, 2, 3, 4]

In [3]: l[-5:] # Last five elements
Out[3]: [5, 6, 7, 8, 9]

When slicing from the start, you can omit the zero index, and when slicing to the end, you can omit the final index since it is redundant, so do not be verbose:

In [5]: l[:3] == l[0:3]
Out[5]: True

In [6]: l[7:] == l[7:len(l)]
Out[6]: True

Negative integers are useful when doing offsets relative to the end of a collection:

In [7]: l[:-1] # Include all elements but the last one
Out[7]: [0, 1, 2, 3, 4, 5, 6, 7, 8]

In [8]: l[-3:] # Take the last three elements
Out[8]: [7, 8, 9]

It is possible to provide indices that are out of bounds when slicing such as:

In [9]: l[:20] # 20 is out of index bounds, and l[20] will raise an IndexError exception
Out[9]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [11]: l[-20:] # -20 is out of index bounds, and l[-20] will raise an IndexError exception
Out[11]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Keep in mind that the result of slicing a collection is a whole new collection. In addition, when using slice notation in assignments, the length of the slice assignments do not need to be the same. The values before and after the assigned slice will be kept, and the collection will shrink or grow to contain the new values:

In [16]: l[2:6] = list('abc') # Assigning fewer elements than the ones contained in the sliced collection l[2:6]

In [17]: l
Out[17]: [0, 1, 'a', 'b', 'c', 6, 7, 8, 9]

In [18]: l[2:5] = list('hello') # Assigning more elements than the ones contained in the sliced collection l [2:5]

In [19]: l
Out[19]: [0, 1, 'h', 'e', 'l', 'l', 'o', 6, 7, 8, 9]

If you omit the start and end index, you will make a copy of the collection:

In [14]: l_copy = l[:]

In [15]: l == l_copy and l is not l_copy
Out[15]: True

If the start and end indexes are omitted when performing an assignment operation, the entire content of the collection will be replaced with a copy of what is referenced:

In [20]: l[:] = list('hello...')

In [21]: l
Out[21]: ['h', 'e', 'l', 'l', 'o', '.', '.', '.']

Besides basic slicing, it is also possible to apply the following notation:

l[start:end:step]

where l is a collection, start is an inclusive index, end is an exclusive index, and step is a stride that can be used to take every nth item in l.

In [22]: l = list(range(10))

In [23]: l[::2] # Take the elements which indexes are even
Out[23]: [0, 2, 4, 6, 8]

In [24]: l[1::2] # Take the elements which indexes are odd
Out[24]: [1, 3, 5, 7, 9]

Using step provides a useful trick to reverse a collection in Python:

In [25]: l[::-1]
Out[25]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

It is also possible to use negative integers for step as the following example:

In[28]:  l[::-2]
Out[28]: [9, 7, 5, 3, 1]

However, using a negative value for step could become very confusing. Moreover, in order to be Pythonic, you should avoid using start, end, and step in a single slice. In case this is required, consider doing this in two assignments (one to slice, and the other to stride).

In [29]: l = l[::2] # This step is for striding

In [30]: l
Out[30]: [0, 2, 4, 6, 8]

In [31]: l = l[1:-1] # This step is for slicing

In [32]: l
Out[32]: [2, 4, 6]

回答 24

我想添加一个世界您好!这个例子为初学者解释了切片的基础。这对我帮助很大。

我们来看一个包含六个值的列表['P', 'Y', 'T', 'H', 'O', 'N']

+---+---+---+---+---+---+
| P | Y | T | H | O | N |
+---+---+---+---+---+---+
  0   1   2   3   4   5

现在,该列表中最简单的部分就是其子列表。符号是[<index>:<index>],关键是这样阅读:

[ start cutting before this index : end cutting before this index ]

现在,如果您[2:5]从上面的列表中切出一部分,就会发生这种情况:

        |           |
+---+---|---+---+---|---+
| P | Y | T | H | O | N |
+---+---|---+---+---|---+
  0   1 | 2   3   4 | 5

您在具有index的元素之前进行了切割,在具有index 的元素之前进行2另一切割。因此,结果将是这两个削减之间的一个片段,一个清单。5['T', 'H', 'O']

I want to add one Hello, World! example that explains the basics of slices for the very beginners. It helped me a lot.

Let’s have a list with six values ['P', 'Y', 'T', 'H', 'O', 'N']:

+---+---+---+---+---+---+
| P | Y | T | H | O | N |
+---+---+---+---+---+---+
  0   1   2   3   4   5

Now the simplest slices of that list are its sublists. The notation is [<index>:<index>] and the key is to read it like this:

[ start cutting before this index : end cutting before this index ]

Now if you make a slice [2:5] of the list above, this will happen:

        |           |
+---+---|---+---+---|---+
| P | Y | T | H | O | N |
+---+---|---+---+---|---+
  0   1 | 2   3   4 | 5

You made a cut before the element with index 2 and another cut before the element with index 5. So the result will be a slice between those two cuts, a list ['T', 'H', 'O'].


回答 25

我个人认为它像一个for循环:

a[start:end:step]
# for(i = start; i < end; i += step)

此外,请注意,对于负值startend相对于所述列表的末尾和上述通过计算在示例given_index + a.shape[0]

I personally think about it like a for loop:

a[start:end:step]
# for(i = start; i < end; i += step)

Also, note that negative values for start and end are relative to the end of the list and computed in the example above by given_index + a.shape[0].


回答 26

以下是字符串索引的示例:

 +---+---+---+---+---+
 | H | e | l | p | A |
 +---+---+---+---+---+
 0   1   2   3   4   5
-5  -4  -3  -2  -1

str="Name string"

切片示例:[开始:结束:步骤]

str[start:end] # Items start through end-1
str[start:]    # Items start through the rest of the array
str[:end]      # Items from the beginning through end-1
str[:]         # A copy of the whole array

下面是示例用法:

print str[0] = N
print str[0:2] = Na
print str[0:7] = Name st
print str[0:7:2] = Nm t
print str[0:-1:2] = Nm ti

The below is the example of an index of a string:

 +---+---+---+---+---+
 | H | e | l | p | A |
 +---+---+---+---+---+
 0   1   2   3   4   5
-5  -4  -3  -2  -1

str="Name string"

Slicing example: [start:end:step]

str[start:end] # Items start through end-1
str[start:]    # Items start through the rest of the array
str[:end]      # Items from the beginning through end-1
str[:]         # A copy of the whole array

Below is the example usage:

print str[0] = N
print str[0:2] = Na
print str[0:7] = Name st
print str[0:7:2] = Nm t
print str[0:-1:2] = Nm ti

回答 27

如果您认为切片中的负索引令人困惑,这是一种很简单的思考方法:只需将负索引替换为len - index。因此,例如,将-3替换为len(list) - 3

说明内部切片功能的最佳方法是在实现此操作的代码中显示它:

def slice(list, start = None, end = None, step = 1):
  # Take care of missing start/end parameters
  start = 0 if start is None else start
  end = len(list) if end is None else end

  # Take care of negative start/end parameters
  start = len(list) + start if start < 0 else start
  end = len(list) + end if end < 0 else end

  # Now just execute a for-loop with start, end and step
  return [list[i] for i in range(start, end, step)]

If you feel negative indices in slicing is confusing, here’s a very easy way to think about it: just replace the negative index with len - index. So for example, replace -3 with len(list) - 3.

The best way to illustrate what slicing does internally is just show it in code that implements this operation:

def slice(list, start = None, end = None, step = 1):
  # Take care of missing start/end parameters
  start = 0 if start is None else start
  end = len(list) if end is None else end

  # Take care of negative start/end parameters
  start = len(list) + start if start < 0 else start
  end = len(list) + end if end < 0 else end

  # Now just execute a for-loop with start, end and step
  return [list[i] for i in range(start, end, step)]

回答 28

基本切片技术是定义起点,终点和步长-也称为步幅。

首先,我们将创建一个值列表以用于切片。

创建两个要切片的列表。第一个是从1到9的数字列表(列表A)。第二个也是一个数字列表,从0到9(列表B):

A = list(range(1, 10, 1)) # Start, stop, and step
B = list(range(9))

print("This is List A:", A)
print("This is List B:", B)

索引A中的数字3和B中的数字6。

print(A[2])
print(B[6])

基本切片

用于切片的扩展索引语法为aList [start:stop:step]。start参数和step参数都默认为none-唯一需要的参数是stop。您是否注意到这类似于使用范围定义列表A和B的方式?这是因为slice对象代表由range(开始,停止,步进)指定的索引集。Python 3.4文档。

如您所见,仅定义stop将返回一个元素。由于开始默认为无,因此这意味着只检索一个元素。

请注意,第一个元素是索引0,而不是索引索引1。这就是为什么我们在此练习中使用2个列表的原因。列表A的元素根据顺序位置编号(第一个元素为1,第二个元素为2,依此类推),而列表B的元素为将用于为其编号的数字(第一个元素为[0],第一个元素为[0],等等。)。

使用扩展的索引语法,我们检索值的范围。例如,所有值都用冒号检索。

A[:]

要检索元素的子集,需要定义开始位置和停止位置。

给定模式aList [start:stop],从列表A中检索前两个元素。

The basic slicing technique is to define the starting point, the stopping point, and the step size – also known as stride.

First, we will create a list of values to use in our slicing.

Create two lists to slice. The first is a numeric list from 1 to 9 (List A). The second is also a numeric list, from 0 to 9 (List B):

A = list(range(1, 10, 1)) # Start, stop, and step
B = list(range(9))

print("This is List A:", A)
print("This is List B:", B)

Index the number 3 from A and the number 6 from B.

print(A[2])
print(B[6])

Basic Slicing

Extended indexing syntax used for slicing is aList[start:stop:step]. The start argument and the step argument both default to none – the only required argument is stop. Did you notice this is similar to how range was used to define lists A and B? This is because the slice object represents the set of indices specified by range(start, stop, step). Python 3.4 documentation.

As you can see, defining only stop returns one element. Since the start defaults to none, this translates into retrieving only one element.

It is important to note, the first element is index 0, not index 1. This is why we are using 2 lists for this exercise. List A’s elements are numbered according to the ordinal position (the first element is 1, the second element is 2, etc.) while List B’s elements are the numbers that would be used to index them ([0] for the first element 0, etc.).

With extended indexing syntax, we retrieve a range of values. For example, all values are retrieved with a colon.

A[:]

To retrieve a subset of elements, the start and stop positions need to be defined.

Given the pattern aList[start:stop], retrieve the first two elements from List A.


回答 29

我认为Python教程图(在其他各种答案中被引用)不是很好,因为该建议对积极的步伐有效,但对消极的步伐却无效。

这是图:

 +---+---+---+---+---+---+
 | P | y | t | h | o | n |
 +---+---+---+---+---+---+
 0   1   2   3   4   5   6
-6  -5  -4  -3  -2  -1

从图中,我希望a[-4,-6,-1]是,yP但是它是ty

>>> a = "Python"
>>> a[2:4:1] # as expected
'th'
>>> a[-4:-6:-1] # off by 1
'ty'

始终起作用的是思考字符或空位,并使用索引作为半开间隔–如果正向跨步则向右打开,如果负向跨步则向左打开。

通过这种方式,我能想到的a[-4:-6:-1]a(-6,-4]在区间的术语。

 +---+---+---+---+---+---+
 | P | y | t | h | o | n |
 +---+---+---+---+---+---+
   0   1   2   3   4   5  
  -6  -5  -4  -3  -2  -1

 +---+---+---+---+---+---+---+---+---+---+---+---+
 | P | y | t | h | o | n | P | y | t | h | o | n |
 +---+---+---+---+---+---+---+---+---+---+---+---+
  -6  -5  -4  -3  -2  -1   0   1   2   3   4   5  

I don’t think that the Python tutorial diagram (cited in various other answers) is good as this suggestion works for positive stride, but does not for a negative stride.

This is the diagram:

 +---+---+---+---+---+---+
 | P | y | t | h | o | n |
 +---+---+---+---+---+---+
 0   1   2   3   4   5   6
-6  -5  -4  -3  -2  -1

From the diagram, I expect a[-4,-6,-1] to be yP but it is ty.

>>> a = "Python"
>>> a[2:4:1] # as expected
'th'
>>> a[-4:-6:-1] # off by 1
'ty'

What always work is to think in characters or slots and use indexing as a half-open interval — right-open if positive stride, left-open if negative stride.

This way, I can think of a[-4:-6:-1] as a(-6,-4] in interval terminology.

 +---+---+---+---+---+---+
 | P | y | t | h | o | n |
 +---+---+---+---+---+---+
   0   1   2   3   4   5  
  -6  -5  -4  -3  -2  -1

 +---+---+---+---+---+---+---+---+---+---+---+---+
 | P | y | t | h | o | n | P | y | t | h | o | n |
 +---+---+---+---+---+---+---+---+---+---+---+---+
  -6  -5  -4  -3  -2  -1   0   1   2   3   4   5