Python列表与数组-何时使用?

问题:Python列表与数组-何时使用?

如果要创建一维数组,则可以将其实现为列表,也可以使用标准库中的“数组”模块。我一直将列表用于一维数组。

我想改用数组模块的原因或情况是什么?

是为了性能和内存优化,还是我缺少明显的东西?

If you are creating a 1d array, you can implement it as a List, or else use the ‘array’ module in the standard library. I have always used Lists for 1d arrays.

What is the reason or circumstance where I would want to use the array module instead?

Is it for performance and memory optimization, or am I missing something obvious?


回答 0

基本上,Python列表非常灵活,可以保存完全不同的任意数据,并且可以在摊销后的固定时间内非常高效地附加到它们。如果您需要高效而又省时地缩小和扩展列表,则可以采用这些方法。但是它们比C数组占用更多的空间

array.array类型,在另一方面,是只在C数组的薄包装。它只能保存所有相同类型的同类数据,因此仅使用sizeof(one object) * length内存字节。通常,在需要将C数组公开给扩展名或系统调用(例如ioctlfctnl)时,应使用它。

array.array也是在Python 2.x()中表示可变字符串的一种合理方法array('B', bytes)。但是,Python 2.6+和3.x提供了一个可变字节字符串bytearray

但是,如果要对数字数据的均质数组进行数学运算,则最好使用NumPy,它可以自动对复杂的多维数组进行矢量化操作。

简而言之array.array当您需要除数学之外的其他原因而需要同构C数据数组时,此选项很有用。

Basically, Python lists are very flexible and can hold completely heterogeneous, arbitrary data, and they can be appended to very efficiently, in amortized constant time. If you need to shrink and grow your list time-efficiently and without hassle, they are the way to go. But they use a lot more space than C arrays.

The array.array type, on the other hand, is just a thin wrapper on C arrays. It can hold only homogeneous data, all of the same type, and so it uses only sizeof(one object) * length bytes of memory. Mostly, you should use it when you need to expose a C array to an extension or a system call (for example, ioctl or fctnl).

array.array is also a reasonable way to represent a mutable string in Python 2.x (array('B', bytes)). However, Python 2.6+ and 3.x offers a mutable byte string as bytearray.

However, if you want to do math on a homogeneous array of numeric data, then you’re much better off using NumPy, which can automatically vectorize operations on complex multi-dimensional arrays.

To make a long story short: array.array is useful when you need a homogeneous C array of data for reasons other than doing math.


回答 1

在几乎所有情况下,正常列表都是正确的选择。数组模块更像是C数组的一个薄包装器,它为您提供了一种强类型的容器(请参阅docs),可以访问更多类似C的类型,例如有符号/无符号short或double,这不是构建的一部分-in类型。我说只有在确实需要时才使用arrays模块,在所有其他情况下,都坚持使用列表。

For almost all cases the normal list is the right choice. The arrays module is more like a thin wrapper over C arrays, which give you kind of strongly typed containers (see docs), with access to more C-like types such as signed/unsigned short or double, which are not part of the built-in types. I’d say use the arrays module only if you really need it, in all other cases stick with lists.


回答 2

如果您不知道为什么要使用它,那么数组模块就是其中一种您可能不需要的东西(请注意,我并不是要以居高临下的方式来说明这一点!) 。大多数时候,数组模块用于与C代码进行接口。为您提供有关性能问题的更直接答案:

对于某些用途,数组比列表更有效。如果需要分配一个您不会更改的数组,那么数组可以更快并且使用更少的内存。GvR有一个优化轶事,其中阵列模块是赢家(长期阅读,但值得)。

另一方面,列表消耗的内存比数组多的部分原因是因为当所有分配的元素都被使用时,python将分配一些额外的元素。这意味着将项目追加到列表的速度更快。因此,如果您计划添加项目,则要使用列表。

TL; DR如果您有特殊的优化需求或需要与C代码进行接口(并且不能使用pyrex),则仅使用数组。

The array module is kind of one of those things that you probably don’t have a need for if you don’t know why you would use it (and take note that I’m not trying to say that in a condescending manner!). Most of the time, the array module is used to interface with C code. To give you a more direct answer to your question about performance:

Arrays are more efficient than lists for some uses. If you need to allocate an array that you KNOW will not change, then arrays can be faster and use less memory. GvR has an optimization anecdote in which the array module comes out to be the winner (long read, but worth it).

On the other hand, part of the reason why lists eat up more memory than arrays is because python will allocate a few extra elements when all allocated elements get used. This means that appending items to lists is faster. So if you plan on adding items, a list is the way to go.

TL;DR I’d only use an array if you had an exceptional optimization need or you need to interface with C code (and can’t use pyrex).


回答 3

这是一个权衡!

每个人的优点:

清单

  • 灵活
  • 可以是异构的

数组(例如:numpy数组)

  • 统一值数组
  • 同质
  • 紧凑(尺寸)
  • 高效(功能和速度)
  • 方便

It’s a trade off !

pros of each one :

list

  • flexible
  • can be heterogeneous

array (ex: numpy array)

  • array of uniform values
  • homogeneous
  • compact (in size)
  • efficient (functionality and speed)
  • convenient

回答 4

我的理解是,数组的存储效率更高(例如,内存的连续块与指向Python对象的指针),但是我不知道任何性能上的好处。另外,对于数组,您必须存储相同类型的原语,而列表可以存储任何内容。

My understanding is that arrays are stored more efficiently (i.e. as contiguous blocks of memory vs. pointers to Python objects), but I am not aware of any performance benefit. Additionally, with arrays you must store primitives of the same type, whereas lists can store anything.


回答 5

标准库数组对于二进制I / O很有用,例如将整数列表转换为要写入例如wave文件的字符串。也就是说,正如许多人已经指出的那样,如果您要进行任何实际工作,则应考虑使用NumPy。

The standard library arrays are useful for binary I/O, such as translating a list of ints to a string to write to, say, a wave file. That said, as many have already noted, if you’re going to do any real work then you should consider using NumPy.


回答 6

如果要使用数组,请考虑使用numpy或scipy包,它们为数组提供了更大的灵活性。

If you’re going to be using arrays, consider the numpy or scipy packages, which give you arrays with a lot more flexibility.


回答 7

数组只能用于特定类型,而列表可以用于任何对象。

数组也只能是一种类型的数据,而列表可以具有各种对象类型的条目。

数组对于某些数值计算也更加有效。

Array can only be used for specific types, whereas lists can be used for any object.

Arrays can also only data of one type, whereas a list can have entries of various object types.

Arrays are also more efficient for some numerical computation.


回答 8

numpy数组和list之间的重要区别是,数组切片是原始数组上的视图。这意味着不会复制数据,并且对视图的任何修改将反映在源数组中。

An important difference between numpy array and list is that array slices are views on the original array. This means that the data is not copied, and any modifications to the view will be reflected in the source array.


回答 9

这个答案将总结几乎所有有关何时使用List和Array的查询:

  1. 这两种数据类型之间的主要区别是可以对它们执行的操作。例如,您可以将数组除以3,然后将数组的每个元素除以3。使用列表无法完成相同的操作。

  2. 该列表是python语法的一部分,因此不需要声明它,而您必须在使用它之前声明该数组。

  3. 您可以将不同数据类型的值存储在列表中(异构),而在Array中,您只能存储相同数据类型的值(异构)。

  4. 数组具有丰富的功能和快速的功能,与列表相比,它广泛用于算术运算和存储大量数据。

  5. 与列表相比,数组占用的内存更少。

This answer will sum up almost all the queries about when to use List and Array:

  1. The main difference between these two data types is the operations you can perform on them. For example, you can divide an array by 3 and it will divide each element of array by 3. Same can not be done with the list.

  2. The list is the part of python’s syntax so it doesn’t need to be declared whereas you have to declare the array before using it.

  3. You can store values of different data-types in a list (heterogeneous), whereas in Array you can only store values of only the same data-type (homogeneous).

  4. Arrays being rich in functionalities and fast, it is widely used for arithmetic operations and for storing a large amount of data – compared to list.

  5. Arrays take less memory compared to lists.