问题:Python:创建n个列表的最快方法

所以我想知道如何最好地创建空白列表的列表:

[[],[],[]...]

由于Python如何处理内存中的列表,因此不起作用:

[[]]*n

这确实会创建,[[],[],...]但是每个元素都是相同的列表:

d = [[]]*n
d[0].append(1)
#[[1],[1],...]

类似于列表理解的作品:

d = [[] for x in xrange(0,n)]

但这使用Python VM进行循环。有没有办法使用隐式循环(利用它用C编写)?

d = []
map(lambda n: d.append([]),xrange(0,10))

这实际上要慢一些。:(

So I was wondering how to best create a list of blank lists:

[[],[],[]...]

Because of how Python works with lists in memory, this doesn’t work:

[[]]*n

This does create [[],[],...] but each element is the same list:

d = [[]]*n
d[0].append(1)
#[[1],[1],...]

Something like a list comprehension works:

d = [[] for x in xrange(0,n)]

But this uses the Python VM for looping. Is there any way to use an implied loop (taking advantage of it being written in C)?

d = []
map(lambda n: d.append([]),xrange(0,10))

This is actually slower. :(


回答 0

可能唯一的方法是比

d = [[] for x in xrange(n)]

from itertools import repeat
d = [[] for i in repeat(None, n)]

它不必int每次迭代都创建一个新对象,并且在我的计算机上快15%。

编辑:使用NumPy,可以避免使用Python循环

d = numpy.empty((n, 0)).tolist()

但这实际上比列表理解要慢2.5倍。

The probably only way which is marginally faster than

d = [[] for x in xrange(n)]

is

from itertools import repeat
d = [[] for i in repeat(None, n)]

It does not have to create a new int object in every iteration and is about 15 % faster on my machine.

Edit: Using NumPy, you can avoid the Python loop using

d = numpy.empty((n, 0)).tolist()

but this is actually 2.5 times slower than the list comprehension.


回答 1

实际上,列表推导比显式循环更有效地实现(请参见示例功能),并且map每次迭代都必须调用不透明的可调用对象的方式,这会导致相当大的开销开销。

无论如何,[[] for _dummy in xrange(n)]这样做是正确的方法,其他各种方法之间的微小速度差异(如果存在的话)都不重要。当然,除非您花费大部分时间来执行此操作-但在那种情况下,您应该改为使用算法。您多久创建一次这些列表?

The list comprehensions actually are implemented more efficiently than explicit looping (see ) and the map way has to invoke an ophaque callable object on every iteration, which incurs considerable overhead overhead.

Regardless, [[] for _dummy in xrange(n)] is the right way to do it and none of the tiny (if existent at all) speed differences between various other ways should matter. Unless of course you spend most of your time doing this – but in that case, you should work on your algorithms instead. How often do you create these lists?


回答 2

这是两种方法,一种是简单易用的(概念上的),另一种是较为正式的,可以在读取数据集后在各种情况下进行扩展。

方法1:概念

X2=[]
X1=[1,2,3]
X2.append(X1)
X3=[4,5,6]
X2.append(X3)
X2 thus has [[1,2,3],[4,5,6]] ie a list of lists. 

方法2:正式且可扩展

另一种优雅的方式将列表存储为不同编号的列表的列表-它是从文件中读取的。(这里的文件具有数据集train)Train是一个具有50行20列的数据集。即。Train [0]给我csv文件的第一行,train [1]给我第二行,依此类推。我有兴趣将50个行作为一个列表(第0列除外)分离为一个数据集,这是我在这里解释的变量,因此必须从原始火车数据集中删除,然后按列表放大列表(即列表的列表) 。这是执行此操作的代码。

请注意,由于我只对解释变量感兴趣,因此我正在读取内部循环中的“ 1”。我在另一个循环中重新初始化X1 = [],否则X2.append([0:(len(train [0])-1)])会一遍又一遍地重写X1-除此之外,它还提高了内存效率。

X2=[]
for j in range(0,len(train)):
    X1=[]
    for k in range(1,len(train[0])):
        txt2=train[j][k]
        X1.append(txt2)
    X2.append(X1[0:(len(train[0])-1)])

Here are two methods, one sweet and simple(and conceptual), the other more formal and can be extended in a variety of situations, after having read a dataset.

Method 1: Conceptual

X2=[]
X1=[1,2,3]
X2.append(X1)
X3=[4,5,6]
X2.append(X3)
X2 thus has [[1,2,3],[4,5,6]] ie a list of lists. 

Method 2 : Formal and extensible

Another elegant way to store a list as a list of lists of different numbers – which it reads from a file. (The file here has the dataset train) Train is a data-set with say 50 rows and 20 columns. ie. Train[0] gives me the 1st row of a csv file, train[1] gives me the 2nd row and so on. I am interested in separating the dataset with 50 rows as one list, except the column 0 , which is my explained variable here, so must be removed from the orignal train dataset, and then scaling up list after list- ie a list of a list. Here’s the code that does that.

Note that I am reading from “1” in the inner loop since I am interested in explanatory variables only. And I re-initialize X1=[] in the other loop, else the X2.append([0:(len(train[0])-1)]) will rewrite X1 over and over again – besides it more memory efficient.

X2=[]
for j in range(0,len(train)):
    X1=[]
    for k in range(1,len(train[0])):
        txt2=train[j][k]
        X1.append(txt2)
    X2.append(X1[0:(len(train[0])-1)])

回答 3

要创建列表和列表列表,请使用以下语法

     x = [[] for i in range(10)]

这将创建一维列表并将其初始化为[[number]中的放置数,并将列表长度设置为range(length)

  • 要创建列表列表,请使用以下语法。
    x = [[[0] for i in range(3)] for i in range(10)]

这将初始化尺寸为10 * 3且值为0的列表的列表

  • 访问/操作元素
    x[1][5]=value

To create list and list of lists use below syntax

     x = [[] for i in range(10)]

this will create 1-d list and to initialize it put number in [[number] and set length of list put length in range(length)

  • To create list of lists use below syntax.
    x = [[[0] for i in range(3)] for i in range(10)]

this will initialize list of lists with 10*3 dimension and with value 0

  • To access/manipulate element
    x[1][5]=value

回答 4

所以我做了一些速度比较以获得最快的方法。列表理解确实非常快。接近的唯一方法是避免在构造列表期间执行字节码。我的第一个尝试是以下方法,该方法在原理上似乎更快:

l = [[]]
for _ in range(n): l.extend(map(list,l))

(当然,生成一个长度为2 ** n的列表)对于短列表和长列表(一百万个),根据时间的关系,此构造的速度是列表理解速度的两倍。

我的第二次尝试是使用starmap来为我调用列表构造函数,这是一种构造,它似乎以最快的速度运行列表构造函数,但仍然较慢,但数量很少:

from itertools import starmap
l = list(starmap(list,[()]*(1<<n)))

足够有趣的执行时间表明,最终列表调用使星图解决方案变慢,因为其执行时间几乎完全等于以下速度:

l = list([] for _ in range(1<<n))

当我意识到list(())也产生一个列表时,我进行了第三次尝试,因此我尝试了一种非常简单的方法:

l = list(map(list, [()]*(1<<n)))

但这比星图调用慢。

结论:对于速度狂:请使用列表理解。仅在需要时调用函数。使用内置函数。

So I did some speed comparisons to get the fastest way. List comprehensions are indeed very fast. The only way to get close is to avoid bytecode getting exectuded during construction of the list. My first attempt was the following method, which would appear to be faster in principle:

l = [[]]
for _ in range(n): l.extend(map(list,l))

(produces a list of length 2**n, of course) This construction is twice as slow as the list comprehension, according to timeit, for both short and long (a million) lists.

My second attempt was to use starmap to call the list constructor for me, There is one construction, which appears to run the list constructor at top speed, but still is slower, but only by a tiny amount:

from itertools import starmap
l = list(starmap(list,[()]*(1<<n)))

Interesting enough the execution time suggests that it is the final list call that is makes the starmap solution slow, since its execution time is almost exactly equal to the speed of:

l = list([] for _ in range(1<<n))

My third attempt came when I realized that list(()) also produces a list, so I tried the apperently simple:

l = list(map(list, [()]*(1<<n)))

but this was slower than the starmap call.

Conclusion: for the speed maniacs: Do use the list comprehension. Only call functions, if you have to. Use builtins.


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。