为什么两个相同的列表具有不同的内存占用量？

Question 1

I created two lists l1 and l2, but each one with a different creation method:

import sys

l1 = [None] * 10
l2 = [None for _ in range(10)]

print('Size of l1 =', sys.getsizeof(l1))
print('Size of l2 =', sys.getsizeof(l2))

But the output surprised me:

Size of l1 = 144
Size of l2 = 192

The list created with a list comprehension is a bigger size in memory, but the two lists are identical in Python otherwise.

Why is that? Is this some CPython internal thing, or some other explanation?

Question 2

When you write [None] * 10, Python knows that it will need a list of exactly 10 objects, so it allocates exactly that.

When you use a list comprehension, Python doesn’t know how much it will need. So it gradually grows the list as elements are added. For each reallocation it allocates more room than is immediately needed, so that it doesn’t have to reallocate for each element. The resulting list is likely to be somewhat bigger than needed.

You can see this behavior when comparing lists created with similar sizes:

>>> sys.getsizeof([None]*15)
184
>>> sys.getsizeof([None]*16)
192
>>> sys.getsizeof([None for _ in range(15)])
192
>>> sys.getsizeof([None for _ in range(16)])
192
>>> sys.getsizeof([None for _ in range(17)])
264

You can see that the first method allocates just what is needed, while the second one grows periodically. In this example, it allocates enough for 16 elements, and had to reallocate when reaching the 17th.

Question 3

As noted in this question the list-comprehension uses list.append under the hood, so it will call the list-resize method, which overallocates.

To demonstrate this to yourself, you can actually use the dis dissasembler:

>>> code = compile('[x for x in iterable]', '', 'eval')
>>> import dis
>>> dis.dis(code)
  1           0 LOAD_CONST               0 (<code object <listcomp> at 0x10560b810, file "", line 1>)
              2 LOAD_CONST               1 ('<listcomp>')
              4 MAKE_FUNCTION            0
              6 LOAD_NAME                0 (iterable)
              8 GET_ITER
             10 CALL_FUNCTION            1
             12 RETURN_VALUE

Disassembly of <code object <listcomp> at 0x10560b810, file "", line 1>:
  1           0 BUILD_LIST               0
              2 LOAD_FAST                0 (.0)
        >>    4 FOR_ITER                 8 (to 14)
              6 STORE_FAST               1 (x)
              8 LOAD_FAST                1 (x)
             10 LIST_APPEND              2
             12 JUMP_ABSOLUTE            4
        >>   14 RETURN_VALUE
>>>

Notice the LIST_APPEND opcode in the disassembly of the <listcomp> code object. From the docs:

LIST_APPEND(i)

Calls list.append(TOS[-i], TOS). Used to implement list comprehensions.

Now, for the list-repetition operation, we have a hint about what is going on if we consider:

>>> import sys
>>> sys.getsizeof([])
64
>>> 8*10
80
>>> 64 + 80
144
>>> sys.getsizeof([None]*10)
144

So, it seems to be able to exactly allocate the size. Looking at the source code, we see this is exactly what happens:

static PyObject *
list_repeat(PyListObject *a, Py_ssize_t n)
{
    Py_ssize_t i, j;
    Py_ssize_t size;
    PyListObject *np;
    PyObject **p, **items;
    PyObject *elem;
    if (n < 0)
        n = 0;
    if (n > 0 && Py_SIZE(a) > PY_SSIZE_T_MAX / n)
        return PyErr_NoMemory();
    size = Py_SIZE(a) * n;
    if (size == 0)
        return PyList_New(0);
    np = (PyListObject *) PyList_New(size);

Namely, here: size = Py_SIZE(a) * n;. The rest of the functions simply fills the array.

Question 4

None is a block of memory, but it is not a pre-specified size. In addition to that, there is some extra spacing in an array between array elements. You can see this yourself by running:

for ele in l2:
    print(sys.getsizeof(ele))

>>>>16
16
16
16
16
16
16
16
16
16

Which does not total the size of l2, but rather is less.

print(sys.getsizeof([None]))
72

And this is much greater than one tenth of the size of l1.

Your numbers should vary depending on both the details of your operating system and the details of current memory usage in your operating system. The size of [None] can never be bigger than the available adjacent memory where the variable is set to be stored, and the variable may have to be moved if it is later dynamically allocated to be larger.

为什么两个相同的列表具有不同的内存占用量？

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

7行代码 Python热力图可视化分析缺失数据处理

Python 流程图 — 一键转化代码为流程图

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

“ is”运算符对整数的行为异常

检查变量是否为None或numpy.array时发生ValueError

在Python字符串的最后一个分隔符上分割？

如何使用Flask从URL获取命名参数？

Python 30秒就能学会的漂亮短代码(译2)

使用请求在python中下载大文件

为什么两个相同的列表具有不同的内存占用量？

问题：为什么两个相同的列表具有不同的内存占用量？

回答 0

回答 1

回答 2

相关文章

排行榜展示

文章展示