标签归档:timeit

为什么迭代一小串字符串比一小串列表慢?

问题:为什么迭代一小串字符串比一小串列表慢?

我在玩timeit时发现,对小字符串进行简单的列表理解要比对小字符串列表进行相同的操作花费的时间更长。有什么解释吗?时间几乎是原来的1.35倍。

>>> from timeit import timeit
>>> timeit("[x for x in 'abc']")
2.0691067844831528
>>> timeit("[x for x in ['a', 'b', 'c']]")
1.5286479570345861

导致此情况的较低级别发生了什么?

I was playing around with timeit and noticed that doing a simple list comprehension over a small string took longer than doing the same operation on a list of small single character strings. Any explanation? It’s almost 1.35 times as much time.

>>> from timeit import timeit
>>> timeit("[x for x in 'abc']")
2.0691067844831528
>>> timeit("[x for x in ['a', 'b', 'c']]")
1.5286479570345861

What’s happening on a lower level that’s causing this?


回答 0

TL; DR

  • 对于Python 2,一旦消除了很多开销,实际的速度差异就会接近70%(或更高)。

  • 对象创建没有错。这两种方法都不会创建新对象,因为会缓存一个字符的字符串。

  • 区别并不明显,但可能是由于对类型和格式正确的字符串索引进行了大量检查而造成的。由于很有必要检查返回的商品,因此很有可能。

  • 列表索引非常快。



>>> python3 -m timeit '[x for x in "abc"]'
1000000 loops, best of 3: 0.388 usec per loop

>>> python3 -m timeit '[x for x in ["a", "b", "c"]]'
1000000 loops, best of 3: 0.436 usec per loop

这与您发现的内容不同…

然后,您必须使用Python 2。

>>> python2 -m timeit '[x for x in "abc"]'
1000000 loops, best of 3: 0.309 usec per loop

>>> python2 -m timeit '[x for x in ["a", "b", "c"]]'
1000000 loops, best of 3: 0.212 usec per loop

让我们解释两个版本之间的区别。我将检查编译后的代码。

对于Python 3:

import dis

def list_iterate():
    [item for item in ["a", "b", "c"]]

dis.dis(list_iterate)
#>>>   4           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d06b118a0, file "", line 4>)
#>>>               3 LOAD_CONST               2 ('list_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               3 ('a')
#>>>              12 LOAD_CONST               4 ('b')
#>>>              15 LOAD_CONST               5 ('c')
#>>>              18 BUILD_LIST               3
#>>>              21 GET_ITER
#>>>              22 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              25 POP_TOP
#>>>              26 LOAD_CONST               0 (None)
#>>>              29 RETURN_VALUE

def string_iterate():
    [item for item in "abc"]

dis.dis(string_iterate)
#>>>  21           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d06b17150, file "", line 21>)
#>>>               3 LOAD_CONST               2 ('string_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               3 ('abc')
#>>>              12 GET_ITER
#>>>              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              16 POP_TOP
#>>>              17 LOAD_CONST               0 (None)
#>>>              20 RETURN_VALUE

您会在此处看到,由于每次都建立列表,列表变体可能会变慢。

这是

 9 LOAD_CONST   3 ('a')
12 LOAD_CONST   4 ('b')
15 LOAD_CONST   5 ('c')
18 BUILD_LIST   3

部分。字符串变体仅具有

 9 LOAD_CONST   3 ('abc')

您可以检查一下是否确实有所不同:

def string_iterate():
    [item for item in ("a", "b", "c")]

dis.dis(string_iterate)
#>>>  35           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d068be660, file "", line 35>)
#>>>               3 LOAD_CONST               2 ('string_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               6 (('a', 'b', 'c'))
#>>>              12 GET_ITER
#>>>              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              16 POP_TOP
#>>>              17 LOAD_CONST               0 (None)
#>>>              20 RETURN_VALUE

这产生了

 9 LOAD_CONST               6 (('a', 'b', 'c'))

因为元组是不可变的。测试:

>>> python3 -m timeit '[x for x in ("a", "b", "c")]'
1000000 loops, best of 3: 0.369 usec per loop

太好了,赶快行动吧。

对于Python 2:

def list_iterate():
    [item for item in ["a", "b", "c"]]

dis.dis(list_iterate)
#>>>   2           0 BUILD_LIST               0
#>>>               3 LOAD_CONST               1 ('a')
#>>>               6 LOAD_CONST               2 ('b')
#>>>               9 LOAD_CONST               3 ('c')
#>>>              12 BUILD_LIST               3
#>>>              15 GET_ITER            
#>>>         >>   16 FOR_ITER                12 (to 31)
#>>>              19 STORE_FAST               0 (item)
#>>>              22 LOAD_FAST                0 (item)
#>>>              25 LIST_APPEND              2
#>>>              28 JUMP_ABSOLUTE           16
#>>>         >>   31 POP_TOP             
#>>>              32 LOAD_CONST               0 (None)
#>>>              35 RETURN_VALUE        

def string_iterate():
    [item for item in "abc"]

dis.dis(string_iterate)
#>>>   2           0 BUILD_LIST               0
#>>>               3 LOAD_CONST               1 ('abc')
#>>>               6 GET_ITER            
#>>>         >>    7 FOR_ITER                12 (to 22)
#>>>              10 STORE_FAST               0 (item)
#>>>              13 LOAD_FAST                0 (item)
#>>>              16 LIST_APPEND              2
#>>>              19 JUMP_ABSOLUTE            7
#>>>         >>   22 POP_TOP             
#>>>              23 LOAD_CONST               0 (None)
#>>>              26 RETURN_VALUE        

奇怪的是,我们具有相同的列表构建,但是这样做的速度仍然更快。Python 2的运行速度异常快。

让我们删除理解和重新计时。这_ =是为了防止它被优化。

>>> python3 -m timeit '_ = ["a", "b", "c"]'
10000000 loops, best of 3: 0.0707 usec per loop

>>> python3 -m timeit '_ = "abc"'
100000000 loops, best of 3: 0.0171 usec per loop

我们可以看到初始化不足以说明版本之间的差异(这些数字很小)!因此,我们可以得出结论,Python 3的理解速度较慢。随着Python 3将理解方式更改为具有更安全的作用域,这才有意义。

好吧,现在提高基准(我只是删除不是迭代的开销)。这通过预先分配来删除迭代器的构建:

>>> python3 -m timeit -s 'iterable = "abc"'           '[x for x in iterable]'
1000000 loops, best of 3: 0.387 usec per loop

>>> python3 -m timeit -s 'iterable = ["a", "b", "c"]' '[x for x in iterable]'
1000000 loops, best of 3: 0.368 usec per loop
>>> python2 -m timeit -s 'iterable = "abc"'           '[x for x in iterable]'
1000000 loops, best of 3: 0.309 usec per loop

>>> python2 -m timeit -s 'iterable = ["a", "b", "c"]' '[x for x in iterable]'
10000000 loops, best of 3: 0.164 usec per loop

我们可以检查调用iter是否是开销:

>>> python3 -m timeit -s 'iterable = "abc"'           'iter(iterable)'
10000000 loops, best of 3: 0.099 usec per loop

>>> python3 -m timeit -s 'iterable = ["a", "b", "c"]' 'iter(iterable)'
10000000 loops, best of 3: 0.1 usec per loop
>>> python2 -m timeit -s 'iterable = "abc"'           'iter(iterable)'
10000000 loops, best of 3: 0.0913 usec per loop

>>> python2 -m timeit -s 'iterable = ["a", "b", "c"]' 'iter(iterable)'
10000000 loops, best of 3: 0.0854 usec per loop

不,不是。差别太小,尤其是对于Python 3。

因此,让我们降低整体速度,从而消除更多不必要的开销!目的是使迭代时间更长,从而节省时间。

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'
100 loops, best of 3: 3.12 msec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'
100 loops, best of 3: 2.77 msec per loop
>>> python2 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'
100 loops, best of 3: 2.32 msec per loop

>>> python2 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'
100 loops, best of 3: 2.09 msec per loop

这实际上并没有太大变化,但有所帮助。

因此,消除理解。开销并不是问题的一部分:

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'
1000 loops, best of 3: 1.71 msec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'
1000 loops, best of 3: 1.36 msec per loop
>>> python2 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'
1000 loops, best of 3: 1.27 msec per loop

>>> python2 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'
1000 loops, best of 3: 935 usec per loop

这还差不多!通过使用deque迭代,我们仍然可以稍微快一些。基本上是一样的,但是速度更快

>>> python3 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 777 usec per loop

>>> python3 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 405 usec per loop
>>> python2 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 805 usec per loop

>>> python2 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 438 usec per loop

令我印象深刻的是,Unicode在字节串方面具有竞争力。我们可以通过尝试在bytesunicode两者中进行显式检查:

  • bytes

    >>> python3 -m timeit -s 'import random; from collections import deque; iterable = b"".join(chr(random.randint(0, 127)).encode("ascii") for _ in range(100000))' 'deque(iterable, maxlen=0)'                                                                    :(
    1000 loops, best of 3: 571 usec per loop
    
    >>> python3 -m timeit -s 'import random; from collections import deque; iterable =         [chr(random.randint(0, 127)).encode("ascii") for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 394 usec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable = b"".join(chr(random.randint(0, 127))                 for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 757 usec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable =         [chr(random.randint(0, 127))                 for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 438 usec per loop
    

    在这里,您可以看到Python 3实际上比Python 2 更快

  • unicode

    >>> python3 -m timeit -s 'import random; from collections import deque; iterable = u"".join(   chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 800 usec per loop
    
    >>> python3 -m timeit -s 'import random; from collections import deque; iterable =         [   chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 394 usec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable = u"".join(unichr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 1.07 msec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable =         [unichr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 469 usec per loop
    

    同样,Python 3更快,尽管这是可以预料的(str在Python 3中引起了很多关注)。

实际上,这unicodebytes差异很小,令人印象深刻。

因此,让我们分析一下这种情况,因为它对我来说既快速又方便:

>>> python3 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 777 usec per loop

>>> python3 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 405 usec per loop

实际上,我们可以排除蒂姆·彼得(Tim Peter)提出10次支持的答案!

>>> foo = iterable[123]
>>> iterable[36] is foo
True

这些不是新对象!

但这值得一提:索引成本。区别可能在于索引,因此删除迭代并仅索引:

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'iterable[123]'
10000000 loops, best of 3: 0.0397 usec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable[123]'
10000000 loops, best of 3: 0.0374 usec per loop

差异似乎很小,但是至少一半的成本是间接费用:

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable; 123'
100000000 loops, best of 3: 0.0173 usec per loop

因此,速度差足以决定对此负责。我认为。

那么为什么索引列表这么快呢?

好吧,我会回来给你这一点,但我的猜测是的是倒在支票实习字符串(或缓存的字符,如果它是一个独立的机构)。这将不如最佳速度快。但是我会去检查源代码(尽管我对C语言不太满意):)。


所以这是来源:

static PyObject *
unicode_getitem(PyObject *self, Py_ssize_t index)
{
    void *data;
    enum PyUnicode_Kind kind;
    Py_UCS4 ch;
    PyObject *res;

    if (!PyUnicode_Check(self) || PyUnicode_READY(self) == -1) {
        PyErr_BadArgument();
        return NULL;
    }
    if (index < 0 || index >= PyUnicode_GET_LENGTH(self)) {
        PyErr_SetString(PyExc_IndexError, "string index out of range");
        return NULL;
    }
    kind = PyUnicode_KIND(self);
    data = PyUnicode_DATA(self);
    ch = PyUnicode_READ(kind, data, index);
    if (ch < 256)
        return get_latin1_char(ch);

    res = PyUnicode_New(1, ch);
    if (res == NULL)
        return NULL;
    kind = PyUnicode_KIND(res);
    data = PyUnicode_DATA(res);
    PyUnicode_WRITE(kind, data, 0, ch);
    assert(_PyUnicode_CheckConsistency(res, 1));
    return res;
}

从顶部走,我们将进行一些检查。这些无聊。然后一些分配,这也应该很无聊。第一个有趣的行是

ch = PyUnicode_READ(kind, data, index);

但是我们希望这很快,因为我们正在通过索引从连续的C数组读取数据。结果ch小于256,因此我们将在中返回缓存的字符get_latin1_char(ch)

因此,我们将运行(删除第一个检查)

kind = PyUnicode_KIND(self);
data = PyUnicode_DATA(self);
ch = PyUnicode_READ(kind, data, index);
return get_latin1_char(ch);

哪里

#define PyUnicode_KIND(op) \
    (assert(PyUnicode_Check(op)), \
     assert(PyUnicode_IS_READY(op)),            \
     ((PyASCIIObject *)(op))->state.kind)

(这很无聊,因为断言在调试中会被忽略(因此我可以检查它们是否很快),((PyASCIIObject *)(op))->state.kind)并且(我认为)是间接调用和C级强制转换);

#define PyUnicode_DATA(op) \
    (assert(PyUnicode_Check(op)), \
     PyUnicode_IS_COMPACT(op) ? _PyUnicode_COMPACT_DATA(op) :   \
     _PyUnicode_NONCOMPACT_DATA(op))

(由于类似的原因,这也很无聊,假设宏(Something_CAPITALIZED)都很快),

#define PyUnicode_READ(kind, data, index) \
    ((Py_UCS4) \
    ((kind) == PyUnicode_1BYTE_KIND ? \
        ((const Py_UCS1 *)(data))[(index)] : \
        ((kind) == PyUnicode_2BYTE_KIND ? \
            ((const Py_UCS2 *)(data))[(index)] : \
            ((const Py_UCS4 *)(data))[(index)] \
        ) \
    ))

(涉及索引,但实际上一点也不慢),并且

static PyObject*
get_latin1_char(unsigned char ch)
{
    PyObject *unicode = unicode_latin1[ch];
    if (!unicode) {
        unicode = PyUnicode_New(1, ch);
        if (!unicode)
            return NULL;
        PyUnicode_1BYTE_DATA(unicode)[0] = ch;
        assert(_PyUnicode_CheckConsistency(unicode, 1));
        unicode_latin1[ch] = unicode;
    }
    Py_INCREF(unicode);
    return unicode;
}

这证实了我的怀疑:

  • 这被缓存:

    PyObject *unicode = unicode_latin1[ch];
  • 这应该很快。在if (!unicode)没有运行,所以它是在这种情况下相当于字面上

    PyObject *unicode = unicode_latin1[ch];
    Py_INCREF(unicode);
    return unicode;
    

坦白地说,在测试asserts 之后(通过禁用它们[我认为它可以在C级断言上运行…]),只有看起来很慢的部分是:

PyUnicode_IS_COMPACT(op)
_PyUnicode_COMPACT_DATA(op)
_PyUnicode_NONCOMPACT_DATA(op)

哪个是:

#define PyUnicode_IS_COMPACT(op) \
    (((PyASCIIObject*)(op))->state.compact)

(和以前一样快),

#define _PyUnicode_COMPACT_DATA(op)                     \
    (PyUnicode_IS_ASCII(op) ?                   \
     ((void*)((PyASCIIObject*)(op) + 1)) :              \
     ((void*)((PyCompactUnicodeObject*)(op) + 1)))

(如果宏IS_ASCII很快,则很快),以及

#define _PyUnicode_NONCOMPACT_DATA(op)                  \
    (assert(((PyUnicodeObject*)(op))->data.any),        \
     ((((PyUnicodeObject *)(op))->data.any)))

(因为它是断言,间接寻址和强制转换,因此速度也很快)。

因此,我们进入(兔子洞)以:

PyUnicode_IS_ASCII

这是

#define PyUnicode_IS_ASCII(op)                   \
    (assert(PyUnicode_Check(op)),                \
     assert(PyUnicode_IS_READY(op)),             \
     ((PyASCIIObject*)op)->state.ascii)

嗯…似乎也很快…


好吧,但让我们将其与进行比较PyList_GetItem。(是的,感谢蒂姆·彼得斯(Tim Peters)为我提供了更多的工作要做:P。)

PyObject *
PyList_GetItem(PyObject *op, Py_ssize_t i)
{
    if (!PyList_Check(op)) {
        PyErr_BadInternalCall();
        return NULL;
    }
    if (i < 0 || i >= Py_SIZE(op)) {
        if (indexerr == NULL) {
            indexerr = PyUnicode_FromString(
                "list index out of range");
            if (indexerr == NULL)
                return NULL;
        }
        PyErr_SetObject(PyExc_IndexError, indexerr);
        return NULL;
    }
    return ((PyListObject *)op) -> ob_item[i];
}

我们可以看到,在非错误情况下,这只会运行:

PyList_Check(op)
Py_SIZE(op)
((PyListObject *)op) -> ob_item[i]

哪里PyList_Check

#define PyList_Check(op) \
     PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LIST_SUBCLASS)

TABS!TABS !!!)(issue215875分钟内修复并合并。就像…是的。该死的。他们让Skeet感到羞耻。

#define Py_SIZE(ob)             (((PyVarObject*)(ob))->ob_size)
#define PyType_FastSubclass(t,f)  PyType_HasFeature(t,f)
#ifdef Py_LIMITED_API
#define PyType_HasFeature(t,f)  ((PyType_GetFlags(t) & (f)) != 0)
#else
#define PyType_HasFeature(t,f)  (((t)->tp_flags & (f)) != 0)
#endif

因此,除非Py_LIMITED_API启用,否则通常这确实是微不足道的(两个间接调用和几个布尔检查)……???

然后是索引和强制转换(((PyListObject *)op) -> ob_item[i]),我们完成了。

因此,对列表检查肯定会更少,并且速度差异很小肯定意味着它可能是相关的。


我认为通常来说,(->)Unicode的类型检查和间接性更多。似乎我遗漏了一点,但是

TL;DR

  • The actual speed difference is closer to 70% (or more) once a lot of the overhead is removed, for Python 2.

  • Object creation is not at fault. Neither method creates a new object, as one-character strings are cached.

  • The difference is unobvious, but is likely created from a greater number of checks on string indexing, with regards to the type and well-formedness. It is also quite likely thanks to the need to check what to return.

  • List indexing is remarkably fast.



>>> python3 -m timeit '[x for x in "abc"]'
1000000 loops, best of 3: 0.388 usec per loop

>>> python3 -m timeit '[x for x in ["a", "b", "c"]]'
1000000 loops, best of 3: 0.436 usec per loop

This disagrees with what you’ve found…

You must be using Python 2, then.

>>> python2 -m timeit '[x for x in "abc"]'
1000000 loops, best of 3: 0.309 usec per loop

>>> python2 -m timeit '[x for x in ["a", "b", "c"]]'
1000000 loops, best of 3: 0.212 usec per loop

Let’s explain the difference between the versions. I’ll examine the compiled code.

For Python 3:

import dis

def list_iterate():
    [item for item in ["a", "b", "c"]]

dis.dis(list_iterate)
#>>>   4           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d06b118a0, file "", line 4>)
#>>>               3 LOAD_CONST               2 ('list_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               3 ('a')
#>>>              12 LOAD_CONST               4 ('b')
#>>>              15 LOAD_CONST               5 ('c')
#>>>              18 BUILD_LIST               3
#>>>              21 GET_ITER
#>>>              22 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              25 POP_TOP
#>>>              26 LOAD_CONST               0 (None)
#>>>              29 RETURN_VALUE

def string_iterate():
    [item for item in "abc"]

dis.dis(string_iterate)
#>>>  21           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d06b17150, file "", line 21>)
#>>>               3 LOAD_CONST               2 ('string_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               3 ('abc')
#>>>              12 GET_ITER
#>>>              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              16 POP_TOP
#>>>              17 LOAD_CONST               0 (None)
#>>>              20 RETURN_VALUE

You see here that the list variant is likely to be slower due to the building of the list each time.

This is the

 9 LOAD_CONST   3 ('a')
12 LOAD_CONST   4 ('b')
15 LOAD_CONST   5 ('c')
18 BUILD_LIST   3

part. The string variant only has

 9 LOAD_CONST   3 ('abc')

You can check that this does seem to make a difference:

def string_iterate():
    [item for item in ("a", "b", "c")]

dis.dis(string_iterate)
#>>>  35           0 LOAD_CONST               1 (<code object <listcomp> at 0x7f4d068be660, file "", line 35>)
#>>>               3 LOAD_CONST               2 ('string_iterate.<locals>.<listcomp>')
#>>>               6 MAKE_FUNCTION            0
#>>>               9 LOAD_CONST               6 (('a', 'b', 'c'))
#>>>              12 GET_ITER
#>>>              13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
#>>>              16 POP_TOP
#>>>              17 LOAD_CONST               0 (None)
#>>>              20 RETURN_VALUE

This produces just

 9 LOAD_CONST               6 (('a', 'b', 'c'))

as tuples are immutable. Test:

>>> python3 -m timeit '[x for x in ("a", "b", "c")]'
1000000 loops, best of 3: 0.369 usec per loop

Great, back up to speed.

For Python 2:

def list_iterate():
    [item for item in ["a", "b", "c"]]

dis.dis(list_iterate)
#>>>   2           0 BUILD_LIST               0
#>>>               3 LOAD_CONST               1 ('a')
#>>>               6 LOAD_CONST               2 ('b')
#>>>               9 LOAD_CONST               3 ('c')
#>>>              12 BUILD_LIST               3
#>>>              15 GET_ITER            
#>>>         >>   16 FOR_ITER                12 (to 31)
#>>>              19 STORE_FAST               0 (item)
#>>>              22 LOAD_FAST                0 (item)
#>>>              25 LIST_APPEND              2
#>>>              28 JUMP_ABSOLUTE           16
#>>>         >>   31 POP_TOP             
#>>>              32 LOAD_CONST               0 (None)
#>>>              35 RETURN_VALUE        

def string_iterate():
    [item for item in "abc"]

dis.dis(string_iterate)
#>>>   2           0 BUILD_LIST               0
#>>>               3 LOAD_CONST               1 ('abc')
#>>>               6 GET_ITER            
#>>>         >>    7 FOR_ITER                12 (to 22)
#>>>              10 STORE_FAST               0 (item)
#>>>              13 LOAD_FAST                0 (item)
#>>>              16 LIST_APPEND              2
#>>>              19 JUMP_ABSOLUTE            7
#>>>         >>   22 POP_TOP             
#>>>              23 LOAD_CONST               0 (None)
#>>>              26 RETURN_VALUE        

The odd thing is that we have the same building of the list, but it’s still faster for this. Python 2 is acting strangely fast.

Let’s remove the comprehensions and re-time. The _ = is to prevent it getting optimised out.

>>> python3 -m timeit '_ = ["a", "b", "c"]'
10000000 loops, best of 3: 0.0707 usec per loop

>>> python3 -m timeit '_ = "abc"'
100000000 loops, best of 3: 0.0171 usec per loop

We can see that initialization is not significant enough to account for the difference between the versions (those numbers are small)! We can thus conclude that Python 3 has slower comprehensions. This makes sense as Python 3 changed comprehensions to have safer scoping.

Well, now improve the benchmark (I’m just removing overhead that isn’t iteration). This removes the building of the iterable by pre-assigning it:

>>> python3 -m timeit -s 'iterable = "abc"'           '[x for x in iterable]'
1000000 loops, best of 3: 0.387 usec per loop

>>> python3 -m timeit -s 'iterable = ["a", "b", "c"]' '[x for x in iterable]'
1000000 loops, best of 3: 0.368 usec per loop
>>> python2 -m timeit -s 'iterable = "abc"'           '[x for x in iterable]'
1000000 loops, best of 3: 0.309 usec per loop

>>> python2 -m timeit -s 'iterable = ["a", "b", "c"]' '[x for x in iterable]'
10000000 loops, best of 3: 0.164 usec per loop

We can check if calling iter is the overhead:

>>> python3 -m timeit -s 'iterable = "abc"'           'iter(iterable)'
10000000 loops, best of 3: 0.099 usec per loop

>>> python3 -m timeit -s 'iterable = ["a", "b", "c"]' 'iter(iterable)'
10000000 loops, best of 3: 0.1 usec per loop
>>> python2 -m timeit -s 'iterable = "abc"'           'iter(iterable)'
10000000 loops, best of 3: 0.0913 usec per loop

>>> python2 -m timeit -s 'iterable = ["a", "b", "c"]' 'iter(iterable)'
10000000 loops, best of 3: 0.0854 usec per loop

No. No it is not. The difference is too small, especially for Python 3.

So let’s remove yet more unwanted overhead… by making the whole thing slower! The aim is just to have a longer iteration so the time hides overhead.

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'
100 loops, best of 3: 3.12 msec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'
100 loops, best of 3: 2.77 msec per loop
>>> python2 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' '[x for x in iterable]'
100 loops, best of 3: 2.32 msec per loop

>>> python2 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' '[x for x in iterable]'
100 loops, best of 3: 2.09 msec per loop

This hasn’t actually changed much, but it’s helped a little.

So remove the comprehension. It’s overhead that’s not part of the question:

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'
1000 loops, best of 3: 1.71 msec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'
1000 loops, best of 3: 1.36 msec per loop
>>> python2 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'for x in iterable: pass'
1000 loops, best of 3: 1.27 msec per loop

>>> python2 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'for x in iterable: pass'
1000 loops, best of 3: 935 usec per loop

That’s more like it! We can get slightly faster still by using deque to iterate. It’s basically the same, but it’s faster:

>>> python3 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 777 usec per loop

>>> python3 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 405 usec per loop
>>> python2 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 805 usec per loop

>>> python2 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 438 usec per loop

What impresses me is that Unicode is competitive with bytestrings. We can check this explicitly by trying bytes and unicode in both:

  • bytes

    >>> python3 -m timeit -s 'import random; from collections import deque; iterable = b"".join(chr(random.randint(0, 127)).encode("ascii") for _ in range(100000))' 'deque(iterable, maxlen=0)'                                                                    :(
    1000 loops, best of 3: 571 usec per loop
    
    >>> python3 -m timeit -s 'import random; from collections import deque; iterable =         [chr(random.randint(0, 127)).encode("ascii") for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 394 usec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable = b"".join(chr(random.randint(0, 127))                 for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 757 usec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable =         [chr(random.randint(0, 127))                 for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 438 usec per loop
    

    Here you see Python 3 actually faster than Python 2.

  • unicode

    >>> python3 -m timeit -s 'import random; from collections import deque; iterable = u"".join(   chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 800 usec per loop
    
    >>> python3 -m timeit -s 'import random; from collections import deque; iterable =         [   chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 394 usec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable = u"".join(unichr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 1.07 msec per loop
    
    >>> python2 -m timeit -s 'import random; from collections import deque; iterable =         [unichr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
    1000 loops, best of 3: 469 usec per loop
    

    Again, Python 3 is faster, although this is to be expected (str has had a lot of attention in Python 3).

In fact, this unicodebytes difference is very small, which is impressive.

So let’s analyse this one case, seeing as it’s fast and convenient for me:

>>> python3 -m timeit -s 'import random; from collections import deque; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 777 usec per loop

>>> python3 -m timeit -s 'import random; from collections import deque; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'deque(iterable, maxlen=0)'
1000 loops, best of 3: 405 usec per loop

We can actually rule out Tim Peter’s 10-times-upvoted answer!

>>> foo = iterable[123]
>>> iterable[36] is foo
True

These are not new objects!

But this is worth mentioning: indexing costs. The difference will likely be in the indexing, so remove the iteration and just index:

>>> python3 -m timeit -s 'import random; iterable = "".join(chr(random.randint(0, 127)) for _ in range(100000))' 'iterable[123]'
10000000 loops, best of 3: 0.0397 usec per loop

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable[123]'
10000000 loops, best of 3: 0.0374 usec per loop

The difference seems small, but at least half of the cost is overhead:

>>> python3 -m timeit -s 'import random; iterable =        [chr(random.randint(0, 127)) for _ in range(100000)]' 'iterable; 123'
100000000 loops, best of 3: 0.0173 usec per loop

so the speed difference is sufficient to decide to blame it. I think.

So why is indexing a list so much faster?

Well, I’ll come back to you on that, but my guess is that’s is down to the check for interned strings (or cached characters if it’s a separate mechanism). This will be less fast than optimal. But I’ll go check the source (although I’m not comfortable in C…) :).


So here’s the source:

static PyObject *
unicode_getitem(PyObject *self, Py_ssize_t index)
{
    void *data;
    enum PyUnicode_Kind kind;
    Py_UCS4 ch;
    PyObject *res;

    if (!PyUnicode_Check(self) || PyUnicode_READY(self) == -1) {
        PyErr_BadArgument();
        return NULL;
    }
    if (index < 0 || index >= PyUnicode_GET_LENGTH(self)) {
        PyErr_SetString(PyExc_IndexError, "string index out of range");
        return NULL;
    }
    kind = PyUnicode_KIND(self);
    data = PyUnicode_DATA(self);
    ch = PyUnicode_READ(kind, data, index);
    if (ch < 256)
        return get_latin1_char(ch);

    res = PyUnicode_New(1, ch);
    if (res == NULL)
        return NULL;
    kind = PyUnicode_KIND(res);
    data = PyUnicode_DATA(res);
    PyUnicode_WRITE(kind, data, 0, ch);
    assert(_PyUnicode_CheckConsistency(res, 1));
    return res;
}

Walking from the top, we’ll have some checks. These are boring. Then some assigns, which should also be boring. The first interesting line is

ch = PyUnicode_READ(kind, data, index);

but we’d hope that is fast, as we’re reading from a contiguous C array by indexing it. The result, ch, will be less than 256 so we’ll return the cached character in get_latin1_char(ch).

So we’ll run (dropping the first checks)

kind = PyUnicode_KIND(self);
data = PyUnicode_DATA(self);
ch = PyUnicode_READ(kind, data, index);
return get_latin1_char(ch);

Where

#define PyUnicode_KIND(op) \
    (assert(PyUnicode_Check(op)), \
     assert(PyUnicode_IS_READY(op)),            \
     ((PyASCIIObject *)(op))->state.kind)

(which is boring because asserts get ignored in debug [so I can check that they’re fast] and ((PyASCIIObject *)(op))->state.kind) is (I think) an indirection and a C-level cast);

#define PyUnicode_DATA(op) \
    (assert(PyUnicode_Check(op)), \
     PyUnicode_IS_COMPACT(op) ? _PyUnicode_COMPACT_DATA(op) :   \
     _PyUnicode_NONCOMPACT_DATA(op))

(which is also boring for similar reasons, assuming the macros (Something_CAPITALIZED) are all fast),

#define PyUnicode_READ(kind, data, index) \
    ((Py_UCS4) \
    ((kind) == PyUnicode_1BYTE_KIND ? \
        ((const Py_UCS1 *)(data))[(index)] : \
        ((kind) == PyUnicode_2BYTE_KIND ? \
            ((const Py_UCS2 *)(data))[(index)] : \
            ((const Py_UCS4 *)(data))[(index)] \
        ) \
    ))

(which involves indexes but really isn’t slow at all) and

static PyObject*
get_latin1_char(unsigned char ch)
{
    PyObject *unicode = unicode_latin1[ch];
    if (!unicode) {
        unicode = PyUnicode_New(1, ch);
        if (!unicode)
            return NULL;
        PyUnicode_1BYTE_DATA(unicode)[0] = ch;
        assert(_PyUnicode_CheckConsistency(unicode, 1));
        unicode_latin1[ch] = unicode;
    }
    Py_INCREF(unicode);
    return unicode;
}

Which confirms my suspicion that:

  • This is cached:

    PyObject *unicode = unicode_latin1[ch];
    
  • This should be fast. The if (!unicode) is not run, so it’s literally equivalent in this case to

    PyObject *unicode = unicode_latin1[ch];
    Py_INCREF(unicode);
    return unicode;
    

Honestly, after testing the asserts are fast (by disabling them [I think it works on the C-level asserts…]), the only plausibly-slow parts are:

PyUnicode_IS_COMPACT(op)
_PyUnicode_COMPACT_DATA(op)
_PyUnicode_NONCOMPACT_DATA(op)

Which are:

#define PyUnicode_IS_COMPACT(op) \
    (((PyASCIIObject*)(op))->state.compact)

(fast, as before),

#define _PyUnicode_COMPACT_DATA(op)                     \
    (PyUnicode_IS_ASCII(op) ?                   \
     ((void*)((PyASCIIObject*)(op) + 1)) :              \
     ((void*)((PyCompactUnicodeObject*)(op) + 1)))

(fast if the macro IS_ASCII is fast), and

#define _PyUnicode_NONCOMPACT_DATA(op)                  \
    (assert(((PyUnicodeObject*)(op))->data.any),        \
     ((((PyUnicodeObject *)(op))->data.any)))

(also fast as it’s an assert plus an indirection plus a cast).

So we’re down (the rabbit hole) to:

PyUnicode_IS_ASCII

which is

#define PyUnicode_IS_ASCII(op)                   \
    (assert(PyUnicode_Check(op)),                \
     assert(PyUnicode_IS_READY(op)),             \
     ((PyASCIIObject*)op)->state.ascii)

Hmm… that seems fast too…


Well, OK, but let’s compare it to PyList_GetItem. (Yeah, thanks Tim Peters for giving me more work to do :P.)

PyObject *
PyList_GetItem(PyObject *op, Py_ssize_t i)
{
    if (!PyList_Check(op)) {
        PyErr_BadInternalCall();
        return NULL;
    }
    if (i < 0 || i >= Py_SIZE(op)) {
        if (indexerr == NULL) {
            indexerr = PyUnicode_FromString(
                "list index out of range");
            if (indexerr == NULL)
                return NULL;
        }
        PyErr_SetObject(PyExc_IndexError, indexerr);
        return NULL;
    }
    return ((PyListObject *)op) -> ob_item[i];
}

We can see that on non-error cases this is just going to run:

PyList_Check(op)
Py_SIZE(op)
((PyListObject *)op) -> ob_item[i]

Where PyList_Check is

#define PyList_Check(op) \
     PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LIST_SUBCLASS)

(TABS! TABS!!!) (issue21587) That got fixed and merged in 5 minutes. Like… yeah. Damn. They put Skeet to shame.

#define Py_SIZE(ob)             (((PyVarObject*)(ob))->ob_size)
#define PyType_FastSubclass(t,f)  PyType_HasFeature(t,f)
#ifdef Py_LIMITED_API
#define PyType_HasFeature(t,f)  ((PyType_GetFlags(t) & (f)) != 0)
#else
#define PyType_HasFeature(t,f)  (((t)->tp_flags & (f)) != 0)
#endif

So this is normally really trivial (two indirections and a couple of boolean checks) unless Py_LIMITED_API is on, in which case… ???

Then there’s the indexing and a cast (((PyListObject *)op) -> ob_item[i]) and we’re done.

So there are definitely fewer checks for lists, and the small speed differences certainly imply that it could be relevant.


I think in general, there’s just more type-checking and indirection (->) for Unicode. It seems I’m missing a point, but what?


回答 1

当您遍历大多数容器对象(列表,元组,字典,…)时,迭代器会容器中传递对象。

但是,当您遍历字符串时,必须为传递的每个字符创建一个对象-字符串不是“容器”,就如同列表是容器一样。在迭代创建对象之前,字符串中的各个字符不作为不同的对象存在。

When you iterate over most container objects (lists, tuples, dicts, …), the iterator delivers the objects in the container.

But when you iterate over a string, a new object has to be created for each character delivered – a string is not “a container” in the same sense a list is a container. The individual characters in a string don’t exist as distinct objects before iteration creates those objects.


回答 2

创建字符串的迭代器可能会招致麻烦。而数组在实例化时已经包含一个迭代器。

编辑:

>>> timeit("[x for x in ['a','b','c']]")
0.3818681240081787
>>> timeit("[x for x in 'abc']")
0.3732869625091553

这是使用2.7运行的,但是在我的Mac book pro i7上。这可能是系统配置不同的结果。

You could be incurring and overhead for creating the iterator for the string. Whereas the array already contains an iterator upon instantiation.

EDIT:

>>> timeit("[x for x in ['a','b','c']]")
0.3818681240081787
>>> timeit("[x for x in 'abc']")
0.3732869625091553

This was ran using 2.7, but on my mac book pro i7. This could be the result of a system configuration difference.


如何使用Python的timeit计时代码段以测试性能?

问题:如何使用Python的timeit计时代码段以测试性能?

我有一个Python脚本,该脚本可以正常工作,但是我需要编写执行时间。我已经用谷歌搜索了,timeit但是我似乎无法使它正常工作。

我的Python脚本如下所示:

import sys
import getopt
import timeit
import random
import os
import re
import ibm_db
import time
from string import maketrans
myfile = open("results_update.txt", "a")

for r in range(100):
    rannumber = random.randint(0, 100)

    update = "update TABLE set val = %i where MyCount >= '2010' and MyCount < '2012' and number = '250'" % rannumber
    #print rannumber

    conn = ibm_db.pconnect("dsn=myDB","usrname","secretPWD")

for r in range(5):
    print "Run %s\n" % r        
    ibm_db.execute(query_stmt)
 query_stmt = ibm_db.prepare(conn, update)

myfile.close()
ibm_db.close(conn)

我需要的是执行查询并将其写入文件所需的时间results_update.txt。目的是测试具有不同索引和调整机制的数据库更新语句。

I’ve a python script which works just as it should, but I need to write the execution time. I’ve googled that I should use timeit but I can’t seem to get it to work.

My Python script looks like this:

import sys
import getopt
import timeit
import random
import os
import re
import ibm_db
import time
from string import maketrans
myfile = open("results_update.txt", "a")

for r in range(100):
    rannumber = random.randint(0, 100)

    update = "update TABLE set val = %i where MyCount >= '2010' and MyCount < '2012' and number = '250'" % rannumber
    #print rannumber

    conn = ibm_db.pconnect("dsn=myDB","usrname","secretPWD")

for r in range(5):
    print "Run %s\n" % r        
    ibm_db.execute(query_stmt)
 query_stmt = ibm_db.prepare(conn, update)

myfile.close()
ibm_db.close(conn)

What I need is the time it takes to execute the query and write it to the file results_update.txt. The purpose is to test an update statement for my database with different indexes and tuning mechanisms.


回答 0

您可以在要计时的块之前或之后使用time.time()time.clock()

import time

t0 = time.time()
code_block
t1 = time.time()

total = t1-t0

此方法不完全准确timeit(它不会平均运行几次),但很简单。

time.time()(在Windows和Linux中)和time.clock()(在Linux中)不够精确,无法实现快速功能(total = 0)。在这种情况下,或者如果要平均几次运行所花费的时间,则必须多次手动调用该函数(就像我在示例代码中已经做过的那样,并且在设置其number参数时timeit会自动执行)

import time

def myfast():
   code

n = 10000
t0 = time.time()
for i in range(n): myfast()
t1 = time.time()

total_n = t1-t0

如注释中所述,在Windows中,Corey time.clock()具有更高的精度(微秒而不是秒),并且优于time.time()

You can use time.time() or time.clock() before and after the block you want to time.

import time

t0 = time.time()
code_block
t1 = time.time()

total = t1-t0

This method is not as exact as timeit (it does not average several runs) but it is straightforward.

time.time() (in Windows and Linux) and time.clock() (in Linux) are not precise enough for fast functions (you get total = 0). In this case or if you want to average the time elapsed by several runs, you have to manually call the function multiple times (As I think you already do in you example code and timeit does automatically when you set its number argument)

import time

def myfast():
   code

n = 10000
t0 = time.time()
for i in range(n): myfast()
t1 = time.time()

total_n = t1-t0

In Windows, as Corey stated in the comment, time.clock() has much higher precision (microsecond instead of second) and is preferred over time.time().


回答 1

如果您要分析代码并可以使用IPython,则它具有magic函数%timeit

%%timeit 对细胞进行操作。

In [2]: %timeit cos(3.14)
10000000 loops, best of 3: 160 ns per loop

In [3]: %%timeit
   ...: cos(3.14)
   ...: x = 2 + 3
   ...: 
10000000 loops, best of 3: 196 ns per loop

If you are profiling your code and can use IPython, it has the magic function %timeit.

%%timeit operates on cells.

In [2]: %timeit cos(3.14)
10000000 loops, best of 3: 160 ns per loop

In [3]: %%timeit
   ...: cos(3.14)
   ...: x = 2 + 3
   ...: 
10000000 loops, best of 3: 196 ns per loop

回答 2

除了时间之外,您显示的这段代码是完全错误的:您执行100个连接(完全忽略最后一个连接,而所有连接除外),然后在您执行第一个执行调用时,将其传递给本地变量query_stmt,该变量仅执行初始化呼叫。

首先,使您的代码正确,而不必担心时间安排:即建立或接收连接并对该连接执行100或500或任意数量的更新的函数,然后关闭该连接。一旦您的代码正常工作,便是考虑在其上使用的正确点timeit

具体来说,如果要计时的函数是一个无参数的函数,则foobar可以使用timeit.timeit(2.6或更高版本-在2.5及更高版本中更为复杂):

timeit.timeit('foobar()', number=1000)

您最好指定运行次数,因为对于您的用例而言,默认值(百万)可能会很高(导致在此代码中花费大量时间;-)。

Quite apart from the timing, this code you show is simply incorrect: you execute 100 connections (completely ignoring all but the last one), and then when you do the first execute call you pass it a local variable query_stmt which you only initialize after the execute call.

First, make your code correct, without worrying about timing yet: i.e. a function that makes or receives a connection and performs 100 or 500 or whatever number of updates on that connection, then closes the connection. Once you have your code working correctly is the correct point at which to think about using timeit on it!

Specifically, if the function you want to time is a parameter-less one called foobar you can use timeit.timeit (2.6 or later — it’s more complicated in 2.5 and before):

timeit.timeit('foobar()', number=1000)

You’d better specify the number of runs because the default, a million, may be high for your use case (leading to spending a lot of time in this code;-).


回答 3

专注于一件事。磁盘I / O速度很慢,因此如果您要调整的只是数据库查询,那么我将不进行测试。

而且,如果需要安排数据库执行时间,请改用数据库工具,例如询问查询计划,并注意性能不仅随确切的查询和拥有的索引而变化,还随数据负载(多少数据)而变化。您已存储)。

就是说,您只需将代码放入函数中,然后使用即可运行该函数timeit.timeit()

def function_to_repeat():
    # ...

duration = timeit.timeit(function_to_repeat, number=1000)

这将禁用垃圾收集,重复调用该function_to_repeat()函数,并使用以下命令计时这些调用的总持续时间timeit.default_timer(),这是您特定平台上最准确的可用时钟。

您应该将设置代码移出重复功能;例如,您应该首先连接到数据库,然后仅对查询计时。使用setup参数导入或创建这些依赖项,并将其传递给函数:

def function_to_repeat(var1, var2):
    # ...

duration = timeit.timeit(
    'function_to_repeat(var1, var2)',
    'from __main__ import function_to_repeat, var1, var2', 
    number=1000)

会抓取globals function_to_repeatvar1var2从您的脚本中将其每次重复传递给函数。

Focus on one specific thing. Disk I/O is slow, so I’d take that out of the test if all you are going to tweak is the database query.

And if you need to time your database execution, look for database tools instead, like asking for the query plan, and note that performance varies not only with the exact query and what indexes you have, but also with the data load (how much data you have stored).

That said, you can simply put your code in a function and run that function with timeit.timeit():

def function_to_repeat():
    # ...

duration = timeit.timeit(function_to_repeat, number=1000)

This would disable the garbage collection, repeatedly call the function_to_repeat() function, and time the total duration of those calls using timeit.default_timer(), which is the most accurate available clock for your specific platform.

You should move setup code out of the repeated function; for example, you should connect to the database first, then time only the queries. Use the setup argument to either import or create those dependencies, and pass them into your function:

def function_to_repeat(var1, var2):
    # ...

duration = timeit.timeit(
    'function_to_repeat(var1, var2)',
    'from __main__ import function_to_repeat, var1, var2', 
    number=1000)

would grab the globals function_to_repeat, var1 and var2 from your script and pass those to the function each repetition.


回答 4

我看到问题已经得到解答,但是我仍然想加2美分。

我也遇到过类似的情况,在这种情况下,我必须测试几种方法的执行时间,因此编写了一个小脚本,该脚本对其中编写的所有函数都调用timeit。

该脚本也可以在github gist上找到

希望对您和其他人有帮助。

from random import random
import types

def list_without_comprehension():
    l = []
    for i in xrange(1000):
        l.append(int(random()*100 % 100))
    return l

def list_with_comprehension():
    # 1K random numbers between 0 to 100
    l = [int(random()*100 % 100) for _ in xrange(1000)]
    return l


# operations on list_without_comprehension
def sort_list_without_comprehension():
    list_without_comprehension().sort()

def reverse_sort_list_without_comprehension():
    list_without_comprehension().sort(reverse=True)

def sorted_list_without_comprehension():
    sorted(list_without_comprehension())


# operations on list_with_comprehension
def sort_list_with_comprehension():
    list_with_comprehension().sort()

def reverse_sort_list_with_comprehension():
    list_with_comprehension().sort(reverse=True)

def sorted_list_with_comprehension():
    sorted(list_with_comprehension())


def main():
    objs = globals()
    funcs = []
    f = open("timeit_demo.sh", "w+")

    for objname in objs:
        if objname != 'main' and type(objs[objname]) == types.FunctionType:
            funcs.append(objname)
    funcs.sort()
    for func in funcs:
        f.write('''echo "Timing: %(funcname)s"
python -m timeit "import timeit_demo; timeit_demo.%(funcname)s();"\n\n
echo "------------------------------------------------------------"
''' % dict(
                funcname = func,
                )
            )

    f.close()

if __name__ == "__main__":
    main()

    from os import system

    #Works only for *nix platforms
    system("/bin/bash timeit_demo.sh")

    #un-comment below for windows
    #system("cmd timeit_demo.sh")

I see the question has already been answered, but still want to add my 2 cents for the same.

I have also faced similar scenario in which I have to test the execution times for several approaches and hence written a small script, which calls timeit on all functions written in it.

The script is also available as github gist here.

Hope it will help you and others.

from random import random
import types

def list_without_comprehension():
    l = []
    for i in xrange(1000):
        l.append(int(random()*100 % 100))
    return l

def list_with_comprehension():
    # 1K random numbers between 0 to 100
    l = [int(random()*100 % 100) for _ in xrange(1000)]
    return l


# operations on list_without_comprehension
def sort_list_without_comprehension():
    list_without_comprehension().sort()

def reverse_sort_list_without_comprehension():
    list_without_comprehension().sort(reverse=True)

def sorted_list_without_comprehension():
    sorted(list_without_comprehension())


# operations on list_with_comprehension
def sort_list_with_comprehension():
    list_with_comprehension().sort()

def reverse_sort_list_with_comprehension():
    list_with_comprehension().sort(reverse=True)

def sorted_list_with_comprehension():
    sorted(list_with_comprehension())


def main():
    objs = globals()
    funcs = []
    f = open("timeit_demo.sh", "w+")

    for objname in objs:
        if objname != 'main' and type(objs[objname]) == types.FunctionType:
            funcs.append(objname)
    funcs.sort()
    for func in funcs:
        f.write('''echo "Timing: %(funcname)s"
python -m timeit "import timeit_demo; timeit_demo.%(funcname)s();"\n\n
echo "------------------------------------------------------------"
''' % dict(
                funcname = func,
                )
            )

    f.close()

if __name__ == "__main__":
    main()

    from os import system

    #Works only for *nix platforms
    system("/bin/bash timeit_demo.sh")

    #un-comment below for windows
    #system("cmd timeit_demo.sh")

回答 5

这是史蒂文的答案的简单包装。该函数不会重复运行/求平均值,只是使您不必在任何地方重复计时代码即可:)

'''function which prints the wall time it takes to execute the given command'''
def time_func(func, *args): #*args can take 0 or more 
  import time
  start_time = time.time()
  func(*args)
  end_time = time.time()
  print("it took this long to run: {}".format(end_time-start_time))

Here’s a simple wrapper for steven’s answer. This function doesn’t do repeated runs/averaging, just saves you from having to repeat the timing code everywhere :)

'''function which prints the wall time it takes to execute the given command'''
def time_func(func, *args): #*args can take 0 or more 
  import time
  start_time = time.time()
  func(*args)
  end_time = time.time()
  print("it took this long to run: {}".format(end_time-start_time))

回答 6

测试套件没有尝试使用导入的程序,timeit因此很难说出意图是什么。但是,这是一个规范的答案,因此timeit似乎有一个完整的例子,详细说明了Martijn的答案

提供的文档timeit提供了许多示例和标志,值得一试。命令行的基本用法是:

$ python -mtimeit "all(True for _ in range(1000))"
2000 loops, best of 5: 161 usec per loop
$ python -mtimeit "all([True for _ in range(1000)])"
2000 loops, best of 5: 116 usec per loop

运行-h以查看所有选项。Python MOTW的精彩部分timeit展示了如何通过命令行中的导入和多行代码字符串运行模块。

在脚本形式中,我通常这样使用它:

import argparse
import copy
import dis
import inspect
import random
import sys
import timeit

def test_slice(L):
    L[:]

def test_copy(L):
    L.copy()

def test_deepcopy(L):
    copy.deepcopy(L)

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--n", type=int, default=10 ** 5)
    parser.add_argument("--trials", type=int, default=100)
    parser.add_argument("--dis", action="store_true")
    args = parser.parse_args()
    n = args.n
    trials = args.trials
    namespace = dict(L = random.sample(range(n), k=n))
    funcs_to_test = [x for x in locals().values() 
                     if callable(x) and x.__module__ == __name__]
    print(f"{'-' * 30}\nn = {n}, {trials} trials\n{'-' * 30}\n")

    for func in funcs_to_test:
        fname = func.__name__
        fargs = ", ".join(inspect.signature(func).parameters)
        stmt = f"{fname}({fargs})"
        setup = f"from __main__ import {fname}"
        time = timeit.timeit(stmt, setup, number=trials, globals=namespace)
        print(inspect.getsource(globals().get(fname)))

        if args.dis:
            dis.dis(globals().get(fname))

        print(f"time (s) => {time}\n{'-' * 30}\n")

您可以轻松添加所需的函数和参数。使用不纯函数时要小心,并要注意状态。

样本输出:

$ python benchmark.py --n 10000
------------------------------
n = 10000, 100 trials
------------------------------

def test_slice(L):
    L[:]

time (s) => 0.015502399999999972
------------------------------

def test_copy(L):
    L.copy()

time (s) => 0.01651419999999998
------------------------------

def test_deepcopy(L):
    copy.deepcopy(L)

time (s) => 2.136012
------------------------------

The testing suite doesn’t make an attempt at using the imported timeit so it’s hard to tell what the intent was. Nonetheless, this is a canonical answer so a complete example of timeit seems in order, elaborating on Martijn’s answer.

The docs for timeit offer many examples and flags worth checking out. The basic usage on the command line is:

$ python -mtimeit "all(True for _ in range(1000))"
2000 loops, best of 5: 161 usec per loop
$ python -mtimeit "all([True for _ in range(1000)])"
2000 loops, best of 5: 116 usec per loop

Run with -h to see all options. Python MOTW has a great section on timeit that shows how to run modules via import and multiline code strings from the command line.

In script form, I typically use it like this:

import argparse
import copy
import dis
import inspect
import random
import sys
import timeit

def test_slice(L):
    L[:]

def test_copy(L):
    L.copy()

def test_deepcopy(L):
    copy.deepcopy(L)

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--n", type=int, default=10 ** 5)
    parser.add_argument("--trials", type=int, default=100)
    parser.add_argument("--dis", action="store_true")
    args = parser.parse_args()
    n = args.n
    trials = args.trials
    namespace = dict(L = random.sample(range(n), k=n))
    funcs_to_test = [x for x in locals().values() 
                     if callable(x) and x.__module__ == __name__]
    print(f"{'-' * 30}\nn = {n}, {trials} trials\n{'-' * 30}\n")

    for func in funcs_to_test:
        fname = func.__name__
        fargs = ", ".join(inspect.signature(func).parameters)
        stmt = f"{fname}({fargs})"
        setup = f"from __main__ import {fname}"
        time = timeit.timeit(stmt, setup, number=trials, globals=namespace)
        print(inspect.getsource(globals().get(fname)))

        if args.dis:
            dis.dis(globals().get(fname))

        print(f"time (s) => {time}\n{'-' * 30}\n")

You can pretty easily drop in the functions and arguments you need. Use caution when using impure functions and take care of state.

Sample output:

$ python benchmark.py --n 10000
------------------------------
n = 10000, 100 trials
------------------------------

def test_slice(L):
    L[:]

time (s) => 0.015502399999999972
------------------------------

def test_copy(L):
    L.copy()

time (s) => 0.01651419999999998
------------------------------

def test_deepcopy(L):
    copy.deepcopy(L)

time (s) => 2.136012
------------------------------

在Python中创建一个空列表

问题:在Python中创建一个空列表

在Python中创建新的空列表的最佳方法是什么?

l = [] 

要么

l = list()

我之所以这样问是因为两个原因:

  1. 技术原因,关于哪个更快。(创建一个类会导致开销吗?)
  2. 代码可读性-这是标准约定。

What is the best way to create a new empty list in Python?

l = [] 

or

l = list()

I am asking this because of two reasons:

  1. Technical reasons, as to which is faster. (creating a class causes overhead?)
  2. Code readability – which one is the standard convention.

回答 0

您可以通过以下方法测试哪段代码更快:

% python -mtimeit  "l=[]"
10000000 loops, best of 3: 0.0711 usec per loop

% python -mtimeit  "l=list()"
1000000 loops, best of 3: 0.297 usec per loop

但是,实际上,这种初始化很可能只是程序的一小部分,因此担心此初始化可能会出错。

可读性是非常主观的。我更喜欢[],但是像Alex Martelli这样的一些非常博学的人更喜欢,list()因为它很明显

Here is how you can test which piece of code is faster:

% python -mtimeit  "l=[]"
10000000 loops, best of 3: 0.0711 usec per loop

% python -mtimeit  "l=list()"
1000000 loops, best of 3: 0.297 usec per loop

However, in practice, this initialization is most likely an extremely small part of your program, so worrying about this is probably wrong-headed.

Readability is very subjective. I prefer [], but some very knowledgable people, like Alex Martelli, prefer list() because it is pronounceable.


回答 1

list()本质上比慢[],因为

  1. 有符号查找(python不能事先知道您是否不只是将列表重新定义为其他内容!),

  2. 有函数调用,

  3. 然后它必须检查是否传递了可迭代的参数(以便它可以使用其中的元素创建列表)。在我们的情况下没有,但是有“如果”检查

在大多数情况下,速度差异不会产生任何实际差异。

list() is inherently slower than [], because

  1. there is symbol lookup (no way for python to know in advance if you did not just redefine list to be something else!),

  2. there is function invocation,

  3. then it has to check if there was iterable argument passed (so it can create list with elements from it) ps. none in our case but there is “if” check

In most cases the speed difference won’t make any practical difference though.


回答 2

我用[]

  1. 速度更快,因为列表符号是短路。
  2. 创建包含项目的列表应该创建不包含项目的列表大致相同,为什么会有区别呢?

I use [].

  1. It’s faster because the list notation is a short circuit.
  2. Creating a list with items should look about the same as creating a list without, why should there be a difference?

回答 3

我并不是很了解,但是根据我的经验,jpcgt实际上是正确的。以下示例:如果我使用以下代码

t = [] # implicit instantiation
t = t.append(1)

在解释器中,然后调用t给我“ t”,不带任何列表,如果我附加其他内容,例如

t = t.append(2)

我收到错误“’NoneType’对象没有属性’append’”。但是,如果我通过以下方式创建列表

t = list() # explicit instantiation

然后就可以了

I do not really know about it, but it seems to me, by experience, that jpcgt is actually right. Following example: If I use following code

t = [] # implicit instantiation
t = t.append(1)

in the interpreter, then calling t gives me just “t” without any list, and if I append something else, e.g.

t = t.append(2)

I get the error “‘NoneType’ object has no attribute ‘append'”. If, however, I create the list by

t = list() # explicit instantiation

then it works fine.


回答 4

只是强调@Darkonaut 答案因为我认为它应该更明显。

new_list = []new_list = list()两者都很好(忽略性能),但append()返回None,结果您无法做new_list = new_list.append(something

这种返回类型的决定让我感到非常困惑。uck

Just to highlight @Darkonaut answer because I think it should be more visible.

new_list = [] or new_list = list() are both fine (ignoring performance), but append() returns None, as result you can’t do new_list = new_list.append(something).


如何使用timeit模块

问题:如何使用timeit模块

我了解做什么的概念,timeit但不确定如何在代码中实现。

我怎样才能比较两个功能,比方说insertion_sorttim_sort,用timeit

I understand the concept of what timeit does but I am not sure how to implement it in my code.

How can I compare two functions, say insertion_sort and tim_sort, with timeit?


回答 0

timeit的工作方式是运行一次安装代码,然后重复调用一系列语句。因此,如果要测试排序,则需要格外小心,以免就地进行一次排序不会影响已排序数据的下一遍(这当然会使Timsort发光,因为它表现最佳当数据已经部分排序时)。

这是有关如何设置排序测试的示例:

>>> import timeit

>>> setup = '''
import random

random.seed('slartibartfast')
s = [random.random() for i in range(1000)]
timsort = list.sort
'''

>>> print min(timeit.Timer('a=s[:]; timsort(a)', setup=setup).repeat(7, 1000))
0.334147930145

请注意,这一系列语句在每次通过时都会对未排序的数据进行全新复制。

另外,请注意运行测量套件七次并仅保留最佳时间的计时技术-这确实可以帮助减少由于系统上正在运行其他进程而导致的测量失真。

这些是我正确使用timeit的技巧。希望这可以帮助 :-)

The way timeit works is to run setup code once and then make repeated calls to a series of statements. So, if you want to test sorting, some care is required so that one pass at an in-place sort doesn’t affect the next pass with already sorted data (that, of course, would make the Timsort really shine because it performs best when the data already partially ordered).

Here is an example of how to set up a test for sorting:

>>> import timeit

>>> setup = '''
import random

random.seed('slartibartfast')
s = [random.random() for i in range(1000)]
timsort = list.sort
'''

>>> print min(timeit.Timer('a=s[:]; timsort(a)', setup=setup).repeat(7, 1000))
0.334147930145

Note that the series of statements makes a fresh copy of the unsorted data on every pass.

Also, note the timing technique of running the measurement suite seven times and keeping only the best time — this can really help reduce measurement distortions due to other processes running on your system.

Those are my tips for using timeit correctly. Hope this helps :-)


回答 1

如果要timeit在交互式Python会话中使用,有两个方便的选项:

  1. 使用IPython Shell。它具有方便的%timeit特殊功能:

    In [1]: def f(x):
       ...:     return x*x
       ...: 
    
    In [2]: %timeit for x in range(100): f(x)
    100000 loops, best of 3: 20.3 us per loop
  2. 在标准的Python解释器中,您可以通过__main__在setup语句中导入它们来访问在交互式会话期间先前定义的函数和其他名称:

    >>> def f(x):
    ...     return x * x 
    ... 
    >>> import timeit
    >>> timeit.repeat("for x in range(100): f(x)", "from __main__ import f",
                      number=100000)
    [2.0640320777893066, 2.0876040458679199, 2.0520210266113281]

If you want to use timeit in an interactive Python session, there are two convenient options:

  1. Use the IPython shell. It features the convenient %timeit special function:

    In [1]: def f(x):
       ...:     return x*x
       ...: 
    
    In [2]: %timeit for x in range(100): f(x)
    100000 loops, best of 3: 20.3 us per loop
    
  2. In a standard Python interpreter, you can access functions and other names you defined earlier during the interactive session by importing them from __main__ in the setup statement:

    >>> def f(x):
    ...     return x * x 
    ... 
    >>> import timeit
    >>> timeit.repeat("for x in range(100): f(x)", "from __main__ import f",
                      number=100000)
    [2.0640320777893066, 2.0876040458679199, 2.0520210266113281]
    

回答 2

我会告诉您一个秘密:最好的使用方法timeit是在命令行上。

在命令行上,进行timeit适当的统计分析:它告诉您最短运行花费了多长时间。这很好,因为所有计时错误都是正的。因此,最短的时间误差最小。没有办法得到负错误,因为计算机的计算速度永远不可能超过其计算速度!

因此,命令行界面:

%~> python -m timeit "1 + 2"
10000000 loops, best of 3: 0.0468 usec per loop

这很简单,是吗?

您可以设置以下内容:

%~> python -m timeit -s "x = range(10000)" "sum(x)"
1000 loops, best of 3: 543 usec per loop

也是有用的!

如果需要多行,则可以使用外壳程序的自动延续或使用单独的参数:

%~> python -m timeit -s "x = range(10000)" -s "y = range(100)" "sum(x)" "min(y)"
1000 loops, best of 3: 554 usec per loop

这给出了一个设置

x = range(1000)
y = range(100)

和时代

sum(x)
min(y)

如果您想要更长的脚本,则可能会倾向于timeit使用Python脚本。我建议避免这种情况,因为在命令行上分析和计时会更好。相反,我倾向于制作shell脚本:

 SETUP="

 ... # lots of stuff

 "

 echo Minmod arr1
 python -m timeit -s "$SETUP" "Minmod(arr1)"

 echo pure_minmod arr1
 python -m timeit -s "$SETUP" "pure_minmod(arr1)"

 echo better_minmod arr1
 python -m timeit -s "$SETUP" "better_minmod(arr1)"

 ... etc

由于要进行多次初始化,因此可能需要更长的时间,但是通常这没什么大不了的。


但是,如果timeit在模块内部使用该怎么办?

好吧,简单的方法是:

def function(...):
    ...

timeit.Timer(function).timeit(number=NUMBER)

这样您就可以累积(而不是最短!)时间来运行该次数。

为了获得良好的分析效果,请使用.repeat并采取以下最低限度的措施:

min(timeit.Timer(function).repeat(repeat=REPEATS, number=NUMBER))

通常应将此与结合使用,functools.partial而不是lambda: ...降低开销。因此,您可能会遇到以下情况:

from functools import partial

def to_time(items):
    ...

test_items = [1, 2, 3] * 100
times = timeit.Timer(partial(to_time, test_items)).repeat(3, 1000)

# Divide by the number of repeats
time_taken = min(times) / 1000

您也可以:

timeit.timeit("...", setup="from __main__ import ...", number=NUMBER)

这将使您从命令行更接近界面,但是方式要少得多。将"from __main__ import ..."让您使用代码从您的主模块所创造的人工环境内timeit

值得注意的是,这是一个方便包装Timer(...).timeit(...),因此在时间安排上并不是特别好。我个人更喜欢使用Timer(...).repeat(...)上面显示的内容。


警告事项

timeit到处都有一些警告。

  • 开销不占。说您要计时x += 1,找出加法需要多长时间:

    >>> python -m timeit -s "x = 0" "x += 1"
    10000000 loops, best of 3: 0.0476 usec per loop

    好吧,这不是 0.0476 µs。您只知道它比这还。所有错误均为正。

    因此,尝试找到开销:

    >>> python -m timeit -s "x = 0" ""      
    100000000 loops, best of 3: 0.014 usec per loop

    仅从定时开始,这就是30%的开销!这会大大歪曲相对的时间安排。但是您只真正关心添加的时间。查找时间x也需要包含在开销中:

    >>> python -m timeit -s "x = 0" "x"
    100000000 loops, best of 3: 0.0166 usec per loop

    差别不大,但是就在那里。

  • 变异方法很危险。

    >>> python -m timeit -s "x = [0]*100000" "while x: x.pop()"
    10000000 loops, best of 3: 0.0436 usec per loop

    但这是完全错误的! x是第一次迭代后的空列表。您需要重新初始化:

    >>> python -m timeit "x = [0]*100000" "while x: x.pop()"
    100 loops, best of 3: 9.79 msec per loop

    但是那样您就会有很多开销。分别说明。

    >>> python -m timeit "x = [0]*100000"                   
    1000 loops, best of 3: 261 usec per loop

    请注意,在这里减去开销是合理的,仅是因为开销只是时间的一小部分。

    对于你的榜样,值得一提的是,这两个插入排序和蒂姆排序有完全不同寻常的已排序的列表时序行为。这意味着,random.shuffle如果您想避免破坏时间安排,就需要进行两次排序。

I’ll let you in on a secret: the best way to use timeit is on the command line.

On the command line, timeit does proper statistical analysis: it tells you how long the shortest run took. This is good because all error in timing is positive. So the shortest time has the least error in it. There’s no way to get negative error because a computer can’t ever compute faster than it can compute!

So, the command-line interface:

%~> python -m timeit "1 + 2"
10000000 loops, best of 3: 0.0468 usec per loop

That’s quite simple, eh?

You can set stuff up:

%~> python -m timeit -s "x = range(10000)" "sum(x)"
1000 loops, best of 3: 543 usec per loop

which is useful, too!

If you want multiple lines, you can either use the shell’s automatic continuation or use separate arguments:

%~> python -m timeit -s "x = range(10000)" -s "y = range(100)" "sum(x)" "min(y)"
1000 loops, best of 3: 554 usec per loop

That gives a setup of

x = range(1000)
y = range(100)

and times

sum(x)
min(y)

If you want to have longer scripts you might be tempted to move to timeit inside a Python script. I suggest avoiding that because the analysis and timing is simply better on the command line. Instead, I tend to make shell scripts:

 SETUP="

 ... # lots of stuff

 "

 echo Minmod arr1
 python -m timeit -s "$SETUP" "Minmod(arr1)"

 echo pure_minmod arr1
 python -m timeit -s "$SETUP" "pure_minmod(arr1)"

 echo better_minmod arr1
 python -m timeit -s "$SETUP" "better_minmod(arr1)"

 ... etc

This can take a bit longer due to the multiple initialisations, but normally that’s not a big deal.


But what if you want to use timeit inside your module?

Well, the simple way is to do:

def function(...):
    ...

timeit.Timer(function).timeit(number=NUMBER)

and that gives you cumulative (not minimum!) time to run that number of times.

To get a good analysis, use .repeat and take the minimum:

min(timeit.Timer(function).repeat(repeat=REPEATS, number=NUMBER))

You should normally combine this with functools.partial instead of lambda: ... to lower overhead. Thus you could have something like:

from functools import partial

def to_time(items):
    ...

test_items = [1, 2, 3] * 100
times = timeit.Timer(partial(to_time, test_items)).repeat(3, 1000)

# Divide by the number of repeats
time_taken = min(times) / 1000

You can also do:

timeit.timeit("...", setup="from __main__ import ...", number=NUMBER)

which would give you something closer to the interface from the command-line, but in a much less cool manner. The "from __main__ import ..." lets you use code from your main module inside the artificial environment created by timeit.

It’s worth noting that this is a convenience wrapper for Timer(...).timeit(...) and so isn’t particularly good at timing. I personally far prefer using Timer(...).repeat(...) as I’ve shown above.


Warnings

There are a few caveats with timeit that hold everywhere.

  • Overhead is not accounted for. Say you want to time x += 1, to find out how long addition takes:

    >>> python -m timeit -s "x = 0" "x += 1"
    10000000 loops, best of 3: 0.0476 usec per loop
    

    Well, it’s not 0.0476 µs. You only know that it’s less than that. All error is positive.

    So try and find pure overhead:

    >>> python -m timeit -s "x = 0" ""      
    100000000 loops, best of 3: 0.014 usec per loop
    

    That’s a good 30% overhead just from timing! This can massively skew relative timings. But you only really cared about the adding timings; the look-up timings for x also need to be included in overhead:

    >>> python -m timeit -s "x = 0" "x"
    100000000 loops, best of 3: 0.0166 usec per loop
    

    The difference isn’t much larger, but it’s there.

  • Mutating methods are dangerous.

    >>> python -m timeit -s "x = [0]*100000" "while x: x.pop()"
    10000000 loops, best of 3: 0.0436 usec per loop
    

    But that’s completely wrong! x is the empty list after the first iteration. You’ll need to reinitialize:

    >>> python -m timeit "x = [0]*100000" "while x: x.pop()"
    100 loops, best of 3: 9.79 msec per loop
    

    But then you have lots of overhead. Account for that separately.

    >>> python -m timeit "x = [0]*100000"                   
    1000 loops, best of 3: 261 usec per loop
    

    Note that subtracting the overhead is reasonable here only because the overhead is a small-ish fraction of the time.

    For your example, it’s worth noting that both Insertion Sort and Tim Sort have completely unusual timing behaviours for already-sorted lists. This means you will require a random.shuffle between sorts if you want to avoid wrecking your timings.


回答 3

如果要快速比较两个代码/功能块,可以执行以下操作:

import timeit

start_time = timeit.default_timer()
func1()
print(timeit.default_timer() - start_time)

start_time = timeit.default_timer()
func2()
print(timeit.default_timer() - start_time)

If you want to compare two blocks of code / functions quickly you could do:

import timeit

start_time = timeit.default_timer()
func1()
print(timeit.default_timer() - start_time)

start_time = timeit.default_timer()
func2()
print(timeit.default_timer() - start_time)

回答 4

我发现使用timeit的最简单方法是从命令行:

给定test.py

def InsertionSort(): ...
def TimSort(): ...

运行timeit是这样的:

% python -mtimeit -s'import test' 'test.InsertionSort()'
% python -mtimeit -s'import test' 'test.TimSort()'

I find the easiest way to use timeit is from the command line:

Given test.py:

def InsertionSort(): ...
def TimSort(): ...

run timeit like this:

% python -mtimeit -s'import test' 'test.InsertionSort()'
% python -mtimeit -s'import test' 'test.TimSort()'

回答 5

对我来说,这是最快的方法:

import timeit
def foo():
    print("here is my code to time...")


timeit.timeit(stmt=foo, number=1234567)

for me, this is the fastest way:

import timeit
def foo():
    print("here is my code to time...")


timeit.timeit(stmt=foo, number=1234567)

回答 6

# Генерация целых чисел

def gen_prime(x):
    multiples = []
    results = []
    for i in range(2, x+1):
        if i not in multiples:
            results.append(i)
            for j in range(i*i, x+1, i):
                multiples.append(j)

    return results


import timeit

# Засекаем время

start_time = timeit.default_timer()
gen_prime(3000)
print(timeit.default_timer() - start_time)

# start_time = timeit.default_timer()
# gen_prime(1001)
# print(timeit.default_timer() - start_time)
# Генерация целых чисел

def gen_prime(x):
    multiples = []
    results = []
    for i in range(2, x+1):
        if i not in multiples:
            results.append(i)
            for j in range(i*i, x+1, i):
                multiples.append(j)

    return results


import timeit

# Засекаем время

start_time = timeit.default_timer()
gen_prime(3000)
print(timeit.default_timer() - start_time)

# start_time = timeit.default_timer()
# gen_prime(1001)
# print(timeit.default_timer() - start_time)

回答 7

这很好用:

  python -m timeit -c "$(cat file_name.py)"

This works great:

  python -m timeit -c "$(cat file_name.py)"

回答 8

让我们在以下每个目录中设置相同的字典并测试执行时间。

setup参数基本上是在设置字典

编号是要运行的代码1000000次。不是设置而是stmt

运行此命令时,您可以看到索引比获取索引快得多。您可以多次运行以查看。

该代码基本上试图获取字典中c的值。

import timeit

print('Getting value of C by index:', timeit.timeit(stmt="mydict['c']", setup="mydict={'a':5, 'b':6, 'c':7}", number=1000000))
print('Getting value of C by get:', timeit.timeit(stmt="mydict.get('c')", setup="mydict={'a':5, 'b':6, 'c':7}", number=1000000))

这是我的结果,您的结果会有所不同。

按索引:0.20900007452246427

通过获取:0.54841166886888

lets setup the same dictionary in each of the following and test the execution time.

The setup argument is basically setting up the dictionary

Number is to run the code 1000000 times. Not the setup but the stmt

When you run this you can see that index is way faster than get. You can run it multiple times to see.

The code basically tries to get the value of c in the dictionary.

import timeit

print('Getting value of C by index:', timeit.timeit(stmt="mydict['c']", setup="mydict={'a':5, 'b':6, 'c':7}", number=1000000))
print('Getting value of C by get:', timeit.timeit(stmt="mydict.get('c')", setup="mydict={'a':5, 'b':6, 'c':7}", number=1000000))

Here are my results, yours will differ.

by index: 0.20900007452246427

by get: 0.54841166886888


回答 9

只需将整个代码作为timeit的参数传递:

import timeit

print(timeit.timeit(

"""   
limit = 10000
prime_list = [i for i in range(2, limit+1)]

for prime in prime_list:
    for elem in range(prime*2, max(prime_list)+1, prime):
        if elem in prime_list:
            prime_list.remove(elem)
"""   
, number=10))

simply pass your entire code as an argument of timeit:

import timeit

print(timeit.timeit(

"""   
limit = 10000
prime_list = [i for i in range(2, limit+1)]

for prime in prime_list:
    for elem in range(prime*2, max(prime_list)+1, prime):
        if elem in prime_list:
            prime_list.remove(elem)
"""   
, number=10))

回答 10

import timeit


def oct(x):
   return x*x


timeit.Timer("for x in range(100): oct(x)", "gc.enable()").timeit()
import timeit


def oct(x):
   return x*x


timeit.Timer("for x in range(100): oct(x)", "gc.enable()").timeit()

回答 11

内置的timeit模块在IPython命令行中效果最佳。

要从模块内计时功能:

from timeit import default_timer as timer
import sys

def timefunc(func, *args, **kwargs):
    """Time a function. 

    args:
        iterations=3

    Usage example:
        timeit(myfunc, 1, b=2)
    """
    try:
        iterations = kwargs.pop('iterations')
    except KeyError:
        iterations = 3
    elapsed = sys.maxsize
    for _ in range(iterations):
        start = timer()
        result = func(*args, **kwargs)
        elapsed = min(timer() - start, elapsed)
    print(('Best of {} {}(): {:.9f}'.format(iterations, func.__name__, elapsed)))
    return result

The built-in timeit module works best from the IPython command line.

To time functions from within a module:

from timeit import default_timer as timer
import sys

def timefunc(func, *args, **kwargs):
    """Time a function. 

    args:
        iterations=3

    Usage example:
        timeit(myfunc, 1, b=2)
    """
    try:
        iterations = kwargs.pop('iterations')
    except KeyError:
        iterations = 3
    elapsed = sys.maxsize
    for _ in range(iterations):
        start = timer()
        result = func(*args, **kwargs)
        elapsed = min(timer() - start, elapsed)
    print(('Best of {} {}(): {:.9f}'.format(iterations, func.__name__, elapsed)))
    return result

回答 12

如何将Python REPL解释器与接受参数的函数一起使用的示例。

>>> import timeit                                                                                         

>>> def naive_func(x):                                                                                    
...     a = 0                                                                                             
...     for i in range(a):                                                                                
...         a += i                                                                                        
...     return a                                                                                          

>>> def wrapper(func, *args, **kwargs):                                                                   
...     def wrapper():                                                                                    
...         return func(*args, **kwargs)                                                                  
...     return wrapper                                                                                    

>>> wrapped = wrapper(naive_func, 1_000)                                                                  

>>> timeit.timeit(wrapped, number=1_000_000)                                                              
0.4458435332577161                                                                                        

Example of how to use Python REPL interpreter with function that accepts parameters.

>>> import timeit                                                                                         

>>> def naive_func(x):                                                                                    
...     a = 0                                                                                             
...     for i in range(a):                                                                                
...         a += i                                                                                        
...     return a                                                                                          

>>> def wrapper(func, *args, **kwargs):                                                                   
...     def wrapper():                                                                                    
...         return func(*args, **kwargs)                                                                  
...     return wrapper                                                                                    

>>> wrapped = wrapper(naive_func, 1_000)                                                                  

>>> timeit.timeit(wrapped, number=1_000_000)                                                              
0.4458435332577161                                                                                        

回答 13

您将创建两个函数,然后运行与此类似的操作。请注意,您要选择相同的执行/运行次数来比较apple与apple。
这已在Python 3.7下进行了测试。

这是易于复制的代码

!/usr/local/bin/python3
import timeit

def fibonacci(n):
    """
    Returns the n-th Fibonacci number.
    """
    if(n == 0):
        result = 0
    elif(n == 1):
        result = 1
    else:
        result = fibonacci(n-1) + fibonacci(n-2)
    return result

if __name__ == '__main__':
    import timeit
    t1 = timeit.Timer("fibonacci(13)", "from __main__ import fibonacci")
    print("fibonacci ran:",t1.timeit(number=1000), "milliseconds")

You would create two functions and then run something similar to this. Notice, you want to choose the same number of execution/run to compare apple to apple.
This was tested under Python 3.7.

Here is the code for ease of copying it

!/usr/local/bin/python3
import timeit

def fibonacci(n):
    """
    Returns the n-th Fibonacci number.
    """
    if(n == 0):
        result = 0
    elif(n == 1):
        result = 1
    else:
        result = fibonacci(n-1) + fibonacci(n-2)
    return result

if __name__ == '__main__':
    import timeit
    t1 = timeit.Timer("fibonacci(13)", "from __main__ import fibonacci")
    print("fibonacci ran:",t1.timeit(number=1000), "milliseconds")

如何在Python中测量经过时间?

问题:如何在Python中测量经过时间?

我想要的是开始在我的代码中的某个地方开始计时,然后获取经过的时间,以衡量执行少量功能所花费的时间。我认为我使用的timeit模块错误,但是文档对我来说却很混乱。

import timeit

start = timeit.timeit()
print("hello")
end = timeit.timeit()
print(end - start)

What I want is to start counting time somewhere in my code and then get the passed time, to measure the time it took to execute few function. I think I’m using the timeit module wrong, but the docs are just confusing for me.

import timeit

start = timeit.timeit()
print("hello")
end = timeit.timeit()
print(end - start)

回答 0

如果您只想测量两点之间经过的挂钟时间,则可以使用 time.time()

import time

start = time.time()
print("hello")
end = time.time()
print(end - start)

这给出了执行时间(以秒为单位)。

自3.3起,另一个选择可能是使用perf_counterprocess_time,具体取决于您的要求。在3.3之前,建议使用time.clock(感谢Amber)。但是,目前不推荐使用:

在Unix上,以秒为单位返回当前处理器时间,以浮点数表示。精度,实际上是“处理器时间”含义的确切定义,取决于同名C函数的精度。

在Windows上,基于Win32函数,此函数返回自第一次调用此函数以来经过的时间(以秒为单位),以浮点数表示。 QueryPerformanceCounter()。分辨率通常优于一微秒。

从版本3.3开始不推荐使用:此功能的行为取决于平台:根据需要,使用perf_counter()process_time()相反,以具有明确定义的行为。

If you just want to measure the elapsed wall-clock time between two points, you could use time.time():

import time

start = time.time()
print("hello")
end = time.time()
print(end - start)

This gives the execution time in seconds.

Another option since 3.3 might be to use perf_counter or process_time, depending on your requirements. Before 3.3 it was recommended to use time.clock (thanks Amber). However, it is currently deprecated:

On Unix, return the current processor time as a floating point number expressed in seconds. The precision, and in fact the very definition of the meaning of “processor time”, depends on that of the C function of the same name.

On Windows, this function returns wall-clock seconds elapsed since the first call to this function, as a floating point number, based on the Win32 function QueryPerformanceCounter(). The resolution is typically better than one microsecond.

Deprecated since version 3.3: The behaviour of this function depends on the platform: use perf_counter() or process_time() instead, depending on your requirements, to have a well defined behaviour.


回答 1

使用timeit.default_timer代替timeit.timeit。前者会自动在您的平台和Python版本上提供最佳时钟:

from timeit import default_timer as timer

start = timer()
# ...
end = timer()
print(end - start) # Time in seconds, e.g. 5.38091952400282

根据操作系统,将timeit.default_timer分配给time.time()或time.clock()。在Python 3.3+ 上,在所有平台上,default_timertime.perf_counter()。参见Python-time.clock()与time.time()-准确性?

也可以看看:

Use timeit.default_timer instead of timeit.timeit. The former provides the best clock available on your platform and version of Python automatically:

from timeit import default_timer as timer

start = timer()
# ...
end = timer()
print(end - start) # Time in seconds, e.g. 5.38091952400282

timeit.default_timer is assigned to time.time() or time.clock() depending on OS. On Python 3.3+ default_timer is time.perf_counter() on all platforms. See Python – time.clock() vs. time.time() – accuracy?

See also:


回答 2

仅限Python 3:

由于从Python 3.3开始不推荐使用 time.clock(),因此您将希望time.perf_counter()用于系统范围的计时或time.process_time()进程范围的计时,就像您以前使用的方式一样time.clock()

import time

t = time.process_time()
#do some stuff
elapsed_time = time.process_time() - t

新功能process_time将不包括睡眠期间经过的时间。

Python 3 only:

Since time.clock() is deprecated as of Python 3.3, you will want to use time.perf_counter() for system-wide timing, or time.process_time() for process-wide timing, just the way you used to use time.clock():

import time

t = time.process_time()
#do some stuff
elapsed_time = time.process_time() - t

The new function process_time will not include time elapsed during sleep.


回答 3

有了您想计时的功能,

test.py:

def foo(): 
    # print "hello"   
    return "hello"

最简单的使用方法timeit是从命令行调用它:

% python -mtimeit -s'import test' 'test.foo()'
1000000 loops, best of 3: 0.254 usec per loop

请勿尝试使用time.timetime.clock(天真)比较功能的速度。他们可能会产生误导性的结果

PS。不要将打印语句放在您希望计时的函数中;否则,测量的时间将取决于终端速度

Given a function you’d like to time,

test.py:

def foo(): 
    # print "hello"   
    return "hello"

the easiest way to use timeit is to call it from the command line:

% python -mtimeit -s'import test' 'test.foo()'
1000000 loops, best of 3: 0.254 usec per loop

Do not try to use time.time or time.clock (naively) to compare the speed of functions. They can give misleading results.

PS. Do not put print statements in a function you wish to time; otherwise the time measured will depend on the speed of the terminal.


回答 4

使用上下文管理器执行此操作很有趣,该上下文管理器会自动记住进入with块时的开始时间,然后冻结块退出时的结束时间。只需一点技巧,您甚至可以通过相同的上下文管理器功能在块内获得运行时间计数。

核心库没有这个(但应该这样做)。放置到位后,您可以执行以下操作:

with elapsed_timer() as elapsed:
    # some lengthy code
    print( "midpoint at %.2f seconds" % elapsed() )  # time so far
    # other lengthy code

print( "all done at %.2f seconds" % elapsed() )

这是足以完成此任务的contextmanager代码:

from contextlib import contextmanager
from timeit import default_timer

@contextmanager
def elapsed_timer():
    start = default_timer()
    elapser = lambda: default_timer() - start
    yield lambda: elapser()
    end = default_timer()
    elapser = lambda: end-start

还有一些可运行的演示代码:

import time

with elapsed_timer() as elapsed:
    time.sleep(1)
    print(elapsed())
    time.sleep(2)
    print(elapsed())
    time.sleep(3)

请注意,根据该函数的设计,的退出值将elapsed()在块退出时冻结,并且进一步的调用将返回相同的持续时间(在此玩具示例中约为6秒)。

It’s fun to do this with a context-manager that automatically remembers the start time upon entry to a with block, then freezes the end time on block exit. With a little trickery, you can even get a running elapsed-time tally inside the block from the same context-manager function.

The core library doesn’t have this (but probably ought to). Once in place, you can do things like:

with elapsed_timer() as elapsed:
    # some lengthy code
    print( "midpoint at %.2f seconds" % elapsed() )  # time so far
    # other lengthy code

print( "all done at %.2f seconds" % elapsed() )

Here’s contextmanager code sufficient to do the trick:

from contextlib import contextmanager
from timeit import default_timer

@contextmanager
def elapsed_timer():
    start = default_timer()
    elapser = lambda: default_timer() - start
    yield lambda: elapser()
    end = default_timer()
    elapser = lambda: end-start

And some runnable demo code:

import time

with elapsed_timer() as elapsed:
    time.sleep(1)
    print(elapsed())
    time.sleep(2)
    print(elapsed())
    time.sleep(3)

Note that by design of this function, the return value of elapsed() is frozen on block exit, and further calls return the same duration (of about 6 seconds in this toy example).


回答 5

以秒为单位的测量时间

from timeit import default_timer as timer
from datetime import timedelta

start = timer()
end = timer()
print(timedelta(seconds=end-start))

输出

0:00:01.946339

Measuring time in seconds:

from timeit import default_timer as timer
from datetime import timedelta

start = timer()
end = timer()
print(timedelta(seconds=end-start))

Output:

0:00:01.946339

回答 6

我喜欢这个。timeitdoc太混乱了。

from datetime import datetime 

start_time = datetime.now() 

# INSERT YOUR CODE 

time_elapsed = datetime.now() - start_time 

print('Time elapsed (hh:mm:ss.ms) {}'.format(time_elapsed))

请注意,这里没有任何格式,我只是写到hh:mm:ss打印输出中,以便可以解释time_elapsed

I prefer this. timeit doc is far too confusing.

from datetime import datetime 

start_time = datetime.now() 

# INSERT YOUR CODE 

time_elapsed = datetime.now() - start_time 

print('Time elapsed (hh:mm:ss.ms) {}'.format(time_elapsed))

Note, that there isn’t any formatting going on here, I just wrote hh:mm:ss into the printout so one can interpret time_elapsed


回答 7

这是执行此操作的另一种方法:

>> from pytictoc import TicToc
>> t = TicToc() # create TicToc instance
>> t.tic() # Start timer
>> # do something
>> t.toc() # Print elapsed time
Elapsed time is 2.612231 seconds.

与传统方式比较:

>> from time import time
>> t1 = time()
>> # do something
>> t2 = time()
>> elapsed = t2 - t1
>> print('Elapsed time is %f seconds.' % elapsed)
Elapsed time is 2.612231 seconds.

安装:

pip install pytictoc

有关更多详细信息,请参阅PyPi页面

Here’s another way to do this:

>> from pytictoc import TicToc
>> t = TicToc() # create TicToc instance
>> t.tic() # Start timer
>> # do something
>> t.toc() # Print elapsed time
Elapsed time is 2.612231 seconds.

Comparing with traditional way:

>> from time import time
>> t1 = time()
>> # do something
>> t2 = time()
>> elapsed = t2 - t1
>> print('Elapsed time is %f seconds.' % elapsed)
Elapsed time is 2.612231 seconds.

Installation:

pip install pytictoc

Refer to the PyPi page for more details.


回答 8

这是我在这里经过许多不错的回答以及其他几篇文章后的发现。

首先,如果您在timeit和之间进行辩论time.time,则timeit有两个优点:

  1. timeit 选择操作系统和Python版本上可用的最佳计时器。
  2. timeit 禁用垃圾收集,但是,这不是您可能想要或不想要的东西。

现在的问题是,timeit使用起来并不是那么简单,因为它需要设置,并且当您进行大量导入时,情况变得很糟。理想情况下,您只需要一个装饰器或使用with块并测量时间。不幸的是,对此没有内置的功能,因此您有两个选择:

选项1:使用时间预算库

timebudget是一个多功能的,非常简单的库,你可以在一行代码只使用PIP后安装。

@timebudget  # Record how long this function takes
def my_method():
    # my code

选项2:直接使用代码模块

我在下面创建了小实用程序模块。

# utils.py
from functools import wraps
import gc
import timeit

def MeasureTime(f, no_print=False, disable_gc=False):
    @wraps(f)
    def _wrapper(*args, **kwargs):
        gcold = gc.isenabled()
        if disable_gc:
            gc.disable()
        start_time = timeit.default_timer()
        try:
            result = f(*args, **kwargs)
        finally:
            elapsed = timeit.default_timer() - start_time
            if disable_gc and gcold:
                gc.enable()
            if not no_print:
                print('"{}": {}s'.format(f.__name__, elapsed))
        return result
    return _wrapper

class MeasureBlockTime:
    def __init__(self,name="(block)", no_print=False, disable_gc=False):
        self.name = name
        self.no_print = no_print
        self.disable_gc = disable_gc
    def __enter__(self):
        self.gcold = gc.isenabled()
        if self.disable_gc:
            gc.disable()
        self.start_time = timeit.default_timer()
    def __exit__(self,ty,val,tb):
        self.elapsed = timeit.default_timer() - self.start_time
        if self.disable_gc and self.gcold:
            gc.enable()
        if not self.no_print:
            print('Function "{}": {}s'.format(self.name, self.elapsed))
        return False #re-raise any exceptions

现在,您只需在其前面放置装饰器即可计时任何功能:

import utils

@utils.MeasureTime
def MyBigFunc():
    #do something time consuming
    for i in range(10000):
        print(i)

如果要计时部分代码,只需将其放在代码with块中:

import utils

#somewhere in my code

with utils.MeasureBlockTime("MyBlock"):
    #do something time consuming
    for i in range(10000):
        print(i)

# rest of my code

优点:

有几个半支持的版本,所以我想指出一些重点:

  1. 出于前面所述的原因,请使用timeit中的timer代替time.time。
  2. 如果需要,可以在计时期间禁用GC。
  3. 装饰器接受带有已命名或未命名参数的函数。
  4. 能够按块定时禁用打印(先使用with utils.MeasureBlockTime() as t,然后再使用t.elapsed)。
  5. 能够为块定时保持启用gc的能力。

Here are my findings after going through many good answers here as well as a few other articles.

First, if you are debating between timeit and time.time, the timeit has two advantages:

  1. timeit selects the best timer available on your OS and Python version.
  2. timeit disables garbage collection, however, this is not something you may or may not want.

Now the problem is that timeit is not that simple to use because it needs setup and things get ugly when you have a bunch of imports. Ideally, you just want a decorator or use with block and measure time. Unfortunately, there is nothing built-in available for this so you have two options:

Option 1: Use timebudget library

The timebudget is a versatile and very simple library that you can use just in one line of code after pip install.

@timebudget  # Record how long this function takes
def my_method():
    # my code

Option 2: Use code module directly

I created below little utility module.

# utils.py
from functools import wraps
import gc
import timeit

def MeasureTime(f, no_print=False, disable_gc=False):
    @wraps(f)
    def _wrapper(*args, **kwargs):
        gcold = gc.isenabled()
        if disable_gc:
            gc.disable()
        start_time = timeit.default_timer()
        try:
            result = f(*args, **kwargs)
        finally:
            elapsed = timeit.default_timer() - start_time
            if disable_gc and gcold:
                gc.enable()
            if not no_print:
                print('"{}": {}s'.format(f.__name__, elapsed))
        return result
    return _wrapper

class MeasureBlockTime:
    def __init__(self,name="(block)", no_print=False, disable_gc=False):
        self.name = name
        self.no_print = no_print
        self.disable_gc = disable_gc
    def __enter__(self):
        self.gcold = gc.isenabled()
        if self.disable_gc:
            gc.disable()
        self.start_time = timeit.default_timer()
    def __exit__(self,ty,val,tb):
        self.elapsed = timeit.default_timer() - self.start_time
        if self.disable_gc and self.gcold:
            gc.enable()
        if not self.no_print:
            print('Function "{}": {}s'.format(self.name, self.elapsed))
        return False #re-raise any exceptions

Now you can time any function just by putting a decorator in front of it:

import utils

@utils.MeasureTime
def MyBigFunc():
    #do something time consuming
    for i in range(10000):
        print(i)

If you want to time portion of code then just put it inside with block:

import utils

#somewhere in my code

with utils.MeasureBlockTime("MyBlock"):
    #do something time consuming
    for i in range(10000):
        print(i)

# rest of my code

Advantages:

There are several half-backed versions floating around so I want to point out few highlights:

  1. Use timer from timeit instead of time.time for reasons described earlier.
  2. You can disable GC during timing if you want.
  3. Decorator accepts functions with named or unnamed params.
  4. Ability to disable printing in block timing (use with utils.MeasureBlockTime() as t and then t.elapsed).
  5. Ability to keep gc enabled for block timing.

回答 9

使用 time.time度量执行可以为您提供命令的整体执行时间,包括计算机上其他进程花费的运行时间。这是用户注意到的时间,但是如果您要比较不同的代码段/算法/函数/ …,则效果不佳。

有关更多信息timeit

如果您想对配置文件有更深入的了解:

更新:去年我大量使用了http://pythonhosted.org/line_profiler/,发现它非常有用,建议使用它代替Pythons profile模块。

Using time.time to measure execution gives you the overall execution time of your commands including running time spent by other processes on your computer. It is the time the user notices, but is not good if you want to compare different code snippets / algorithms / functions / …

More information on timeit:

If you want a deeper insight into profiling:

Update: I used http://pythonhosted.org/line_profiler/ a lot during the last year and find it very helpfull and recommend to use it instead of Pythons profile module.


回答 10

这是一个微小的计时器类,它返回“ hh:mm:ss”字符串:

class Timer:
  def __init__(self):
    self.start = time.time()

  def restart(self):
    self.start = time.time()

  def get_time_hhmmss(self):
    end = time.time()
    m, s = divmod(end - self.start, 60)
    h, m = divmod(m, 60)
    time_str = "%02d:%02d:%02d" % (h, m, s)
    return time_str

用法:

# Start timer
my_timer = Timer()

# ... do something

# Get time string:
time_hhmmss = my_timer.get_time_hhmmss()
print("Time elapsed: %s" % time_hhmmss )

# ... use the timer again
my_timer.restart()

# ... do something

# Get time:
time_hhmmss = my_timer.get_time_hhmmss()

# ... etc

Here is a tiny timer class that returns “hh:mm:ss” string:

class Timer:
  def __init__(self):
    self.start = time.time()

  def restart(self):
    self.start = time.time()

  def get_time_hhmmss(self):
    end = time.time()
    m, s = divmod(end - self.start, 60)
    h, m = divmod(m, 60)
    time_str = "%02d:%02d:%02d" % (h, m, s)
    return time_str

Usage:

# Start timer
my_timer = Timer()

# ... do something

# Get time string:
time_hhmmss = my_timer.get_time_hhmmss()
print("Time elapsed: %s" % time_hhmmss )

# ... use the timer again
my_timer.restart()

# ... do something

# Get time:
time_hhmmss = my_timer.get_time_hhmmss()

# ... etc

回答 11

python cProfile和pstats模块为测量某些功能所经过的时间提供了强大的支持,而无需在现有功能周围添加任何代码。

例如,如果您有python脚本timeFunctions.py:

import time

def hello():
    print "Hello :)"
    time.sleep(0.1)

def thankyou():
    print "Thank you!"
    time.sleep(0.05)

for idx in range(10):
    hello()

for idx in range(100):
    thankyou()

要运行事件探查器并为文件生成统计信息,您可以运行:

python -m cProfile -o timeStats.profile timeFunctions.py

这样做是使用cProfile模块来分析timeFunctions.py中的所有功能,并在timeStats.profile文件中收集统计信息。请注意,我们不必向现有模块(timeFunctions.py)添加任何代码,并且可以使用任何模块来完成此操作。

拥有stats文件后,您可以按以下方式运行pstats模块:

python -m pstats timeStats.profile

这将运行交互式统计浏览器,从而为您提供许多不错的功能。对于您的特定用例,您可以只检查功能的统计信息。在我们的示例中,检查这两个功能的统计信息显示以下内容:

Welcome to the profile statistics browser.
timeStats.profile% stats hello
<timestamp>    timeStats.profile

         224 function calls in 6.014 seconds

   Random listing order was used
   List reduced from 6 to 1 due to restriction <'hello'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       10    0.000    0.000    1.001    0.100 timeFunctions.py:3(hello)

timeStats.profile% stats thankyou
<timestamp>    timeStats.profile

         224 function calls in 6.014 seconds

   Random listing order was used
   List reduced from 6 to 1 due to restriction <'thankyou'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      100    0.002    0.000    5.012    0.050 timeFunctions.py:7(thankyou)

这个虚拟的例子并没有做太多,但是让您知道可以做什么。关于这种方法的最好之处在于,我不必编辑任何现有代码即可获得这些数字,并且显然可以帮助进行性能分析。

The python cProfile and pstats modules offer great support for measuring time elapsed in certain functions without having to add any code around the existing functions.

For example if you have a python script timeFunctions.py:

import time

def hello():
    print "Hello :)"
    time.sleep(0.1)

def thankyou():
    print "Thank you!"
    time.sleep(0.05)

for idx in range(10):
    hello()

for idx in range(100):
    thankyou()

To run the profiler and generate stats for the file you can just run:

python -m cProfile -o timeStats.profile timeFunctions.py

What this is doing is using the cProfile module to profile all functions in timeFunctions.py and collecting the stats in the timeStats.profile file. Note that we did not have to add any code to existing module (timeFunctions.py) and this can be done with any module.

Once you have the stats file you can run the pstats module as follows:

python -m pstats timeStats.profile

This runs the interactive statistics browser which gives you a lot of nice functionality. For your particular use case you can just check the stats for your function. In our example checking stats for both functions shows us the following:

Welcome to the profile statistics browser.
timeStats.profile% stats hello
<timestamp>    timeStats.profile

         224 function calls in 6.014 seconds

   Random listing order was used
   List reduced from 6 to 1 due to restriction <'hello'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       10    0.000    0.000    1.001    0.100 timeFunctions.py:3(hello)

timeStats.profile% stats thankyou
<timestamp>    timeStats.profile

         224 function calls in 6.014 seconds

   Random listing order was used
   List reduced from 6 to 1 due to restriction <'thankyou'>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      100    0.002    0.000    5.012    0.050 timeFunctions.py:7(thankyou)

The dummy example does not do much but give you an idea of what can be done. The best part about this approach is that I dont have to edit any of my existing code to get these numbers and obviously help with profiling.


回答 12

这是另一个用于计时代码的上下文管理器-

用法:

from benchmark import benchmark

with benchmark("Test 1+1"):
    1+1
=>
Test 1+1 : 1.41e-06 seconds

或者,如果您需要时间值

with benchmark("Test 1+1") as b:
    1+1
print(b.time)
=>
Test 1+1 : 7.05e-07 seconds
7.05233786763e-07

Benchmark.py

from timeit import default_timer as timer

class benchmark(object):

    def __init__(self, msg, fmt="%0.3g"):
        self.msg = msg
        self.fmt = fmt

    def __enter__(self):
        self.start = timer()
        return self

    def __exit__(self, *args):
        t = timer() - self.start
        print(("%s : " + self.fmt + " seconds") % (self.msg, t))
        self.time = t

改编自http://dabeaz.blogspot.fr/2010/02/context-manager-for-timing-benchmarks.html

Here’s another context manager for timing code –

Usage:

from benchmark import benchmark

with benchmark("Test 1+1"):
    1+1
=>
Test 1+1 : 1.41e-06 seconds

or, if you need the time value

with benchmark("Test 1+1") as b:
    1+1
print(b.time)
=>
Test 1+1 : 7.05e-07 seconds
7.05233786763e-07

benchmark.py:

from timeit import default_timer as timer

class benchmark(object):

    def __init__(self, msg, fmt="%0.3g"):
        self.msg = msg
        self.fmt = fmt

    def __enter__(self):
        self.start = timer()
        return self

    def __exit__(self, *args):
        t = timer() - self.start
        print(("%s : " + self.fmt + " seconds") % (self.msg, t))
        self.time = t

Adapted from http://dabeaz.blogspot.fr/2010/02/context-manager-for-timing-benchmarks.html


回答 13

使用探查器模块。它提供了非常详细的配置文件。

import profile
profile.run('main()')

它输出类似:

          5 function calls in 0.047 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 :0(exec)
        1    0.047    0.047    0.047    0.047 :0(setprofile)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        0    0.000             0.000          profile:0(profiler)
        1    0.000    0.000    0.047    0.047 profile:0(main())
        1    0.000    0.000    0.000    0.000 two_sum.py:2(twoSum)

我发现它非常有用。

Use profiler module. It gives a very detailed profile.

import profile
profile.run('main()')

it outputs something like:

          5 function calls in 0.047 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 :0(exec)
        1    0.047    0.047    0.047    0.047 :0(setprofile)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        0    0.000             0.000          profile:0(profiler)
        1    0.000    0.000    0.047    0.047 profile:0(main())
        1    0.000    0.000    0.000    0.000 two_sum.py:2(twoSum)

I’ve found it very informative.


回答 14

我喜欢它简单(python 3):

from timeit import timeit

timeit(lambda: print("hello"))

单个执行的输出为微秒

2.430883963010274

说明:timeit 默认执行匿名函数一百万次,结果以秒为单位。因此,单次执行的结果是相同的数量,但平均为微秒


对于缓慢的操作添加较低数量的迭代,或者你可能会永远等待:

import time

timeit(lambda: time.sleep(1.5), number=1)

输出总是在数目迭代:

1.5015795179999714

I like it simple (python 3):

from timeit import timeit

timeit(lambda: print("hello"))

Output is microseconds for a single execution:

2.430883963010274

Explanation: timeit executes the anonymous function 1 million times by default and the result is given in seconds. Therefore the result for 1 single execution is the same amount but in microseconds on average.


For slow operations add a lower number of iterations or you could be waiting forever:

import time

timeit(lambda: time.sleep(1.5), number=1)

Output is always in seconds for the total number of iterations:

1.5015795179999714

回答 15

(仅对于Ipython)可以使用%timeit来衡量平均处理时间:

def foo():
    print "hello"

然后:

%timeit foo()

结果是这样的:

10000 loops, best of 3: 27 µs per loop

(With Ipython only) you can use %timeit to measure average processing time:

def foo():
    print "hello"

and then:

%timeit foo()

the result is something like:

10000 loops, best of 3: 27 µs per loop

回答 16

使用timeit的另一种方法:

from timeit import timeit

def func():
    return 1 + 1

time = timeit(func, number=1)
print(time)

One more way to use timeit:

from timeit import timeit

def func():
    return 1 + 1

time = timeit(func, number=1)
print(time)

回答 17

在python3上:

from time import sleep, perf_counter as pc
t0 = pc()
sleep(1)
print(pc()-t0)

优雅而短暂。

on python3:

from time import sleep, perf_counter as pc
t0 = pc()
sleep(1)
print(pc()-t0)

elegant and short.


回答 18

有点超级后来的反应,但也许对某人有用。我认为这是一种超级干净的方法。

import time

def timed(fun, *args):
    s = time.time()
    r = fun(*args)
    print('{} execution took {} seconds.'.format(fun.__name__, time.time()-s))
    return(r)

timed(print, "Hello")

请记住,“ print”是Python 3中的功能,而不是Python 2.7中的功能。但是,它可以与任何其他功能一起使用。干杯!

Kind of a super later response, but maybe it serves a purpose for someone. This is a way to do it which I think is super clean.

import time

def timed(fun, *args):
    s = time.time()
    r = fun(*args)
    print('{} execution took {} seconds.'.format(fun.__name__, time.time()-s))
    return(r)

timed(print, "Hello")

Keep in mind that “print” is a function in Python 3 and not Python 2.7. However, it works with any other function. Cheers!


回答 19

您可以使用timeit。

这是一个有关如何使用Python REPL测试带参数的naive_func的示例:

>>> import timeit                                                                                         

>>> def naive_func(x):                                                                                    
...     a = 0                                                                                             
...     for i in range(a):                                                                                
...         a += i                                                                                        
...     return a                                                                                          

>>> def wrapper(func, *args, **kwargs):                                                                   
...     def wrapper():                                                                                    
...         return func(*args, **kwargs)                                                                  
...     return wrapper                                                                                    

>>> wrapped = wrapper(naive_func, 1_000)                                                                  

>>> timeit.timeit(wrapped, number=1_000_000)                                                              
0.4458435332577161  

如果function没有任何参数,则不需要包装函数。

You can use timeit.

Here is an example on how to test naive_func that takes parameter using Python REPL:

>>> import timeit                                                                                         

>>> def naive_func(x):                                                                                    
...     a = 0                                                                                             
...     for i in range(a):                                                                                
...         a += i                                                                                        
...     return a                                                                                          

>>> def wrapper(func, *args, **kwargs):                                                                   
...     def wrapper():                                                                                    
...         return func(*args, **kwargs)                                                                  
...     return wrapper                                                                                    

>>> wrapped = wrapper(naive_func, 1_000)                                                                  

>>> timeit.timeit(wrapped, number=1_000_000)                                                              
0.4458435332577161  

You don’t need wrapper function if function doesn’t have any parameters.


回答 20

我们还可以将时间转换为人类可以理解的时间。

import time, datetime

start = time.clock()

def num_multi1(max):
    result = 0
    for num in range(0, 1000):
        if (num % 3 == 0 or num % 5 == 0):
            result += num

    print "Sum is %d " % result

num_multi1(1000)

end = time.clock()
value = end - start
timestamp = datetime.datetime.fromtimestamp(value)
print timestamp.strftime('%Y-%m-%d %H:%M:%S')

We can also convert time into human-readable time.

import time, datetime

start = time.clock()

def num_multi1(max):
    result = 0
    for num in range(0, 1000):
        if (num % 3 == 0 or num % 5 == 0):
            result += num

    print "Sum is %d " % result

num_multi1(1000)

end = time.clock()
value = end - start
timestamp = datetime.datetime.fromtimestamp(value)
print timestamp.strftime('%Y-%m-%d %H:%M:%S')

回答 21

我为此创建了一个库,如果您想测量一个函数,可以像这样做


from pythonbenchmark import compare, measure
import time

a,b,c,d,e = 10,10,10,10,10
something = [a,b,c,d,e]

@measure
def myFunction(something):
    time.sleep(0.4)

@measure
def myOptimizedFunction(something):
    time.sleep(0.2)

myFunction(input)
myOptimizedFunction(input)

https://github.com/Karlheinzniebuhr/pythonbenchmark

I made a library for this, if you want to measure a function you can just do it like this


from pythonbenchmark import compare, measure
import time

a,b,c,d,e = 10,10,10,10,10
something = [a,b,c,d,e]

@measure
def myFunction(something):
    time.sleep(0.4)

@measure
def myOptimizedFunction(something):
    time.sleep(0.2)

myFunction(input)
myOptimizedFunction(input)

https://github.com/Karlheinzniebuhr/pythonbenchmark


回答 22

要了解每个函数的递归调用,请执行以下操作:

%load_ext snakeviz
%%snakeviz

它只需在Jupyter笔记本中使用这两行代码,并生成一个漂亮的交互式图表。例如:

这是代码。再次,以2开头的%行是使用snakeviz所需的仅有的额外代码行:

# !pip install snakeviz
%load_ext snakeviz
import glob
import hashlib

%%snakeviz

files = glob.glob('*.txt')
def print_files_hashed(files):
    for file in files:
        with open(file) as f:
            print(hashlib.md5(f.read().encode('utf-8')).hexdigest())
print_files_hashed(files)

在笔记本外运行snakeviz也似乎是可能的。在snakeviz网站上有更多信息。

To get insight on every function calls recursively, do:

%load_ext snakeviz
%%snakeviz

It just takes those 2 lines of code in a Jupyter notebook, and it generates a nice interactive diagram. For example:

Here is the code. Again, the 2 lines starting with % are the only extra lines of code needed to use snakeviz:

# !pip install snakeviz
%load_ext snakeviz
import glob
import hashlib

%%snakeviz

files = glob.glob('*.txt')
def print_files_hashed(files):
    for file in files:
        with open(file) as f:
            print(hashlib.md5(f.read().encode('utf-8')).hexdigest())
print_files_hashed(files)

It also seems possible to run snakeviz outside notebooks. More info on the snakeviz website.


回答 23

import time

def getElapsedTime(startTime, units):
    elapsedInSeconds = time.time() - startTime
    if units == 'sec':
        return elapsedInSeconds
    if units == 'min':
        return elapsedInSeconds/60
    if units == 'hour':
        return elapsedInSeconds/(60*60)
import time

def getElapsedTime(startTime, units):
    elapsedInSeconds = time.time() - startTime
    if units == 'sec':
        return elapsedInSeconds
    if units == 'min':
        return elapsedInSeconds/60
    if units == 'hour':
        return elapsedInSeconds/(60*60)

回答 24

这种独特的基于类的方法提供了可打印的字符串表示形式,可自定义的舍入功能,并且可以方便地访问作为字符串或浮点数的经过时间。它是使用Python 3.7开发的。

import datetime
import timeit


class Timer:
    """Measure time used."""
    # Ref: https://stackoverflow.com/a/57931660/

    def __init__(self, round_ndigits: int = 0):
        self._round_ndigits = round_ndigits
        self._start_time = timeit.default_timer()

    def __call__(self) -> float:
        return timeit.default_timer() - self._start_time

    def __str__(self) -> str:
        return str(datetime.timedelta(seconds=round(self(), self._round_ndigits)))

用法:

# Setup timer
>>> timer = Timer()

# Access as a string
>>> print(f'Time elapsed is {timer}.')
Time elapsed is 0:00:03.
>>> print(f'Time elapsed is {timer}.')
Time elapsed is 0:00:04.

# Access as a float
>>> timer()
6.841332235
>>> timer()
7.970274425

This unique class-based approach offers a printable string representation, customizable rounding, and convenient access to the elapsed time as a string or a float. It was developed with Python 3.7.

import datetime
import timeit


class Timer:
    """Measure time used."""
    # Ref: https://stackoverflow.com/a/57931660/

    def __init__(self, round_ndigits: int = 0):
        self._round_ndigits = round_ndigits
        self._start_time = timeit.default_timer()

    def __call__(self) -> float:
        return timeit.default_timer() - self._start_time

    def __str__(self) -> str:
        return str(datetime.timedelta(seconds=round(self(), self._round_ndigits)))

Usage:

# Setup timer
>>> timer = Timer()

# Access as a string
>>> print(f'Time elapsed is {timer}.')
Time elapsed is 0:00:03.
>>> print(f'Time elapsed is {timer}.')
Time elapsed is 0:00:04.

# Access as a float
>>> timer()
6.841332235
>>> timer()
7.970274425

回答 25

测量小代码段的执行时间。

时间单位以秒为单位,以浮点数表示

import timeit
t = timeit.Timer('li = list(map(lambda x:x*2,[1,2,3,4,5]))')
t.timeit()
t.repeat()
>[1.2934070999999676, 1.3335035000000062, 1.422568500000125]

repeat()方法可以方便地多次调用timeit()并返回结果列表。

repeat(repeat=3

有了这个清单,我们可以花很多时间。

默认情况下,timeit()在计时期间临时关闭垃圾收集。time.Timer()解决了这个问题。

优点:

timeit.Timer()使独立计时更具可比性。gc可能是被测功能性能的重要组成部分。如果是这样,可以将gc(垃圾收集器)作为设置字符串中的第一条语句重新启用。例如:

timeit.Timer('li = list(map(lambda x:x*2,[1,2,3,4,5]))',setup='gc.enable()')

Python文档

Measure execution time of small code snippets.

Unit of time: measured in seconds as a float

import timeit
t = timeit.Timer('li = list(map(lambda x:x*2,[1,2,3,4,5]))')
t.timeit()
t.repeat()
>[1.2934070999999676, 1.3335035000000062, 1.422568500000125]

The repeat() method is a convenience to call timeit() multiple times and return a list of results.

repeat(repeat=3)¶

With this list we can take a mean of all times.

By default, timeit() temporarily turns off garbage collection during the timing. time.Timer() solves this problem.

Pros:

timeit.Timer() makes independent timings more comparable. The gc may be an important component of the performance of the function being measured. If so, gc(garbage collector) can be re-enabled as the first statement in the setup string. For example:

timeit.Timer('li = list(map(lambda x:x*2,[1,2,3,4,5]))',setup='gc.enable()')

Source Python Docs!


回答 26

如果您希望能够方便地计时功能,可以使用一个简单的装饰器:

def timing_decorator(func):
    def wrapper(*args, **kwargs):
        start = time.time()
        original_return_val = func(*args, **kwargs)
        end = time.time()
        print("time elapsed in ", func.__name__, ": ", end - start, sep='')
        return original_return_val

    return wrapper

您可以在想要计时的函数上使用它,如下所示:

@timing_decorator
def function_to_time():
    time.sleep(1)

然后function_to_time,在您每次调用时,它都会打印花费了多长时间以及函数的名称。

If you want to be able to time functions conveniently, you can use a simple decorator:

def timing_decorator(func):
    def wrapper(*args, **kwargs):
        start = time.time()
        original_return_val = func(*args, **kwargs)
        end = time.time()
        print("time elapsed in ", func.__name__, ": ", end - start, sep='')
        return original_return_val

    return wrapper

You can use it on a function that you want to time like this:

@timing_decorator
def function_to_time():
    time.sleep(1)

Then any time you call function_to_time, it will print how long it took and the name of the function being timed.


回答 27

基于https://stackoverflow.com/a/30024601/5095636提供的contextmanager解决方案,以下是lambda免费版本,因为flake8根据E731警告使用lambda :

from contextlib import contextmanager
from timeit import default_timer

@contextmanager
def elapsed_timer():
    start_time = default_timer()

    class _Timer():
      start = start_time
      end = default_timer()
      duration = end - start

    yield _Timer

    end_time = default_timer()
    _Timer.end = end_time
    _Timer.duration = end_time - start_time

测试:

from time import sleep

with elapsed_timer() as t:
    print("start:", t.start)
    sleep(1)
    print("end:", t.end)

t.start
t.end
t.duration

based on the contextmanager solution given by https://stackoverflow.com/a/30024601/5095636, hereunder the lambda free version, as flake8 warns on the usage of lambda as per E731:

from contextlib import contextmanager
from timeit import default_timer

@contextmanager
def elapsed_timer():
    start_time = default_timer()

    class _Timer():
      start = start_time
      end = default_timer()
      duration = end - start

    yield _Timer

    end_time = default_timer()
    _Timer.end = end_time
    _Timer.duration = end_time - start_time

test:

from time import sleep

with elapsed_timer() as t:
    print("start:", t.start)
    sleep(1)
    print("end:", t.end)

t.start
t.end
t.duration

回答 28

我能想到的唯一方法是使用time.time()

import time
start = time.time()
sleep(5) #just to give it some delay to show it working
finish = time.time()
elapsed = finish - start
print(elapsed)

希望对您有所帮助。

The only way I can think of is using time.time().

import time
start = time.time()
sleep(5) #just to give it some delay to show it working
finish = time.time()
elapsed = finish - start
print(elapsed)

Hope that will help.


回答 29

timeit模块非常适合计时一小段Python代码。它至少可以以三种形式使用:

1-作为命令行模块

python2 -m timeit 'for i in xrange(10): oct(i)' 

2-对于短代码,请将其作为参数传递。

import timeit
timeit.Timer('for i in xrange(10): oct(i)').timeit()

3-对于更长的代码为:

import timeit
code_to_test = """
a = range(100000)
b = []
for i in a:
    b.append(i*2)
"""
elapsed_time = timeit.timeit(code_to_test, number=100)/100
print(elapsed_time)

The timeit module is good for timing a small piece of Python code. It can be used at least in three forms:

1- As a command-line module

python2 -m timeit 'for i in xrange(10): oct(i)' 

2- For a short code, pass it as arguments.

import timeit
timeit.Timer('for i in xrange(10): oct(i)').timeit()

3- For longer code as:

import timeit
code_to_test = """
a = range(100000)
b = []
for i in a:
    b.append(i*2)
"""
elapsed_time = timeit.timeit(code_to_test, number=100)/100
print(elapsed_time)