如何使用timeit模块

问题:如何使用timeit模块

我了解做什么的概念,timeit但不确定如何在代码中实现。

我怎样才能比较两个功能,比方说insertion_sorttim_sort,用timeit

I understand the concept of what timeit does but I am not sure how to implement it in my code.

How can I compare two functions, say insertion_sort and tim_sort, with timeit?


回答 0

timeit的工作方式是运行一次安装代码,然后重复调用一系列语句。因此,如果要测试排序,则需要格外小心,以免就地进行一次排序不会影响已排序数据的下一遍(这当然会使Timsort发光,因为它表现最佳当数据已经部分排序时)。

这是有关如何设置排序测试的示例:

>>> import timeit

>>> setup = '''
import random

random.seed('slartibartfast')
s = [random.random() for i in range(1000)]
timsort = list.sort
'''

>>> print min(timeit.Timer('a=s[:]; timsort(a)', setup=setup).repeat(7, 1000))
0.334147930145

请注意,这一系列语句在每次通过时都会对未排序的数据进行全新复制。

另外,请注意运行测量套件七次并仅保留最佳时间的计时技术-这确实可以帮助减少由于系统上正在运行其他进程而导致的测量失真。

这些是我正确使用timeit的技巧。希望这可以帮助 :-)

The way timeit works is to run setup code once and then make repeated calls to a series of statements. So, if you want to test sorting, some care is required so that one pass at an in-place sort doesn’t affect the next pass with already sorted data (that, of course, would make the Timsort really shine because it performs best when the data already partially ordered).

Here is an example of how to set up a test for sorting:

>>> import timeit

>>> setup = '''
import random

random.seed('slartibartfast')
s = [random.random() for i in range(1000)]
timsort = list.sort
'''

>>> print min(timeit.Timer('a=s[:]; timsort(a)', setup=setup).repeat(7, 1000))
0.334147930145

Note that the series of statements makes a fresh copy of the unsorted data on every pass.

Also, note the timing technique of running the measurement suite seven times and keeping only the best time — this can really help reduce measurement distortions due to other processes running on your system.

Those are my tips for using timeit correctly. Hope this helps :-)


回答 1

如果要timeit在交互式Python会话中使用,有两个方便的选项:

  1. 使用IPython Shell。它具有方便的%timeit特殊功能:

    In [1]: def f(x):
       ...:     return x*x
       ...: 
    
    In [2]: %timeit for x in range(100): f(x)
    100000 loops, best of 3: 20.3 us per loop
  2. 在标准的Python解释器中,您可以通过__main__在setup语句中导入它们来访问在交互式会话期间先前定义的函数和其他名称:

    >>> def f(x):
    ...     return x * x 
    ... 
    >>> import timeit
    >>> timeit.repeat("for x in range(100): f(x)", "from __main__ import f",
                      number=100000)
    [2.0640320777893066, 2.0876040458679199, 2.0520210266113281]

If you want to use timeit in an interactive Python session, there are two convenient options:

  1. Use the IPython shell. It features the convenient %timeit special function:

    In [1]: def f(x):
       ...:     return x*x
       ...: 
    
    In [2]: %timeit for x in range(100): f(x)
    100000 loops, best of 3: 20.3 us per loop
    
  2. In a standard Python interpreter, you can access functions and other names you defined earlier during the interactive session by importing them from __main__ in the setup statement:

    >>> def f(x):
    ...     return x * x 
    ... 
    >>> import timeit
    >>> timeit.repeat("for x in range(100): f(x)", "from __main__ import f",
                      number=100000)
    [2.0640320777893066, 2.0876040458679199, 2.0520210266113281]
    

回答 2

我会告诉您一个秘密:最好的使用方法timeit是在命令行上。

在命令行上,进行timeit适当的统计分析:它告诉您最短运行花费了多长时间。这很好,因为所有计时错误都是正的。因此,最短的时间误差最小。没有办法得到负错误,因为计算机的计算速度永远不可能超过其计算速度!

因此,命令行界面:

%~> python -m timeit "1 + 2"
10000000 loops, best of 3: 0.0468 usec per loop

这很简单,是吗?

您可以设置以下内容:

%~> python -m timeit -s "x = range(10000)" "sum(x)"
1000 loops, best of 3: 543 usec per loop

也是有用的!

如果需要多行,则可以使用外壳程序的自动延续或使用单独的参数:

%~> python -m timeit -s "x = range(10000)" -s "y = range(100)" "sum(x)" "min(y)"
1000 loops, best of 3: 554 usec per loop

这给出了一个设置

x = range(1000)
y = range(100)

和时代

sum(x)
min(y)

如果您想要更长的脚本,则可能会倾向于timeit使用Python脚本。我建议避免这种情况,因为在命令行上分析和计时会更好。相反,我倾向于制作shell脚本:

 SETUP="

 ... # lots of stuff

 "

 echo Minmod arr1
 python -m timeit -s "$SETUP" "Minmod(arr1)"

 echo pure_minmod arr1
 python -m timeit -s "$SETUP" "pure_minmod(arr1)"

 echo better_minmod arr1
 python -m timeit -s "$SETUP" "better_minmod(arr1)"

 ... etc

由于要进行多次初始化,因此可能需要更长的时间,但是通常这没什么大不了的。


但是,如果timeit在模块内部使用该怎么办?

好吧,简单的方法是:

def function(...):
    ...

timeit.Timer(function).timeit(number=NUMBER)

这样您就可以累积(而不是最短!)时间来运行该次数。

为了获得良好的分析效果,请使用.repeat并采取以下最低限度的措施:

min(timeit.Timer(function).repeat(repeat=REPEATS, number=NUMBER))

通常应将此与结合使用,functools.partial而不是lambda: ...降低开销。因此,您可能会遇到以下情况:

from functools import partial

def to_time(items):
    ...

test_items = [1, 2, 3] * 100
times = timeit.Timer(partial(to_time, test_items)).repeat(3, 1000)

# Divide by the number of repeats
time_taken = min(times) / 1000

您也可以:

timeit.timeit("...", setup="from __main__ import ...", number=NUMBER)

这将使您从命令行更接近界面,但是方式要少得多。将"from __main__ import ..."让您使用代码从您的主模块所创造的人工环境内timeit

值得注意的是,这是一个方便包装Timer(...).timeit(...),因此在时间安排上并不是特别好。我个人更喜欢使用Timer(...).repeat(...)上面显示的内容。


警告事项

timeit到处都有一些警告。

  • 开销不占。说您要计时x += 1,找出加法需要多长时间:

    >>> python -m timeit -s "x = 0" "x += 1"
    10000000 loops, best of 3: 0.0476 usec per loop

    好吧,这不是 0.0476 µs。您只知道它比这还。所有错误均为正。

    因此,尝试找到开销:

    >>> python -m timeit -s "x = 0" ""      
    100000000 loops, best of 3: 0.014 usec per loop

    仅从定时开始,这就是30%的开销!这会大大歪曲相对的时间安排。但是您只真正关心添加的时间。查找时间x也需要包含在开销中:

    >>> python -m timeit -s "x = 0" "x"
    100000000 loops, best of 3: 0.0166 usec per loop

    差别不大,但是就在那里。

  • 变异方法很危险。

    >>> python -m timeit -s "x = [0]*100000" "while x: x.pop()"
    10000000 loops, best of 3: 0.0436 usec per loop

    但这是完全错误的! x是第一次迭代后的空列表。您需要重新初始化:

    >>> python -m timeit "x = [0]*100000" "while x: x.pop()"
    100 loops, best of 3: 9.79 msec per loop

    但是那样您就会有很多开销。分别说明。

    >>> python -m timeit "x = [0]*100000"                   
    1000 loops, best of 3: 261 usec per loop

    请注意,在这里减去开销是合理的,仅是因为开销只是时间的一小部分。

    对于你的榜样,值得一提的是,这两个插入排序和蒂姆排序有完全不同寻常的已排序的列表时序行为。这意味着,random.shuffle如果您想避免破坏时间安排,就需要进行两次排序。

I’ll let you in on a secret: the best way to use timeit is on the command line.

On the command line, timeit does proper statistical analysis: it tells you how long the shortest run took. This is good because all error in timing is positive. So the shortest time has the least error in it. There’s no way to get negative error because a computer can’t ever compute faster than it can compute!

So, the command-line interface:

%~> python -m timeit "1 + 2"
10000000 loops, best of 3: 0.0468 usec per loop

That’s quite simple, eh?

You can set stuff up:

%~> python -m timeit -s "x = range(10000)" "sum(x)"
1000 loops, best of 3: 543 usec per loop

which is useful, too!

If you want multiple lines, you can either use the shell’s automatic continuation or use separate arguments:

%~> python -m timeit -s "x = range(10000)" -s "y = range(100)" "sum(x)" "min(y)"
1000 loops, best of 3: 554 usec per loop

That gives a setup of

x = range(1000)
y = range(100)

and times

sum(x)
min(y)

If you want to have longer scripts you might be tempted to move to timeit inside a Python script. I suggest avoiding that because the analysis and timing is simply better on the command line. Instead, I tend to make shell scripts:

 SETUP="

 ... # lots of stuff

 "

 echo Minmod arr1
 python -m timeit -s "$SETUP" "Minmod(arr1)"

 echo pure_minmod arr1
 python -m timeit -s "$SETUP" "pure_minmod(arr1)"

 echo better_minmod arr1
 python -m timeit -s "$SETUP" "better_minmod(arr1)"

 ... etc

This can take a bit longer due to the multiple initialisations, but normally that’s not a big deal.


But what if you want to use timeit inside your module?

Well, the simple way is to do:

def function(...):
    ...

timeit.Timer(function).timeit(number=NUMBER)

and that gives you cumulative (not minimum!) time to run that number of times.

To get a good analysis, use .repeat and take the minimum:

min(timeit.Timer(function).repeat(repeat=REPEATS, number=NUMBER))

You should normally combine this with functools.partial instead of lambda: ... to lower overhead. Thus you could have something like:

from functools import partial

def to_time(items):
    ...

test_items = [1, 2, 3] * 100
times = timeit.Timer(partial(to_time, test_items)).repeat(3, 1000)

# Divide by the number of repeats
time_taken = min(times) / 1000

You can also do:

timeit.timeit("...", setup="from __main__ import ...", number=NUMBER)

which would give you something closer to the interface from the command-line, but in a much less cool manner. The "from __main__ import ..." lets you use code from your main module inside the artificial environment created by timeit.

It’s worth noting that this is a convenience wrapper for Timer(...).timeit(...) and so isn’t particularly good at timing. I personally far prefer using Timer(...).repeat(...) as I’ve shown above.


Warnings

There are a few caveats with timeit that hold everywhere.

  • Overhead is not accounted for. Say you want to time x += 1, to find out how long addition takes:

    >>> python -m timeit -s "x = 0" "x += 1"
    10000000 loops, best of 3: 0.0476 usec per loop
    

    Well, it’s not 0.0476 µs. You only know that it’s less than that. All error is positive.

    So try and find pure overhead:

    >>> python -m timeit -s "x = 0" ""      
    100000000 loops, best of 3: 0.014 usec per loop
    

    That’s a good 30% overhead just from timing! This can massively skew relative timings. But you only really cared about the adding timings; the look-up timings for x also need to be included in overhead:

    >>> python -m timeit -s "x = 0" "x"
    100000000 loops, best of 3: 0.0166 usec per loop
    

    The difference isn’t much larger, but it’s there.

  • Mutating methods are dangerous.

    >>> python -m timeit -s "x = [0]*100000" "while x: x.pop()"
    10000000 loops, best of 3: 0.0436 usec per loop
    

    But that’s completely wrong! x is the empty list after the first iteration. You’ll need to reinitialize:

    >>> python -m timeit "x = [0]*100000" "while x: x.pop()"
    100 loops, best of 3: 9.79 msec per loop
    

    But then you have lots of overhead. Account for that separately.

    >>> python -m timeit "x = [0]*100000"                   
    1000 loops, best of 3: 261 usec per loop
    

    Note that subtracting the overhead is reasonable here only because the overhead is a small-ish fraction of the time.

    For your example, it’s worth noting that both Insertion Sort and Tim Sort have completely unusual timing behaviours for already-sorted lists. This means you will require a random.shuffle between sorts if you want to avoid wrecking your timings.


回答 3

如果要快速比较两个代码/功能块,可以执行以下操作:

import timeit

start_time = timeit.default_timer()
func1()
print(timeit.default_timer() - start_time)

start_time = timeit.default_timer()
func2()
print(timeit.default_timer() - start_time)

If you want to compare two blocks of code / functions quickly you could do:

import timeit

start_time = timeit.default_timer()
func1()
print(timeit.default_timer() - start_time)

start_time = timeit.default_timer()
func2()
print(timeit.default_timer() - start_time)

回答 4

我发现使用timeit的最简单方法是从命令行:

给定test.py

def InsertionSort(): ...
def TimSort(): ...

运行timeit是这样的:

% python -mtimeit -s'import test' 'test.InsertionSort()'
% python -mtimeit -s'import test' 'test.TimSort()'

I find the easiest way to use timeit is from the command line:

Given test.py:

def InsertionSort(): ...
def TimSort(): ...

run timeit like this:

% python -mtimeit -s'import test' 'test.InsertionSort()'
% python -mtimeit -s'import test' 'test.TimSort()'

回答 5

对我来说,这是最快的方法:

import timeit
def foo():
    print("here is my code to time...")


timeit.timeit(stmt=foo, number=1234567)

for me, this is the fastest way:

import timeit
def foo():
    print("here is my code to time...")


timeit.timeit(stmt=foo, number=1234567)

回答 6

# Генерация целых чисел

def gen_prime(x):
    multiples = []
    results = []
    for i in range(2, x+1):
        if i not in multiples:
            results.append(i)
            for j in range(i*i, x+1, i):
                multiples.append(j)

    return results


import timeit

# Засекаем время

start_time = timeit.default_timer()
gen_prime(3000)
print(timeit.default_timer() - start_time)

# start_time = timeit.default_timer()
# gen_prime(1001)
# print(timeit.default_timer() - start_time)
# Генерация целых чисел

def gen_prime(x):
    multiples = []
    results = []
    for i in range(2, x+1):
        if i not in multiples:
            results.append(i)
            for j in range(i*i, x+1, i):
                multiples.append(j)

    return results


import timeit

# Засекаем время

start_time = timeit.default_timer()
gen_prime(3000)
print(timeit.default_timer() - start_time)

# start_time = timeit.default_timer()
# gen_prime(1001)
# print(timeit.default_timer() - start_time)

回答 7

这很好用:

  python -m timeit -c "$(cat file_name.py)"

This works great:

  python -m timeit -c "$(cat file_name.py)"

回答 8

让我们在以下每个目录中设置相同的字典并测试执行时间。

setup参数基本上是在设置字典

编号是要运行的代码1000000次。不是设置而是stmt

运行此命令时,您可以看到索引比获取索引快得多。您可以多次运行以查看。

该代码基本上试图获取字典中c的值。

import timeit

print('Getting value of C by index:', timeit.timeit(stmt="mydict['c']", setup="mydict={'a':5, 'b':6, 'c':7}", number=1000000))
print('Getting value of C by get:', timeit.timeit(stmt="mydict.get('c')", setup="mydict={'a':5, 'b':6, 'c':7}", number=1000000))

这是我的结果,您的结果会有所不同。

按索引:0.20900007452246427

通过获取:0.54841166886888

lets setup the same dictionary in each of the following and test the execution time.

The setup argument is basically setting up the dictionary

Number is to run the code 1000000 times. Not the setup but the stmt

When you run this you can see that index is way faster than get. You can run it multiple times to see.

The code basically tries to get the value of c in the dictionary.

import timeit

print('Getting value of C by index:', timeit.timeit(stmt="mydict['c']", setup="mydict={'a':5, 'b':6, 'c':7}", number=1000000))
print('Getting value of C by get:', timeit.timeit(stmt="mydict.get('c')", setup="mydict={'a':5, 'b':6, 'c':7}", number=1000000))

Here are my results, yours will differ.

by index: 0.20900007452246427

by get: 0.54841166886888


回答 9

只需将整个代码作为timeit的参数传递:

import timeit

print(timeit.timeit(

"""   
limit = 10000
prime_list = [i for i in range(2, limit+1)]

for prime in prime_list:
    for elem in range(prime*2, max(prime_list)+1, prime):
        if elem in prime_list:
            prime_list.remove(elem)
"""   
, number=10))

simply pass your entire code as an argument of timeit:

import timeit

print(timeit.timeit(

"""   
limit = 10000
prime_list = [i for i in range(2, limit+1)]

for prime in prime_list:
    for elem in range(prime*2, max(prime_list)+1, prime):
        if elem in prime_list:
            prime_list.remove(elem)
"""   
, number=10))

回答 10

import timeit


def oct(x):
   return x*x


timeit.Timer("for x in range(100): oct(x)", "gc.enable()").timeit()
import timeit


def oct(x):
   return x*x


timeit.Timer("for x in range(100): oct(x)", "gc.enable()").timeit()

回答 11

内置的timeit模块在IPython命令行中效果最佳。

要从模块内计时功能:

from timeit import default_timer as timer
import sys

def timefunc(func, *args, **kwargs):
    """Time a function. 

    args:
        iterations=3

    Usage example:
        timeit(myfunc, 1, b=2)
    """
    try:
        iterations = kwargs.pop('iterations')
    except KeyError:
        iterations = 3
    elapsed = sys.maxsize
    for _ in range(iterations):
        start = timer()
        result = func(*args, **kwargs)
        elapsed = min(timer() - start, elapsed)
    print(('Best of {} {}(): {:.9f}'.format(iterations, func.__name__, elapsed)))
    return result

The built-in timeit module works best from the IPython command line.

To time functions from within a module:

from timeit import default_timer as timer
import sys

def timefunc(func, *args, **kwargs):
    """Time a function. 

    args:
        iterations=3

    Usage example:
        timeit(myfunc, 1, b=2)
    """
    try:
        iterations = kwargs.pop('iterations')
    except KeyError:
        iterations = 3
    elapsed = sys.maxsize
    for _ in range(iterations):
        start = timer()
        result = func(*args, **kwargs)
        elapsed = min(timer() - start, elapsed)
    print(('Best of {} {}(): {:.9f}'.format(iterations, func.__name__, elapsed)))
    return result

回答 12

如何将Python REPL解释器与接受参数的函数一起使用的示例。

>>> import timeit                                                                                         

>>> def naive_func(x):                                                                                    
...     a = 0                                                                                             
...     for i in range(a):                                                                                
...         a += i                                                                                        
...     return a                                                                                          

>>> def wrapper(func, *args, **kwargs):                                                                   
...     def wrapper():                                                                                    
...         return func(*args, **kwargs)                                                                  
...     return wrapper                                                                                    

>>> wrapped = wrapper(naive_func, 1_000)                                                                  

>>> timeit.timeit(wrapped, number=1_000_000)                                                              
0.4458435332577161                                                                                        

Example of how to use Python REPL interpreter with function that accepts parameters.

>>> import timeit                                                                                         

>>> def naive_func(x):                                                                                    
...     a = 0                                                                                             
...     for i in range(a):                                                                                
...         a += i                                                                                        
...     return a                                                                                          

>>> def wrapper(func, *args, **kwargs):                                                                   
...     def wrapper():                                                                                    
...         return func(*args, **kwargs)                                                                  
...     return wrapper                                                                                    

>>> wrapped = wrapper(naive_func, 1_000)                                                                  

>>> timeit.timeit(wrapped, number=1_000_000)                                                              
0.4458435332577161                                                                                        

回答 13

您将创建两个函数,然后运行与此类似的操作。请注意,您要选择相同的执行/运行次数来比较apple与apple。
这已在Python 3.7下进行了测试。

这是易于复制的代码

!/usr/local/bin/python3
import timeit

def fibonacci(n):
    """
    Returns the n-th Fibonacci number.
    """
    if(n == 0):
        result = 0
    elif(n == 1):
        result = 1
    else:
        result = fibonacci(n-1) + fibonacci(n-2)
    return result

if __name__ == '__main__':
    import timeit
    t1 = timeit.Timer("fibonacci(13)", "from __main__ import fibonacci")
    print("fibonacci ran:",t1.timeit(number=1000), "milliseconds")

You would create two functions and then run something similar to this. Notice, you want to choose the same number of execution/run to compare apple to apple.
This was tested under Python 3.7.

Here is the code for ease of copying it

!/usr/local/bin/python3
import timeit

def fibonacci(n):
    """
    Returns the n-th Fibonacci number.
    """
    if(n == 0):
        result = 0
    elif(n == 1):
        result = 1
    else:
        result = fibonacci(n-1) + fibonacci(n-2)
    return result

if __name__ == '__main__':
    import timeit
    t1 = timeit.Timer("fibonacci(13)", "from __main__ import fibonacci")
    print("fibonacci ran:",t1.timeit(number=1000), "milliseconds")