如何使用Python的timeit计时代码段以测试性能？

Question 1

I’ve a python script which works just as it should, but I need to write the execution time. I’ve googled that I should use timeit but I can’t seem to get it to work.

My Python script looks like this:

import sys
import getopt
import timeit
import random
import os
import re
import ibm_db
import time
from string import maketrans
myfile = open("results_update.txt", "a")

for r in range(100):
    rannumber = random.randint(0, 100)

    update = "update TABLE set val = %i where MyCount >= '2010' and MyCount < '2012' and number = '250'" % rannumber
    #print rannumber

    conn = ibm_db.pconnect("dsn=myDB","usrname","secretPWD")

for r in range(5):
    print "Run %s\n" % r        
    ibm_db.execute(query_stmt)
 query_stmt = ibm_db.prepare(conn, update)

myfile.close()
ibm_db.close(conn)

What I need is the time it takes to execute the query and write it to the file results_update.txt. The purpose is to test an update statement for my database with different indexes and tuning mechanisms.

Question 2

You can use time.time() or time.clock() before and after the block you want to time.

import time

t0 = time.time()
code_block
t1 = time.time()

total = t1-t0

This method is not as exact as timeit (it does not average several runs) but it is straightforward.

time.time() (in Windows and Linux) and time.clock() (in Linux) are not precise enough for fast functions (you get total = 0). In this case or if you want to average the time elapsed by several runs, you have to manually call the function multiple times (As I think you already do in you example code and timeit does automatically when you set its number argument)

import time

def myfast():
   code

n = 10000
t0 = time.time()
for i in range(n): myfast()
t1 = time.time()

total_n = t1-t0

In Windows, as Corey stated in the comment, time.clock() has much higher precision (microsecond instead of second) and is preferred over time.time().

Question 3

If you are profiling your code and can use IPython, it has the magic function %timeit.

%%timeit operates on cells.

In [2]: %timeit cos(3.14)
10000000 loops, best of 3: 160 ns per loop

In [3]: %%timeit
   ...: cos(3.14)
   ...: x = 2 + 3
   ...: 
10000000 loops, best of 3: 196 ns per loop

Question 4

Quite apart from the timing, this code you show is simply incorrect: you execute 100 connections (completely ignoring all but the last one), and then when you do the first execute call you pass it a local variable query_stmt which you only initialize after the execute call.

First, make your code correct, without worrying about timing yet: i.e. a function that makes or receives a connection and performs 100 or 500 or whatever number of updates on that connection, then closes the connection. Once you have your code working correctly is the correct point at which to think about using timeit on it!

Specifically, if the function you want to time is a parameter-less one called foobar you can use timeit.timeit (2.6 or later — it’s more complicated in 2.5 and before):

timeit.timeit('foobar()', number=1000)

You’d better specify the number of runs because the default, a million, may be high for your use case (leading to spending a lot of time in this code;-).

Question 5

Focus on one specific thing. Disk I/O is slow, so I’d take that out of the test if all you are going to tweak is the database query.

And if you need to time your database execution, look for database tools instead, like asking for the query plan, and note that performance varies not only with the exact query and what indexes you have, but also with the data load (how much data you have stored).

That said, you can simply put your code in a function and run that function with timeit.timeit():

def function_to_repeat():
    # ...

duration = timeit.timeit(function_to_repeat, number=1000)

This would disable the garbage collection, repeatedly call the function_to_repeat() function, and time the total duration of those calls using timeit.default_timer(), which is the most accurate available clock for your specific platform.

You should move setup code out of the repeated function; for example, you should connect to the database first, then time only the queries. Use the setup argument to either import or create those dependencies, and pass them into your function:

def function_to_repeat(var1, var2):
    # ...

duration = timeit.timeit(
    'function_to_repeat(var1, var2)',
    'from __main__ import function_to_repeat, var1, var2', 
    number=1000)

would grab the globals function_to_repeat, var1 and var2 from your script and pass those to the function each repetition.

Question 6

I see the question has already been answered, but still want to add my 2 cents for the same.

I have also faced similar scenario in which I have to test the execution times for several approaches and hence written a small script, which calls timeit on all functions written in it.

The script is also available as github gist here.

Hope it will help you and others.

from random import random
import types

def list_without_comprehension():
    l = []
    for i in xrange(1000):
        l.append(int(random()*100 % 100))
    return l

def list_with_comprehension():
    # 1K random numbers between 0 to 100
    l = [int(random()*100 % 100) for _ in xrange(1000)]
    return l


# operations on list_without_comprehension
def sort_list_without_comprehension():
    list_without_comprehension().sort()

def reverse_sort_list_without_comprehension():
    list_without_comprehension().sort(reverse=True)

def sorted_list_without_comprehension():
    sorted(list_without_comprehension())


# operations on list_with_comprehension
def sort_list_with_comprehension():
    list_with_comprehension().sort()

def reverse_sort_list_with_comprehension():
    list_with_comprehension().sort(reverse=True)

def sorted_list_with_comprehension():
    sorted(list_with_comprehension())


def main():
    objs = globals()
    funcs = []
    f = open("timeit_demo.sh", "w+")

    for objname in objs:
        if objname != 'main' and type(objs[objname]) == types.FunctionType:
            funcs.append(objname)
    funcs.sort()
    for func in funcs:
        f.write('''echo "Timing: %(funcname)s"
python -m timeit "import timeit_demo; timeit_demo.%(funcname)s();"\n\n
echo "------------------------------------------------------------"
''' % dict(
                funcname = func,
                )
            )

    f.close()

if __name__ == "__main__":
    main()

    from os import system

    #Works only for *nix platforms
    system("/bin/bash timeit_demo.sh")

    #un-comment below for windows
    #system("cmd timeit_demo.sh")

Question 7

Here’s a simple wrapper for steven’s answer. This function doesn’t do repeated runs/averaging, just saves you from having to repeat the timing code everywhere :)

'''function which prints the wall time it takes to execute the given command'''
def time_func(func, *args): #*args can take 0 or more 
  import time
  start_time = time.time()
  func(*args)
  end_time = time.time()
  print("it took this long to run: {}".format(end_time-start_time))

Question 8

The testing suite doesn’t make an attempt at using the imported timeit so it’s hard to tell what the intent was. Nonetheless, this is a canonical answer so a complete example of timeit seems in order, elaborating on Martijn’s answer.

The docs for timeit offer many examples and flags worth checking out. The basic usage on the command line is:

$ python -mtimeit "all(True for _ in range(1000))"
2000 loops, best of 5: 161 usec per loop
$ python -mtimeit "all([True for _ in range(1000)])"
2000 loops, best of 5: 116 usec per loop

Run with -h to see all options. Python MOTW has a great section on timeit that shows how to run modules via import and multiline code strings from the command line.

In script form, I typically use it like this:

import argparse
import copy
import dis
import inspect
import random
import sys
import timeit

def test_slice(L):
    L[:]

def test_copy(L):
    L.copy()

def test_deepcopy(L):
    copy.deepcopy(L)

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--n", type=int, default=10 ** 5)
    parser.add_argument("--trials", type=int, default=100)
    parser.add_argument("--dis", action="store_true")
    args = parser.parse_args()
    n = args.n
    trials = args.trials
    namespace = dict(L = random.sample(range(n), k=n))
    funcs_to_test = [x for x in locals().values() 
                     if callable(x) and x.__module__ == __name__]
    print(f"{'-' * 30}\nn = {n}, {trials} trials\n{'-' * 30}\n")

    for func in funcs_to_test:
        fname = func.__name__
        fargs = ", ".join(inspect.signature(func).parameters)
        stmt = f"{fname}({fargs})"
        setup = f"from __main__ import {fname}"
        time = timeit.timeit(stmt, setup, number=trials, globals=namespace)
        print(inspect.getsource(globals().get(fname)))

        if args.dis:
            dis.dis(globals().get(fname))

        print(f"time (s) => {time}\n{'-' * 30}\n")

You can pretty easily drop in the functions and arguments you need. Use caution when using impure functions and take care of state.

Sample output:

$ python benchmark.py --n 10000
------------------------------
n = 10000, 100 trials
------------------------------

def test_slice(L):
    L[:]

time (s) => 0.015502399999999972
------------------------------

def test_copy(L):
    L.copy()

time (s) => 0.01651419999999998
------------------------------

def test_deepcopy(L):
    copy.deepcopy(L)

time (s) => 2.136012
------------------------------

如何使用Python的timeit计时代码段以测试性能？

问题：如何使用Python的timeit计时代码段以测试性能？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

Python 流程图 — 一键转化代码为流程图

7行代码 Python热力图可视化分析缺失数据处理

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

Virtualenvs中的参考损坏

如何打开文件夹中的每个文件？

将字符串插入列表中而不会拆分为字符

利用 Python 理解设计模式之委托模式

用单个空格替换非ASCII字符

DeprecationWarning：无效的转义序列-使用什么代替\ d？

如何使用Python的timeit计时代码段以测试性能？

问题：如何使用Python的timeit计时代码段以测试性能？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

相关文章

排行榜展示

文章展示