26.6. timeit — 测量小代码片段的执行时间

2.3 新版功能.

源码: Lib/timeit.py


该模块提供了一种简单的方法来计算一小段 Python 代码的耗时。它有 命令行界面 以及一个 可调用 方法。它避免了许多用于测量执行时间的常见陷阱。另见 Tim Peters 对 O’Reilly 出版的 Python Cookbook 中“算法”章节的介绍。

26.6.1. 基本示例

以下示例显示了如何使用 命令行界面 来比较三个不同的表达式:

$ python -m timeit '"-".join(str(n) for n in range(100))'
10000 loops, best of 3: 40.3 usec per loop
$ python -m timeit '"-".join([str(n) for n in range(100)])'
10000 loops, best of 3: 33.4 usec per loop
$ python -m timeit '"-".join(map(str, range(100)))'
10000 loops, best of 3: 25.2 usec per loop

这可以通过 Python 接口 实现

>>> import timeit
>>> timeit.timeit('"-".join(str(n) for n in range(100))', number=10000)
0.8187260627746582
>>> timeit.timeit('"-".join([str(n) for n in range(100)])', number=10000)
0.7288308143615723
>>> timeit.timeit('"-".join(map(str, range(100)))', number=10000)
0.5858950614929199

但请注意 timeit 仅在使用命令行界面时自动确定重复次数。在 示例 部分,你可以找到更多高级示例。

26.6.2. Python 接口

该模块定义了三个便利函数和一个公共类:

timeit.timeit(stmt='pass', setup='pass', timer=<default timer>, number=1000000)

Create a Timer instance with the given statement, setup code and timer function and run its timeit() method with number executions.

2.6 新版功能.

timeit.repeat(stmt='pass', setup='pass', timer=<default timer>, repeat=3, number=1000000)

Create a Timer instance with the given statement, setup code and timer function and run its repeat() method with the given repeat count and number executions.

2.6 新版功能.

timeit.default_timer()

Define a default timer, in a platform-specific manner. On Windows, time.clock() has microsecond granularity, but time.time()’s granularity is 1/60th of a second. On Unix, time.clock() has 1/100th of a second granularity, and time.time() is much more precise. On either platform, default_timer() measures wall clock time, not the CPU time. This means that other processes running on the same computer may interfere with the timing.

class timeit.Timer(stmt='pass', setup='pass', timer=<timer function>)

用于小代码片段的计数执行速度的类。

The constructor takes a statement to be timed, an additional statement used for setup, and a timer function. Both statements default to 'pass'; the timer function is platform-dependent (see the module doc string). stmt and setup may also contain multiple statements separated by ; or newlines, as long as they don’t contain multi-line string literals.

To measure the execution time of the first statement, use the timeit() method. The repeat() method is a convenience to call timeit() multiple times and return a list of results.

在 2.6 版更改: The stmt and setup parameters can now also take objects that are callable without arguments. This will embed calls to them in a timer function that will then be executed by timeit(). Note that the timing overhead is a little larger in this case because of the extra function calls.

timeit(number=1000000)

执行 number 次主要语句。这将执行一次 setup 语句,然后返回执行主语句多次所需的时间,以秒为单位测量为浮点数。参数是通过循环的次数,默认为一百万。要使用的主语句、 setup 语句和 timer 函数将传递给构造函数。

注解

By default, timeit() temporarily turns off garbage collection during the timing. The advantage of this approach is that it makes independent timings more comparable. This disadvantage is that GC may be an important component of the performance of the function being measured. If so, GC can be re-enabled as the first statement in the setup string. For example:

timeit.Timer('for i in xrange(10): oct(i)', 'gc.enable()').timeit()
repeat(repeat=3, number=1000000)

调用 timeit() 几次。

这是一个方便的函数,它反复调用 timeit() ,返回结果列表。第一个参数指定调用 timeit() 的次数。第二个参数指定 timeit()number 参数。

注解

从结果向量计算并报告平均值和标准差这些是很诱人的。但是,这不是很有用。在典型情况下,最低值给出了机器运行给定代码段的速度的下限;结果向量中较高的值通常不是由Python的速度变化引起的,而是由于其他过程干扰你的计时准确性。所以结果的 min() 可能是你应该感兴趣的唯一数字。之后,你应该看看整个向量并应用常识而不是统计。

print_exc(file=None)

帮助程序从计时代码中打印回溯。

典型使用:

t = Timer(...)       # outside the try/except
try:
    t.timeit(...)    # or t.repeat(...)
except:
    t.print_exc()

The advantage over the standard traceback is that source lines in the compiled template will be displayed. The optional file argument directs where the traceback is sent; it defaults to sys.stderr.

26.6.3. 命令行界面

从命令行调用程序时,使用以下表单:

python -m timeit [-n N] [-r N] [-s S] [-t] [-c] [-h] [statement ...]

如果了解以下选项:

-n N, --number=N

执行 ‘语句’ 多少次

-r N, --repeat=N

how many times to repeat the timer (default 3)

-s S, --setup=S

最初要执行一次的语句(默认为 pass

-t, --time

use time.time() (default on all platforms but Windows)

-c, --clock

use time.clock() (default on Windows)

-v, --verbose

打印原始计时结果;重复更多位数精度

-h, --help

打印一条简短的使用信息并退出

可以通过将每一行指定为单独的语句参数来给出多行语句;通过在引号中包含参数并使用前导空格可以缩进行。多个 -s 选项的处理方式相似。

如果 -n 未给出,则通过尝试10的连续幂次来计算合适数量的循环,直到总时间至少为 0.2 秒。

default_timer() measurations can be affected by other programs running on the same machine, so the best thing to do when accurate timing is necessary is to repeat the timing a few times and use the best time. The -r option is good for this; the default of 3 repetitions is probably enough in most cases. On Unix, you can use time.clock() to measure CPU time.

注解

There is a certain baseline overhead associated with executing a pass statement. The code here doesn’t try to hide it, but you should be aware of it. The baseline overhead can be measured by invoking the program without arguments, and it might differ between Python versions. Also, to fairly compare older Python versions to Python 2.3, you may want to use Python’s -O option (see Optimizations) for the older versions to avoid timing SET_LINENO instructions.

26.6.4. 示例

可以提供一个在开头只执行一次的 setup 语句:

$ python -m timeit -s 'text = "sample string"; char = "g"'  'char in text'
10000000 loops, best of 3: 0.0877 usec per loop
$ python -m timeit -s 'text = "sample string"; char = "g"'  'text.find(char)'
1000000 loops, best of 3: 0.342 usec per loop
>>> import timeit
>>> timeit.timeit('char in text', setup='text = "sample string"; char = "g"')
0.41440500499993504
>>> timeit.timeit('text.find(char)', setup='text = "sample string"; char = "g"')
1.7246671520006203

使用 Timer 类及其方法可以完成同样的操作:

>>> import timeit
>>> t = timeit.Timer('char in text', setup='text = "sample string"; char = "g"')
>>> t.timeit()
0.3955516149999312
>>> t.repeat()
[0.40193588800002544, 0.3960157959998014, 0.39594301399984033]

以下示例显示如何计算包含多行的表达式。 在这里我们对比使用 hasattr()try/except 的开销来测试缺失与提供对象属性:

$ python -m timeit 'try:' '  str.__nonzero__' 'except AttributeError:' '  pass'
100000 loops, best of 3: 15.7 usec per loop
$ python -m timeit 'if hasattr(str, "__nonzero__"): pass'
100000 loops, best of 3: 4.26 usec per loop

$ python -m timeit 'try:' '  int.__nonzero__' 'except AttributeError:' '  pass'
1000000 loops, best of 3: 1.43 usec per loop
$ python -m timeit 'if hasattr(int, "__nonzero__"): pass'
100000 loops, best of 3: 2.23 usec per loop
>>> import timeit
>>> # attribute is missing
>>> s = """\
... try:
...     str.__nonzero__
... except AttributeError:
...     pass
... """
>>> timeit.timeit(stmt=s, number=100000)
0.9138244460009446
>>> s = "if hasattr(str, '__bool__'): pass"
>>> timeit.timeit(stmt=s, number=100000)
0.5829014980008651
>>>
>>> # attribute is present
>>> s = """\
... try:
...     int.__nonzero__
... except AttributeError:
...     pass
... """
>>> timeit.timeit(stmt=s, number=100000)
0.04215312199994514
>>> s = "if hasattr(int, '__bool__'): pass"
>>> timeit.timeit(stmt=s, number=100000)
0.08588060699912603

要让 timeit 模块访问你定义的函数,你可以传递一个包含 import 语句的 setup 参数:

def test():
    """Stupid test function"""
    L = []
    for i in range(100):
        L.append(i)

if __name__ == '__main__':
    import timeit
    print(timeit.timeit("test()", setup="from __main__ import test"))