为什么pow(a,d,n)比a ** d%n快得多?

问题:为什么pow(a,d,n)比a ** d%n快得多?

我正在尝试实施Miller-Rabin素数测试,并对为什么中号(〜7位数)要花这么长时间(> 20秒)感到困惑。我最终发现以下代码行是问题的根源:

x = a**d % n

(其中adn都是相似的,但不相等的中号,**是幂运算符,并且%是模运算符)

然后,我尝试将其替换为以下内容:

x = pow(a, d, n)

相比之下,它几乎是瞬时的。

对于上下文,这是原始功能:

from random import randint

def primalityTest(n, k):
    if n < 2:
        return False
    if n % 2 == 0:
        return False
    s = 0
    d = n - 1
    while d % 2 == 0:
        s += 1
        d >>= 1
    for i in range(k):
        rand = randint(2, n - 2)
        x = rand**d % n         # offending line
        if x == 1 or x == n - 1:
            continue
        for r in range(s):
            toReturn = True
            x = pow(x, 2, n)
            if x == 1:
                return False
            if x == n - 1:
                toReturn = False
                break
        if toReturn:
            return False
    return True

print(primalityTest(2700643,1))

定时计算示例:

from timeit import timeit

a = 2505626
d = 1520321
n = 2700643

def testA():
    print(a**d % n)

def testB():
    print(pow(a, d, n))

print("time: %(time)fs" % {"time":timeit("testA()", setup="from __main__ import testA", number=1)})
print("time: %(time)fs" % {"time":timeit("testB()", setup="from __main__ import testB", number=1)})

输出(与PyPy 1.9.0一起运行):

2642565
time: 23.785543s
2642565
time: 0.000030s

输出(在Python 3.3.0中运行,2.7.2返回的时间非常相似):

2642565
time: 14.426975s
2642565
time: 0.000021s

还有一个相关的问题,为什么使用Python 2或3运行时,这种计算几乎比使用PyPy时快两倍,而通常PyPy却要快得多

I was trying to implement a Miller-Rabin primality test, and was puzzled why it was taking so long (> 20 seconds) for midsize numbers (~7 digits). I eventually found the following line of code to be the source of the problem:

x = a**d % n

(where a, d, and n are all similar, but unequal, midsize numbers, ** is the exponentiation operator, and % is the modulo operator)

I then I tried replacing it with the following:

x = pow(a, d, n)

and it by comparison it is almost instantaneous.

For context, here is the original function:

from random import randint

def primalityTest(n, k):
    if n < 2:
        return False
    if n % 2 == 0:
        return False
    s = 0
    d = n - 1
    while d % 2 == 0:
        s += 1
        d >>= 1
    for i in range(k):
        rand = randint(2, n - 2)
        x = rand**d % n         # offending line
        if x == 1 or x == n - 1:
            continue
        for r in range(s):
            toReturn = True
            x = pow(x, 2, n)
            if x == 1:
                return False
            if x == n - 1:
                toReturn = False
                break
        if toReturn:
            return False
    return True

print(primalityTest(2700643,1))

An example timed calculation:

from timeit import timeit

a = 2505626
d = 1520321
n = 2700643

def testA():
    print(a**d % n)

def testB():
    print(pow(a, d, n))

print("time: %(time)fs" % {"time":timeit("testA()", setup="from __main__ import testA", number=1)})
print("time: %(time)fs" % {"time":timeit("testB()", setup="from __main__ import testB", number=1)})

Output (run with PyPy 1.9.0):

2642565
time: 23.785543s
2642565
time: 0.000030s

Output (run with Python 3.3.0, 2.7.2 returns very similar times):

2642565
time: 14.426975s
2642565
time: 0.000021s

And a related question, why is this calculation almost twice as fast when run with Python 2 or 3 than with PyPy, when usually PyPy is much faster?


回答 0

请参阅Wikipedia上有关模幂的文章。基本上,当您这样做时a**d % n,实际上必须计算a**d,这可能会很大。但是有些计算方法a**d % n不必自己计算a**d,这就是pow它的作用。该**运营商不能做到这一点,因为它不能“预见未来”知道你要立即采取模数。

See the Wikipedia article on modular exponentiation. Basically, when you do a**d % n, you actually have to calculate a**d, which could be quite large. But there are ways of computing a**d % n without having to compute a**d itself, and that is what pow does. The ** operator can’t do this because it can’t “see into the future” to know that you are going to immediately take the modulus.


回答 1

BrenBarn回答了您的主要问题。除了您:

为什么用Python 2或3运行时,它的速度几乎是PyPy的两倍,而通常PyPy要快得多?

如果您阅读了PyPy的性能页面,这正是PyPy不擅长的事情-实际上,他们给出的第一个示例是:

不良的例子包括进行大量的计算-这是由无法优化的支持代码执行的。

从理论上讲,将巨大的幂乘以Mod转换为模块化幂(至少在第一遍之后)是JIT可以实现的一种转换,但不是PyPy的JIT。

附带说明一下,如果您需要使用巨大的整数进行计算,则可能需要查看第三方模块,例如gmpy,在某些情况下,它有时会比CPython的本机实现快得多,在某些主流用途之外,并且也有很多用途。否则,您将不得不编写自己的其他功能,而代价是不太方便。

BrenBarn answered your main question. For your aside:

why is it almost twice as fast when run with Python 2 or 3 than PyPy, when usually PyPy is much faster?

If you read PyPy’s performance page, this is exactly the kind of thing PyPy is not good at—in fact, the very first example they give:

Bad examples include doing computations with large longs – which is performed by unoptimizable support code.

Theoretically, turning a huge exponentiation followed by a mod into a modular exponentiation (at least after the first pass) is a transformation a JIT might be able to make… but not PyPy’s JIT.

As a side note, if you need to do calculations with huge integers, you may want to look at third-party modules like gmpy, which can sometimes be much faster than CPython’s native implementation in some cases outside the mainstream uses, and also has a lot of additional functionality that you’d otherwise have to write yourself, at the cost of being less convenient.


回答 2

进行模幂运算有一些捷径:例如,您可以找到从到的a**(2i) mod n每个,并将所需的中间结果相乘(mod )。专用的模幂函数(例如3参数)可以利用这些技巧,因为它知道您正在执行模数运算。Python解析器无法识别给定的裸表达式,因此它将执行完整的计算(这将花费更长的时间)。i1log(d)npow()a**d % n

There are shortcuts to doing modular exponentiation: for instance, you can find a**(2i) mod n for every i from 1 to log(d) and multiply together (mod n) the intermediate results you need. A dedicated modular-exponentiation function like 3-argument pow() can leverage such tricks because it knows you’re doing modular arithmetic. The Python parser can’t recognize this given the bare expression a**d % n, so it will perform the full calculation (which will take much longer).


回答 3

x = a**d % n计算的方法是提高功率,然后用adn。首先,如果a很大,则会创建一个巨大的数字,然后将其截短。但是,x = pow(a, d, n)最有可能进行了优化,以便仅n跟踪最后一位,这是计算以模为模的乘法所需的全部数字。

The way x = a**d % n is calculated is to raise a to the d power, then modulo that with n. Firstly, if a is large, this creates a huge number which is then truncated. However, x = pow(a, d, n) is most likely optimized so that only the last n digits are tracked, which are all that are required for calculating multiplication modulo a number.