标签归档:math

为什么Python的无穷大散列具有π的数字?

问题:为什么Python的无穷大散列具有π的数字?

Python中无穷大的哈​​希值具有与pi匹配的数字:

>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159

这仅仅是巧合还是故意的?

The hash of infinity in Python has digits matching pi:

>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159

Is that just a coincidence or is it intentional?


回答 0

_PyHASH_INF定义为等于的常数314159

我找不到关于此的任何讨论,也没有提供原因的评论。我认为它或多或少是任意选择的。我想只要它们不将相同的有意义的值用于其他哈希,就没关系。

_PyHASH_INF is defined as a constant equal to 314159.

I can’t find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don’t use the same meaningful value for other hashes, it shouldn’t matter.


回答 1

简介:这不是巧合;在Python的默认CPython实现中_PyHASH_INF被硬编码为314159,并在2000年被Tim Peters选为任意值(显然是从π的数字)。


的值hash(float('inf'))是数值类型内置散列函数的系统相关的参数中的一个,并且也可以作为sys.hash_info.inf在Python 3:

>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159

与PyPy的结果相同。)


就代码而言,hash是一个内置函数。在Python float对象上调用它会调用函数,该函数的指针由内置float类型()的tp_hash属性给定,该类型定义为的函数,PyTypeObject PyFloat_Type而该函数又具有float_hashreturn _Py_HashDouble(v->ob_fval)

    if (Py_IS_INFINITY(v))
        return v > 0 ? _PyHASH_INF : -_PyHASH_INF;

其中_PyHASH_INF定义为 314159:

#define _PyHASH_INF 314159

从历史的角度来看,Tim Peters在2000年8月添加了314159此上下文中Python代码中的第一个提及(您可以使用git bisect或找到git log -S 314159 -p),现在在git存储库中提交了39dce293cpython

提交消息说:

修复了http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470的问题。这是一个令人误解的错误-真正的“错误”是hash(x)xinfinity为无限时返回错误。修复了。向添加了新的Py_IS_INFINITYpyport.h。重新排列了代码,以减少浮点数和复数的散列中越来越多的重复,从而将Trent之前的做法推到了合理的结论。修复了一个极其罕见的错误,即即使没有错误,浮点数的哈希也可能返回-1(并没有浪费时间来构造一个测试用例,从代码中可以明显看出它可能发生)。改进了复杂的哈希,因此 hash(complex(x, y))不再系统地相等hash(complex(y, x))

特别是,在此提交中,他撕掉了static long float_hash(PyFloatObject *v)in 的代码Objects/floatobject.c并使它成为just return _Py_HashDouble(v->ob_fval);,并在in的定义long _Py_HashDouble(double v)Objects/object.c添加了以下几行:

        if (Py_IS_INFINITY(intpart))
            /* can't convert to long int -- arbitrary */
            v = v < 0 ? -271828.0 : 314159.0;

因此,如上所述,这是一个任意选择。请注意,271828由e的前几个十进制数字形成。

相关的以后的提交:

Summary: It’s not a coincidence; _PyHASH_INF is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.


The value of hash(float('inf')) is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf in Python 3:

>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159

(Same results with PyPy too.)


In terms of code, hash is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash attribute of the built-in float type (PyTypeObject PyFloat_Type), which is the float_hash function, defined as return _Py_HashDouble(v->ob_fval), which in turn has

    if (Py_IS_INFINITY(v))
        return v > 0 ? _PyHASH_INF : -_PyHASH_INF;

where _PyHASH_INF is defined as 314159:

#define _PyHASH_INF 314159

In terms of history, the first mention of 314159 in this context in the Python code (you can find this with git bisect or git log -S 314159 -p) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython git repository.

The commit message says:

Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470. This was a misleading bug — the true “bug” was that hash(x) gave an error return when x is an infinity. Fixed that. Added new Py_IS_INFINITY macro to pyport.h. Rearranged code to reduce growing duplication in hashing of float and complex numbers, pushing Trent’s earlier stab at that to a logical conclusion. Fixed exceedingly rare bug where hashing of floats could return -1 even if there wasn’t an error (didn’t waste time trying to construct a test case, it was simply obvious from the code that it could happen). Improved complex hash so that hash(complex(x, y)) doesn’t systematically equal hash(complex(y, x)) anymore.

In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v) in Objects/floatobject.c and made it just return _Py_HashDouble(v->ob_fval);, and in the definition of long _Py_HashDouble(double v) in Objects/object.c he added the lines:

        if (Py_IS_INFINITY(intpart))
            /* can't convert to long int -- arbitrary */
            v = v < 0 ? -271828.0 : 314159.0;

So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.

Related later commits:


回答 2

确实,

sys.hash_info.inf

返回314159。该值不会生成,而是内置在源代码中。事实上,

hash(float('-inf'))

-271828在python 2中返回或大约为-e(现在为-314159)。

将所有时间中两个最著名的无理数用作哈希值的事实使得它不太可能是巧合。

Indeed,

sys.hash_info.inf

returns 314159. The value is not generated, it’s built into the source code. In fact,

hash(float('-inf'))

returns -271828, or approximately -e, in python 2 (it’s -314159 now).

The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.


在Python中计算算术平均值(一种平均值)

问题:在Python中计算算术平均值(一种平均值)

Python中是否有内置或标准库方法来计算数字列表的算术平均值(一种平均值)?

Is there a built-in or standard library method in Python to calculate the arithmetic mean (one type of average) of a list of numbers?


回答 0

我不知道标准库中的任何内容。但是,您可以使用类似以下内容的方法:

def mean(numbers):
    return float(sum(numbers)) / max(len(numbers), 1)

>>> mean([1,2,3,4])
2.5
>>> mean([])
0.0

在numpy中,有numpy.mean()

I am not aware of anything in the standard library. However, you could use something like:

def mean(numbers):
    return float(sum(numbers)) / max(len(numbers), 1)

>>> mean([1,2,3,4])
2.5
>>> mean([])
0.0

In numpy, there’s numpy.mean().


回答 1

NumPy的a numpy.mean是算术平均值。用法很简单:

>>> import numpy
>>> a = [1, 2, 4]
>>> numpy.mean(a)
2.3333333333333335

NumPy has a numpy.mean which is an arithmetic mean. Usage is as simple as this:

>>> import numpy
>>> a = [1, 2, 4]
>>> numpy.mean(a)
2.3333333333333335

回答 2

用途statistics.mean

import statistics
print(statistics.mean([1,2,4])) # 2.3333333333333335

从Python 3.4开始可用。对于3.1-3.3用户,该模块的旧版本可在PyPI上以的名称获得stats。只需更改statistics为即可stats

Use statistics.mean:

import statistics
print(statistics.mean([1,2,4])) # 2.3333333333333335

It’s available since Python 3.4. For 3.1-3.3 users, an old version of the module is available on PyPI under the name stats. Just change statistics to stats.


回答 3

您甚至不需要麻木或肮脏的…

>>> a = [1, 2, 3, 4, 5, 6]
>>> print(sum(a) / len(a))
3

You don’t even need numpy or scipy…

>>> a = [1, 2, 3, 4, 5, 6]
>>> print(sum(a) / len(a))
3

回答 4

使用scipy:

import scipy;
a=[1,2,4];
print(scipy.mean(a));

Use scipy:

import scipy;
a=[1,2,4];
print(scipy.mean(a));

回答 5

除了强制浮动之外,您还可以执行以下操作

def mean(nums):
    return sum(nums, 0.0) / len(nums)

或使用lambda

mean = lambda nums: sum(nums, 0.0) / len(nums)

更新日期:2019-12-15

Python 3.8 在统计模块中添加了功能fmean。哪个更快,并且总是返回float。

将数据转换为浮点并计算算术平均值。

它的运行速度比mean()函数快,并且始终返回浮点数。数据可以是序列或可迭代的。如果输入数据集为空,则引发StatisticsError。

fmean([3.5,4.0,5.25])

4.25

3.8版的新功能。

Instead of casting to float you can do following

def mean(nums):
    return sum(nums, 0.0) / len(nums)

or using lambda

mean = lambda nums: sum(nums, 0.0) / len(nums)

UPDATES: 2019-12-15

Python 3.8 added function fmean to statistics module. Which is faster and always returns float.

Convert data to floats and compute the arithmetic mean.

This runs faster than the mean() function and it always returns a float. The data may be a sequence or iterable. If the input dataset is empty, raises a StatisticsError.

fmean([3.5, 4.0, 5.25])

4.25

New in version 3.8.


回答 6

from statistics import mean
avarage=mean(your_list)

例如

from statistics import mean

my_list=[5,2,3,2]
avarage=mean(my_list)
print(avarage)

结果是

3.0
from statistics import mean
avarage=mean(your_list)

for example

from statistics import mean

my_list=[5,2,3,2]
avarage=mean(my_list)
print(avarage)

and result is

3.0

回答 7

def avg(l):
    """uses floating-point division."""
    return sum(l) / float(len(l))

例子:

l1 = [3,5,14,2,5,36,4,3]
l2 = [0,0,0]

print(avg(l1)) # 9.0
print(avg(l2)) # 0.0
def avg(l):
    """uses floating-point division."""
    return sum(l) / float(len(l))

Examples:

l1 = [3,5,14,2,5,36,4,3]
l2 = [0,0,0]

print(avg(l1)) # 9.0
print(avg(l2)) # 0.0

回答 8

def list_mean(nums):
    sumof = 0
    num_of = len(nums)
    mean = 0
    for i in nums:
        sumof += i
    mean = sumof / num_of
    return float(mean)
def list_mean(nums):
    sumof = 0
    num_of = len(nums)
    mean = 0
    for i in nums:
        sumof += i
    mean = sumof / num_of
    return float(mean)

回答 9

我一直认为应该avg从Builtins / stdlib中省略它,因为它很简单

sum(L)/len(L) # L is some list

并且任何告诫将在调用者代码中解决以供本地使用

注意事项:

  1. 非浮点结果:在python2中,9/4为2。解析,使用float(sum(L))/len(L)from __future__ import division

  2. 除以零:列表可能为空。解决:

    if not L:
        raise WhateverYouWantError("foo")
    avg = float(sum(L))/len(L)

I always supposed avg is omitted from the builtins/stdlib because it is as simple as

sum(L)/len(L) # L is some list

and any caveats would be addressed in caller code for local usage already.

Notable caveats:

  1. non-float result: in python2, 9/4 is 2. to resolve, use float(sum(L))/len(L) or from __future__ import division

  2. division by zero: the list may be empty. to resolve:

    if not L:
        raise WhateverYouWantError("foo")
    avg = float(sum(L))/len(L)
    

回答 10

对您问题的正确答案是使用statistics.mean。但是为了好玩,这是一个不使用该len()功能的Mean的版本,因此statistics.mean可以在不支持该功能的生成器上使用它(如)len()

from functools import reduce
from operator import truediv
def ave(seq):
    return truediv(*reduce(lambda a, b: (a[0] + b[1], b[0]), 
                           enumerate(seq, start=1), 
                           (0, 0)))

The proper answer to your question is to use statistics.mean. But for fun, here is a version of mean that does not use the len() function, so it (like statistics.mean) can be used on generators, which do not support len():

from functools import reduce
from operator import truediv
def ave(seq):
    return truediv(*reduce(lambda a, b: (a[0] + b[1], b[0]), 
                           enumerate(seq, start=1), 
                           (0, 0)))

回答 11

其他人已经发布了很好的答案,但有些人可能仍在寻找找到Mean(avg)的经典方法,因此,我在这里发布了此信息(在Python 3.6中测试过的代码):

def meanmanual(listt):

mean = 0
lsum = 0
lenoflist = len(listt)

for i in listt:
    lsum += i

mean = lsum / lenoflist
return float(mean)

a = [1, 2, 3, 4, 5, 6]
meanmanual(a)

Answer: 3.5

Others already posted very good answers, but some people might still be looking for a classic way to find Mean(avg), so here I post this (code tested in Python 3.6):

def meanmanual(listt):

mean = 0
lsum = 0
lenoflist = len(listt)

for i in listt:
    lsum += i

mean = lsum / lenoflist
return float(mean)

a = [1, 2, 3, 4, 5, 6]
meanmanual(a)

Answer: 3.5

将数字范围转换为另一个范围,保持比例

问题:将数字范围转换为另一个范围,保持比例

我正在尝试将一个数字范围转换为另一个数字范围,并保持比率。数学不是我的强项。

我有一个图像文件,其中点值的范围可能在-16000.00到16000.00之间,尽管典型范围可能会小得多。我想做的是将这些值压缩到0-100的整数范围内,其中0是最小点的值,而100是最大点的值。即使丢失了一些精度,它们之间的所有点都应保持相对比率,我想在python中做到这一点,但即使是通用算法也足够。我更喜欢可以调整最小/最大或任意一个范围的算法(即,第二个范围可以是-50到800,而不是0到100)。

I’m trying to convert one range of numbers to another, maintaining ratio. Maths is not my strong point.

I have an image file where point values may range from -16000.00 to 16000.00 though the typical range may be much less. What I want to do is compress these values into the integer range 0-100, where 0 is the value of the smallest point, and 100 is the value of the largest. All points in between should keep a relative ratio even though some precision is being lost I’d like to do this in python but even a general algorithm should suffice. I’d prefer an algorithm where the min/max or either range can be adjusted (ie, the second range could be -50 to 800 instead of 0 to 100).


回答 0

NewValue = (((OldValue - OldMin) * (NewMax - NewMin)) / (OldMax - OldMin)) + NewMin

或者更具可读性:

OldRange = (OldMax - OldMin)  
NewRange = (NewMax - NewMin)  
NewValue = (((OldValue - OldMin) * NewRange) / OldRange) + NewMin

或者,如果您想保护旧范围为0(OldMin = OldMax)的情况,请执行以下操作

OldRange = (OldMax - OldMin)
if (OldRange == 0)
    NewValue = NewMin
else
{
    NewRange = (NewMax - NewMin)  
    NewValue = (((OldValue - OldMin) * NewRange) / OldRange) + NewMin
}

请注意,在这种情况下,我们被迫任意选择一个可能的新范围值。根据上下文,明智的选择可能是:NewMin见样本),NewMax(NewMin + NewMax) / 2

NewValue = (((OldValue - OldMin) * (NewMax - NewMin)) / (OldMax - OldMin)) + NewMin

Or a little more readable:

OldRange = (OldMax - OldMin)  
NewRange = (NewMax - NewMin)  
NewValue = (((OldValue - OldMin) * NewRange) / OldRange) + NewMin

Or if you want to protect for the case where the old range is 0 (OldMin = OldMax):

OldRange = (OldMax - OldMin)
if (OldRange == 0)
    NewValue = NewMin
else
{
    NewRange = (NewMax - NewMin)  
    NewValue = (((OldValue - OldMin) * NewRange) / OldRange) + NewMin
}

Note that in this case we’re forced to pick one of the possible new range values arbitrarily. Depending on context, sensible choices could be: NewMin (see sample), NewMax or (NewMin + NewMax) / 2


回答 1

这是一个简单的线性转换。

new_value = ( (old_value - old_min) / (old_max - old_min) ) * (new_max - new_min) + new_min

因此,将-16000到16000范围内的10000转换为0到100的新比例会生成:

old_value = 10000
old_min = -16000
old_max = 16000
new_min = 0
new_max = 100

new_value = ( ( 10000 - -16000 ) / (16000 - -16000) ) * (100 - 0) + 0
          = 81.25

That’s a simple linear conversion.

new_value = ( (old_value - old_min) / (old_max - old_min) ) * (new_max - new_min) + new_min

So converting 10000 on the scale of -16000 to 16000 to a new scale of 0 to 100 yields:

old_value = 10000
old_min = -16000
old_max = 16000
new_min = 0
new_max = 100

new_value = ( ( 10000 - -16000 ) / (16000 - -16000) ) * (100 - 0) + 0
          = 81.25

回答 2

实际上,在某些情况下,以上答案可能会中断。例如错误输入值,错误输入范围,负输入/输出范围。

def remap( x, oMin, oMax, nMin, nMax ):

    #range check
    if oMin == oMax:
        print "Warning: Zero input range"
        return None

    if nMin == nMax:
        print "Warning: Zero output range"
        return None

    #check reversed input range
    reverseInput = False
    oldMin = min( oMin, oMax )
    oldMax = max( oMin, oMax )
    if not oldMin == oMin:
        reverseInput = True

    #check reversed output range
    reverseOutput = False   
    newMin = min( nMin, nMax )
    newMax = max( nMin, nMax )
    if not newMin == nMin :
        reverseOutput = True

    portion = (x-oldMin)*(newMax-newMin)/(oldMax-oldMin)
    if reverseInput:
        portion = (oldMax-x)*(newMax-newMin)/(oldMax-oldMin)

    result = portion + newMin
    if reverseOutput:
        result = newMax - portion

    return result

#test cases
print remap( 25.0, 0.0, 100.0, 1.0, -1.0 ), "==", 0.5
print remap( 25.0, 100.0, -100.0, -1.0, 1.0 ), "==", -0.25
print remap( -125.0, -100.0, -200.0, 1.0, -1.0 ), "==", 0.5
print remap( -125.0, -200.0, -100.0, -1.0, 1.0 ), "==", 0.5
#even when value is out of bound
print remap( -20.0, 0.0, 100.0, 0.0, 1.0 ), "==", -0.2

Actually there are some cases that above answers would break. Such as wrongly input value, wrongly input range, negative input/output ranges.

def remap( x, oMin, oMax, nMin, nMax ):

    #range check
    if oMin == oMax:
        print "Warning: Zero input range"
        return None

    if nMin == nMax:
        print "Warning: Zero output range"
        return None

    #check reversed input range
    reverseInput = False
    oldMin = min( oMin, oMax )
    oldMax = max( oMin, oMax )
    if not oldMin == oMin:
        reverseInput = True

    #check reversed output range
    reverseOutput = False   
    newMin = min( nMin, nMax )
    newMax = max( nMin, nMax )
    if not newMin == nMin :
        reverseOutput = True

    portion = (x-oldMin)*(newMax-newMin)/(oldMax-oldMin)
    if reverseInput:
        portion = (oldMax-x)*(newMax-newMin)/(oldMax-oldMin)

    result = portion + newMin
    if reverseOutput:
        result = newMax - portion

    return result

#test cases
print remap( 25.0, 0.0, 100.0, 1.0, -1.0 ), "==", 0.5
print remap( 25.0, 100.0, -100.0, -1.0, 1.0 ), "==", -0.25
print remap( -125.0, -100.0, -200.0, 1.0, -1.0 ), "==", 0.5
print remap( -125.0, -200.0, -100.0, -1.0, 1.0 ), "==", 0.5
#even when value is out of bound
print remap( -20.0, 0.0, 100.0, 0.0, 1.0 ), "==", -0.2

回答 3

有一种情况是,当您检查的所有值都相同时,@ jerryjvl的代码将返回NaN。

if (OldMin != OldMax && NewMin != NewMax):
    return (((OldValue - OldMin) * (NewMax - NewMin)) / (OldMax - OldMin)) + NewMin
else:
    return (NewMax + NewMin) / 2

There is a condition, when all of the values that you are checking are the same, where @jerryjvl’s code would return NaN.

if (OldMin != OldMax && NewMin != NewMax):
    return (((OldValue - OldMin) * (NewMax - NewMin)) / (OldMax - OldMin)) + NewMin
else:
    return (NewMax + NewMin) / 2

回答 4

我没有挖BNF为此,但是Arduino文档中有一个很好的功能示例,它的功能很强大。我可以在Python中使用此方法,只需添加一个def重命名以重新映射(因为map是内置的)并删除类型强制转换和花括号(即仅删除所有’long’)。

原版的

long map(long x, long in_min, long in_max, long out_min, long out_max)
{
  return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min;
}

Python

def remap(x, in_min, in_max, out_min, out_max):
  return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min

https://www.arduino.cc/en/reference/map

I didn’t dig up the BNF for this, but the Arduino documentation had a great example of the function and it’s breakdown. I was able to use this in Python by simply adding a def renaming to remap (cause map is a built-in) and removing the type casts and curly braces (ie just remove all the ‘long’s).

Original

long map(long x, long in_min, long in_max, long out_min, long out_max)
{
  return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min;
}

Python

def remap(x, in_min, in_max, out_min, out_max):
  return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min

https://www.arduino.cc/en/reference/map


回答 5

在PenguinTD提供的清单中,我不明白为什么范围会被逆转,它无需逆转范围就可以工作。线性范围转换基于线性方程Y=Xm+n,其中mn是从给定范围推导的。与其将范围称为minmax,不如将它们称为1和2更好。因此,公式为:

Y = (((X - x1) * (y2 - y1)) / (x2 - x1)) + y1

Y=y1何时何地X=x1Y=y2何时何地X=x2x1x2y1y2可被赋予任何positivenegative值。在宏中定义表达式使其更有用,然后可以与任何参数名称一起使用。

#define RangeConv(X, x1, x2, y1, y2) (((float)((X - x1) * (y2 - y1)) / (x2 - x1)) + y1)

float投将确保浮点除法在所有参数都是的情况下integer的值。根据应用程序的不同,可能不必检查范围x1=x2y1==y2

In the listing provided by PenguinTD, I do not understand why the ranges are reversed, it works without having to reverse the ranges. Linear range conversion is based upon the linear equation Y=Xm+n, where m and n are derived from the given ranges. Rather than refer to the ranges as min and max, it would be better to refer to them as 1 and 2. So the formula would be:

Y = (((X - x1) * (y2 - y1)) / (x2 - x1)) + y1

Where Y=y1 when X=x1, and Y=y2 when X=x2. x1, x2, y1 & y2 can be given any positive or negative value. Defining the expression in a macro makes it more useful,it can then be used with any argument names.

#define RangeConv(X, x1, x2, y1, y2) (((float)((X - x1) * (y2 - y1)) / (x2 - x1)) + y1)

The float cast would ensure floating point division in the case where all the arguments are integer values. Depending on the application it may not be necessary to check the ranges x1=x2 and y1==y2.


回答 6

以下是一些简短的Python函数,可简化复制和粘贴操作,其中包括缩放整个列表的函数。

def scale_number(unscaled, to_min, to_max, from_min, from_max):
    return (to_max-to_min)*(unscaled-from_min)/(from_max-from_min)+to_min

def scale_list(l, to_min, to_max):
    return [scale_number(i, to_min, to_max, min(l), max(l)) for i in l]

可以这样使用:

scale_list([1,3,4,5], 0, 100)

[0.0,50.0,75.0,100.0]

就我而言,我想按比例绘制对数曲线,如下所示:

scale_list([math.log(i+1) for i in range(5)], 0, 50)

[0.0,21.533827903669653,34.130309724299266,43.06765580733931,50.0]

Here’s some short Python functions for your copy and paste ease, including a function to scale an entire list.

def scale_number(unscaled, to_min, to_max, from_min, from_max):
    return (to_max-to_min)*(unscaled-from_min)/(from_max-from_min)+to_min

def scale_list(l, to_min, to_max):
    return [scale_number(i, to_min, to_max, min(l), max(l)) for i in l]

Which can be used like so:

scale_list([1,3,4,5], 0, 100)

[0.0, 50.0, 75.0, 100.0]

In my case I wanted to scale a logarithmic curve, like so:

scale_list([math.log(i+1) for i in range(5)], 0, 50)

[0.0, 21.533827903669653, 34.130309724299266, 43.06765580733931, 50.0]


回答 7

我在js中解决的一个问题中使用了该解决方案,所以我想我会分享翻译。感谢您的解释和解决方案。

function remap( x, oMin, oMax, nMin, nMax ){
//range check
if (oMin == oMax){
    console.log("Warning: Zero input range");
    return None;
};

if (nMin == nMax){
    console.log("Warning: Zero output range");
    return None
}

//check reversed input range
var reverseInput = false;
oldMin = Math.min( oMin, oMax );
oldMax = Math.max( oMin, oMax );
if (oldMin != oMin){
    reverseInput = true;
}

//check reversed output range
var reverseOutput = false;  
newMin = Math.min( nMin, nMax )
newMax = Math.max( nMin, nMax )
if (newMin != nMin){
    reverseOutput = true;
};

var portion = (x-oldMin)*(newMax-newMin)/(oldMax-oldMin)
if (reverseInput){
    portion = (oldMax-x)*(newMax-newMin)/(oldMax-oldMin);
};

var result = portion + newMin
if (reverseOutput){
    result = newMax - portion;
}

return result;
}

I used this solution in a problem I was solving in js, so I thought I would share the translation. Thanks for the explanation and solution.

function remap( x, oMin, oMax, nMin, nMax ){
//range check
if (oMin == oMax){
    console.log("Warning: Zero input range");
    return None;
};

if (nMin == nMax){
    console.log("Warning: Zero output range");
    return None
}

//check reversed input range
var reverseInput = false;
oldMin = Math.min( oMin, oMax );
oldMax = Math.max( oMin, oMax );
if (oldMin != oMin){
    reverseInput = true;
}

//check reversed output range
var reverseOutput = false;  
newMin = Math.min( nMin, nMax )
newMax = Math.max( nMin, nMax )
if (newMin != nMin){
    reverseOutput = true;
};

var portion = (x-oldMin)*(newMax-newMin)/(oldMax-oldMin)
if (reverseInput){
    portion = (oldMax-x)*(newMax-newMin)/(oldMax-oldMin);
};

var result = portion + newMin
if (reverseOutput){
    result = newMax - portion;
}

return result;
}

回答 8

C ++变体

我发现PenguinTD的解决方案很有用,所以如果有人需要,我可以将其移植到C ++:

浮点重新映射(float x,float oMin,float oMax,float nMin,float nMax){

//range check
if( oMin == oMax) {
    //std::cout<< "Warning: Zero input range";
    return -1;    }

if( nMin == nMax){
    //std::cout<<"Warning: Zero output range";
    return -1;        }

//check reversed input range
bool reverseInput = false;
float oldMin = min( oMin, oMax );
float oldMax = max( oMin, oMax );
if (oldMin == oMin)
    reverseInput = true;

//check reversed output range
bool reverseOutput = false;  
float newMin = min( nMin, nMax );
float newMax = max( nMin, nMax );
if (newMin == nMin)
    reverseOutput = true;

float portion = (x-oldMin)*(newMax-newMin)/(oldMax-oldMin);
if (reverseInput)
    portion = (oldMax-x)*(newMax-newMin)/(oldMax-oldMin);

float result = portion + newMin;
if (reverseOutput)
    result = newMax - portion;

return result; }

C++ Variant

I found PenguinTD’s Solution usefull, so i ported it to C++ if anyone needs it:

float remap(float x, float oMin, float oMax, float nMin, float nMax ){

//range check
if( oMin == oMax) {
    //std::cout<< "Warning: Zero input range";
    return -1;    }

if( nMin == nMax){
    //std::cout<<"Warning: Zero output range";
    return -1;        }

//check reversed input range
bool reverseInput = false;
float oldMin = min( oMin, oMax );
float oldMax = max( oMin, oMax );
if (oldMin == oMin)
    reverseInput = true;

//check reversed output range
bool reverseOutput = false;  
float newMin = min( nMin, nMax );
float newMax = max( nMin, nMax );
if (newMin == nMin)
    reverseOutput = true;

float portion = (x-oldMin)*(newMax-newMin)/(oldMax-oldMin);
if (reverseInput)
    portion = (oldMax-x)*(newMax-newMin)/(oldMax-oldMin);

float result = portion + newMin;
if (reverseOutput)
    result = newMax - portion;

return result; }

回答 9

PHP端口

发现PenguinTD的解决方案很有帮助,因此我将其移植到PHP。救救自己!

/**
* =====================================
*              Remap Range            
* =====================================
* - Convert one range to another. (including value)
*
* @param    int $intValue   The value in the old range you wish to convert
* @param    int $oMin       The minimum of the old range
* @param    int $oMax       The maximum of the old range
* @param    int $nMin       The minimum of the new range
* @param    int $nMax       The maximum of the new range
*
* @return   float $fResult  The old value converted to the new range
*/
function remapRange($intValue, $oMin, $oMax, $nMin, $nMax) {
    // Range check
    if ($oMin == $oMax) {
        echo 'Warning: Zero input range';
        return false;
    }

    if ($nMin == $nMax) {
        echo 'Warning: Zero output range';
        return false;
    }

    // Check reversed input range
    $bReverseInput = false;
    $intOldMin = min($oMin, $oMax);
    $intOldMax = max($oMin, $oMax);
    if ($intOldMin != $oMin) {
        $bReverseInput = true;
    }

    // Check reversed output range
    $bReverseOutput = false;
    $intNewMin = min($nMin, $nMax);
    $intNewMax = max($nMin, $nMax);
    if ($intNewMin != $nMin) {
        $bReverseOutput = true;
    }

    $fRatio = ($intValue - $intOldMin) * ($intNewMax - $intNewMin) / ($intOldMax - $intOldMin);
    if ($bReverseInput) {
        $fRatio = ($intOldMax - $intValue) * ($intNewMax - $intNewMin) / ($intOldMax - $intOldMin);
    }

    $fResult = $fRatio + $intNewMin;
    if ($bReverseOutput) {
        $fResult = $intNewMax - $fRatio;
    }

    return $fResult;
}

PHP Port

Found PenguinTD’s solution helpful so I ported it to PHP. Help yourself!

/**
* =====================================
*              Remap Range            
* =====================================
* - Convert one range to another. (including value)
*
* @param    int $intValue   The value in the old range you wish to convert
* @param    int $oMin       The minimum of the old range
* @param    int $oMax       The maximum of the old range
* @param    int $nMin       The minimum of the new range
* @param    int $nMax       The maximum of the new range
*
* @return   float $fResult  The old value converted to the new range
*/
function remapRange($intValue, $oMin, $oMax, $nMin, $nMax) {
    // Range check
    if ($oMin == $oMax) {
        echo 'Warning: Zero input range';
        return false;
    }

    if ($nMin == $nMax) {
        echo 'Warning: Zero output range';
        return false;
    }

    // Check reversed input range
    $bReverseInput = false;
    $intOldMin = min($oMin, $oMax);
    $intOldMax = max($oMin, $oMax);
    if ($intOldMin != $oMin) {
        $bReverseInput = true;
    }

    // Check reversed output range
    $bReverseOutput = false;
    $intNewMin = min($nMin, $nMax);
    $intNewMax = max($nMin, $nMax);
    if ($intNewMin != $nMin) {
        $bReverseOutput = true;
    }

    $fRatio = ($intValue - $intOldMin) * ($intNewMax - $intNewMin) / ($intOldMax - $intOldMin);
    if ($bReverseInput) {
        $fRatio = ($intOldMax - $intValue) * ($intNewMax - $intNewMin) / ($intOldMax - $intOldMin);
    }

    $fResult = $fRatio + $intNewMin;
    if ($bReverseOutput) {
        $fResult = $intNewMax - $fRatio;
    }

    return $fResult;
}

回答 10

这是一个Javascript版本,它返回一个函数,该函数对预定的源和目标范围进行重新缩放,从而将每次必须执行的计算量减到最少。

// This function returns a function bound to the 
// min/max source & target ranges given.
// oMin, oMax = source
// nMin, nMax = dest.
function makeRangeMapper(oMin, oMax, nMin, nMax ){
    //range check
    if (oMin == oMax){
        console.log("Warning: Zero input range");
        return undefined;
    };

    if (nMin == nMax){
        console.log("Warning: Zero output range");
        return undefined
    }

    //check reversed input range
    var reverseInput = false;
    let oldMin = Math.min( oMin, oMax );
    let oldMax = Math.max( oMin, oMax );
    if (oldMin != oMin){
        reverseInput = true;
    }

    //check reversed output range
    var reverseOutput = false;  
    let newMin = Math.min( nMin, nMax )
    let newMax = Math.max( nMin, nMax )
    if (newMin != nMin){
        reverseOutput = true;
    }

    // Hot-rod the most common case.
    if (!reverseInput && !reverseOutput) {
        let dNew = newMax-newMin;
        let dOld = oldMax-oldMin;
        return (x)=>{
            return ((x-oldMin)* dNew / dOld) + newMin;
        }
    }

    return (x)=>{
        let portion;
        if (reverseInput){
            portion = (oldMax-x)*(newMax-newMin)/(oldMax-oldMin);
        } else {
            portion = (x-oldMin)*(newMax-newMin)/(oldMax-oldMin)
        }
        let result;
        if (reverseOutput){
            result = newMax - portion;
        } else {
            result = portion + newMin;
        }

        return result;
    }   
}

这是使用此功能将0-1缩放为-0x80000000、0x7FFFFFFF的示例

let normTo32Fn = makeRangeMapper(0, 1, -0x80000000, 0x7FFFFFFF);
let fs = normTo32Fn(0.5);
let fs2 = normTo32Fn(0);

Here is a Javascript version that returns a function that does the rescaling for predetermined source and destination ranges, minimizing the amount of computation that has to be done each time.

// This function returns a function bound to the 
// min/max source & target ranges given.
// oMin, oMax = source
// nMin, nMax = dest.
function makeRangeMapper(oMin, oMax, nMin, nMax ){
    //range check
    if (oMin == oMax){
        console.log("Warning: Zero input range");
        return undefined;
    };

    if (nMin == nMax){
        console.log("Warning: Zero output range");
        return undefined
    }

    //check reversed input range
    var reverseInput = false;
    let oldMin = Math.min( oMin, oMax );
    let oldMax = Math.max( oMin, oMax );
    if (oldMin != oMin){
        reverseInput = true;
    }

    //check reversed output range
    var reverseOutput = false;  
    let newMin = Math.min( nMin, nMax )
    let newMax = Math.max( nMin, nMax )
    if (newMin != nMin){
        reverseOutput = true;
    }

    // Hot-rod the most common case.
    if (!reverseInput && !reverseOutput) {
        let dNew = newMax-newMin;
        let dOld = oldMax-oldMin;
        return (x)=>{
            return ((x-oldMin)* dNew / dOld) + newMin;
        }
    }

    return (x)=>{
        let portion;
        if (reverseInput){
            portion = (oldMax-x)*(newMax-newMin)/(oldMax-oldMin);
        } else {
            portion = (x-oldMin)*(newMax-newMin)/(oldMax-oldMin)
        }
        let result;
        if (reverseOutput){
            result = newMax - portion;
        } else {
            result = portion + newMin;
        }

        return result;
    }   
}

Here is an example of using this function to scale 0-1 into -0x80000000, 0x7FFFFFFF

let normTo32Fn = makeRangeMapper(0, 1, -0x80000000, 0x7FFFFFFF);
let fs = normTo32Fn(0.5);
let fs2 = normTo32Fn(0);

回答 11

捷径/简化提案

 NewRange/OldRange = Handy multiplicand or HM
 Convert OldValue in OldRange to NewValue in NewRange = 
 (OldValue - OldMin x HM) + NewMin

韦恩

Short-cut/simplified proposal

 NewRange/OldRange = Handy multiplicand or HM
 Convert OldValue in OldRange to NewValue in NewRange = 
 (OldValue - OldMin x HM) + NewMin

wayne


回答 12

我个人使用了支持泛型的帮助程序类(与Swift 3兼容)

struct Rescale<Type : BinaryFloatingPoint> {
    typealias RescaleDomain = (lowerBound: Type, upperBound: Type)

    var fromDomain: RescaleDomain
    var toDomain: RescaleDomain

    init(from: RescaleDomain, to: RescaleDomain) {
        self.fromDomain = from
        self.toDomain = to
    }

    func interpolate(_ x: Type ) -> Type {
        return self.toDomain.lowerBound * (1 - x) + self.toDomain.upperBound * x;
    }

    func uninterpolate(_ x: Type) -> Type {
        let b = (self.fromDomain.upperBound - self.fromDomain.lowerBound) != 0 ? self.fromDomain.upperBound - self.fromDomain.lowerBound : 1 / self.fromDomain.upperBound;
        return (x - self.fromDomain.lowerBound) / b
    }

    func rescale(_ x: Type )  -> Type {
        return interpolate( uninterpolate(x) )
    }
}

I personally use the helper class which supports generics (Swift 3, 4.x compatible)

struct Rescale<Type : BinaryFloatingPoint> {
    typealias RescaleDomain = (lowerBound: Type, upperBound: Type)

    var fromDomain: RescaleDomain
    var toDomain: RescaleDomain

    init(from: RescaleDomain, to: RescaleDomain) {
        self.fromDomain = from
        self.toDomain = to
    }

    func interpolate(_ x: Type ) -> Type {
        return self.toDomain.lowerBound * (1 - x) + self.toDomain.upperBound * x;
    }

    func uninterpolate(_ x: Type) -> Type {
        let b = (self.fromDomain.upperBound - self.fromDomain.lowerBound) != 0 ? self.fromDomain.upperBound - self.fromDomain.lowerBound : 1 / self.fromDomain.upperBound;
        return (x - self.fromDomain.lowerBound) / b
    }

    func rescale(_ x: Type )  -> Type {
        return interpolate( uninterpolate(x) )
    }
}

Ex:

   let rescaler = Rescale<Float>(from: (-1, 1), to: (0, 100))
    
   print(rescaler.rescale(0)) // OUTPUT: 50

回答 13

本示例将歌曲的当前位置转换为20-40的角度范围。

    /// <summary>
    /// This test converts Current songtime to an angle in a range. 
    /// </summary>
    [Fact]
    public void ConvertRangeTests()
    {            
       //Convert a songs time to an angle of a range 20 - 40
        var result = ConvertAndGetCurrentValueOfRange(
            TimeSpan.Zero, TimeSpan.FromMinutes(5.4),
            20, 40, 
            2.7
            );

        Assert.True(result == 30);
    }

    /// <summary>
    /// Gets the current value from the mixValue maxValue range.        
    /// </summary>
    /// <param name="startTime">Start of the song</param>
    /// <param name="duration"></param>
    /// <param name="minValue"></param>
    /// <param name="maxValue"></param>
    /// <param name="value">Current time</param>
    /// <returns></returns>
    public double ConvertAndGetCurrentValueOfRange(
                TimeSpan startTime,
                TimeSpan duration,
                double minValue,
                double maxValue,
                double value)
    {
        var timeRange = duration - startTime;
        var newRange = maxValue - minValue;
        var ratio = newRange / timeRange.TotalMinutes;
        var newValue = value * ratio;
        var currentValue= newValue + minValue;
        return currentValue;
    }

This example converts a songs current position into an angle range of 20 – 40.

    /// <summary>
    /// This test converts Current songtime to an angle in a range. 
    /// </summary>
    [Fact]
    public void ConvertRangeTests()
    {            
       //Convert a songs time to an angle of a range 20 - 40
        var result = ConvertAndGetCurrentValueOfRange(
            TimeSpan.Zero, TimeSpan.FromMinutes(5.4),
            20, 40, 
            2.7
            );

        Assert.True(result == 30);
    }

    /// <summary>
    /// Gets the current value from the mixValue maxValue range.        
    /// </summary>
    /// <param name="startTime">Start of the song</param>
    /// <param name="duration"></param>
    /// <param name="minValue"></param>
    /// <param name="maxValue"></param>
    /// <param name="value">Current time</param>
    /// <returns></returns>
    public double ConvertAndGetCurrentValueOfRange(
                TimeSpan startTime,
                TimeSpan duration,
                double minValue,
                double maxValue,
                double value)
    {
        var timeRange = duration - startTime;
        var newRange = maxValue - minValue;
        var ratio = newRange / timeRange.TotalMinutes;
        var newValue = value * ratio;
        var currentValue= newValue + minValue;
        return currentValue;
    }

回答 14

列表理解一线解决方案

color_array_new = [int((((x - min(node_sizes)) * 99) / (max(node_sizes) - min(node_sizes))) + 1) for x in node_sizes]

较长的版本

def colour_specter(waste_amount):
color_array = []
OldRange = max(waste_amount) - min(waste_amount)
NewRange = 99
for number_value in waste_amount:
    NewValue = int((((number_value - min(waste_amount)) * NewRange) / OldRange) + 1)
    color_array.append(NewValue)
print(color_array)
return color_array

List comprehension one liner solution

color_array_new = [int((((x - min(node_sizes)) * 99) / (max(node_sizes) - min(node_sizes))) + 1) for x in node_sizes]

Longer version

def colour_specter(waste_amount):
color_array = []
OldRange = max(waste_amount) - min(waste_amount)
NewRange = 99
for number_value in waste_amount:
    NewValue = int((((number_value - min(waste_amount)) * NewRange) / OldRange) + 1)
    color_array.append(NewValue)
print(color_array)
return color_array

回答 15

Java版本

无论您喂什么食物,它始终有效!

我把所有内容都扩展了,以便于学习。当然,最后的舍入是可选的。

    private long remap(long p, long Amin, long Amax, long Bmin, long Bmax ) {

    double deltaA = Amax - Amin;
    double deltaB = Bmax - Bmin;
    double scale  = deltaB / deltaA;
    double negA   = -1 * Amin;
    double offset = (negA * scale) + Bmin;
    double q      = (p * scale) + offset;
    return Math.round(q);

}

Java Version

Always works no matter what you feed it!

I left everything expanded out so that it’s easier to follow for learning. Rounding at the end, of course, is optional.

    private long remap(long p, long Amin, long Amax, long Bmin, long Bmax ) {

    double deltaA = Amax - Amin;
    double deltaB = Bmax - Bmin;
    double scale  = deltaB / deltaA;
    double negA   = -1 * Amin;
    double offset = (negA * scale) + Bmin;
    double q      = (p * scale) + offset;
    return Math.round(q);

}

在Python中将N秒添加到datetime.time的标准方法是什么?

问题:在Python中将N秒添加到datetime.time的标准方法是什么?

给定datetime.timePython中的值,是否有标准的方法向其添加整数秒,例如11:34:59+ 3 = 11:35:02

这些明显的想法行不通:

>>> datetime.time(11, 34, 59) + 3
TypeError: unsupported operand type(s) for +: 'datetime.time' and 'int'
>>> datetime.time(11, 34, 59) + datetime.timedelta(0, 3)
TypeError: unsupported operand type(s) for +: 'datetime.time' and 'datetime.timedelta'
>>> datetime.time(11, 34, 59) + datetime.time(0, 0, 3)
TypeError: unsupported operand type(s) for +: 'datetime.time' and 'datetime.time'

最后,我编写了这样的函数:

def add_secs_to_time(timeval, secs_to_add):
    secs = timeval.hour * 3600 + timeval.minute * 60 + timeval.second
    secs += secs_to_add
    return datetime.time(secs // 3600, (secs % 3600) // 60, secs % 60)

我不禁以为我缺少一种更简单的方法来做到这一点。

有关

Given a datetime.time value in Python, is there a standard way to add an integer number of seconds to it, so that 11:34:59 + 3 = 11:35:02, for example?

These obvious ideas don’t work:

>>> datetime.time(11, 34, 59) + 3
TypeError: unsupported operand type(s) for +: 'datetime.time' and 'int'
>>> datetime.time(11, 34, 59) + datetime.timedelta(0, 3)
TypeError: unsupported operand type(s) for +: 'datetime.time' and 'datetime.timedelta'
>>> datetime.time(11, 34, 59) + datetime.time(0, 0, 3)
TypeError: unsupported operand type(s) for +: 'datetime.time' and 'datetime.time'

In the end I have written functions like this:

def add_secs_to_time(timeval, secs_to_add):
    secs = timeval.hour * 3600 + timeval.minute * 60 + timeval.second
    secs += secs_to_add
    return datetime.time(secs // 3600, (secs % 3600) // 60, secs % 60)

I can’t help thinking that I’m missing an easier way to do this though.

Related


回答 0

您可以将完整datetime变量与一起使用timedelta,并通过提供一个虚拟日期,然后使用time来获取时间值。

例如:

import datetime
a = datetime.datetime(100,1,1,11,34,59)
b = a + datetime.timedelta(0,3) # days, seconds, then other fields.
print(a.time())
print(b.time())

得出两个值,相隔三秒:

11:34:59
11:35:02

您也可以选择更具可读性的

b = a + datetime.timedelta(seconds=3)

如果你这么倾向。


如果您追求的是可以执行此操作的函数,则可以使用addSecs以下方法进行研究:

import datetime

def addSecs(tm, secs):
    fulldate = datetime.datetime(100, 1, 1, tm.hour, tm.minute, tm.second)
    fulldate = fulldate + datetime.timedelta(seconds=secs)
    return fulldate.time()

a = datetime.datetime.now().time()
b = addSecs(a, 300)
print(a)
print(b)

输出:

 09:11:55.775695
 09:16:55

You can use full datetime variables with timedelta, and by providing a dummy date then using time to just get the time value.

For example:

import datetime
a = datetime.datetime(100,1,1,11,34,59)
b = a + datetime.timedelta(0,3) # days, seconds, then other fields.
print(a.time())
print(b.time())

results in the two values, three seconds apart:

11:34:59
11:35:02

You could also opt for the more readable

b = a + datetime.timedelta(seconds=3)

if you’re so inclined.


If you’re after a function that can do this, you can look into using addSecs below:

import datetime

def addSecs(tm, secs):
    fulldate = datetime.datetime(100, 1, 1, tm.hour, tm.minute, tm.second)
    fulldate = fulldate + datetime.timedelta(seconds=secs)
    return fulldate.time()

a = datetime.datetime.now().time()
b = addSecs(a, 300)
print(a)
print(b)

This outputs:

 09:11:55.775695
 09:16:55

回答 1

如此处其他人所述,您可以在整个过程中使用完整的datetime对象:

from datetime import datetime, date, time, timedelta
sometime = time(8,00) # 8am
later = (datetime.combine(date.today(), sometime) + timedelta(seconds=3)).time()

但是,我认为值得解释为什么需要完整的datetime对象。考虑如果我在下午11点增加2个小时会发生什么情况。正确的行为是什么?有一个exceptions,因为您的时间不能超过晚上11:59?它应该回绕吗?

不同的程序员会期望不同的东西,因此他们选择的任何结果都会使很多人感到惊讶。更糟糕的是,程序员最初编写的代码在最初测试时就可以正常工作,然后通过做一些意想不到的事情而使代码中断。这非常糟糕,这就是为什么不允许您向时间对象添加timedelta对象的原因。

As others here have stated, you can just use full datetime objects throughout:

from datetime import datetime, date, time, timedelta
sometime = time(8,00) # 8am
later = (datetime.combine(date.today(), sometime) + timedelta(seconds=3)).time()

However, I think it’s worth explaining why full datetime objects are required. Consider what would happen if I added 2 hours to 11pm. What’s the correct behavior? An exception, because you can’t have a time larger than 11:59pm? Should it wrap back around?

Different programmers will expect different things, so whichever result they picked would surprise a lot of people. Worse yet, programmers would write code that worked just fine when they tested it initially, and then have it break later by doing something unexpected. This is very bad, which is why you’re not allowed to add timedelta objects to time objects.


回答 2

一件事,可能会增加清晰度以覆盖默认值(秒)

>>> b = a + datetime.timedelta(seconds=3000)
>>> b
datetime.datetime(1, 1, 1, 12, 24, 59)

One little thing, might add clarity to override the default value for seconds

>>> b = a + datetime.timedelta(seconds=3000)
>>> b
datetime.datetime(1, 1, 1, 12, 24, 59)

回答 3

感谢@Pax Diablo,@ bvmou和@Arachnid建议在整个过程中使用完整的日期时间。如果我必须从外部来源接受datetime.time对象,那么这似乎是一种替代add_secs_to_time()功能:

def add_secs_to_time(timeval, secs_to_add):
    dummy_date = datetime.date(1, 1, 1)
    full_datetime = datetime.datetime.combine(dummy_date, timeval)
    added_datetime = full_datetime + datetime.timedelta(seconds=secs_to_add)
    return added_datetime.time()

此冗长的代码可以压缩为以下形式:

(datetime.datetime.combine(datetime.date(1, 1, 1), timeval) + datetime.timedelta(seconds=secs_to_add)).time()

但我想我还是要将其包装在一个函数中,以确保代码清晰。

Thanks to @Pax Diablo, @bvmou and @Arachnid for the suggestion of using full datetimes throughout. If I have to accept datetime.time objects from an external source, then this seems to be an alternative add_secs_to_time() function:

def add_secs_to_time(timeval, secs_to_add):
    dummy_date = datetime.date(1, 1, 1)
    full_datetime = datetime.datetime.combine(dummy_date, timeval)
    added_datetime = full_datetime + datetime.timedelta(seconds=secs_to_add)
    return added_datetime.time()

This verbose code can be compressed to this one-liner:

(datetime.datetime.combine(datetime.date(1, 1, 1), timeval) + datetime.timedelta(seconds=secs_to_add)).time()

but I think I’d want to wrap that up in a function for code clarity anyway.


回答 4

如果值得在您的项目中添加另一个文件/依赖项,那么我刚刚编写了一个很小的小类,它datetime.time具有算术能力。当您经过午夜时,它会绕零。现在,“从现在开始24小时将是几点钟”有很多特殊情况,包括夏时制,leap秒,历史时区更改等。但是有时候您确实确实需要简单的案例,这就是这样做的目的。

您的示例将写为:

>>> import datetime
>>> import nptime
>>> nptime.nptime(11, 34, 59) + datetime.timedelta(0, 3)
nptime(11, 35, 2)

nptime继承自datetime.time,因此任何这些方法也应该可用。

可以从PyPi以nptime(“非修整时间”)或在GitHub上获得:https : //github.com/tgs/nptime

If it’s worth adding another file / dependency to your project, I’ve just written a tiny little class that extends datetime.time with the ability to do arithmetic. When you go past midnight, it wraps around zero. Now, “What time will it be, 24 hours from now” has a lot of corner cases, including daylight savings time, leap seconds, historical timezone changes, and so on. But sometimes you really do need the simple case, and that’s what this will do.

Your example would be written:

>>> import datetime
>>> import nptime
>>> nptime.nptime(11, 34, 59) + datetime.timedelta(0, 3)
nptime(11, 35, 2)

nptime inherits from datetime.time, so any of those methods should be usable, too.

It’s available from PyPi as nptime (“non-pedantic time”), or on GitHub: https://github.com/tgs/nptime


回答 5

您不能简单地添加数字,datetime因为不清楚使用的单位是秒,小时,周…

timedelta用于日期和时间操作的类。datetime减去datetimeGives timedeltadatetimePlus timedeltaGives datetimedatetime虽然两个对象可以添加,但不能添加两个对象timedelta

创建timedelta要添加多少秒的datetime对象并将其添加到对象:

>>> from datetime import datetime, timedelta
>>> t = datetime.now() + timedelta(seconds=3000)
>>> print(t)
datetime.datetime(2018, 1, 17, 21, 47, 13, 90244)

C ++中有相同的概念:std::chrono::duration

You cannot simply add number to datetime because it’s unclear what unit is used: seconds, hours, weeks…

There is timedelta class for manipulations with date and time. datetime minus datetime gives timedelta, datetime plus timedelta gives datetime, two datetime objects cannot be added although two timedelta can.

Create timedelta object with how many seconds you want to add and add it to datetime object:

>>> from datetime import datetime, timedelta
>>> t = datetime.now() + timedelta(seconds=3000)
>>> print(t)
datetime.datetime(2018, 1, 17, 21, 47, 13, 90244)

There is same concept in C++: std::chrono::duration.


回答 6

为了完整起见,这是使用它的方式arrow(Python的更好的日期和时间):

sometime = arrow.now()
abitlater = sometime.shift(seconds=3)

For completeness’ sake, here’s the way to do it with arrow (better dates and times for Python):

sometime = arrow.now()
abitlater = sometime.shift(seconds=3)

回答 7

尝试添加datetime.datetimedatetime.timedelta。如果只需要时间部分,则可以time()在结果datetime.datetime对象上调用方法以获取它。

Try adding a datetime.datetime to a datetime.timedelta. If you only want the time portion, you can call the time() method on the resultant datetime.datetime object to get it.


回答 8

老问题了,但我想我会抛出一个处理时区的函数。关键部分是将datetime.time对象的tzinfo属性传递到Combine中,然后在结果虚拟日期时间上使用timetz()而不是time()。此答案部分受此处其他答案的启发。

def add_timedelta_to_time(t, td):
    """Add a timedelta object to a time object using a dummy datetime.

    :param t: datetime.time object.
    :param td: datetime.timedelta object.

    :returns: datetime.time object, representing the result of t + td.

    NOTE: Using a gigantic td may result in an overflow. You've been
    warned.
    """
    # Create a dummy date object.
    dummy_date = date(year=100, month=1, day=1)

    # Combine the dummy date with the given time.
    dummy_datetime = datetime.combine(date=dummy_date, time=t, tzinfo=t.tzinfo)

    # Add the timedelta to the dummy datetime.
    new_datetime = dummy_datetime + td

    # Return the resulting time, including timezone information.
    return new_datetime.timetz()

这是一个非常简单的测试用例类(使用内置unittest):

import unittest
from datetime import datetime, timezone, timedelta, time

class AddTimedeltaToTimeTestCase(unittest.TestCase):
    """Test add_timedelta_to_time."""

    def test_wraps(self):
        t = time(hour=23, minute=59)
        td = timedelta(minutes=2)
        t_expected = time(hour=0, minute=1)
        t_actual = add_timedelta_to_time(t=t, td=td)
        self.assertEqual(t_expected, t_actual)

    def test_tz(self):
        t = time(hour=4, minute=16, tzinfo=timezone.utc)
        td = timedelta(hours=10, minutes=4)
        t_expected = time(hour=14, minute=20, tzinfo=timezone.utc)
        t_actual = add_timedelta_to_time(t=t, td=td)
        self.assertEqual(t_expected, t_actual)


if __name__ == '__main__':
    unittest.main()

Old question, but I figured I’d throw in a function that handles timezones. The key parts are passing the datetime.time object’s tzinfo attribute into combine, and then using timetz() instead of time() on the resulting dummy datetime. This answer partly inspired by the other answers here.

def add_timedelta_to_time(t, td):
    """Add a timedelta object to a time object using a dummy datetime.

    :param t: datetime.time object.
    :param td: datetime.timedelta object.

    :returns: datetime.time object, representing the result of t + td.

    NOTE: Using a gigantic td may result in an overflow. You've been
    warned.
    """
    # Create a dummy date object.
    dummy_date = date(year=100, month=1, day=1)

    # Combine the dummy date with the given time.
    dummy_datetime = datetime.combine(date=dummy_date, time=t, tzinfo=t.tzinfo)

    # Add the timedelta to the dummy datetime.
    new_datetime = dummy_datetime + td

    # Return the resulting time, including timezone information.
    return new_datetime.timetz()

And here’s a really simple test case class (using built-in unittest):

import unittest
from datetime import datetime, timezone, timedelta, time

class AddTimedeltaToTimeTestCase(unittest.TestCase):
    """Test add_timedelta_to_time."""

    def test_wraps(self):
        t = time(hour=23, minute=59)
        td = timedelta(minutes=2)
        t_expected = time(hour=0, minute=1)
        t_actual = add_timedelta_to_time(t=t, td=td)
        self.assertEqual(t_expected, t_actual)

    def test_tz(self):
        t = time(hour=4, minute=16, tzinfo=timezone.utc)
        td = timedelta(hours=10, minutes=4)
        t_expected = time(hour=14, minute=20, tzinfo=timezone.utc)
        t_actual = add_timedelta_to_time(t=t, td=td)
        self.assertEqual(t_expected, t_actual)


if __name__ == '__main__':
    unittest.main()

列出N以下所有素数的最快方法

问题:列出N以下所有素数的最快方法

这是我能想到的最好的算法。

def get_primes(n):
    numbers = set(range(n, 1, -1))
    primes = []
    while numbers:
        p = numbers.pop()
        primes.append(p)
        numbers.difference_update(set(range(p*2, n+1, p)))
    return primes

>>> timeit.Timer(stmt='get_primes.get_primes(1000000)', setup='import   get_primes').timeit(1)
1.1499958793645562

可以使它更快吗?

此代码有一个缺陷:由于numbers是无序集合,因此不能保证numbers.pop()从集合中删除最低的数字。但是,它对某些输入数字有效(至少对我而言):

>>> sum(get_primes(2000000))
142913828922L
#That's the correct sum of all numbers below 2 million
>>> 529 in get_primes(1000)
False
>>> 529 in get_primes(530)
True

This is the best algorithm I could come up.

def get_primes(n):
    numbers = set(range(n, 1, -1))
    primes = []
    while numbers:
        p = numbers.pop()
        primes.append(p)
        numbers.difference_update(set(range(p*2, n+1, p)))
    return primes

>>> timeit.Timer(stmt='get_primes.get_primes(1000000)', setup='import   get_primes').timeit(1)
1.1499958793645562

Can it be made even faster?

This code has a flaw: Since numbers is an unordered set, there is no guarantee that numbers.pop() will remove the lowest number from the set. Nevertheless, it works (at least for me) for some input numbers:

>>> sum(get_primes(2000000))
142913828922L
#That's the correct sum of all numbers below 2 million
>>> 529 in get_primes(1000)
False
>>> 529 in get_primes(530)
True

回答 0

警告: timeit由于硬件或Python版本的差异,结果可能会有所不同。

下面是一个脚本,比较了许多实现:

非常感谢斯蒂芬为使sieve_wheel_30引起我的注意。幸得罗伯特·威廉·汉克斯为primesfrom2to,primesfrom3to,rwh_primes,rwh_primes1和rwh_primes2。

使用psyco测试的简单Python方法中,对于n = 1000000, rwh_primes1是测试最快的方法。

+---------------------+-------+
| Method              | ms    |
+---------------------+-------+
| rwh_primes1         | 43.0  |
| sieveOfAtkin        | 46.4  |
| rwh_primes          | 57.4  |
| sieve_wheel_30      | 63.0  |
| rwh_primes2         | 67.8  |    
| sieveOfEratosthenes | 147.0 |
| ambi_sieve_plain    | 152.0 |
| sundaram3           | 194.0 |
+---------------------+-------+

没有psyco的情况下经过测试的普通Python方法中,对于n = 1000000, rwh_primes2是最快的。

+---------------------+-------+
| Method              | ms    |
+---------------------+-------+
| rwh_primes2         | 68.1  |
| rwh_primes1         | 93.7  |
| rwh_primes          | 94.6  |
| sieve_wheel_30      | 97.4  |
| sieveOfEratosthenes | 178.0 |
| ambi_sieve_plain    | 286.0 |
| sieveOfAtkin        | 314.0 |
| sundaram3           | 416.0 |
+---------------------+-------+

在所有测试的方法中,允许numpy,对于n = 1000000, primesfrom2to是测试最快的方法。

+---------------------+-------+
| Method              | ms    |
+---------------------+-------+
| primesfrom2to       | 15.9  |
| primesfrom3to       | 18.4  |
| ambi_sieve          | 29.3  |
+---------------------+-------+

使用以下命令测量时间:

python -mtimeit -s"import primes" "primes.{method}(1000000)"

{method}每个方法名称替换。

primes.py:

#!/usr/bin/env python
import psyco; psyco.full()
from math import sqrt, ceil
import numpy as np

def rwh_primes(n):
    # /programming/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * n
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i]:
            sieve[i*i::2*i]=[False]*((n-i*i-1)/(2*i)+1)
    return [2] + [i for i in xrange(3,n,2) if sieve[i]]

def rwh_primes1(n):
    # /programming/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * (n/2)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i/2]:
            sieve[i*i/2::i] = [False] * ((n-i*i-1)/(2*i)+1)
    return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]

def rwh_primes2(n):
    # /programming/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns a list of primes, 2 <= p < n """
    correction = (n%6>1)
    n = {0:n,1:n-1,2:n+4,3:n+3,4:n+2,5:n+1}[n%6]
    sieve = [True] * (n/3)
    sieve[0] = False
    for i in xrange(int(n**0.5)/3+1):
      if sieve[i]:
        k=3*i+1|1
        sieve[      ((k*k)/3)      ::2*k]=[False]*((n/6-(k*k)/6-1)/k+1)
        sieve[(k*k+4*k-2*k*(i&1))/3::2*k]=[False]*((n/6-(k*k+4*k-2*k*(i&1))/6-1)/k+1)
    return [2,3] + [3*i+1|1 for i in xrange(1,n/3-correction) if sieve[i]]

def sieve_wheel_30(N):
    # http://zerovolt.com/?p=88
    ''' Returns a list of primes <= N using wheel criterion 2*3*5 = 30

Copyright 2009 by zerovolt.com
This code is free for non-commercial purposes, in which case you can just leave this comment as a credit for my work.
If you need this code for commercial purposes, please contact me by sending an email to: info [at] zerovolt [dot] com.'''
    __smallp = ( 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,
    61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139,
    149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227,
    229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311,
    313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401,
    409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491,
    499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599,
    601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683,
    691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797,
    809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887,
    907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997)

    wheel = (2, 3, 5)
    const = 30
    if N < 2:
        return []
    if N <= const:
        pos = 0
        while __smallp[pos] <= N:
            pos += 1
        return list(__smallp[:pos])
    # make the offsets list
    offsets = (7, 11, 13, 17, 19, 23, 29, 1)
    # prepare the list
    p = [2, 3, 5]
    dim = 2 + N // const
    tk1  = [True] * dim
    tk7  = [True] * dim
    tk11 = [True] * dim
    tk13 = [True] * dim
    tk17 = [True] * dim
    tk19 = [True] * dim
    tk23 = [True] * dim
    tk29 = [True] * dim
    tk1[0] = False
    # help dictionary d
    # d[a , b] = c  ==> if I want to find the smallest useful multiple of (30*pos)+a
    # on tkc, then I need the index given by the product of [(30*pos)+a][(30*pos)+b]
    # in general. If b < a, I need [(30*pos)+a][(30*(pos+1))+b]
    d = {}
    for x in offsets:
        for y in offsets:
            res = (x*y) % const
            if res in offsets:
                d[(x, res)] = y
    # another help dictionary: gives tkx calling tmptk[x]
    tmptk = {1:tk1, 7:tk7, 11:tk11, 13:tk13, 17:tk17, 19:tk19, 23:tk23, 29:tk29}
    pos, prime, lastadded, stop = 0, 0, 0, int(ceil(sqrt(N)))
    # inner functions definition
    def del_mult(tk, start, step):
        for k in xrange(start, len(tk), step):
            tk[k] = False
    # end of inner functions definition
    cpos = const * pos
    while prime < stop:
        # 30k + 7
        if tk7[pos]:
            prime = cpos + 7
            p.append(prime)
            lastadded = 7
            for off in offsets:
                tmp = d[(7, off)]
                start = (pos + prime) if off == 7 else (prime * (const * (pos + 1 if tmp < 7 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 11
        if tk11[pos]:
            prime = cpos + 11
            p.append(prime)
            lastadded = 11
            for off in offsets:
                tmp = d[(11, off)]
                start = (pos + prime) if off == 11 else (prime * (const * (pos + 1 if tmp < 11 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 13
        if tk13[pos]:
            prime = cpos + 13
            p.append(prime)
            lastadded = 13
            for off in offsets:
                tmp = d[(13, off)]
                start = (pos + prime) if off == 13 else (prime * (const * (pos + 1 if tmp < 13 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 17
        if tk17[pos]:
            prime = cpos + 17
            p.append(prime)
            lastadded = 17
            for off in offsets:
                tmp = d[(17, off)]
                start = (pos + prime) if off == 17 else (prime * (const * (pos + 1 if tmp < 17 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 19
        if tk19[pos]:
            prime = cpos + 19
            p.append(prime)
            lastadded = 19
            for off in offsets:
                tmp = d[(19, off)]
                start = (pos + prime) if off == 19 else (prime * (const * (pos + 1 if tmp < 19 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 23
        if tk23[pos]:
            prime = cpos + 23
            p.append(prime)
            lastadded = 23
            for off in offsets:
                tmp = d[(23, off)]
                start = (pos + prime) if off == 23 else (prime * (const * (pos + 1 if tmp < 23 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 29
        if tk29[pos]:
            prime = cpos + 29
            p.append(prime)
            lastadded = 29
            for off in offsets:
                tmp = d[(29, off)]
                start = (pos + prime) if off == 29 else (prime * (const * (pos + 1 if tmp < 29 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # now we go back to top tk1, so we need to increase pos by 1
        pos += 1
        cpos = const * pos
        # 30k + 1
        if tk1[pos]:
            prime = cpos + 1
            p.append(prime)
            lastadded = 1
            for off in offsets:
                tmp = d[(1, off)]
                start = (pos + prime) if off == 1 else (prime * (const * pos + tmp) )//const
                del_mult(tmptk[off], start, prime)
    # time to add remaining primes
    # if lastadded == 1, remove last element and start adding them from tk1
    # this way we don't need an "if" within the last while
    if lastadded == 1:
        p.pop()
    # now complete for every other possible prime
    while pos < len(tk1):
        cpos = const * pos
        if tk1[pos]: p.append(cpos + 1)
        if tk7[pos]: p.append(cpos + 7)
        if tk11[pos]: p.append(cpos + 11)
        if tk13[pos]: p.append(cpos + 13)
        if tk17[pos]: p.append(cpos + 17)
        if tk19[pos]: p.append(cpos + 19)
        if tk23[pos]: p.append(cpos + 23)
        if tk29[pos]: p.append(cpos + 29)
        pos += 1
    # remove exceeding if present
    pos = len(p) - 1
    while p[pos] > N:
        pos -= 1
    if pos < len(p) - 1:
        del p[pos+1:]
    # return p list
    return p

def sieveOfEratosthenes(n):
    """sieveOfEratosthenes(n): return the list of the primes < n."""
    # Code from: <dickinsm@gmail.com>, Nov 30 2006
    # http://groups.google.com/group/comp.lang.python/msg/f1f10ced88c68c2d
    if n <= 2:
        return []
    sieve = range(3, n, 2)
    top = len(sieve)
    for si in sieve:
        if si:
            bottom = (si*si - 3) // 2
            if bottom >= top:
                break
            sieve[bottom::si] = [0] * -((bottom - top) // si)
    return [2] + [el for el in sieve if el]

def sieveOfAtkin(end):
    """sieveOfAtkin(end): return a list of all the prime numbers <end
    using the Sieve of Atkin."""
    # Code by Steve Krenzel, <Sgk284@gmail.com>, improved
    # Code: https://web.archive.org/web/20080324064651/http://krenzel.info/?p=83
    # Info: http://en.wikipedia.org/wiki/Sieve_of_Atkin
    assert end > 0
    lng = ((end-1) // 2)
    sieve = [False] * (lng + 1)

    x_max, x2, xd = int(sqrt((end-1)/4.0)), 0, 4
    for xd in xrange(4, 8*x_max + 2, 8):
        x2 += xd
        y_max = int(sqrt(end-x2))
        n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1
        if not (n & 1):
            n -= n_diff
            n_diff -= 2
        for d in xrange((n_diff - 1) << 1, -1, -8):
            m = n % 12
            if m == 1 or m == 5:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, x2, xd = int(sqrt((end-1) / 3.0)), 0, 3
    for xd in xrange(3, 6 * x_max + 2, 6):
        x2 += xd
        y_max = int(sqrt(end-x2))
        n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1
        if not(n & 1):
            n -= n_diff
            n_diff -= 2
        for d in xrange((n_diff - 1) << 1, -1, -8):
            if n % 12 == 7:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, y_min, x2, xd = int((2 + sqrt(4-8*(1-end)))/4), -1, 0, 3
    for x in xrange(1, x_max + 1):
        x2 += xd
        xd += 6
        if x2 >= end: y_min = (((int(ceil(sqrt(x2 - end))) - 1) << 1) - 2) << 1
        n, n_diff = ((x*x + x) << 1) - 1, (((x-1) << 1) - 2) << 1
        for d in xrange(n_diff, y_min, -8):
            if n % 12 == 11:
                m = n >> 1
                sieve[m] = not sieve[m]
            n += d

    primes = [2, 3]
    if end <= 3:
        return primes[:max(0,end-2)]

    for n in xrange(5 >> 1, (int(sqrt(end))+1) >> 1):
        if sieve[n]:
            primes.append((n << 1) + 1)
            aux = (n << 1) + 1
            aux *= aux
            for k in xrange(aux, end, 2 * aux):
                sieve[k >> 1] = False

    s  = int(sqrt(end)) + 1
    if s  % 2 == 0:
        s += 1
    primes.extend([i for i in xrange(s, end, 2) if sieve[i >> 1]])

    return primes

def ambi_sieve_plain(n):
    s = range(3, n, 2)
    for m in xrange(3, int(n**0.5)+1, 2): 
        if s[(m-3)/2]: 
            for t in xrange((m*m-3)/2,(n>>1)-1,m):
                s[t]=0
    return [2]+[t for t in s if t>0]

def sundaram3(max_n):
    # /programming/2068372/fastest-way-to-list-all-primes-below-n-in-python/2073279#2073279
    numbers = range(3, max_n+1, 2)
    half = (max_n)//2
    initial = 4

    for step in xrange(3, max_n+1, 2):
        for i in xrange(initial, half, step):
            numbers[i-1] = 0
        initial += 2*(step+1)

        if initial > half:
            return [2] + filter(None, numbers)

################################################################################
# Using Numpy:
def ambi_sieve(n):
    # http://tommih.blogspot.com/2009/04/fast-prime-number-generator.html
    s = np.arange(3, n, 2)
    for m in xrange(3, int(n ** 0.5)+1, 2): 
        if s[(m-3)/2]: 
            s[(m*m-3)/2::m]=0
    return np.r_[2, s[s>0]]

def primesfrom3to(n):
    # /programming/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns a array of primes, p < n """
    assert n>=2
    sieve = np.ones(n/2, dtype=np.bool)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i/2]:
            sieve[i*i/2::i] = False
    return np.r_[2, 2*np.nonzero(sieve)[0][1::]+1]    

def primesfrom2to(n):
    # /programming/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns a array of primes, 2 <= p < n """
    sieve = np.ones(n/3 + (n%6==2), dtype=np.bool)
    sieve[0] = False
    for i in xrange(int(n**0.5)/3+1):
        if sieve[i]:
            k=3*i+1|1
            sieve[      ((k*k)/3)      ::2*k] = False
            sieve[(k*k+4*k-2*k*(i&1))/3::2*k] = False
    return np.r_[2,3,((3*np.nonzero(sieve)[0]+1)|1)]

if __name__=='__main__':
    import itertools
    import sys

    def test(f1,f2,num):
        print('Testing {f1} and {f2} return same results'.format(
            f1=f1.func_name,
            f2=f2.func_name))
        if not all([a==b for a,b in itertools.izip_longest(f1(num),f2(num))]):
            sys.exit("Error: %s(%s) != %s(%s)"%(f1.func_name,num,f2.func_name,num))

    n=1000000
    test(sieveOfAtkin,sieveOfEratosthenes,n)
    test(sieveOfAtkin,ambi_sieve,n)
    test(sieveOfAtkin,ambi_sieve_plain,n) 
    test(sieveOfAtkin,sundaram3,n)
    test(sieveOfAtkin,sieve_wheel_30,n)
    test(sieveOfAtkin,primesfrom3to,n)
    test(sieveOfAtkin,primesfrom2to,n)
    test(sieveOfAtkin,rwh_primes,n)
    test(sieveOfAtkin,rwh_primes1,n)         
    test(sieveOfAtkin,rwh_primes2,n)

运行脚本会测试所有实现都给出相同的结果。

Warning: timeit results may vary due to differences in hardware or version of Python.

Below is a script which compares a number of implementations:

Many thanks to stephan for bringing sieve_wheel_30 to my attention. Credit goes to Robert William Hanks for primesfrom2to, primesfrom3to, rwh_primes, rwh_primes1, and rwh_primes2.

Of the plain Python methods tested, with psyco, for n=1000000, rwh_primes1 was the fastest tested.

+---------------------+-------+
| Method              | ms    |
+---------------------+-------+
| rwh_primes1         | 43.0  |
| sieveOfAtkin        | 46.4  |
| rwh_primes          | 57.4  |
| sieve_wheel_30      | 63.0  |
| rwh_primes2         | 67.8  |    
| sieveOfEratosthenes | 147.0 |
| ambi_sieve_plain    | 152.0 |
| sundaram3           | 194.0 |
+---------------------+-------+

Of the plain Python methods tested, without psyco, for n=1000000, rwh_primes2 was the fastest.

+---------------------+-------+
| Method              | ms    |
+---------------------+-------+
| rwh_primes2         | 68.1  |
| rwh_primes1         | 93.7  |
| rwh_primes          | 94.6  |
| sieve_wheel_30      | 97.4  |
| sieveOfEratosthenes | 178.0 |
| ambi_sieve_plain    | 286.0 |
| sieveOfAtkin        | 314.0 |
| sundaram3           | 416.0 |
+---------------------+-------+

Of all the methods tested, allowing numpy, for n=1000000, primesfrom2to was the fastest tested.

+---------------------+-------+
| Method              | ms    |
+---------------------+-------+
| primesfrom2to       | 15.9  |
| primesfrom3to       | 18.4  |
| ambi_sieve          | 29.3  |
+---------------------+-------+

Timings were measured using the command:

python -mtimeit -s"import primes" "primes.{method}(1000000)"

with {method} replaced by each of the method names.

primes.py:

#!/usr/bin/env python
import psyco; psyco.full()
from math import sqrt, ceil
import numpy as np

def rwh_primes(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * n
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i]:
            sieve[i*i::2*i]=[False]*((n-i*i-1)/(2*i)+1)
    return [2] + [i for i in xrange(3,n,2) if sieve[i]]

def rwh_primes1(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * (n/2)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i/2]:
            sieve[i*i/2::i] = [False] * ((n-i*i-1)/(2*i)+1)
    return [2] + [2*i+1 for i in xrange(1,n/2) if sieve[i]]

def rwh_primes2(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns a list of primes, 2 <= p < n """
    correction = (n%6>1)
    n = {0:n,1:n-1,2:n+4,3:n+3,4:n+2,5:n+1}[n%6]
    sieve = [True] * (n/3)
    sieve[0] = False
    for i in xrange(int(n**0.5)/3+1):
      if sieve[i]:
        k=3*i+1|1
        sieve[      ((k*k)/3)      ::2*k]=[False]*((n/6-(k*k)/6-1)/k+1)
        sieve[(k*k+4*k-2*k*(i&1))/3::2*k]=[False]*((n/6-(k*k+4*k-2*k*(i&1))/6-1)/k+1)
    return [2,3] + [3*i+1|1 for i in xrange(1,n/3-correction) if sieve[i]]

def sieve_wheel_30(N):
    # http://zerovolt.com/?p=88
    ''' Returns a list of primes <= N using wheel criterion 2*3*5 = 30

Copyright 2009 by zerovolt.com
This code is free for non-commercial purposes, in which case you can just leave this comment as a credit for my work.
If you need this code for commercial purposes, please contact me by sending an email to: info [at] zerovolt [dot] com.'''
    __smallp = ( 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59,
    61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139,
    149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227,
    229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311,
    313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401,
    409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491,
    499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599,
    601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683,
    691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797,
    809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887,
    907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997)

    wheel = (2, 3, 5)
    const = 30
    if N < 2:
        return []
    if N <= const:
        pos = 0
        while __smallp[pos] <= N:
            pos += 1
        return list(__smallp[:pos])
    # make the offsets list
    offsets = (7, 11, 13, 17, 19, 23, 29, 1)
    # prepare the list
    p = [2, 3, 5]
    dim = 2 + N // const
    tk1  = [True] * dim
    tk7  = [True] * dim
    tk11 = [True] * dim
    tk13 = [True] * dim
    tk17 = [True] * dim
    tk19 = [True] * dim
    tk23 = [True] * dim
    tk29 = [True] * dim
    tk1[0] = False
    # help dictionary d
    # d[a , b] = c  ==> if I want to find the smallest useful multiple of (30*pos)+a
    # on tkc, then I need the index given by the product of [(30*pos)+a][(30*pos)+b]
    # in general. If b < a, I need [(30*pos)+a][(30*(pos+1))+b]
    d = {}
    for x in offsets:
        for y in offsets:
            res = (x*y) % const
            if res in offsets:
                d[(x, res)] = y
    # another help dictionary: gives tkx calling tmptk[x]
    tmptk = {1:tk1, 7:tk7, 11:tk11, 13:tk13, 17:tk17, 19:tk19, 23:tk23, 29:tk29}
    pos, prime, lastadded, stop = 0, 0, 0, int(ceil(sqrt(N)))
    # inner functions definition
    def del_mult(tk, start, step):
        for k in xrange(start, len(tk), step):
            tk[k] = False
    # end of inner functions definition
    cpos = const * pos
    while prime < stop:
        # 30k + 7
        if tk7[pos]:
            prime = cpos + 7
            p.append(prime)
            lastadded = 7
            for off in offsets:
                tmp = d[(7, off)]
                start = (pos + prime) if off == 7 else (prime * (const * (pos + 1 if tmp < 7 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 11
        if tk11[pos]:
            prime = cpos + 11
            p.append(prime)
            lastadded = 11
            for off in offsets:
                tmp = d[(11, off)]
                start = (pos + prime) if off == 11 else (prime * (const * (pos + 1 if tmp < 11 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 13
        if tk13[pos]:
            prime = cpos + 13
            p.append(prime)
            lastadded = 13
            for off in offsets:
                tmp = d[(13, off)]
                start = (pos + prime) if off == 13 else (prime * (const * (pos + 1 if tmp < 13 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 17
        if tk17[pos]:
            prime = cpos + 17
            p.append(prime)
            lastadded = 17
            for off in offsets:
                tmp = d[(17, off)]
                start = (pos + prime) if off == 17 else (prime * (const * (pos + 1 if tmp < 17 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 19
        if tk19[pos]:
            prime = cpos + 19
            p.append(prime)
            lastadded = 19
            for off in offsets:
                tmp = d[(19, off)]
                start = (pos + prime) if off == 19 else (prime * (const * (pos + 1 if tmp < 19 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 23
        if tk23[pos]:
            prime = cpos + 23
            p.append(prime)
            lastadded = 23
            for off in offsets:
                tmp = d[(23, off)]
                start = (pos + prime) if off == 23 else (prime * (const * (pos + 1 if tmp < 23 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # 30k + 29
        if tk29[pos]:
            prime = cpos + 29
            p.append(prime)
            lastadded = 29
            for off in offsets:
                tmp = d[(29, off)]
                start = (pos + prime) if off == 29 else (prime * (const * (pos + 1 if tmp < 29 else 0) + tmp) )//const
                del_mult(tmptk[off], start, prime)
        # now we go back to top tk1, so we need to increase pos by 1
        pos += 1
        cpos = const * pos
        # 30k + 1
        if tk1[pos]:
            prime = cpos + 1
            p.append(prime)
            lastadded = 1
            for off in offsets:
                tmp = d[(1, off)]
                start = (pos + prime) if off == 1 else (prime * (const * pos + tmp) )//const
                del_mult(tmptk[off], start, prime)
    # time to add remaining primes
    # if lastadded == 1, remove last element and start adding them from tk1
    # this way we don't need an "if" within the last while
    if lastadded == 1:
        p.pop()
    # now complete for every other possible prime
    while pos < len(tk1):
        cpos = const * pos
        if tk1[pos]: p.append(cpos + 1)
        if tk7[pos]: p.append(cpos + 7)
        if tk11[pos]: p.append(cpos + 11)
        if tk13[pos]: p.append(cpos + 13)
        if tk17[pos]: p.append(cpos + 17)
        if tk19[pos]: p.append(cpos + 19)
        if tk23[pos]: p.append(cpos + 23)
        if tk29[pos]: p.append(cpos + 29)
        pos += 1
    # remove exceeding if present
    pos = len(p) - 1
    while p[pos] > N:
        pos -= 1
    if pos < len(p) - 1:
        del p[pos+1:]
    # return p list
    return p

def sieveOfEratosthenes(n):
    """sieveOfEratosthenes(n): return the list of the primes < n."""
    # Code from: <dickinsm@gmail.com>, Nov 30 2006
    # http://groups.google.com/group/comp.lang.python/msg/f1f10ced88c68c2d
    if n <= 2:
        return []
    sieve = range(3, n, 2)
    top = len(sieve)
    for si in sieve:
        if si:
            bottom = (si*si - 3) // 2
            if bottom >= top:
                break
            sieve[bottom::si] = [0] * -((bottom - top) // si)
    return [2] + [el for el in sieve if el]

def sieveOfAtkin(end):
    """sieveOfAtkin(end): return a list of all the prime numbers <end
    using the Sieve of Atkin."""
    # Code by Steve Krenzel, <Sgk284@gmail.com>, improved
    # Code: https://web.archive.org/web/20080324064651/http://krenzel.info/?p=83
    # Info: http://en.wikipedia.org/wiki/Sieve_of_Atkin
    assert end > 0
    lng = ((end-1) // 2)
    sieve = [False] * (lng + 1)

    x_max, x2, xd = int(sqrt((end-1)/4.0)), 0, 4
    for xd in xrange(4, 8*x_max + 2, 8):
        x2 += xd
        y_max = int(sqrt(end-x2))
        n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1
        if not (n & 1):
            n -= n_diff
            n_diff -= 2
        for d in xrange((n_diff - 1) << 1, -1, -8):
            m = n % 12
            if m == 1 or m == 5:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, x2, xd = int(sqrt((end-1) / 3.0)), 0, 3
    for xd in xrange(3, 6 * x_max + 2, 6):
        x2 += xd
        y_max = int(sqrt(end-x2))
        n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1
        if not(n & 1):
            n -= n_diff
            n_diff -= 2
        for d in xrange((n_diff - 1) << 1, -1, -8):
            if n % 12 == 7:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, y_min, x2, xd = int((2 + sqrt(4-8*(1-end)))/4), -1, 0, 3
    for x in xrange(1, x_max + 1):
        x2 += xd
        xd += 6
        if x2 >= end: y_min = (((int(ceil(sqrt(x2 - end))) - 1) << 1) - 2) << 1
        n, n_diff = ((x*x + x) << 1) - 1, (((x-1) << 1) - 2) << 1
        for d in xrange(n_diff, y_min, -8):
            if n % 12 == 11:
                m = n >> 1
                sieve[m] = not sieve[m]
            n += d

    primes = [2, 3]
    if end <= 3:
        return primes[:max(0,end-2)]

    for n in xrange(5 >> 1, (int(sqrt(end))+1) >> 1):
        if sieve[n]:
            primes.append((n << 1) + 1)
            aux = (n << 1) + 1
            aux *= aux
            for k in xrange(aux, end, 2 * aux):
                sieve[k >> 1] = False

    s  = int(sqrt(end)) + 1
    if s  % 2 == 0:
        s += 1
    primes.extend([i for i in xrange(s, end, 2) if sieve[i >> 1]])

    return primes

def ambi_sieve_plain(n):
    s = range(3, n, 2)
    for m in xrange(3, int(n**0.5)+1, 2): 
        if s[(m-3)/2]: 
            for t in xrange((m*m-3)/2,(n>>1)-1,m):
                s[t]=0
    return [2]+[t for t in s if t>0]

def sundaram3(max_n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/2073279#2073279
    numbers = range(3, max_n+1, 2)
    half = (max_n)//2
    initial = 4

    for step in xrange(3, max_n+1, 2):
        for i in xrange(initial, half, step):
            numbers[i-1] = 0
        initial += 2*(step+1)

        if initial > half:
            return [2] + filter(None, numbers)

################################################################################
# Using Numpy:
def ambi_sieve(n):
    # http://tommih.blogspot.com/2009/04/fast-prime-number-generator.html
    s = np.arange(3, n, 2)
    for m in xrange(3, int(n ** 0.5)+1, 2): 
        if s[(m-3)/2]: 
            s[(m*m-3)/2::m]=0
    return np.r_[2, s[s>0]]

def primesfrom3to(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns a array of primes, p < n """
    assert n>=2
    sieve = np.ones(n/2, dtype=np.bool)
    for i in xrange(3,int(n**0.5)+1,2):
        if sieve[i/2]:
            sieve[i*i/2::i] = False
    return np.r_[2, 2*np.nonzero(sieve)[0][1::]+1]    

def primesfrom2to(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns a array of primes, 2 <= p < n """
    sieve = np.ones(n/3 + (n%6==2), dtype=np.bool)
    sieve[0] = False
    for i in xrange(int(n**0.5)/3+1):
        if sieve[i]:
            k=3*i+1|1
            sieve[      ((k*k)/3)      ::2*k] = False
            sieve[(k*k+4*k-2*k*(i&1))/3::2*k] = False
    return np.r_[2,3,((3*np.nonzero(sieve)[0]+1)|1)]

if __name__=='__main__':
    import itertools
    import sys

    def test(f1,f2,num):
        print('Testing {f1} and {f2} return same results'.format(
            f1=f1.func_name,
            f2=f2.func_name))
        if not all([a==b for a,b in itertools.izip_longest(f1(num),f2(num))]):
            sys.exit("Error: %s(%s) != %s(%s)"%(f1.func_name,num,f2.func_name,num))

    n=1000000
    test(sieveOfAtkin,sieveOfEratosthenes,n)
    test(sieveOfAtkin,ambi_sieve,n)
    test(sieveOfAtkin,ambi_sieve_plain,n) 
    test(sieveOfAtkin,sundaram3,n)
    test(sieveOfAtkin,sieve_wheel_30,n)
    test(sieveOfAtkin,primesfrom3to,n)
    test(sieveOfAtkin,primesfrom2to,n)
    test(sieveOfAtkin,rwh_primes,n)
    test(sieveOfAtkin,rwh_primes1,n)         
    test(sieveOfAtkin,rwh_primes2,n)

Running the script tests that all implementations give the same result.


回答 1

更快,更明智的纯Python代码:

def primes(n):
    """ Returns  a list of primes < n """
    sieve = [True] * n
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i]:
            sieve[i*i::2*i]=[False]*((n-i*i-1)//(2*i)+1)
    return [2] + [i for i in range(3,n,2) if sieve[i]]

或从半筛开始

def primes1(n):
    """ Returns  a list of primes < n """
    sieve = [True] * (n//2)
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i//2]:
            sieve[i*i//2::i] = [False] * ((n-i*i-1)//(2*i)+1)
    return [2] + [2*i+1 for i in range(1,n//2) if sieve[i]]

更快,更明智的内存numpy代码:

import numpy
def primesfrom3to(n):
    """ Returns a array of primes, 3 <= p < n """
    sieve = numpy.ones(n//2, dtype=numpy.bool)
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i//2]:
            sieve[i*i//2::i] = False
    return 2*numpy.nonzero(sieve)[0][1::]+1

从三分之一的筛子开始的更快的变化:

import numpy
def primesfrom2to(n):
    """ Input n>=6, Returns a array of primes, 2 <= p < n """
    sieve = numpy.ones(n//3 + (n%6==2), dtype=numpy.bool)
    for i in range(1,int(n**0.5)//3+1):
        if sieve[i]:
            k=3*i+1|1
            sieve[       k*k//3     ::2*k] = False
            sieve[k*(k-2*(i&1)+4)//3::2*k] = False
    return numpy.r_[2,3,((3*numpy.nonzero(sieve)[0][1:]+1)|1)]

上述代码的(难编码)纯python版本为:

def primes2(n):
    """ Input n>=6, Returns a list of primes, 2 <= p < n """
    n, correction = n-n%6+6, 2-(n%6>1)
    sieve = [True] * (n//3)
    for i in range(1,int(n**0.5)//3+1):
      if sieve[i]:
        k=3*i+1|1
        sieve[      k*k//3      ::2*k] = [False] * ((n//6-k*k//6-1)//k+1)
        sieve[k*(k-2*(i&1)+4)//3::2*k] = [False] * ((n//6-k*(k-2*(i&1)+4)//6-1)//k+1)
    return [2,3] + [3*i+1|1 for i in range(1,n//3-correction) if sieve[i]]

不幸的是,纯python并没有采用更简单,更快速的numpy方式进行赋值,并且len()像in [False]*len(sieve[((k*k)//3)::2*k])中那样在循环内调用太慢了。因此,我不得不即兴改正输入(避免更多的数学运算),并做一些极端的(令人痛苦的)数学魔术。

我个人认为numpy(已被广泛使用)不是Python标准库的一部分,而Python开发人员似乎完全忽略了语法和速度方面的改进,这是一个遗憾。

Faster & more memory-wise pure Python code:

def primes(n):
    """ Returns  a list of primes < n """
    sieve = [True] * n
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i]:
            sieve[i*i::2*i]=[False]*((n-i*i-1)//(2*i)+1)
    return [2] + [i for i in range(3,n,2) if sieve[i]]

or starting with half sieve

def primes1(n):
    """ Returns  a list of primes < n """
    sieve = [True] * (n//2)
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i//2]:
            sieve[i*i//2::i] = [False] * ((n-i*i-1)//(2*i)+1)
    return [2] + [2*i+1 for i in range(1,n//2) if sieve[i]]

Faster & more memory-wise numpy code:

import numpy
def primesfrom3to(n):
    """ Returns a array of primes, 3 <= p < n """
    sieve = numpy.ones(n//2, dtype=numpy.bool)
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i//2]:
            sieve[i*i//2::i] = False
    return 2*numpy.nonzero(sieve)[0][1::]+1

a faster variation starting with a third of a sieve:

import numpy
def primesfrom2to(n):
    """ Input n>=6, Returns a array of primes, 2 <= p < n """
    sieve = numpy.ones(n//3 + (n%6==2), dtype=numpy.bool)
    for i in range(1,int(n**0.5)//3+1):
        if sieve[i]:
            k=3*i+1|1
            sieve[       k*k//3     ::2*k] = False
            sieve[k*(k-2*(i&1)+4)//3::2*k] = False
    return numpy.r_[2,3,((3*numpy.nonzero(sieve)[0][1:]+1)|1)]

A (hard-to-code) pure-python version of the above code would be:

def primes2(n):
    """ Input n>=6, Returns a list of primes, 2 <= p < n """
    n, correction = n-n%6+6, 2-(n%6>1)
    sieve = [True] * (n//3)
    for i in range(1,int(n**0.5)//3+1):
      if sieve[i]:
        k=3*i+1|1
        sieve[      k*k//3      ::2*k] = [False] * ((n//6-k*k//6-1)//k+1)
        sieve[k*(k-2*(i&1)+4)//3::2*k] = [False] * ((n//6-k*(k-2*(i&1)+4)//6-1)//k+1)
    return [2,3] + [3*i+1|1 for i in range(1,n//3-correction) if sieve[i]]

Unfortunately pure-python don’t adopt the simpler and faster numpy way of doing assignment, and calling len() inside the loop as in [False]*len(sieve[((k*k)//3)::2*k]) is too slow. So I had to improvise to correct input (& avoid more math) and do some extreme (& painful) math-magic.

Personally I think it is a shame that numpy (which is so widely used) is not part of Python standard library, and that the improvements in syntax and speed seem to be completely overlooked by Python developers.


回答 2

有从Python食谱一个漂亮整洁的样品这里 -建议对URL最快的版本是:

import itertools
def erat2( ):
    D = {  }
    yield 2
    for q in itertools.islice(itertools.count(3), 0, None, 2):
        p = D.pop(q, None)
        if p is None:
            D[q*q] = q
            yield q
        else:
            x = p + q
            while x in D or not (x&1):
                x += p
            D[x] = p

这样就可以

def get_primes_erat(n):
  return list(itertools.takewhile(lambda p: p<n, erat2()))

使用pri.py中的这段代码在shell提示符下(我喜欢这样做)进行测量,我观察到:

$ python2.5 -mtimeit -s'import pri' 'pri.get_primes(1000000)'
10 loops, best of 3: 1.69 sec per loop
$ python2.5 -mtimeit -s'import pri' 'pri.get_primes_erat(1000000)'
10 loops, best of 3: 673 msec per loop

因此,看起来Cookbook解决方案的速度是以前的两倍。

There’s a pretty neat sample from the Python Cookbook here — the fastest version proposed on that URL is:

import itertools
def erat2( ):
    D = {  }
    yield 2
    for q in itertools.islice(itertools.count(3), 0, None, 2):
        p = D.pop(q, None)
        if p is None:
            D[q*q] = q
            yield q
        else:
            x = p + q
            while x in D or not (x&1):
                x += p
            D[x] = p

so that would give

def get_primes_erat(n):
  return list(itertools.takewhile(lambda p: p<n, erat2()))

Measuring at the shell prompt (as I prefer to do) with this code in pri.py, I observe:

$ python2.5 -mtimeit -s'import pri' 'pri.get_primes(1000000)'
10 loops, best of 3: 1.69 sec per loop
$ python2.5 -mtimeit -s'import pri' 'pri.get_primes_erat(1000000)'
10 loops, best of 3: 673 msec per loop

so it looks like the Cookbook solution is over twice as fast.


回答 3

我认为使用Sundaram的Sieve打破了纯Python的记录:

def sundaram3(max_n):
    numbers = range(3, max_n+1, 2)
    half = (max_n)//2
    initial = 4

    for step in xrange(3, max_n+1, 2):
        for i in xrange(initial, half, step):
            numbers[i-1] = 0
        initial += 2*(step+1)

        if initial > half:
            return [2] + filter(None, numbers)

对比:

C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.get_primes_erat(1000000)"
10 loops, best of 3: 710 msec per loop

C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.daniel_sieve_2(1000000)"
10 loops, best of 3: 435 msec per loop

C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.sundaram3(1000000)"
10 loops, best of 3: 327 msec per loop

Using Sundaram’s Sieve, I think I broke pure-Python’s record:

def sundaram3(max_n):
    numbers = range(3, max_n+1, 2)
    half = (max_n)//2
    initial = 4

    for step in xrange(3, max_n+1, 2):
        for i in xrange(initial, half, step):
            numbers[i-1] = 0
        initial += 2*(step+1)

        if initial > half:
            return [2] + filter(None, numbers)

Comparasion:

C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.get_primes_erat(1000000)"
10 loops, best of 3: 710 msec per loop

C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.daniel_sieve_2(1000000)"
10 loops, best of 3: 435 msec per loop

C:\USERS>python -m timeit -n10 -s "import get_primes" "get_primes.sundaram3(1000000)"
10 loops, best of 3: 327 msec per loop

回答 4

该算法速度很快,但存在严重缺陷:

>>> sorted(get_primes(530))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73,
79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163,
167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251,
257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349,
353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443,
449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 527, 529]
>>> 17*31
527
>>> 23*23
529

您假设这numbers.pop()将返回集合中的最小数字,但是完全不能保证。集是无序的,并且pop()删除并返回任意元素,因此不能用于从其余数字中选择下一个素数。

The algorithm is fast, but it has a serious flaw:

>>> sorted(get_primes(530))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73,
79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163,
167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251,
257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349,
353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443,
449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 527, 529]
>>> 17*31
527
>>> 23*23
529

You assume that numbers.pop() would return the smallest number in the set, but this is not guaranteed at all. Sets are unordered and pop() removes and returns an arbitrary element, so it cannot be used to select the next prime from the remaining numbers.


回答 5

对于具有足够大N的真正最快的解决方案,将是下载预先计算的素数列表,将其存储为元组,然后执行以下操作:

for pos,i in enumerate(primes):
    if i > N:
        print primes[:pos]

如果N > primes[-1] 只是这样,则可以计算更多的质数并将新列表保存在您的代码中,因此下一次同样快。

始终在框外思考。

For truly fastest solution with sufficiently large N would be to download a pre-calculated list of primes, store it as a tuple and do something like:

for pos,i in enumerate(primes):
    if i > N:
        print primes[:pos]

If N > primes[-1] only then calculate more primes and save the new list in your code, so next time it is equally as fast.

Always think outside the box.


回答 6

如果您不想重新发明轮子,可以安装符号数学库sympy(是的,它与Python 3兼容)

pip install sympy

并使用primerange功能

from sympy import sieve
primes = list(sieve.primerange(1, 10**6))

If you don’t want to reinvent the wheel, you can install the symbolic maths library sympy (yes it’s Python 3 compatible)

pip install sympy

And use the primerange function

from sympy import sieve
primes = list(sieve.primerange(1, 10**6))

回答 7

如果您接受itertools但不接受numpy,则这是针对Python 3的rwh_primes2的改编版,其在我的计算机上的运行速度约为以前的两倍。唯一的实质性更改是使用字节数组而不是布尔值列表,并使用compress而不是列表推导来构建最终列表。(如果可以的话,我将其添加为类似moarningsun的评论。)

import itertools
izip = itertools.zip_longest
chain = itertools.chain.from_iterable
compress = itertools.compress
def rwh_primes2_python3(n):
    """ Input n>=6, Returns a list of primes, 2 <= p < n """
    zero = bytearray([False])
    size = n//3 + (n % 6 == 2)
    sieve = bytearray([True]) * size
    sieve[0] = False
    for i in range(int(n**0.5)//3+1):
      if sieve[i]:
        k=3*i+1|1
        start = (k*k+4*k-2*k*(i&1))//3
        sieve[(k*k)//3::2*k]=zero*((size - (k*k)//3 - 1) // (2 * k) + 1)
        sieve[  start ::2*k]=zero*((size -   start  - 1) // (2 * k) + 1)
    ans = [2,3]
    poss = chain(izip(*[range(i, n, 6) for i in (1,5)]))
    ans.extend(compress(poss, sieve))
    return ans

比较:

>>> timeit.timeit('primes.rwh_primes2(10**6)', setup='import primes', number=1)
0.0652179726976101
>>> timeit.timeit('primes.rwh_primes2_python3(10**6)', setup='import primes', number=1)
0.03267321276325674

>>> timeit.timeit('primes.rwh_primes2(10**8)', setup='import primes', number=1)
6.394284538007014
>>> timeit.timeit('primes.rwh_primes2_python3(10**8)', setup='import primes', number=1)
3.833829450302801

If you accept itertools but not numpy, here is an adaptation of rwh_primes2 for Python 3 that runs about twice as fast on my machine. The only substantial change is using a bytearray instead of a list for the boolean, and using compress instead of a list comprehension to build the final list. (I’d add this as a comment like moarningsun if I were able.)

import itertools
izip = itertools.zip_longest
chain = itertools.chain.from_iterable
compress = itertools.compress
def rwh_primes2_python3(n):
    """ Input n>=6, Returns a list of primes, 2 <= p < n """
    zero = bytearray([False])
    size = n//3 + (n % 6 == 2)
    sieve = bytearray([True]) * size
    sieve[0] = False
    for i in range(int(n**0.5)//3+1):
      if sieve[i]:
        k=3*i+1|1
        start = (k*k+4*k-2*k*(i&1))//3
        sieve[(k*k)//3::2*k]=zero*((size - (k*k)//3 - 1) // (2 * k) + 1)
        sieve[  start ::2*k]=zero*((size -   start  - 1) // (2 * k) + 1)
    ans = [2,3]
    poss = chain(izip(*[range(i, n, 6) for i in (1,5)]))
    ans.extend(compress(poss, sieve))
    return ans

Comparisons:

>>> timeit.timeit('primes.rwh_primes2(10**6)', setup='import primes', number=1)
0.0652179726976101
>>> timeit.timeit('primes.rwh_primes2_python3(10**6)', setup='import primes', number=1)
0.03267321276325674

and

>>> timeit.timeit('primes.rwh_primes2(10**8)', setup='import primes', number=1)
6.394284538007014
>>> timeit.timeit('primes.rwh_primes2_python3(10**8)', setup='import primes', number=1)
3.833829450302801

回答 8

编写自己的主要发现代码很有启发性,但是手头有一个快速可靠的库也很有用。我围绕C ++库primesieve编写了一个包装器,将其命名为primesieve-python

尝试一下 pip install primesieve

import primesieve
primes = primesieve.generate_primes(10**8)

我很好奇看到速度比较。

It’s instructive to write your own prime finding code, but it’s also useful to have a fast reliable library at hand. I wrote a wrapper around the C++ library primesieve, named it primesieve-python

Try it pip install primesieve

import primesieve
primes = primesieve.generate_primes(10**8)

I’d be curious to see the speed compared.


回答 9

这是最快的功能之一的两个更新版本(纯Python 3.6),

from itertools import compress

def rwh_primes1v1(n):
    """ Returns  a list of primes < n for n > 2 """
    sieve = bytearray([True]) * (n//2)
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i//2]:
            sieve[i*i//2::i] = bytearray((n-i*i-1)//(2*i)+1)
    return [2,*compress(range(3,n,2), sieve[1:])]

def rwh_primes1v2(n):
    """ Returns a list of primes < n for n > 2 """
    sieve = bytearray([True]) * (n//2+1)
    for i in range(1,int(n**0.5)//2+1):
        if sieve[i]:
            sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1)
    return [2,*compress(range(3,n,2), sieve[1:])]

Here is two updated (pure Python 3.6) versions of one of the fastest functions,

from itertools import compress

def rwh_primes1v1(n):
    """ Returns  a list of primes < n for n > 2 """
    sieve = bytearray([True]) * (n//2)
    for i in range(3,int(n**0.5)+1,2):
        if sieve[i//2]:
            sieve[i*i//2::i] = bytearray((n-i*i-1)//(2*i)+1)
    return [2,*compress(range(3,n,2), sieve[1:])]

def rwh_primes1v2(n):
    """ Returns a list of primes < n for n > 2 """
    sieve = bytearray([True]) * (n//2+1)
    for i in range(1,int(n**0.5)//2+1):
        if sieve[i]:
            sieve[2*i*(i+1)::2*i+1] = bytearray((n//2-2*i*(i+1))//(2*i+1)+1)
    return [2,*compress(range(3,n,2), sieve[1:])]

回答 10

在N <9,080,191的假设下确定性实施Miller-Rabin素数检验

import sys
import random

def miller_rabin_pass(a, n):
    d = n - 1
    s = 0
    while d % 2 == 0:
        d >>= 1
        s += 1

    a_to_power = pow(a, d, n)
    if a_to_power == 1:
        return True
    for i in xrange(s-1):
        if a_to_power == n - 1:
            return True
        a_to_power = (a_to_power * a_to_power) % n
    return a_to_power == n - 1


def miller_rabin(n):
    for a in [2, 3, 37, 73]:
      if not miller_rabin_pass(a, n):
        return False
    return True


n = int(sys.argv[1])
primes = [2]
for p in range(3,n,2):
  if miller_rabin(p):
    primes.append(p)
print len(primes)

根据Wikipedia上的文章(http://en.wikipedia.org/wiki/Miller–Rabin_primality_test),测试N <9,080,191的a = 2,3,37和73足以确定N是否为复合值。

然后,我从此处找到的原始Miller-Rabin测试的概率实现中改编了源代码:http : //en.literateprograms.org/Miller-Rabin_primality_test_(Python)

A deterministic implementation of Miller-Rabin’s Primality test on the assumption that N < 9,080,191

import sys
import random

def miller_rabin_pass(a, n):
    d = n - 1
    s = 0
    while d % 2 == 0:
        d >>= 1
        s += 1

    a_to_power = pow(a, d, n)
    if a_to_power == 1:
        return True
    for i in xrange(s-1):
        if a_to_power == n - 1:
            return True
        a_to_power = (a_to_power * a_to_power) % n
    return a_to_power == n - 1


def miller_rabin(n):
    for a in [2, 3, 37, 73]:
      if not miller_rabin_pass(a, n):
        return False
    return True


n = int(sys.argv[1])
primes = [2]
for p in range(3,n,2):
  if miller_rabin(p):
    primes.append(p)
print len(primes)

According to the article on Wikipedia (http://en.wikipedia.org/wiki/Miller–Rabin_primality_test) testing N < 9,080,191 for a = 2,3,37, and 73 is enough to decide whether N is composite or not.

And I adapted the source code from the probabilistic implementation of original Miller-Rabin’s test found here: http://en.literateprograms.org/Miller-Rabin_primality_test_(Python)


回答 11

如果您可以控制N,则列出所有素数的最快方法是预先计算它们。说真的 预计算是一种被忽略的优化方法。

If you have control over N, the very fastest way to list all primes is to precompute them. Seriously. Precomputing is a way overlooked optimization.


回答 12

这是我通常用于在Python中生成素数的代码:

$ python -mtimeit -s'import sieve' 'sieve.sieve(1000000)' 
10 loops, best of 3: 445 msec per loop
$ cat sieve.py
from math import sqrt

def sieve(size):
 prime=[True]*size
 rng=xrange
 limit=int(sqrt(size))

 for i in rng(3,limit+1,+2):
  if prime[i]:
   prime[i*i::+i]=[False]*len(prime[i*i::+i])

 return [2]+[i for i in rng(3,size,+2) if prime[i]]

if __name__=='__main__':
 print sieve(100)

它无法与此处发布的更快的解决方案竞争,但至少它是纯python。

感谢您发布此问题。今天我真的学到了很多东西。

Here’s the code I normally use to generate primes in Python:

$ python -mtimeit -s'import sieve' 'sieve.sieve(1000000)' 
10 loops, best of 3: 445 msec per loop
$ cat sieve.py
from math import sqrt

def sieve(size):
 prime=[True]*size
 rng=xrange
 limit=int(sqrt(size))

 for i in rng(3,limit+1,+2):
  if prime[i]:
   prime[i*i::+i]=[False]*len(prime[i*i::+i])

 return [2]+[i for i in rng(3,size,+2) if prime[i]]

if __name__=='__main__':
 print sieve(100)

It can’t compete with the faster solutions posted here, but at least it is pure python.

Thanks for posting this question. I really learnt a lot today.


回答 13

对于最快的代码,numpy解决方案是最好的。不过,出于纯粹的学术原因,我要发布我的纯python版本,该版本比上面发布的食谱版本快50%。由于我将整个列表存储在内存中,因此您需要足够的空间来容纳所有内容,但它似乎可以很好地扩展。

def daniel_sieve_2(maxNumber):
    """
    Given a number, returns all numbers less than or equal to
    that number which are prime.
    """
    allNumbers = range(3, maxNumber+1, 2)
    for mIndex, number in enumerate(xrange(3, maxNumber+1, 2)):
        if allNumbers[mIndex] == 0:
            continue
        # now set all multiples to 0
        for index in xrange(mIndex+number, (maxNumber-3)/2+1, number):
            allNumbers[index] = 0
    return [2] + filter(lambda n: n!=0, allNumbers)

结果:

>>>mine = timeit.Timer("daniel_sieve_2(1000000)",
...                    "from sieves import daniel_sieve_2")
>>>prev = timeit.Timer("get_primes_erat(1000000)",
...                    "from sieves import get_primes_erat")
>>>print "Mine: {0:0.4f} ms".format(min(mine.repeat(3, 1))*1000)
Mine: 428.9446 ms
>>>print "Previous Best {0:0.4f} ms".format(min(prev.repeat(3, 1))*1000)
Previous Best 621.3581 ms

For the fastest code, the numpy solution is the best. For purely academic reasons, though, I’m posting my pure python version, which is a bit less than 50% faster than the cookbook version posted above. Since I make the entire list in memory, you need enough space to hold everything, but it seems to scale fairly well.

def daniel_sieve_2(maxNumber):
    """
    Given a number, returns all numbers less than or equal to
    that number which are prime.
    """
    allNumbers = range(3, maxNumber+1, 2)
    for mIndex, number in enumerate(xrange(3, maxNumber+1, 2)):
        if allNumbers[mIndex] == 0:
            continue
        # now set all multiples to 0
        for index in xrange(mIndex+number, (maxNumber-3)/2+1, number):
            allNumbers[index] = 0
    return [2] + filter(lambda n: n!=0, allNumbers)

And the results:

>>>mine = timeit.Timer("daniel_sieve_2(1000000)",
...                    "from sieves import daniel_sieve_2")
>>>prev = timeit.Timer("get_primes_erat(1000000)",
...                    "from sieves import get_primes_erat")
>>>print "Mine: {0:0.4f} ms".format(min(mine.repeat(3, 1))*1000)
Mine: 428.9446 ms
>>>print "Previous Best {0:0.4f} ms".format(min(prev.repeat(3, 1))*1000)
Previous Best 621.3581 ms

回答 14

使用Numpy的半筛的实现略有不同:

http://rebrained.com/?p=458

导入数学
导入numpy
def prime6(最高):
    primes = numpy.arange(3,最高+1,2)
    isprime = numpy.ones((最多1)/ 2,dtype = bool)
    对于素数[:int(math.sqrt(upto))]:
        如果isprime [(factor-2)/ 2]:isprime [(factor * 3-2)/ 2:(upto-1)/ 2:factor] = 0
    返回numpy.insert(primes [isprime],0,2)

有人可以将此与其他时间进行比较吗?在我的机器上,它看起来可以与其他Numpy半筛媲美。

A slightly different implementation of a half sieve using Numpy:

http://rebrained.com/?p=458

import math
import numpy
def prime6(upto):
    primes=numpy.arange(3,upto+1,2)
    isprime=numpy.ones((upto-1)/2,dtype=bool)
    for factor in primes[:int(math.sqrt(upto))]:
        if isprime[(factor-2)/2]: isprime[(factor*3-2)/2:(upto-1)/2:factor]=0
    return numpy.insert(primes[isprime],0,2)

Can someone compare this with the other timings? On my machine it seems pretty comparable to the other Numpy half-sieve.


回答 15

全部都经过编写和测试。因此,无需重新发明轮子。

python -m timeit -r10 -s"from sympy import sieve" "primes = list(sieve.primerange(1, 10**6))"

给我们打破了12.2毫秒的记录!

10 loops, best of 10: 12.2 msec per loop

如果这还不够快,您可以尝试PyPy:

pypy -m timeit -r10 -s"from sympy import sieve" "primes = list(sieve.primerange(1, 10**6))"

结果是:

10 loops, best of 10: 2.03 msec per loop

247次投票的答案列出了15.9毫秒的最佳解决方案。比较一下!!!

It’s all written and tested. So there is no need to reinvent the wheel.

python -m timeit -r10 -s"from sympy import sieve" "primes = list(sieve.primerange(1, 10**6))"

gives us a record breaking 12.2 msec!

10 loops, best of 10: 12.2 msec per loop

If this is not fast enough, you can try PyPy:

pypy -m timeit -r10 -s"from sympy import sieve" "primes = list(sieve.primerange(1, 10**6))"

which results in:

10 loops, best of 10: 2.03 msec per loop

The answer with 247 up-votes lists 15.9 ms for the best solution. Compare this!!!


回答 16

我测试了一些unutbu的函数,用饥饿的数百万来计算

获奖者是使用numpy库的函数,

注意:进行内存利用率测试也很有趣:)

计算时间结果

样例代码

我的github存储库上的完整代码

#!/usr/bin/env python

import lib
import timeit
import sys
import math
import datetime

import prettyplotlib as ppl
import numpy as np

import matplotlib.pyplot as plt
from prettyplotlib import brewer2mpl

primenumbers_gen = [
    'sieveOfEratosthenes',
    'ambi_sieve',
    'ambi_sieve_plain',
    'sundaram3',
    'sieve_wheel_30',
    'primesfrom3to',
    'primesfrom2to',
    'rwh_primes',
    'rwh_primes1',
    'rwh_primes2',
]

def human_format(num):
    # /programming/579310/formatting-long-numbers-as-strings-in-python?answertab=active#tab-top
    magnitude = 0
    while abs(num) >= 1000:
        magnitude += 1
        num /= 1000.0
    # add more suffixes if you need them
    return '%.2f%s' % (num, ['', 'K', 'M', 'G', 'T', 'P'][magnitude])


if __name__=='__main__':

    # Vars
    n = 10000000 # number itereration generator
    nbcol = 5 # For decompose prime number generator
    nb_benchloop = 3 # Eliminate false positive value during the test (bench average time)
    datetimeformat = '%Y-%m-%d %H:%M:%S.%f'
    config = 'from __main__ import n; import lib'
    primenumbers_gen = {
        'sieveOfEratosthenes': {'color': 'b'},
        'ambi_sieve': {'color': 'b'},
        'ambi_sieve_plain': {'color': 'b'},
         'sundaram3': {'color': 'b'},
        'sieve_wheel_30': {'color': 'b'},
# # #        'primesfrom2to': {'color': 'b'},
        'primesfrom3to': {'color': 'b'},
        # 'rwh_primes': {'color': 'b'},
        # 'rwh_primes1': {'color': 'b'},
        'rwh_primes2': {'color': 'b'},
    }


    # Get n in command line
    if len(sys.argv)>1:
        n = int(sys.argv[1])

    step = int(math.ceil(n / float(nbcol)))
    nbs = np.array([i * step for i in range(1, int(nbcol) + 1)])
    set2 = brewer2mpl.get_map('Paired', 'qualitative', 12).mpl_colors

    print datetime.datetime.now().strftime(datetimeformat)
    print("Compute prime number to %(n)s" % locals())
    print("")

    results = dict()
    for pgen in primenumbers_gen:
        results[pgen] = dict()
        benchtimes = list()
        for n in nbs:
            t = timeit.Timer("lib.%(pgen)s(n)" % locals(), setup=config)
            execute_times = t.repeat(repeat=nb_benchloop,number=1)
            benchtime = np.mean(execute_times)
            benchtimes.append(benchtime)
        results[pgen] = {'benchtimes':np.array(benchtimes)}

fig, ax = plt.subplots(1)
plt.ylabel('Computation time (in second)')
plt.xlabel('Numbers computed')
i = 0
for pgen in primenumbers_gen:

    bench = results[pgen]['benchtimes']
    avgs = np.divide(bench,nbs)
    avg = np.average(bench, weights=nbs)

    # Compute linear regression
    A = np.vstack([nbs, np.ones(len(nbs))]).T
    a, b = np.linalg.lstsq(A, nbs*avgs)[0]

    # Plot
    i += 1
    #label="%(pgen)s" % locals()
    #ppl.plot(nbs, nbs*avgs, label=label, lw=1, linestyle='--', color=set2[i % 12])
    label="%(pgen)s avg" % locals()
    ppl.plot(nbs, a * nbs + b, label=label, lw=2, color=set2[i % 12])
print datetime.datetime.now().strftime(datetimeformat)

ppl.legend(ax, loc='upper left', ncol=4)

# Change x axis label
ax.get_xaxis().get_major_formatter().set_scientific(False)
fig.canvas.draw()
labels = [human_format(int(item.get_text())) for item in ax.get_xticklabels()]

ax.set_xticklabels(labels)
ax = plt.gca()

plt.show()

I tested some unutbu’s functions, i computed it with hungred millions number

The winners are the functions that use numpy library,

Note: It would also interesting make a memory utilization test :)

Computation time result

Sample code

Complete code on my github repository

#!/usr/bin/env python

import lib
import timeit
import sys
import math
import datetime

import prettyplotlib as ppl
import numpy as np

import matplotlib.pyplot as plt
from prettyplotlib import brewer2mpl

primenumbers_gen = [
    'sieveOfEratosthenes',
    'ambi_sieve',
    'ambi_sieve_plain',
    'sundaram3',
    'sieve_wheel_30',
    'primesfrom3to',
    'primesfrom2to',
    'rwh_primes',
    'rwh_primes1',
    'rwh_primes2',
]

def human_format(num):
    # https://stackoverflow.com/questions/579310/formatting-long-numbers-as-strings-in-python?answertab=active#tab-top
    magnitude = 0
    while abs(num) >= 1000:
        magnitude += 1
        num /= 1000.0
    # add more suffixes if you need them
    return '%.2f%s' % (num, ['', 'K', 'M', 'G', 'T', 'P'][magnitude])


if __name__=='__main__':

    # Vars
    n = 10000000 # number itereration generator
    nbcol = 5 # For decompose prime number generator
    nb_benchloop = 3 # Eliminate false positive value during the test (bench average time)
    datetimeformat = '%Y-%m-%d %H:%M:%S.%f'
    config = 'from __main__ import n; import lib'
    primenumbers_gen = {
        'sieveOfEratosthenes': {'color': 'b'},
        'ambi_sieve': {'color': 'b'},
        'ambi_sieve_plain': {'color': 'b'},
         'sundaram3': {'color': 'b'},
        'sieve_wheel_30': {'color': 'b'},
# # #        'primesfrom2to': {'color': 'b'},
        'primesfrom3to': {'color': 'b'},
        # 'rwh_primes': {'color': 'b'},
        # 'rwh_primes1': {'color': 'b'},
        'rwh_primes2': {'color': 'b'},
    }


    # Get n in command line
    if len(sys.argv)>1:
        n = int(sys.argv[1])

    step = int(math.ceil(n / float(nbcol)))
    nbs = np.array([i * step for i in range(1, int(nbcol) + 1)])
    set2 = brewer2mpl.get_map('Paired', 'qualitative', 12).mpl_colors

    print datetime.datetime.now().strftime(datetimeformat)
    print("Compute prime number to %(n)s" % locals())
    print("")

    results = dict()
    for pgen in primenumbers_gen:
        results[pgen] = dict()
        benchtimes = list()
        for n in nbs:
            t = timeit.Timer("lib.%(pgen)s(n)" % locals(), setup=config)
            execute_times = t.repeat(repeat=nb_benchloop,number=1)
            benchtime = np.mean(execute_times)
            benchtimes.append(benchtime)
        results[pgen] = {'benchtimes':np.array(benchtimes)}

fig, ax = plt.subplots(1)
plt.ylabel('Computation time (in second)')
plt.xlabel('Numbers computed')
i = 0
for pgen in primenumbers_gen:

    bench = results[pgen]['benchtimes']
    avgs = np.divide(bench,nbs)
    avg = np.average(bench, weights=nbs)

    # Compute linear regression
    A = np.vstack([nbs, np.ones(len(nbs))]).T
    a, b = np.linalg.lstsq(A, nbs*avgs)[0]

    # Plot
    i += 1
    #label="%(pgen)s" % locals()
    #ppl.plot(nbs, nbs*avgs, label=label, lw=1, linestyle='--', color=set2[i % 12])
    label="%(pgen)s avg" % locals()
    ppl.plot(nbs, a * nbs + b, label=label, lw=2, color=set2[i % 12])
print datetime.datetime.now().strftime(datetimeformat)

ppl.legend(ax, loc='upper left', ncol=4)

# Change x axis label
ax.get_xaxis().get_major_formatter().set_scientific(False)
fig.canvas.draw()
labels = [human_format(int(item.get_text())) for item in ax.get_xticklabels()]

ax.set_xticklabels(labels)
ax = plt.gca()

plt.show()

回答 17

对于Python 3

def rwh_primes2(n):
    correction = (n%6>1)
    n = {0:n,1:n-1,2:n+4,3:n+3,4:n+2,5:n+1}[n%6]
    sieve = [True] * (n//3)
    sieve[0] = False
    for i in range(int(n**0.5)//3+1):
      if sieve[i]:
        k=3*i+1|1
        sieve[      ((k*k)//3)      ::2*k]=[False]*((n//6-(k*k)//6-1)//k+1)
        sieve[(k*k+4*k-2*k*(i&1))//3::2*k]=[False]*((n//6-(k*k+4*k-2*k*(i&1))//6-1)//k+1)
    return [2,3] + [3*i+1|1 for i in range(1,n//3-correction) if sieve[i]]

For Python 3

def rwh_primes2(n):
    correction = (n%6>1)
    n = {0:n,1:n-1,2:n+4,3:n+3,4:n+2,5:n+1}[n%6]
    sieve = [True] * (n//3)
    sieve[0] = False
    for i in range(int(n**0.5)//3+1):
      if sieve[i]:
        k=3*i+1|1
        sieve[      ((k*k)//3)      ::2*k]=[False]*((n//6-(k*k)//6-1)//k+1)
        sieve[(k*k+4*k-2*k*(i&1))//3::2*k]=[False]*((n//6-(k*k+4*k-2*k*(i&1))//6-1)//k+1)
    return [2,3] + [3*i+1|1 for i in range(1,n//3-correction) if sieve[i]]

回答 18

纯Python中最快的基本筛

from itertools import compress

def half_sieve(n):
    """
    Returns a list of prime numbers less than `n`.
    """
    if n <= 2:
        return []
    sieve = bytearray([True]) * (n // 2)
    for i in range(3, int(n ** 0.5) + 1, 2):
        if sieve[i // 2]:
            sieve[i * i // 2::i] = bytearray((n - i * i - 1) // (2 * i) + 1)
    primes = list(compress(range(1, n, 2), sieve))
    primes[0] = 2
    return primes

我优化了Eratosthenes筛的速度和内存。

基准测试

from time import clock
import platform

def benchmark(iterations, limit):
    start = clock()
    for x in range(iterations):
        half_sieve(limit)
    end = clock() - start
    print(f'{end/iterations:.4f} seconds for primes < {limit}')

if __name__ == '__main__':
    print(platform.python_version())
    print(platform.platform())
    print(platform.processor())
    it = 10
    for pw in range(4, 9):
        benchmark(it, 10**pw)

输出量

>>> 3.6.7
>>> Windows-10-10.0.17763-SP0
>>> Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
>>> 0.0003 seconds for primes < 10000
>>> 0.0021 seconds for primes < 100000
>>> 0.0204 seconds for primes < 1000000
>>> 0.2389 seconds for primes < 10000000
>>> 2.6702 seconds for primes < 100000000

Fastest prime sieve in Pure Python:

from itertools import compress

def half_sieve(n):
    """
    Returns a list of prime numbers less than `n`.
    """
    if n <= 2:
        return []
    sieve = bytearray([True]) * (n // 2)
    for i in range(3, int(n ** 0.5) + 1, 2):
        if sieve[i // 2]:
            sieve[i * i // 2::i] = bytearray((n - i * i - 1) // (2 * i) + 1)
    primes = list(compress(range(1, n, 2), sieve))
    primes[0] = 2
    return primes

I optimised Sieve of Eratosthenes for speed and memory.

Benchmark

from time import clock
import platform

def benchmark(iterations, limit):
    start = clock()
    for x in range(iterations):
        half_sieve(limit)
    end = clock() - start
    print(f'{end/iterations:.4f} seconds for primes < {limit}')

if __name__ == '__main__':
    print(platform.python_version())
    print(platform.platform())
    print(platform.processor())
    it = 10
    for pw in range(4, 9):
        benchmark(it, 10**pw)

Output

>>> 3.6.7
>>> Windows-10-10.0.17763-SP0
>>> Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
>>> 0.0003 seconds for primes < 10000
>>> 0.0021 seconds for primes < 100000
>>> 0.0204 seconds for primes < 1000000
>>> 0.2389 seconds for primes < 10000000
>>> 2.6702 seconds for primes < 100000000

回答 19

第一次使用python,因此我在其中使用的某些方法似乎有点麻烦。我只是将我的c ++代码直接转换为python,这就是我所拥有的(尽管在python中有点slowww)

#!/usr/bin/env python
import time

def GetPrimes(n):

    Sieve = [1 for x in xrange(n)]

    Done = False
    w = 3

    while not Done:

        for q in xrange (3, n, 2):
            Prod = w*q
            if Prod < n:
                Sieve[Prod] = 0
            else:
                break

        if w > (n/2):
            Done = True
        w += 2

    return Sieve



start = time.clock()

d = 10000000
Primes = GetPrimes(d)

count = 1 #This is for 2

for x in xrange (3, d, 2):
    if Primes[x]:
        count+=1

elapsed = (time.clock() - start)
print "\nFound", count, "primes in", elapsed, "seconds!\n"

pythonw Primes.py

在12.799119秒内找到664579质数!

#!/usr/bin/env python
import time

def GetPrimes2(n):

    Sieve = [1 for x in xrange(n)]

    for q in xrange (3, n, 2):
        k = q
        for y in xrange(k*3, n, k*2):
            Sieve[y] = 0

    return Sieve



start = time.clock()

d = 10000000
Primes = GetPrimes2(d)

count = 1 #This is for 2

for x in xrange (3, d, 2):
    if Primes[x]:
        count+=1

elapsed = (time.clock() - start)
print "\nFound", count, "primes in", elapsed, "seconds!\n"

pythonw Primes2.py

在10.230172秒内找到664579质数!

#!/usr/bin/env python
import time

def GetPrimes3(n):

    Sieve = [1 for x in xrange(n)]

    for q in xrange (3, n, 2):
        k = q
        for y in xrange(k*k, n, k << 1):
            Sieve[y] = 0

    return Sieve



start = time.clock()

d = 10000000
Primes = GetPrimes3(d)

count = 1 #This is for 2

for x in xrange (3, d, 2):
    if Primes[x]:
        count+=1

elapsed = (time.clock() - start)
print "\nFound", count, "primes in", elapsed, "seconds!\n"

Python Primes2.py

在7.113776秒内发现664579质数!

First time using python, so some of the methods I use in this might seem a bit cumbersome. I just straight converted my c++ code to python and this is what I have (albeit a tad bit slowww in python)

#!/usr/bin/env python
import time

def GetPrimes(n):

    Sieve = [1 for x in xrange(n)]

    Done = False
    w = 3

    while not Done:

        for q in xrange (3, n, 2):
            Prod = w*q
            if Prod < n:
                Sieve[Prod] = 0
            else:
                break

        if w > (n/2):
            Done = True
        w += 2

    return Sieve



start = time.clock()

d = 10000000
Primes = GetPrimes(d)

count = 1 #This is for 2

for x in xrange (3, d, 2):
    if Primes[x]:
        count+=1

elapsed = (time.clock() - start)
print "\nFound", count, "primes in", elapsed, "seconds!\n"

pythonw Primes.py

Found 664579 primes in 12.799119 seconds!

#!/usr/bin/env python
import time

def GetPrimes2(n):

    Sieve = [1 for x in xrange(n)]

    for q in xrange (3, n, 2):
        k = q
        for y in xrange(k*3, n, k*2):
            Sieve[y] = 0

    return Sieve



start = time.clock()

d = 10000000
Primes = GetPrimes2(d)

count = 1 #This is for 2

for x in xrange (3, d, 2):
    if Primes[x]:
        count+=1

elapsed = (time.clock() - start)
print "\nFound", count, "primes in", elapsed, "seconds!\n"

pythonw Primes2.py

Found 664579 primes in 10.230172 seconds!

#!/usr/bin/env python
import time

def GetPrimes3(n):

    Sieve = [1 for x in xrange(n)]

    for q in xrange (3, n, 2):
        k = q
        for y in xrange(k*k, n, k << 1):
            Sieve[y] = 0

    return Sieve



start = time.clock()

d = 10000000
Primes = GetPrimes3(d)

count = 1 #This is for 2

for x in xrange (3, d, 2):
    if Primes[x]:
        count+=1

elapsed = (time.clock() - start)
print "\nFound", count, "primes in", elapsed, "seconds!\n"

python Primes2.py

Found 664579 primes in 7.113776 seconds!


回答 20

我知道比赛已经结束了几年。…

尽管如此,这是我对纯python主筛的建议,它基于在处理向前的筛子时通过使用适当的步骤来省略2、3和5的倍数。但是,对于N <10 ^ 9而言,它实际上要比@Robert William Hanks高级解决方案rwh_primes2和rwh_primes1慢。通过使用高于1.5 * 10 ^ 8的ctypes.c_ushort筛子数组,可以以某种方式适应内存限制。

10 ^ 6

$ python -mtimeit -s“导入primeSieveSpeedComp”“ primeSieveSpeedComp.primeSieveSeq(1000000)” 10个循环,每个循环最好3:46.7毫秒

比较:$ python -mtimeit -s“导入primeSieveSpeedComp”“ primeSieveSpeedComp.rwh_primes1(1000000)” 10个循环,最好是3个循环:每个循环要进行43.2毫秒比较:$ python -m timeit -s“ import primeSieveSpeedComp”“ primeSieveSpeedComp.rwh_primes2 (1000000)“ 10个循环,最好为3:每个循环34.5毫秒

10 ^ 7

$ python -mtimeit -s“导入primeSieveSpeedComp”“ primeSieveSpeedComp.primeSieveSeq(10000000)” 10个循环,每循环最好530毫秒

比较:$ python -mtimeit -s“导入primeSieveSpeedComp”“ primeSieveSpeedComp.rwh_primes1(10000000)” 10个循环,最好是3个循环:494毫秒比较:$ python -m timeit -s“ import primeSieveSpeedComp”“ primeSieveSpeedComp.rwh_primes2 (10000000)“ 10个循环,每个循环最好3:375毫秒

10 ^ 8

$ python -mtimeit -s“ import primeSieveSpeedComp”“ primeSieveSpeedComp.primeSieveSeq(100000000)” 10个循环,每循环最好3:5.55秒

比较:$ python -mtimeit -s“导入primeSieveSpeedComp”“ primeSieveSpeedComp.rwh_primes1(100000000)” 10个循环,最好每个循环进行3:5.33秒进行比较:$ python -m timeit -s“ import primeSieveSpeedComp”“ primeSieveSpeedComp.rwh_primes2 (100000000)“ 10个循环,每循环3:3.95秒的最佳时间

10 ^ 9

$ python -mtimeit -s“ import primeSieveSpeedComp”“ primeSieveSpeedComp.primeSieveSeq(1000000000)” 10个循环,最好3个循环:每个循环61.2

比较:$ python -mtimeit -n 3 -s“ import primeSieveSpeedComp”“ primeSieveSpeedComp.rwh_primes1(1000000000)” 3个循环,最佳3:97.8 每个循环秒

比较:$ python -m timeit -s“ import primeSieveSpeedComp”“ primeSieveSpeedComp.rwh_primes2(1000000000)” 10个循环,最好3个循环:每个循环41.9秒

您可以将以下代码复制到ubuntus primeSieveSpeedComp中,以查看此测试。

def primeSieveSeq(MAX_Int):
    if MAX_Int > 5*10**8:
        import ctypes
        int16Array = ctypes.c_ushort * (MAX_Int >> 1)
        sieve = int16Array()
        #print 'uses ctypes "unsigned short int Array"'
    else:
        sieve = (MAX_Int >> 1) * [False]
        #print 'uses python list() of long long int'
    if MAX_Int < 10**8:
        sieve[4::3] = [True]*((MAX_Int - 8)/6+1)
        sieve[12::5] = [True]*((MAX_Int - 24)/10+1)
    r = [2, 3, 5]
    n = 0
    for i in xrange(int(MAX_Int**0.5)/30+1):
        n += 3
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 2
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 1
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 2
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 1
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 2
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 3
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 1
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
    if MAX_Int < 10**8:
        return [2, 3, 5]+[(p << 1) + 1 for p in [n for n in xrange(3, MAX_Int >> 1) if not sieve[n]]]
    n = n >> 1
    try:
        for i in xrange((MAX_Int-2*n)/30 + 1):
            n += 3
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 2
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 1
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 2
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 1
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 2
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 3
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 1
            if not sieve[n]:
                r.append((n << 1) + 1)
    except:
        pass
    return r

I know the competition is closed for some years. …

Nonetheless this is my suggestion for a pure python prime sieve, based on omitting the multiples of 2, 3 and 5 by using appropriate steps while processing the sieve forward. Nonetheless it is actually slower for N<10^9 than @Robert William Hanks superior solutions rwh_primes2 and rwh_primes1. By using a ctypes.c_ushort sieve array above 1.5* 10^8 it is somehow adaptive to memory limits.

10^6

$ python -mtimeit -s”import primeSieveSpeedComp” “primeSieveSpeedComp.primeSieveSeq(1000000)” 10 loops, best of 3: 46.7 msec per loop

to compare:$ python -mtimeit -s”import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes1(1000000)” 10 loops, best of 3: 43.2 msec per loop to compare: $ python -m timeit -s”import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes2(1000000)” 10 loops, best of 3: 34.5 msec per loop

10^7

$ python -mtimeit -s”import primeSieveSpeedComp” “primeSieveSpeedComp.primeSieveSeq(10000000)” 10 loops, best of 3: 530 msec per loop

to compare:$ python -mtimeit -s”import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes1(10000000)” 10 loops, best of 3: 494 msec per loop to compare: $ python -m timeit -s”import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes2(10000000)” 10 loops, best of 3: 375 msec per loop

10^8

$ python -mtimeit -s”import primeSieveSpeedComp” “primeSieveSpeedComp.primeSieveSeq(100000000)” 10 loops, best of 3: 5.55 sec per loop

to compare: $ python -mtimeit -s”import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes1(100000000)” 10 loops, best of 3: 5.33 sec per loop to compare: $ python -m timeit -s”import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes2(100000000)” 10 loops, best of 3: 3.95 sec per loop

10^9

$ python -mtimeit -s”import primeSieveSpeedComp” “primeSieveSpeedComp.primeSieveSeq(1000000000)” 10 loops, best of 3: 61.2 sec per loop

to compare: $ python -mtimeit -n 3 -s”import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes1(1000000000)” 3 loops, best of 3: 97.8 sec per loop

to compare: $ python -m timeit -s”import primeSieveSpeedComp” “primeSieveSpeedComp.rwh_primes2(1000000000)” 10 loops, best of 3: 41.9 sec per loop

You may copy the code below into ubuntus primeSieveSpeedComp to review this tests.

def primeSieveSeq(MAX_Int):
    if MAX_Int > 5*10**8:
        import ctypes
        int16Array = ctypes.c_ushort * (MAX_Int >> 1)
        sieve = int16Array()
        #print 'uses ctypes "unsigned short int Array"'
    else:
        sieve = (MAX_Int >> 1) * [False]
        #print 'uses python list() of long long int'
    if MAX_Int < 10**8:
        sieve[4::3] = [True]*((MAX_Int - 8)/6+1)
        sieve[12::5] = [True]*((MAX_Int - 24)/10+1)
    r = [2, 3, 5]
    n = 0
    for i in xrange(int(MAX_Int**0.5)/30+1):
        n += 3
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 2
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 1
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 2
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 1
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 2
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 3
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
        n += 1
        if not sieve[n]:
            n2 = (n << 1) + 1
            r.append(n2)
            n2q = (n2**2) >> 1
            sieve[n2q::n2] = [True]*(((MAX_Int >> 1) - n2q - 1) / n2 + 1)
    if MAX_Int < 10**8:
        return [2, 3, 5]+[(p << 1) + 1 for p in [n for n in xrange(3, MAX_Int >> 1) if not sieve[n]]]
    n = n >> 1
    try:
        for i in xrange((MAX_Int-2*n)/30 + 1):
            n += 3
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 2
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 1
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 2
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 1
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 2
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 3
            if not sieve[n]:
                r.append((n << 1) + 1)
            n += 1
            if not sieve[n]:
                r.append((n << 1) + 1)
    except:
        pass
    return r

回答 21

这是Eratosthenes筛子的numpy版本,具有良好的复杂性(比对长度为n的数组排序要低)和矢量化。与@unutbu相比,此速度与使用46微微秒的软件包查找所有小于一百万的质数的速度一样快。

import numpy as np 
def generate_primes(n):
    is_prime = np.ones(n+1,dtype=bool)
    is_prime[0:2] = False
    for i in range(int(n**0.5)+1):
        if is_prime[i]:
            is_prime[i**2::i]=False
    return np.where(is_prime)[0]

时间:

import time    
for i in range(2,10):
    timer =time.time()
    generate_primes(10**i)
    print('n = 10^',i,' time =', round(time.time()-timer,6))

>> n = 10^ 2  time = 5.6e-05
>> n = 10^ 3  time = 6.4e-05
>> n = 10^ 4  time = 0.000114
>> n = 10^ 5  time = 0.000593
>> n = 10^ 6  time = 0.00467
>> n = 10^ 7  time = 0.177758
>> n = 10^ 8  time = 1.701312
>> n = 10^ 9  time = 19.322478

Here is a numpy version of Sieve of Eratosthenes having both good complexity (lower than sorting an array of length n) and vectorization. Compared to @unutbu times this just as fast as the packages with 46 microsecons to find all primes below a million.

import numpy as np 
def generate_primes(n):
    is_prime = np.ones(n+1,dtype=bool)
    is_prime[0:2] = False
    for i in range(int(n**0.5)+1):
        if is_prime[i]:
            is_prime[i**2::i]=False
    return np.where(is_prime)[0]

Timings:

import time    
for i in range(2,10):
    timer =time.time()
    generate_primes(10**i)
    print('n = 10^',i,' time =', round(time.time()-timer,6))

>> n = 10^ 2  time = 5.6e-05
>> n = 10^ 3  time = 6.4e-05
>> n = 10^ 4  time = 0.000114
>> n = 10^ 5  time = 0.000593
>> n = 10^ 6  time = 0.00467
>> n = 10^ 7  time = 0.177758
>> n = 10^ 8  time = 1.701312
>> n = 10^ 9  time = 19.322478

回答 22

我已经更新了许多Python 3代码,并将其扔到了perfplot(我的一个项目)上,以查看哪个实际上最快。事实证明,对于大蛋糕nprimesfrom{2,3}to可以选择蛋糕:

在此处输入图片说明


复制剧情的代码:

import perfplot
from math import sqrt, ceil
import numpy as np
import sympy


def rwh_primes(n):
    # /programming/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * n
    for i in range(3, int(n ** 0.5) + 1, 2):
        if sieve[i]:
            sieve[i * i::2 * i] = [False] * ((n - i * i - 1) // (2 * i) + 1)
    return [2] + [i for i in range(3, n, 2) if sieve[i]]


def rwh_primes1(n):
    # /programming/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * (n // 2)
    for i in range(3, int(n ** 0.5) + 1, 2):
        if sieve[i // 2]:
            sieve[i * i // 2::i] = [False] * ((n - i * i - 1) // (2 * i) + 1)
    return [2] + [2 * i + 1 for i in range(1, n // 2) if sieve[i]]


def rwh_primes2(n):
    # /programming/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """Input n>=6, Returns a list of primes, 2 <= p < n"""
    assert n >= 6
    correction = n % 6 > 1
    n = {0: n, 1: n - 1, 2: n + 4, 3: n + 3, 4: n + 2, 5: n + 1}[n % 6]
    sieve = [True] * (n // 3)
    sieve[0] = False
    for i in range(int(n ** 0.5) // 3 + 1):
        if sieve[i]:
            k = 3 * i + 1 | 1
            sieve[((k * k) // 3)::2 * k] = [False] * (
                (n // 6 - (k * k) // 6 - 1) // k + 1
            )
            sieve[(k * k + 4 * k - 2 * k * (i & 1)) // 3::2 * k] = [False] * (
                (n // 6 - (k * k + 4 * k - 2 * k * (i & 1)) // 6 - 1) // k + 1
            )
    return [2, 3] + [3 * i + 1 | 1 for i in range(1, n // 3 - correction) if sieve[i]]


def sieve_wheel_30(N):
    # http://zerovolt.com/?p=88
    """ Returns a list of primes <= N using wheel criterion 2*3*5 = 30

Copyright 2009 by zerovolt.com
This code is free for non-commercial purposes, in which case you can just leave this comment as a credit for my work.
If you need this code for commercial purposes, please contact me by sending an email to: info [at] zerovolt [dot] com."""
    __smallp = (
        2,
        3,
        5,
        7,
        11,
        13,
        17,
        19,
        23,
        29,
        31,
        37,
        41,
        43,
        47,
        53,
        59,
        61,
        67,
        71,
        73,
        79,
        83,
        89,
        97,
        101,
        103,
        107,
        109,
        113,
        127,
        131,
        137,
        139,
        149,
        151,
        157,
        163,
        167,
        173,
        179,
        181,
        191,
        193,
        197,
        199,
        211,
        223,
        227,
        229,
        233,
        239,
        241,
        251,
        257,
        263,
        269,
        271,
        277,
        281,
        283,
        293,
        307,
        311,
        313,
        317,
        331,
        337,
        347,
        349,
        353,
        359,
        367,
        373,
        379,
        383,
        389,
        397,
        401,
        409,
        419,
        421,
        431,
        433,
        439,
        443,
        449,
        457,
        461,
        463,
        467,
        479,
        487,
        491,
        499,
        503,
        509,
        521,
        523,
        541,
        547,
        557,
        563,
        569,
        571,
        577,
        587,
        593,
        599,
        601,
        607,
        613,
        617,
        619,
        631,
        641,
        643,
        647,
        653,
        659,
        661,
        673,
        677,
        683,
        691,
        701,
        709,
        719,
        727,
        733,
        739,
        743,
        751,
        757,
        761,
        769,
        773,
        787,
        797,
        809,
        811,
        821,
        823,
        827,
        829,
        839,
        853,
        857,
        859,
        863,
        877,
        881,
        883,
        887,
        907,
        911,
        919,
        929,
        937,
        941,
        947,
        953,
        967,
        971,
        977,
        983,
        991,
        997,
    )
    # wheel = (2, 3, 5)
    const = 30
    if N < 2:
        return []
    if N <= const:
        pos = 0
        while __smallp[pos] <= N:
            pos += 1
        return list(__smallp[:pos])
    # make the offsets list
    offsets = (7, 11, 13, 17, 19, 23, 29, 1)
    # prepare the list
    p = [2, 3, 5]
    dim = 2 + N // const
    tk1 = [True] * dim
    tk7 = [True] * dim
    tk11 = [True] * dim
    tk13 = [True] * dim
    tk17 = [True] * dim
    tk19 = [True] * dim
    tk23 = [True] * dim
    tk29 = [True] * dim
    tk1[0] = False
    # help dictionary d
    # d[a , b] = c  ==> if I want to find the smallest useful multiple of (30*pos)+a
    # on tkc, then I need the index given by the product of [(30*pos)+a][(30*pos)+b]
    # in general. If b < a, I need [(30*pos)+a][(30*(pos+1))+b]
    d = {}
    for x in offsets:
        for y in offsets:
            res = (x * y) % const
            if res in offsets:
                d[(x, res)] = y
    # another help dictionary: gives tkx calling tmptk[x]
    tmptk = {1: tk1, 7: tk7, 11: tk11, 13: tk13, 17: tk17, 19: tk19, 23: tk23, 29: tk29}
    pos, prime, lastadded, stop = 0, 0, 0, int(ceil(sqrt(N)))

    # inner functions definition
    def del_mult(tk, start, step):
        for k in range(start, len(tk), step):
            tk[k] = False

    # end of inner functions definition
    cpos = const * pos
    while prime < stop:
        # 30k + 7
        if tk7[pos]:
            prime = cpos + 7
            p.append(prime)
            lastadded = 7
            for off in offsets:
                tmp = d[(7, off)]
                start = (
                    (pos + prime)
                    if off == 7
                    else (prime * (const * (pos + 1 if tmp < 7 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 11
        if tk11[pos]:
            prime = cpos + 11
            p.append(prime)
            lastadded = 11
            for off in offsets:
                tmp = d[(11, off)]
                start = (
                    (pos + prime)
                    if off == 11
                    else (prime * (const * (pos + 1 if tmp < 11 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 13
        if tk13[pos]:
            prime = cpos + 13
            p.append(prime)
            lastadded = 13
            for off in offsets:
                tmp = d[(13, off)]
                start = (
                    (pos + prime)
                    if off == 13
                    else (prime * (const * (pos + 1 if tmp < 13 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 17
        if tk17[pos]:
            prime = cpos + 17
            p.append(prime)
            lastadded = 17
            for off in offsets:
                tmp = d[(17, off)]
                start = (
                    (pos + prime)
                    if off == 17
                    else (prime * (const * (pos + 1 if tmp < 17 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 19
        if tk19[pos]:
            prime = cpos + 19
            p.append(prime)
            lastadded = 19
            for off in offsets:
                tmp = d[(19, off)]
                start = (
                    (pos + prime)
                    if off == 19
                    else (prime * (const * (pos + 1 if tmp < 19 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 23
        if tk23[pos]:
            prime = cpos + 23
            p.append(prime)
            lastadded = 23
            for off in offsets:
                tmp = d[(23, off)]
                start = (
                    (pos + prime)
                    if off == 23
                    else (prime * (const * (pos + 1 if tmp < 23 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 29
        if tk29[pos]:
            prime = cpos + 29
            p.append(prime)
            lastadded = 29
            for off in offsets:
                tmp = d[(29, off)]
                start = (
                    (pos + prime)
                    if off == 29
                    else (prime * (const * (pos + 1 if tmp < 29 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # now we go back to top tk1, so we need to increase pos by 1
        pos += 1
        cpos = const * pos
        # 30k + 1
        if tk1[pos]:
            prime = cpos + 1
            p.append(prime)
            lastadded = 1
            for off in offsets:
                tmp = d[(1, off)]
                start = (
                    (pos + prime)
                    if off == 1
                    else (prime * (const * pos + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
    # time to add remaining primes
    # if lastadded == 1, remove last element and start adding them from tk1
    # this way we don't need an "if" within the last while
    if lastadded == 1:
        p.pop()
    # now complete for every other possible prime
    while pos < len(tk1):
        cpos = const * pos
        if tk1[pos]:
            p.append(cpos + 1)
        if tk7[pos]:
            p.append(cpos + 7)
        if tk11[pos]:
            p.append(cpos + 11)
        if tk13[pos]:
            p.append(cpos + 13)
        if tk17[pos]:
            p.append(cpos + 17)
        if tk19[pos]:
            p.append(cpos + 19)
        if tk23[pos]:
            p.append(cpos + 23)
        if tk29[pos]:
            p.append(cpos + 29)
        pos += 1
    # remove exceeding if present
    pos = len(p) - 1
    while p[pos] > N:
        pos -= 1
    if pos < len(p) - 1:
        del p[pos + 1 :]
    # return p list
    return p


def sieve_of_eratosthenes(n):
    """sieveOfEratosthenes(n): return the list of the primes < n."""
    # Code from: <dickinsm@gmail.com>, Nov 30 2006
    # http://groups.google.com/group/comp.lang.python/msg/f1f10ced88c68c2d
    if n <= 2:
        return []
    sieve = list(range(3, n, 2))
    top = len(sieve)
    for si in sieve:
        if si:
            bottom = (si * si - 3) // 2
            if bottom >= top:
                break
            sieve[bottom::si] = [0] * -((bottom - top) // si)
    return [2] + [el for el in sieve if el]


def sieve_of_atkin(end):
    """return a list of all the prime numbers <end using the Sieve of Atkin."""
    # Code by Steve Krenzel, <Sgk284@gmail.com>, improved
    # Code: https://web.archive.org/web/20080324064651/http://krenzel.info/?p=83
    # Info: http://en.wikipedia.org/wiki/Sieve_of_Atkin
    assert end > 0
    lng = (end - 1) // 2
    sieve = [False] * (lng + 1)

    x_max, x2, xd = int(sqrt((end - 1) / 4.0)), 0, 4
    for xd in range(4, 8 * x_max + 2, 8):
        x2 += xd
        y_max = int(sqrt(end - x2))
        n, n_diff = x2 + y_max * y_max, (y_max << 1) - 1
        if not (n & 1):
            n -= n_diff
            n_diff -= 2
        for d in range((n_diff - 1) << 1, -1, -8):
            m = n % 12
            if m == 1 or m == 5:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, x2, xd = int(sqrt((end - 1) / 3.0)), 0, 3
    for xd in range(3, 6 * x_max + 2, 6):
        x2 += xd
        y_max = int(sqrt(end - x2))
        n, n_diff = x2 + y_max * y_max, (y_max << 1) - 1
        if not (n & 1):
            n -= n_diff
            n_diff -= 2
        for d in range((n_diff - 1) << 1, -1, -8):
            if n % 12 == 7:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, y_min, x2, xd = int((2 + sqrt(4 - 8 * (1 - end))) / 4), -1, 0, 3
    for x in range(1, x_max + 1):
        x2 += xd
        xd += 6
        if x2 >= end:
            y_min = (((int(ceil(sqrt(x2 - end))) - 1) << 1) - 2) << 1
        n, n_diff = ((x * x + x) << 1) - 1, (((x - 1) << 1) - 2) << 1
        for d in range(n_diff, y_min, -8):
            if n % 12 == 11:
                m = n >> 1
                sieve[m] = not sieve[m]
            n += d

    primes = [2, 3]
    if end <= 3:
        return primes[: max(0, end - 2)]

    for n in range(5 >> 1, (int(sqrt(end)) + 1) >> 1):
        if sieve[n]:
            primes.append((n << 1) + 1)
            aux = (n << 1) + 1
            aux *= aux
            for k in range(aux, end, 2 * aux):
                sieve[k >> 1] = False

    s = int(sqrt(end)) + 1
    if s % 2 == 0:
        s += 1
    primes.extend([i for i in range(s, end, 2) if sieve[i >> 1]])

    return primes


def ambi_sieve_plain(n):
    s = list(range(3, n, 2))
    for m in range(3, int(n ** 0.5) + 1, 2):
        if s[(m - 3) // 2]:
            for t in range((m * m - 3) // 2, (n >> 1) - 1, m):
                s[t] = 0
    return [2] + [t for t in s if t > 0]


def sundaram3(max_n):
    # /programming/2068372/fastest-way-to-list-all-primes-below-n-in-python/2073279#2073279
    numbers = range(3, max_n + 1, 2)
    half = (max_n) // 2
    initial = 4

    for step in range(3, max_n + 1, 2):
        for i in range(initial, half, step):
            numbers[i - 1] = 0
        initial += 2 * (step + 1)

        if initial > half:
            return [2] + filter(None, numbers)


# Using Numpy:
def ambi_sieve(n):
    # http://tommih.blogspot.com/2009/04/fast-prime-number-generator.html
    s = np.arange(3, n, 2)
    for m in range(3, int(n ** 0.5) + 1, 2):
        if s[(m - 3) // 2]:
            s[(m * m - 3) // 2::m] = 0
    return np.r_[2, s[s > 0]]


def primesfrom3to(n):
    # /programming/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns an array of primes, p < n """
    assert n >= 2
    sieve = np.ones(n // 2, dtype=np.bool)
    for i in range(3, int(n ** 0.5) + 1, 2):
        if sieve[i // 2]:
            sieve[i * i // 2::i] = False
    return np.r_[2, 2 * np.nonzero(sieve)[0][1::] + 1]


def primesfrom2to(n):
    # /programming/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns an array of primes, 2 <= p < n """
    assert n >= 6
    sieve = np.ones(n // 3 + (n % 6 == 2), dtype=np.bool)
    sieve[0] = False
    for i in range(int(n ** 0.5) // 3 + 1):
        if sieve[i]:
            k = 3 * i + 1 | 1
            sieve[((k * k) // 3)::2 * k] = False
            sieve[(k * k + 4 * k - 2 * k * (i & 1)) // 3::2 * k] = False
    return np.r_[2, 3, ((3 * np.nonzero(sieve)[0] + 1) | 1)]


def sympy_sieve(n):
    return list(sympy.sieve.primerange(1, n))


perfplot.save(
    "prime.png",
    setup=lambda n: n,
    kernels=[
        rwh_primes,
        rwh_primes1,
        rwh_primes2,
        sieve_wheel_30,
        sieve_of_eratosthenes,
        sieve_of_atkin,
        # ambi_sieve_plain,
        # sundaram3,
        ambi_sieve,
        primesfrom3to,
        primesfrom2to,
        sympy_sieve,
    ],
    n_range=[2 ** k for k in range(3, 25)],
    logx=True,
    logy=True,
    xlabel="n",
)

I’ve updated much of the code for Python 3 and threw it at perfplot (a project of mine) to see which is actually fastest. Turns out that, for large n, primesfrom{2,3}to take the cake:

enter image description here


Code to reproduce the plot:

import perfplot
from math import sqrt, ceil
import numpy as np
import sympy


def rwh_primes(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * n
    for i in range(3, int(n ** 0.5) + 1, 2):
        if sieve[i]:
            sieve[i * i::2 * i] = [False] * ((n - i * i - 1) // (2 * i) + 1)
    return [2] + [i for i in range(3, n, 2) if sieve[i]]


def rwh_primes1(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns  a list of primes < n """
    sieve = [True] * (n // 2)
    for i in range(3, int(n ** 0.5) + 1, 2):
        if sieve[i // 2]:
            sieve[i * i // 2::i] = [False] * ((n - i * i - 1) // (2 * i) + 1)
    return [2] + [2 * i + 1 for i in range(1, n // 2) if sieve[i]]


def rwh_primes2(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """Input n>=6, Returns a list of primes, 2 <= p < n"""
    assert n >= 6
    correction = n % 6 > 1
    n = {0: n, 1: n - 1, 2: n + 4, 3: n + 3, 4: n + 2, 5: n + 1}[n % 6]
    sieve = [True] * (n // 3)
    sieve[0] = False
    for i in range(int(n ** 0.5) // 3 + 1):
        if sieve[i]:
            k = 3 * i + 1 | 1
            sieve[((k * k) // 3)::2 * k] = [False] * (
                (n // 6 - (k * k) // 6 - 1) // k + 1
            )
            sieve[(k * k + 4 * k - 2 * k * (i & 1)) // 3::2 * k] = [False] * (
                (n // 6 - (k * k + 4 * k - 2 * k * (i & 1)) // 6 - 1) // k + 1
            )
    return [2, 3] + [3 * i + 1 | 1 for i in range(1, n // 3 - correction) if sieve[i]]


def sieve_wheel_30(N):
    # http://zerovolt.com/?p=88
    """ Returns a list of primes <= N using wheel criterion 2*3*5 = 30

Copyright 2009 by zerovolt.com
This code is free for non-commercial purposes, in which case you can just leave this comment as a credit for my work.
If you need this code for commercial purposes, please contact me by sending an email to: info [at] zerovolt [dot] com."""
    __smallp = (
        2,
        3,
        5,
        7,
        11,
        13,
        17,
        19,
        23,
        29,
        31,
        37,
        41,
        43,
        47,
        53,
        59,
        61,
        67,
        71,
        73,
        79,
        83,
        89,
        97,
        101,
        103,
        107,
        109,
        113,
        127,
        131,
        137,
        139,
        149,
        151,
        157,
        163,
        167,
        173,
        179,
        181,
        191,
        193,
        197,
        199,
        211,
        223,
        227,
        229,
        233,
        239,
        241,
        251,
        257,
        263,
        269,
        271,
        277,
        281,
        283,
        293,
        307,
        311,
        313,
        317,
        331,
        337,
        347,
        349,
        353,
        359,
        367,
        373,
        379,
        383,
        389,
        397,
        401,
        409,
        419,
        421,
        431,
        433,
        439,
        443,
        449,
        457,
        461,
        463,
        467,
        479,
        487,
        491,
        499,
        503,
        509,
        521,
        523,
        541,
        547,
        557,
        563,
        569,
        571,
        577,
        587,
        593,
        599,
        601,
        607,
        613,
        617,
        619,
        631,
        641,
        643,
        647,
        653,
        659,
        661,
        673,
        677,
        683,
        691,
        701,
        709,
        719,
        727,
        733,
        739,
        743,
        751,
        757,
        761,
        769,
        773,
        787,
        797,
        809,
        811,
        821,
        823,
        827,
        829,
        839,
        853,
        857,
        859,
        863,
        877,
        881,
        883,
        887,
        907,
        911,
        919,
        929,
        937,
        941,
        947,
        953,
        967,
        971,
        977,
        983,
        991,
        997,
    )
    # wheel = (2, 3, 5)
    const = 30
    if N < 2:
        return []
    if N <= const:
        pos = 0
        while __smallp[pos] <= N:
            pos += 1
        return list(__smallp[:pos])
    # make the offsets list
    offsets = (7, 11, 13, 17, 19, 23, 29, 1)
    # prepare the list
    p = [2, 3, 5]
    dim = 2 + N // const
    tk1 = [True] * dim
    tk7 = [True] * dim
    tk11 = [True] * dim
    tk13 = [True] * dim
    tk17 = [True] * dim
    tk19 = [True] * dim
    tk23 = [True] * dim
    tk29 = [True] * dim
    tk1[0] = False
    # help dictionary d
    # d[a , b] = c  ==> if I want to find the smallest useful multiple of (30*pos)+a
    # on tkc, then I need the index given by the product of [(30*pos)+a][(30*pos)+b]
    # in general. If b < a, I need [(30*pos)+a][(30*(pos+1))+b]
    d = {}
    for x in offsets:
        for y in offsets:
            res = (x * y) % const
            if res in offsets:
                d[(x, res)] = y
    # another help dictionary: gives tkx calling tmptk[x]
    tmptk = {1: tk1, 7: tk7, 11: tk11, 13: tk13, 17: tk17, 19: tk19, 23: tk23, 29: tk29}
    pos, prime, lastadded, stop = 0, 0, 0, int(ceil(sqrt(N)))

    # inner functions definition
    def del_mult(tk, start, step):
        for k in range(start, len(tk), step):
            tk[k] = False

    # end of inner functions definition
    cpos = const * pos
    while prime < stop:
        # 30k + 7
        if tk7[pos]:
            prime = cpos + 7
            p.append(prime)
            lastadded = 7
            for off in offsets:
                tmp = d[(7, off)]
                start = (
                    (pos + prime)
                    if off == 7
                    else (prime * (const * (pos + 1 if tmp < 7 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 11
        if tk11[pos]:
            prime = cpos + 11
            p.append(prime)
            lastadded = 11
            for off in offsets:
                tmp = d[(11, off)]
                start = (
                    (pos + prime)
                    if off == 11
                    else (prime * (const * (pos + 1 if tmp < 11 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 13
        if tk13[pos]:
            prime = cpos + 13
            p.append(prime)
            lastadded = 13
            for off in offsets:
                tmp = d[(13, off)]
                start = (
                    (pos + prime)
                    if off == 13
                    else (prime * (const * (pos + 1 if tmp < 13 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 17
        if tk17[pos]:
            prime = cpos + 17
            p.append(prime)
            lastadded = 17
            for off in offsets:
                tmp = d[(17, off)]
                start = (
                    (pos + prime)
                    if off == 17
                    else (prime * (const * (pos + 1 if tmp < 17 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 19
        if tk19[pos]:
            prime = cpos + 19
            p.append(prime)
            lastadded = 19
            for off in offsets:
                tmp = d[(19, off)]
                start = (
                    (pos + prime)
                    if off == 19
                    else (prime * (const * (pos + 1 if tmp < 19 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 23
        if tk23[pos]:
            prime = cpos + 23
            p.append(prime)
            lastadded = 23
            for off in offsets:
                tmp = d[(23, off)]
                start = (
                    (pos + prime)
                    if off == 23
                    else (prime * (const * (pos + 1 if tmp < 23 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # 30k + 29
        if tk29[pos]:
            prime = cpos + 29
            p.append(prime)
            lastadded = 29
            for off in offsets:
                tmp = d[(29, off)]
                start = (
                    (pos + prime)
                    if off == 29
                    else (prime * (const * (pos + 1 if tmp < 29 else 0) + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
        # now we go back to top tk1, so we need to increase pos by 1
        pos += 1
        cpos = const * pos
        # 30k + 1
        if tk1[pos]:
            prime = cpos + 1
            p.append(prime)
            lastadded = 1
            for off in offsets:
                tmp = d[(1, off)]
                start = (
                    (pos + prime)
                    if off == 1
                    else (prime * (const * pos + tmp)) // const
                )
                del_mult(tmptk[off], start, prime)
    # time to add remaining primes
    # if lastadded == 1, remove last element and start adding them from tk1
    # this way we don't need an "if" within the last while
    if lastadded == 1:
        p.pop()
    # now complete for every other possible prime
    while pos < len(tk1):
        cpos = const * pos
        if tk1[pos]:
            p.append(cpos + 1)
        if tk7[pos]:
            p.append(cpos + 7)
        if tk11[pos]:
            p.append(cpos + 11)
        if tk13[pos]:
            p.append(cpos + 13)
        if tk17[pos]:
            p.append(cpos + 17)
        if tk19[pos]:
            p.append(cpos + 19)
        if tk23[pos]:
            p.append(cpos + 23)
        if tk29[pos]:
            p.append(cpos + 29)
        pos += 1
    # remove exceeding if present
    pos = len(p) - 1
    while p[pos] > N:
        pos -= 1
    if pos < len(p) - 1:
        del p[pos + 1 :]
    # return p list
    return p


def sieve_of_eratosthenes(n):
    """sieveOfEratosthenes(n): return the list of the primes < n."""
    # Code from: <dickinsm@gmail.com>, Nov 30 2006
    # http://groups.google.com/group/comp.lang.python/msg/f1f10ced88c68c2d
    if n <= 2:
        return []
    sieve = list(range(3, n, 2))
    top = len(sieve)
    for si in sieve:
        if si:
            bottom = (si * si - 3) // 2
            if bottom >= top:
                break
            sieve[bottom::si] = [0] * -((bottom - top) // si)
    return [2] + [el for el in sieve if el]


def sieve_of_atkin(end):
    """return a list of all the prime numbers <end using the Sieve of Atkin."""
    # Code by Steve Krenzel, <Sgk284@gmail.com>, improved
    # Code: https://web.archive.org/web/20080324064651/http://krenzel.info/?p=83
    # Info: http://en.wikipedia.org/wiki/Sieve_of_Atkin
    assert end > 0
    lng = (end - 1) // 2
    sieve = [False] * (lng + 1)

    x_max, x2, xd = int(sqrt((end - 1) / 4.0)), 0, 4
    for xd in range(4, 8 * x_max + 2, 8):
        x2 += xd
        y_max = int(sqrt(end - x2))
        n, n_diff = x2 + y_max * y_max, (y_max << 1) - 1
        if not (n & 1):
            n -= n_diff
            n_diff -= 2
        for d in range((n_diff - 1) << 1, -1, -8):
            m = n % 12
            if m == 1 or m == 5:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, x2, xd = int(sqrt((end - 1) / 3.0)), 0, 3
    for xd in range(3, 6 * x_max + 2, 6):
        x2 += xd
        y_max = int(sqrt(end - x2))
        n, n_diff = x2 + y_max * y_max, (y_max << 1) - 1
        if not (n & 1):
            n -= n_diff
            n_diff -= 2
        for d in range((n_diff - 1) << 1, -1, -8):
            if n % 12 == 7:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, y_min, x2, xd = int((2 + sqrt(4 - 8 * (1 - end))) / 4), -1, 0, 3
    for x in range(1, x_max + 1):
        x2 += xd
        xd += 6
        if x2 >= end:
            y_min = (((int(ceil(sqrt(x2 - end))) - 1) << 1) - 2) << 1
        n, n_diff = ((x * x + x) << 1) - 1, (((x - 1) << 1) - 2) << 1
        for d in range(n_diff, y_min, -8):
            if n % 12 == 11:
                m = n >> 1
                sieve[m] = not sieve[m]
            n += d

    primes = [2, 3]
    if end <= 3:
        return primes[: max(0, end - 2)]

    for n in range(5 >> 1, (int(sqrt(end)) + 1) >> 1):
        if sieve[n]:
            primes.append((n << 1) + 1)
            aux = (n << 1) + 1
            aux *= aux
            for k in range(aux, end, 2 * aux):
                sieve[k >> 1] = False

    s = int(sqrt(end)) + 1
    if s % 2 == 0:
        s += 1
    primes.extend([i for i in range(s, end, 2) if sieve[i >> 1]])

    return primes


def ambi_sieve_plain(n):
    s = list(range(3, n, 2))
    for m in range(3, int(n ** 0.5) + 1, 2):
        if s[(m - 3) // 2]:
            for t in range((m * m - 3) // 2, (n >> 1) - 1, m):
                s[t] = 0
    return [2] + [t for t in s if t > 0]


def sundaram3(max_n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/2073279#2073279
    numbers = range(3, max_n + 1, 2)
    half = (max_n) // 2
    initial = 4

    for step in range(3, max_n + 1, 2):
        for i in range(initial, half, step):
            numbers[i - 1] = 0
        initial += 2 * (step + 1)

        if initial > half:
            return [2] + filter(None, numbers)


# Using Numpy:
def ambi_sieve(n):
    # http://tommih.blogspot.com/2009/04/fast-prime-number-generator.html
    s = np.arange(3, n, 2)
    for m in range(3, int(n ** 0.5) + 1, 2):
        if s[(m - 3) // 2]:
            s[(m * m - 3) // 2::m] = 0
    return np.r_[2, s[s > 0]]


def primesfrom3to(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Returns an array of primes, p < n """
    assert n >= 2
    sieve = np.ones(n // 2, dtype=np.bool)
    for i in range(3, int(n ** 0.5) + 1, 2):
        if sieve[i // 2]:
            sieve[i * i // 2::i] = False
    return np.r_[2, 2 * np.nonzero(sieve)[0][1::] + 1]


def primesfrom2to(n):
    # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
    """ Input n>=6, Returns an array of primes, 2 <= p < n """
    assert n >= 6
    sieve = np.ones(n // 3 + (n % 6 == 2), dtype=np.bool)
    sieve[0] = False
    for i in range(int(n ** 0.5) // 3 + 1):
        if sieve[i]:
            k = 3 * i + 1 | 1
            sieve[((k * k) // 3)::2 * k] = False
            sieve[(k * k + 4 * k - 2 * k * (i & 1)) // 3::2 * k] = False
    return np.r_[2, 3, ((3 * np.nonzero(sieve)[0] + 1) | 1)]


def sympy_sieve(n):
    return list(sympy.sieve.primerange(1, n))


perfplot.save(
    "prime.png",
    setup=lambda n: n,
    kernels=[
        rwh_primes,
        rwh_primes1,
        rwh_primes2,
        sieve_wheel_30,
        sieve_of_eratosthenes,
        sieve_of_atkin,
        # ambi_sieve_plain,
        # sundaram3,
        ambi_sieve,
        primesfrom3to,
        primesfrom2to,
        sympy_sieve,
    ],
    n_range=[2 ** k for k in range(3, 25)],
    logx=True,
    logy=True,
    xlabel="n",
)

回答 23

我的猜测是最快所有方式中的就是对代码中的素数进行硬编码。

因此,为什么不编写一个缓慢的脚本来生成另一个包含所有数字的源文件,然后在运行实际程序时导入该源文件。

当然,仅当您在编译时知道N的上限时,此方法才有效,但是(几乎)所有项目Euler问题都是如此。

 

PS: 我可能错了,尽管使用硬连线素数解析源代码比首先计算它们要慢,但是据我所知Python运行于编译.pyc文件中,因此读取所有素数不超过N的二进制数组应该很血腥在这种情况下很快。

My guess is that the fastest of all ways is to hard code the primes in your code.

So why not just write a slow script that generates another source file that has all numbers hardwired in it, and then import that source file when you run your actual program.

Of course, this works only if you know the upper bound of N at compile time, but thus is the case for (almost) all project Euler problems.

 

PS: I might be wrong though iff parsing the source with hard-wired primes is slower than computing them in the first place, but as far I know Python runs from compiled .pyc files so reading a binary array with all primes up to N should be bloody fast in that case.


回答 24

很抱歉打扰,但是erat2()在算法中存在严重缺陷。

在搜索下一个复合词时,我们仅需要测试奇数。q,p都是奇数;那么q + p是偶数,不需要测试,但是q + 2 * p总是奇数。这消除了while循环条件下的“ if even”测试,并节省了大约30%的运行时间。

当我们讨论它时:使用优雅的’D.pop(q,None)’获取和删除方法,而不是’if q in D:p = D [q],del D [q]’,它快两倍!至少在我的机器上(P3-1Ghz)。所以我建议这种聪明算法的实现:

def erat3( ):
    from itertools import islice, count

    # q is the running integer that's checked for primeness.
    # yield 2 and no other even number thereafter
    yield 2
    D = {}
    # no need to mark D[4] as we will test odd numbers only
    for q in islice(count(3),0,None,2):
        if q in D:                  #  is composite
            p = D[q]
            del D[q]
            # q is composite. p=D[q] is the first prime that
            # divides it. Since we've reached q, we no longer
            # need it in the map, but we'll mark the next
            # multiple of its witnesses to prepare for larger
            # numbers.
            x = q + p+p        # next odd(!) multiple
            while x in D:      # skip composites
                x += p+p
            D[x] = p
        else:                  # is prime
            # q is a new prime.
            # Yield it and mark its first multiple that isn't
            # already marked in previous iterations.
            D[q*q] = q
            yield q

Sorry to bother but erat2() has a serious flaw in the algorithm.

While searching for the next composite, we need to test odd numbers only. q,p both are odd; then q+p is even and doesn’t need to be tested, but q+2*p is always odd. This eliminates the “if even” test in the while loop condition and saves about 30% of the runtime.

While we’re at it: instead of the elegant ‘D.pop(q,None)’ get and delete method use ‘if q in D: p=D[q],del D[q]’ which is twice as fast! At least on my machine (P3-1Ghz). So I suggest this implementation of this clever algorithm:

def erat3( ):
    from itertools import islice, count

    # q is the running integer that's checked for primeness.
    # yield 2 and no other even number thereafter
    yield 2
    D = {}
    # no need to mark D[4] as we will test odd numbers only
    for q in islice(count(3),0,None,2):
        if q in D:                  #  is composite
            p = D[q]
            del D[q]
            # q is composite. p=D[q] is the first prime that
            # divides it. Since we've reached q, we no longer
            # need it in the map, but we'll mark the next
            # multiple of its witnesses to prepare for larger
            # numbers.
            x = q + p+p        # next odd(!) multiple
            while x in D:      # skip composites
                x += p+p
            D[x] = p
        else:                  # is prime
            # q is a new prime.
            # Yield it and mark its first multiple that isn't
            # already marked in previous iterations.
            D[q*q] = q
            yield q

回答 25

到目前为止,我尝试过的最快方法是基于Python Cookbookerat2函数:

import itertools as it
def erat2a( ):
    D = {  }
    yield 2
    for q in it.islice(it.count(3), 0, None, 2):
        p = D.pop(q, None)
        if p is None:
            D[q*q] = q
            yield q
        else:
            x = q + 2*p
            while x in D:
                x += 2*p
            D[x] = p

有关加速的说明,请参见此答案。

The fastest method I’ve tried so far is based on the Python cookbook erat2 function:

import itertools as it
def erat2a( ):
    D = {  }
    yield 2
    for q in it.islice(it.count(3), 0, None, 2):
        p = D.pop(q, None)
        if p is None:
            D[q*q] = q
            yield q
        else:
            x = q + 2*p
            while x in D:
                x += 2*p
            D[x] = p

See this answer for an explanation of the speeding-up.


回答 26

我可能聚会晚了,但是必须为此添加我自己的代码。它占用的空间大约为n / 2,因为我们不需要存储偶数,而且我还使用了bitarray python模块,从而进一步大大减少了内存消耗,并使所有素数的计算能力高达1,000,000,000

from bitarray import bitarray
def primes_to(n):
    size = n//2
    sieve = bitarray(size)
    sieve.setall(1)
    limit = int(n**0.5)
    for i in range(1,limit):
        if sieve[i]:
            val = 2*i+1
            sieve[(i+i*val)::val] = 0
    return [2] + [2*i+1 for i, v in enumerate(sieve) if v and i > 0]

python -m timeit -n10 -s "import euler" "euler.primes_to(1000000000)"
10 loops, best of 3: 46.5 sec per loop

这是在64位2.4GHZ MAC OSX 10.8.3上运行的

I may be late to the party but will have to add my own code for this. It uses approximately n/2 in space because we don’t need to store even numbers and I also make use of the bitarray python module, further draStically cutting down on memory consumption and enabling computing all primes up to 1,000,000,000

from bitarray import bitarray
def primes_to(n):
    size = n//2
    sieve = bitarray(size)
    sieve.setall(1)
    limit = int(n**0.5)
    for i in range(1,limit):
        if sieve[i]:
            val = 2*i+1
            sieve[(i+i*val)::val] = 0
    return [2] + [2*i+1 for i, v in enumerate(sieve) if v and i > 0]

python -m timeit -n10 -s "import euler" "euler.primes_to(1000000000)"
10 loops, best of 3: 46.5 sec per loop

This was run on a 64bit 2.4GHZ MAC OSX 10.8.3


回答 27

随着时间的推移,我收集了几个质数筛。我的计算机上最快的是:

from time import time
# 175 ms for all the primes up to the value 10**6
def primes_sieve(limit):
    a = [True] * limit
    a[0] = a[1] = False
    #a[2] = True
    for n in xrange(4, limit, 2):
        a[n] = False
    root_limit = int(limit**.5)+1
    for i in xrange(3,root_limit):
        if a[i]:
            for n in xrange(i*i, limit, 2*i):
                a[n] = False
    return a

LIMIT = 10**6
s=time()
primes = primes_sieve(LIMIT)
print time()-s

I collected several prime number sieves over time. The fastest on my computer is this:

from time import time
# 175 ms for all the primes up to the value 10**6
def primes_sieve(limit):
    a = [True] * limit
    a[0] = a[1] = False
    #a[2] = True
    for n in xrange(4, limit, 2):
        a[n] = False
    root_limit = int(limit**.5)+1
    for i in xrange(3,root_limit):
        if a[i]:
            for n in xrange(i*i, limit, 2*i):
                a[n] = False
    return a

LIMIT = 10**6
s=time()
primes = primes_sieve(LIMIT)
print time()-s

回答 28

我对这个问题的回答很慢,但这似乎很有趣。我正在使用numpy,这可能会作弊,我怀疑这种方法是最快的,但应该清楚。它筛选仅引用其索引的布尔数组,并从所有True值的索引中引出质数。无需取模。

import numpy as np
def ajs_primes3a(upto):
    mat = np.ones((upto), dtype=bool)
    mat[0] = False
    mat[1] = False
    mat[4::2] = False
    for idx in range(3, int(upto ** 0.5)+1, 2):
        mat[idx*2::idx] = False
    return np.where(mat == True)[0]

I’m slow responding to this question but it seemed like a fun exercise. I’m using numpy which might be cheating and I doubt this method is the fastest but it should be clear. It sieves a Boolean array referring to its indices only and elicits prime numbers from the indices of all True values. No modulo needed.

import numpy as np
def ajs_primes3a(upto):
    mat = np.ones((upto), dtype=bool)
    mat[0] = False
    mat[1] = False
    mat[4::2] = False
    for idx in range(3, int(upto ** 0.5)+1, 2):
        mat[idx*2::idx] = False
    return np.where(mat == True)[0]

回答 29

这是一种有趣的技术,可以使用python的列表推导来生成质数(但不是最有效的):

noprimes = [j for i in range(2, 8) for j in range(i*2, 50, i)]
primes = [x for x in range(2, 50) if x not in noprimes]

您可以在此处找到示例和一些说明

Here is an interesting technique to generate prime numbers (yet not the most efficient) using python’s list comprehensions:

noprimes = [j for i in range(2, 8) for j in range(i*2, 50, i)]
primes = [x for x in range(2, 50) if x not in noprimes]

You can find the example and some explanations right here


用于除法时,“ /”和“ //”之间有什么区别?

问题:用于除法时,“ /”和“ //”之间有什么区别?

将一个使用在另一个上是否有好处?在Python 2中,它们似乎都返回相同的结果:

>>> 6/3
2
>>> 6//3
2

Is there a benefit to using one over the other? In Python 2, they both seem to return the same results:

>>> 6/3
2
>>> 6//3
2

回答 0

在Python 3.x中,5 / 2将返回2.5并且5 // 2将返回2。前者是浮点除法,后者是地板除法,有时也称为整数除法

在Python 2.2或更高版本的2.x行中,除非执行from __future__ import division,否则整数没有区别,这会使Python 2.x采取3.x行为。

不管将来的进口是什么,5.0 // 2都会归还,2.0因为这是操作的地板分割结果。

您可以在https://docs.python.org/whatsnew/2.2.html#pep-238-changing-the-division-operator中找到详细说明

In Python 3.x, 5 / 2 will return 2.5 and 5 // 2 will return 2. The former is floating point division, and the latter is floor division, sometimes also called integer division.

In Python 2.2 or later in the 2.x line, there is no difference for integers unless you perform a from __future__ import division, which causes Python 2.x to adopt the 3.x behavior.

Regardless of the future import, 5.0 // 2 will return 2.0 since that’s the floor division result of the operation.

You can find a detailed description at https://docs.python.org/whatsnew/2.2.html#pep-238-changing-the-division-operator


回答 1

Python 2.x澄清:

为了阐明Python 2.x系列,/既不是楼层划分也不是真正的划分。当前接受的答案尚不清楚。

/是地板除法时 ARG游戏int,但真正当师之一或两者的参数的个数是float

上面讲了更多的事实,并且比接受的答案中的第二段更清楚。

Python 2.x Clarification:

To clarify for the Python 2.x line, / is neither floor division nor true division. The current accepted answer is not clear on this.

/ is floor division when both args are int, but is true division when either or both of the args are float.

The above tells more truth, and is more clear than the 2nd paragraph in the accepted answer.


回答 2

//不论您使用哪种类型,都可以实施“楼层划分”。所以 1.0/2.0会给0.5,但两者1/21//2并且1.0//2.0会给0

有关详细信息,请参见https://docs.python.org/whatsnew/2.2.html#pep-238-changing-the-division-operator

// implements “floor division”, regardless of your type. So 1.0/2.0 will give 0.5, but both 1/2, 1//2 and 1.0//2.0 will give 0.

See https://docs.python.org/whatsnew/2.2.html#pep-238-changing-the-division-operator for details


回答 3

// -浮点除法

// ->楼层划分

让我们在python 2.7和python 3.5中查看一些示例。

Python 2.7.10和Python 3.5

print (2/3)  ----> 0                   Python 2.7
print (2/3)  ----> 0.6666666666666666  Python 3.5

Python 2.7.10和Python 3.5

  print (4/2)  ----> 2         Python 2.7
  print (4/2)  ----> 2.0       Python 3.5

现在,如果您希望在python 2.7中具有与python 3.5相同的输出,则可以执行以下操作:

Python 2.7.10

from __future__ import division
print (2/3)  ----> 0.6666666666666666   #Python 2.7
print (4/2)  ----> 2.0                  #Python 2.7

在python 2.7和python 3.5中,Floor划分之间没有区别

138.93//3 ---> 46.0        #Python 2.7
138.93//3 ---> 46.0        #Python 3.5
4//3      ---> 1           #Python 2.7
4//3      ---> 1           #Python 3.5

/ –> Floating point division

// –> Floor division

Lets see some examples in both python 2.7 and in Python 3.5.

Python 2.7.10 vs. Python 3.5

print (2/3)  ----> 0                   Python 2.7
print (2/3)  ----> 0.6666666666666666  Python 3.5

Python 2.7.10 vs. Python 3.5

  print (4/2)  ----> 2         Python 2.7
  print (4/2)  ----> 2.0       Python 3.5

Now if you want to have (in python 2.7) same output as in python 3.5, you can do the following:

Python 2.7.10

from __future__ import division
print (2/3)  ----> 0.6666666666666666   #Python 2.7
print (4/2)  ----> 2.0                  #Python 2.7

Where as there is no differece between Floor division in both python 2.7 and in Python 3.5

138.93//3 ---> 46.0        #Python 2.7
138.93//3 ---> 46.0        #Python 3.5
4//3      ---> 1           #Python 2.7
4//3      ---> 1           #Python 3.5

回答 4

正如大家已经回答的那样,//是楼层划分。

之所以如此重要,是因为//在所有2.2版本的Python版本中,包括Python 3.x版本,都明确地进行了地域划分。

的行为/可能会有所不同,具体取决于:

  • 是否有效__future__导入(模块本地)
  • Python命令行选项,-Q old或者-Q new

As everyone has already answered, // is floor division.

Why this is important is that // is unambiguously floor division, in all Python versions from 2.2, including Python 3.x versions.

The behavior of / can change depending on:

  • Active __future__ import or not (module-local)
  • Python command line option, either -Q old or -Q new

回答 5

>>> print 5.0 / 2
2.5

>>> print 5.0 // 2
2.0
>>> print 5.0 / 2
2.5

>>> print 5.0 // 2
2.0

回答 6

Python 2.7和其他即将发布的python版本:

  • 部门(/

将左操作数除以右操作数

例: 4 / 2 = 2

  • 楼层部门(//

操作数的除法,结果是除掉小数点后的数字的商。但是,如果其中一个操作数是负数,则结果是下限的,即从零开始舍入(朝负无穷大):

例如:9//2 = 49.0//2.0 = 4.0-11//3 = -4-11.0//3 = -4.0

这两个/师和//地板除法运算的操作以类似的方式。

Python 2.7 and other upcoming version of python:

  • Division (/)

Divides left hand operand by right hand operand

Example: 4 / 2 = 2

  • Floor Division (//)

The division of operands where the result is the quotient in which the digits after the decimal point are removed. But if one of the operands is negative, the result is floored, i.e., rounded away from zero (towards negative infinity):

Examples: 9//2 = 4 and 9.0//2.0 = 4.0, -11//3 = -4, -11.0//3 = -4.0

Both / Division and // floor division operator are operating in similar fashion.


回答 7

双斜线//是地板分割:

>>> 7//3
2

The double slash, //, is floor division:

>>> 7//3
2

回答 8

//是底数除法,它将始终为您提供结果的整数底数。另一个是“常规”划分。

// is floor division, it will always give you the integer floor of the result. The other is ‘regular’ division.


回答 9

等式的答案四舍五入为下一个较小的整数,或者以.0作为小数点进行浮点运算。

>>>print 5//2
2
>>> print 5.0//2
2.0
>>>print 5//2.0
2.0
>>>print 5.0//2.0
2.0

The answer of the equation is rounded to the next smaller integer or float with .0 as decimal point.

>>>print 5//2
2
>>> print 5.0//2
2.0
>>>print 5//2.0
2.0
>>>print 5.0//2.0
2.0

回答 10

上面的答案是好的。我想补充一点。最多两个值都导致相同的商。在该楼层之后,除法运算符(//)正常工作,但除法运算符()无效/

 - > int(755349677599789174/2)
 - > 377674838799894592      #wrong answer
 - > 755349677599789174 //2
 - > 377674838799894587      #correct answer

The above answers are good. I want to add another point. Up to some values both of them result in the same quotient. After that floor division operator (//) works fine but not division (/) operator.

 - > int(755349677599789174/2)
 - > 377674838799894592      #wrong answer
 - > 755349677599789174 //2
 - > 377674838799894587      #correct answer

回答 11

5.0//2会导致2.0,而不是2因为运算符返回值的返回类型//遵循python强制(类型转换)规则。

Python促进将较低数据类型(整数)转换为较高数据类型(浮点数),以避免数据丢失。

5.0//2 results in 2.0, and not 2 because the return type of the return value from // operator follows python coercion (type casting) rules.

Python promotes conversion of lower datatype (integer) to higher data type (float) to avoid data loss.


回答 12

  • // 是底数划分,它将始终为您提供结果的底数。
  • 另一个/是浮点除法。

以下是之间的差///; 我已经在Python 3.7.2中运行了这些算术运算

>>> print (11 / 3)
3.6666666666666665

>>> print (11 // 3)
3

>>> print (11.3 / 3)
3.7666666666666667

>>> print (11.3 // 3)
3.0
  • // is floor division, it will always give you the floor value of the result.
  • And the other one / is the floating-point division.

Followings are the difference between / and //; I have run these arithmetic operations in Python 3.7.2

>>> print (11 / 3)
3.6666666666666665

>>> print (11 // 3)
3

>>> print (11.3 / 3)
3.7666666666666667

>>> print (11.3 // 3)
3.0

如何检查NaN值?

问题:如何检查NaN值?

float('nan')结果为Nan(不是数字)。但是,如何检查呢?应该很容易,但是我找不到。

float('nan') results in Nan (not a number). But how do I check for it? Should be very easy, but I cannot find it.


回答 0

math.isnan(x)

返回True如果x为NaN(非数字),以及False其他。

>>> import math
>>> x = float('nan')
>>> math.isnan(x)
True

math.isnan(x)

Return True if x is a NaN (not a number), and False otherwise.

>>> import math
>>> x = float('nan')
>>> math.isnan(x)
True

回答 1

测试NaN的通常方法是查看其是否与自身相等:

def isNaN(num):
    return num != num

The usual way to test for a NaN is to see if it’s equal to itself:

def isNaN(num):
    return num != num

回答 2

numpy.isnan(number)告诉您是否NaN存在。

numpy.isnan(number) tells you if it’s NaN or not.


回答 3

您可以通过以下三种方法测试变量是否为“ NaN”。

import pandas as pd
import numpy as np
import math

#For single variable all three libraries return single boolean
x1 = float("nan")

print(f"It's pd.isna  : {pd.isna(x1)}")
print(f"It's np.isnan  : {np.isnan(x1)}")
print(f"It's math.isnan : {math.isnan(x1)}")

输出量

It's pd.isna  : True
It's np.isnan  : True
It's math.isnan  : True

Here are three ways where you can test a variable is “NaN” or not.

import pandas as pd
import numpy as np
import math

#For single variable all three libraries return single boolean
x1 = float("nan")

print(f"It's pd.isna  : {pd.isna(x1)}")
print(f"It's np.isnan  : {np.isnan(x1)}")
print(f"It's math.isnan : {math.isnan(x1)}")

Output

It's pd.isna  : True
It's np.isnan  : True
It's math.isnan  : True

回答 4

这是使用的答案:

  • NaN实施符合IEEE 754标准
    • 即:python的NaN:float('nan')numpy.nan
  • 任何其他对象:字符串或任何对象(如果遇到,则不会引发异常)

遵循该标准实现的NaN是唯一不相等比较与其自身返回True的值:

def is_nan(x):
    return (x != x)

还有一些例子:

import numpy as np
values = [float('nan'), np.nan, 55, "string", lambda x : x]
for value in values:
    print(f"{repr(value):<8} : {is_nan(value)}")

输出:

nan      : True
nan      : True
55       : False
'string' : False
<function <lambda> at 0x000000000927BF28> : False

here is an answer working with:

  • NaN implementations respecting IEEE 754 standard
    • ie: python’s NaN: float('nan'), numpy.nan
  • any other objects: string or whatever (does not raise exceptions if encountered)

A NaN implemented following the standard, is the only value for which the inequality comparison with itself should return True:

def is_nan(x):
    return (x != x)

And some examples:

import numpy as np
values = [float('nan'), np.nan, 55, "string", lambda x : x]
for value in values:
    print(f"{repr(value):<8} : {is_nan(value)}")

Output:

nan      : True
nan      : True
55       : False
'string' : False
<function <lambda> at 0x000000000927BF28> : False

回答 5

我实际上只是碰到了这个,但是对我来说,它正在检查nan,-inf或inf。我刚用过

if float('-inf') < float(num) < float('inf'):

这对于数字是正确的,对于nan和两个inf都是错误的,并且会为字符串或其他类型的东西引发异常(这可能是一件好事)。而且,这不需要导入任何库,例如math或numpy(numpy是如此之大,它会使任何已编译应用程序的大小增加一倍)。

I actually just ran into this, but for me it was checking for nan, -inf, or inf. I just used

if float('-inf') < float(num) < float('inf'):

This is true for numbers, false for nan and both inf, and will raise an exception for things like strings or other types (which is probably a good thing). Also this does not require importing any libraries like math or numpy (numpy is so damn big it doubles the size of any compiled application).


回答 6

math.isnan()

或将数字与其本身进行比较。NaN始终为!= NaN,否则(例如,如果数字)比较将成功。

math.isnan()

or compare the number to itself. NaN is always != NaN, otherwise (e.g. if it is a number) the comparison should succeed.


回答 7

如果卡在<2.6上,这是另一种方法,则没有numpy,也没有IEEE 754支持:

def isNaN(x):
    return str(x) == str(1e400*0)

Another method if you’re stuck on <2.6, you don’t have numpy, and you don’t have IEEE 754 support:

def isNaN(x):
    return str(x) == str(1e400*0)

回答 8

好吧,我进入了这篇文章,因为我在功能上遇到了一些问题:

math.isnan()

运行此代码时出现问题:

a = "hello"
math.isnan(a)

它引发异常。我的解决方案是再次进行检查:

def is_nan(x):
    return isinstance(x, float) and math.isnan(x)

Well I entered this post, because i’ve had some issues with the function:

math.isnan()

There are problem when you run this code:

a = "hello"
math.isnan(a)

It raises exception. My solution for that is to make another check:

def is_nan(x):
    return isinstance(x, float) and math.isnan(x)

回答 9

随着python <2.6我最终得到了

def isNaN(x):
    return str(float(x)).lower() == 'nan'

这适用于Solaris 5.9框上的python 2.5.1和Ubuntu 10上的python 2.6.5

With python < 2.6 I ended up with

def isNaN(x):
    return str(float(x)).lower() == 'nan'

This works for me with python 2.5.1 on a Solaris 5.9 box and with python 2.6.5 on Ubuntu 10


回答 10

我正在从NaN以字符串形式发送的Web服务接收数据'Nan'。但是我的数据中也可能存在其他类型的字符串,因此简单的字符串float(value)可能会引发异常。我使用了以下可接受答案的变体:

def isnan(value):
  try:
      import math
      return math.isnan(float(value))
  except:
      return False

需求:

isnan('hello') == False
isnan('NaN') == True
isnan(100) == False
isnan(float('nan')) = True

I am receiving the data from a web-service that sends NaN as a string 'Nan'. But there could be other sorts of string in my data as well, so a simple float(value) could throw an exception. I used the following variant of the accepted answer:

def isnan(value):
  try:
      import math
      return math.isnan(float(value))
  except:
      return False

Requirement:

isnan('hello') == False
isnan('NaN') == True
isnan(100) == False
isnan(float('nan')) = True

回答 11

判断变量是NaN还是None的所有方法:

无类型

In [1]: from numpy import math

In [2]: a = None
In [3]: not a
Out[3]: True

In [4]: len(a or ()) == 0
Out[4]: True

In [5]: a == None
Out[5]: True

In [6]: a is None
Out[6]: True

In [7]: a != a
Out[7]: False

In [9]: math.isnan(a)
Traceback (most recent call last):
  File "<ipython-input-9-6d4d8c26d370>", line 1, in <module>
    math.isnan(a)
TypeError: a float is required

In [10]: len(a) == 0
Traceback (most recent call last):
  File "<ipython-input-10-65b72372873e>", line 1, in <module>
    len(a) == 0
TypeError: object of type 'NoneType' has no len()

NaN型

In [11]: b = float('nan')
In [12]: b
Out[12]: nan

In [13]: not b
Out[13]: False

In [14]: b != b
Out[14]: True

In [15]: math.isnan(b)
Out[15]: True

All the methods to tell if the variable is NaN or None:

None type

In [1]: from numpy import math

In [2]: a = None
In [3]: not a
Out[3]: True

In [4]: len(a or ()) == 0
Out[4]: True

In [5]: a == None
Out[5]: True

In [6]: a is None
Out[6]: True

In [7]: a != a
Out[7]: False

In [9]: math.isnan(a)
Traceback (most recent call last):
  File "<ipython-input-9-6d4d8c26d370>", line 1, in <module>
    math.isnan(a)
TypeError: a float is required

In [10]: len(a) == 0
Traceback (most recent call last):
  File "<ipython-input-10-65b72372873e>", line 1, in <module>
    len(a) == 0
TypeError: object of type 'NoneType' has no len()

NaN type

In [11]: b = float('nan')
In [12]: b
Out[12]: nan

In [13]: not b
Out[13]: False

In [14]: b != b
Out[14]: True

In [15]: math.isnan(b)
Out[15]: True

回答 12

如何从混合数据类型列表中删除NaN(浮动)项目

如果您在迭代器中包含混合类型,则以下是不使用numpy的解决方案:

from math import isnan

Z = ['a','b', float('NaN'), 'd', float('1.1024')]

[x for x in Z if not (
                      type(x) == float # let's drop all float values…
                      and isnan(x) # … but only if they are nan
                      )]
['a','b','d',1.1024]

短路评估意味着isnan不会调用非浮点类型的值,因为可以False and (…)快速评估而False无需评估右侧。

How to remove NaN (float) item(s) from a list of mixed data types

If you have mixed types in an iterable, here is a solution that does not use numpy:

from math import isnan

Z = ['a','b', float('NaN'), 'd', float('1.1024')]

[x for x in Z if not (
                      type(x) == float # let's drop all float values…
                      and isnan(x) # … but only if they are nan
                      )]
['a', 'b', 'd', 1.1024]

Short-circuit evaluation means that isnan will not be called on values that are not of type ‘float’, as False and (…) quickly evaluates to False without having to evaluate the right-hand side.


回答 13

在Python 3.6中,检查字符串值x math.isnan(x)和np.isnan(x)会引发错误。所以我无法检查给定的值是否为NaN,如果我事先不知道它是一个数字。以下似乎解决了这个问题

if str(x)=='nan' and type(x)!='str':
    print ('NaN')
else:
    print ('non NaN')

In Python 3.6 checking on a string value x math.isnan(x) and np.isnan(x) raises an error. So I can’t check if the given value is NaN or not if I don’t know beforehand it’s a number. The following seems to solve this issue

if str(x)=='nan' and type(x)!='str':
    print ('NaN')
else:
    print ('non NaN')

回答 14

对于float类型的nan

>>> import pandas as pd
>>> value = float(nan)
>>> type(value)
>>> <class 'float'>
>>> pd.isnull(value)
True
>>>
>>> value = 'nan'
>>> type(value)
>>> <class 'str'>
>>> pd.isnull(value)
False

For nan of type float

>>> import pandas as pd
>>> value = float(nan)
>>> type(value)
>>> <class 'float'>
>>> pd.isnull(value)
True
>>>
>>> value = 'nan'
>>> type(value)
>>> <class 'str'>
>>> pd.isnull(value)
False

回答 15

对于熊猫字符串,请使用pd.isnull:

if not pd.isnull(atext):
  for word in nltk.word_tokenize(atext):

作为NLTK的特征提取功能

def act_features(atext):
features = {}
if not pd.isnull(atext):
  for word in nltk.word_tokenize(atext):
    if word not in default_stopwords:
      features['cont({})'.format(word.lower())]=True
return features

for strings in panda take pd.isnull:

if not pd.isnull(atext):
  for word in nltk.word_tokenize(atext):

the function as feature extraction for NLTK

def act_features(atext):
features = {}
if not pd.isnull(atext):
  for word in nltk.word_tokenize(atext):
    if word not in default_stopwords:
      features['cont({})'.format(word.lower())]=True
return features

Manim-社区维护的用于创建数学动画的Python框架

用于解释数学视频的动画引擎


Manim是用于解释数学视频的动画引擎。它用于以编程方式创建精确的动画,如3Blue1Brown

注意:这个存储库是由Manim社区维护的,与Grant Sanderson或3Blue1Brown没有任何关联(尽管我们非常感谢他将他的工作提供给世界)。如果你想了解格兰特是如何制作视频的,请访问他的存储库(3b1b/manim)。此分叉比他的更新更频繁,如果您想将Manim用于您自己的项目,建议使用此分叉

目录:

安装

Manim需要一些依赖项,在使用它之前必须安装这些依赖项。如果您想在本地安装之前先试用,可以这样做in our online Jupyter environment

有关本地安装,请访问Documentation并按照适用于您的操作系统的说明进行操作

安装依赖项后,在终端窗口中运行以下命令:

pip install manim

用法

Manim是一个非常通用的软件包。以下是一个示例Scene您可以构建:

from manim import *


class SquareToCircle(Scene):
    def construct(self):
        circle = Circle()
        square = Square()
        square.flip(RIGHT)
        square.rotate(-3 * TAU / 8)
        circle.set_fill(PINK, opacity=0.5)

        self.play(Create(square))
        self.play(Transform(square, circle))
        self.play(FadeOut(square))

要查看此场景的输出,请将代码保存在一个名为example.py然后,在终端窗口中运行以下命令:

manim -p -ql example.py SquareToCircle

您应该会看到您的原生视频播放器程序弹出,并播放一个简单的场景,在该场景中,一个正方形被转换为一个圆。您可以在下面的内容中找到一些更简单的示例GitHub repository您也可以访问official gallery有关更高级的示例,请参阅

Manim还附带一个%%manimIPython魔术,允许在JupyterLab(以及经典的Jupyter)笔记本中方便地使用它。请参阅

corresponding documentation需要一些指导和帮助try it out online

命令行参数

Manim的一般用法如下:

manim-illustration

这个-p上述命令中的标志用于预览,即视频文件渲染完成后将自动打开。这个-qlFLAG用于以较低质量进行更快的渲染

其他一些有用的标志包括:

  • -s跳到末尾,只显示最后一帧
  • -n <number>要跳到前面的n‘场景的第8个动画
  • -f在文件浏览器中显示文件

有关命令行参数的完整列表,请访问documentation

文档

文档正在进行中,地址为ReadTheDocs

码头工人

社区还维护码头形象(manimcommunity/manim),可以找到on DockerHub支持以下标签:

运行坞站映像的说明

快速示例

渲染场景的步骤CircleToSquare在文件中test_scenes.py包含在当前工作目录中,同时保留您的用户和组ID,请使用

docker run --rm -it  --user="$(id -u):$(id -g)" -v "$(pwd)":/manim manimcommunity/manim manim test_scenes.py CircleToSquare -qm

在后台运行图像

除了使用上面描述的“一次性容器”方法,您还可以创建一个命名容器,您也可以根据自己的喜好进行修改。首先,跑步

docker run -it --name my-manim-container -v "$(pwd):/manim" manimcommunity/manim /bin/bash

要在容器中获得交互式shell,例如,可以安装更多依赖项(比如使用以下命令安装texlive包tlmgr)。满意后立即退出容器。然后,在使用它之前,通过运行以下命令启动容器

docker start my-manim-container

然后,要渲染场景CircleToSquare在文件中test_scenes.py,呼叫

docker exec -it --user="$(id -u):$(id -g)" my-manim-container manim test.py CircleToSquare -qm

木星实验室

另一种替代方法是使用docker映像启动运行JupyterLab的本地Web服务器,JupyterLab中安装了Python内核Manim,可以通过%%manim细胞魔法。要使用JupyterLab,请运行

docker run -it -p 8888:8888 manimcommunity/manim jupyter lab --ip=0.0.0.0

然后按照航站楼里的说明操作

重要说明

在执行时manim在Docker容器中,有几个命令行标志(特别是-p(预览文件)和-f(在文件浏览器中显示输出文件))不受支持

帮助处理Manim

如果您需要安装或使用Manim的帮助,请随时联系我们的Discord
Server
Reddit Community如果您要提交错误报告或功能请求,请打开问题

贡献

对Manim的贡献总是受欢迎的。特别是,对测试和文档的需求非常迫切。有关投稿指引,请参阅documentation

该项目中的大多数开发人员使用Poetry对于管理层来说。您需要在您的环境中安装并使用POLITE。您可以了解更多关于poetry以及如何在ITS中使用它documentation

如何引用Manim

我们认识到好的软件对支持研究的重要性,我们注意到,当研究得到有效沟通时,它就会变得更有价值。为了证明曼尼姆的价值,我们要求你在你的作品中引用曼尼姆。目前,引用Manim的最佳方式是引用Manim home page使用此BibTeX条目(该条目用于发布v0.9.0,但很容易修改):

@Manual{Manim:v0.9.0,
  key =          {Manim},
  author =       {{The Manim Community Developers}},
  title =        {{Manim} -- {M}athematical {A}nimation {F}ramework ({V}ersion v0.9.0)},
  note =         {\url{https://www.manim.community}},
  year =         2021,
}

这应该会呈现大致如下所示的引用:

  1. 曼尼姆社区开发人员,Manim – Mathematical Animation Framework (Version v0.9.0)2021年

行为规范

我们完整的行为准则,以及我们如何执行它,都可以继续阅读。our website

许可证

该软件在麻省理工学院的许可下是双重许可的,版权归3Blu1Brown LLC所有(见许可),版权归Manim社区开发者所有(见LICENSE.Community)

Mlcourse.ai-开放机器学习课程

ODS stickers

mlcourse.ai是一门开放的机器学习课程,由OpenDataScience (ods.ai),由Yury Kashnitsky (yorko)尤里拥有应用数学博士学位和卡格尔竞赛大师学位,他的目标是设计一门理论与实践完美平衡的ML课程。因此,你可以在课堂上复习数学公式,并与Kaggle Inclass竞赛一起练习。目前,该课程正处于自定步模式检查一下详细的Roadmap引导您完成自定进度的课程。ai

奖金:此外,您还可以购买带有最佳非演示版本的奖励作业包mlcourse.ai任务。选择“Bonus Assignments” tier请参阅主页上的交易详情mlcourse.ai

镜子(🇬🇧-仅限):mlcourse.ai(主站点)、Kaggle Dataset(与Kaggle笔记本相同的笔记本)

自定

这个Roadmap将指导您度过11周的mlCourse.ai课程。每周,从熊猫到梯度助推,都会给出阅读什么文章、看什么讲座、完成什么作业的指示。

内容

这是medium.com上发表的文章列表🇬🇧,habr.com🇷🇺还提到了中文笔记本。🇨🇳并给出了指向Kaggle笔记本(英文)的链接。图标是可点击的

  1. 用PANDA软件进行探索性数据分析🇬🇧🇷🇺🇨🇳Kaggle Notebook
  2. 用Python进行可视化数据分析🇬🇧🇷🇺🇨🇳,Kaggle笔记本电脑:part1part2
  3. 分类、决策树和k近邻🇬🇧🇷🇺🇨🇳Kaggle Notebook
  4. 线性分类与回归🇬🇧🇷🇺🇨🇳,Kaggle笔记本电脑:part1part2part3part4part5
  5. 套袋与随机林🇬🇧🇷🇺🇨🇳,Kaggle笔记本电脑:part1part2part3
  6. 特征工程与特征选择🇬🇧🇷🇺🇨🇳Kaggle Notebook
  7. 无监督学习:主成分分析与聚类🇬🇧🇷🇺🇨🇳Kaggle Notebook
  8. Vowpal Wabbit:用千兆字节的数据学习🇬🇧🇷🇺🇨🇳Kaggle Notebook
  9. 用Python进行时间序列分析,第一部分🇬🇧🇷🇺🇨🇳使用Facebook Prophet预测未来,第2部分🇬🇧🇨🇳卡格尔笔记本:part1part2
  10. 梯度增压🇬🇧🇷🇺🇨🇳Kaggle Notebook

讲座

视频上传到thisYouTube播放列表。引言,videoslides

  1. 用熊猫进行探索性数据分析,video
  2. 可视化,EDA的主要情节,video
  3. 诊断树:theorypractical part
  4. Logistic回归:theoretical foundationspractical part(《爱丽丝》比赛中的基线)
  5. 合奏和随机森林-part 1分类指标-part 2预测客户付款的业务任务示例-part 3
  6. 线性回归和正则化-theory,Lasso&Ridge,LTV预测-practice
  7. 无监督学习-Principal Component AnalysisClustering
  8. 用于分类和回归的随机梯度下降-part 1,第2部分TBA
  9. 用Python(ARIMA,PERPHET)进行时间序列分析-video
  10. 梯度增压:基本思路-part 1、XgBoost、LightGBM和CatBoost+Practice背后的关键理念-part 2

作业

以下是演示作业。此外,在“Bonus Assignments” tier您可以访问非演示作业

  1. 用熊猫进行探索性数据分析,nbviewerKaggle Notebooksolution
  2. 分析心血管疾病数据,nbviewerKaggle Notebooksolution
  3. 带有玩具任务和UCI成人数据集的决策树,nbviewerKaggle Notebooksolution
  4. 讽刺检测,Kaggle Notebooksolution线性回归作为一个最优化问题,nbviewerKaggle Notebook
  5. Logistic回归和随机森林在信用评分问题中的应用nbviewerKaggle Notebooksolution
  6. 在回归任务中探索OLS、LASSO和随机森林nbviewerKaggle Notebooksolution
  7. 无监督学习,nbviewerKaggle Notebooksolution
  8. 实现在线回归,nbviewerKaggle Notebooksolution
  9. 时间序列分析,nbviewerKaggle Notebooksolution
  10. 在比赛中超越底线,Kaggle Notebook

卡格尔竞赛

  1. 如果可以,请抓住我:通过网页会话跟踪检测入侵者。Kaggle Inclass
  2. Dota 2获胜者预测。Kaggle Inclass

引用mlCourse.ai

如果你碰巧引用了mlcourse.ai在您的工作中,您可以使用此BibTeX记录:

@misc{mlcourse_ai,
    author = {Kashnitsky, Yury},
    title = {mlcourse.ai – Open Machine Learning Course},
    year = {2020},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/Yorko/mlcourse.ai}},
}

社区

讨论在#mlCourse_ai世界上最重要的一条航道OpenDataScience (ods.ai)松懈团队

课程是免费的,但你可以通过承诺以下内容来支持组织者Patreon(每月支持)或一次性付款Ko-fi

Donate
Donate