标签归档:ieee-754

为什么(inf + 0j)* 1计算为inf + nanj?

问题:为什么(inf + 0j)* 1计算为inf + nanj?

>>> (float('inf')+0j)*1
(inf+nanj)

为什么?这在我的代码中造成了一个讨厌的错误。

为什么1乘法身份不给(inf + 0j)

>>> (float('inf')+0j)*1
(inf+nanj)

Why? This caused a nasty bug in my code.

Why isn’t 1 the multiplicative identity, giving (inf + 0j)?


回答 0

首先1将转换为复数1 + 0j,然后再进行inf * 0乘法运算,结果为nan

(inf + 0j) * 1
(inf + 0j) * (1 + 0j)
inf * 1  + inf * 0j  + 0j * 1 + 0j * 0j
#          ^ this is where it comes from
inf  + nan j  + 0j - 0
inf  + nan j

The 1 is converted to a complex number first, 1 + 0j, which then leads to an inf * 0 multiplication, resulting in a nan.

(inf + 0j) * 1
(inf + 0j) * (1 + 0j)
inf * 1  + inf * 0j  + 0j * 1 + 0j * 0j
#          ^ this is where it comes from
inf  + nan j  + 0j - 0
inf  + nan j

回答 1

从机械上讲,公认的答案当然是正确的,但我认为可以给出更深层次的答案。

首先,像@PeterCordes在注释中一样澄清问题是很有用的:“复数是否存在可用于inf + 0j的复数形式?” 或者换句话说就是OP认为在计算机实现复杂乘法方面存在弱点,或者在概念上有什么不完善的地方inf+0j

简短答案:

使用极坐标,我们可以将复数乘法视为缩放和旋转。即使将无限个“手臂”旋转0度(如乘以1的情况),我们也无法期望将其尖端以有限的精度放置。因此,确实存在一些根本不正确的东西inf+0j,即,一旦我们达到无穷大,有限的偏移就变得毫无意义。

长答案:

背景:这个问题所围绕的“大事”是扩展数字系统(考虑实数或复数)的问题。可能要这样做的原因之一是添加了无穷大的概念,或者如果恰好是数学家,则使“紧凑”。还有其他原因,太(https://en.wikipedia.org/wiki/Galois_theoryhttps://en.wikipedia.org/wiki/Non-standard_analysis),但我们不会在这里的那些兴趣。

一点压实

当然,关于这种扩展的棘手的一点是,我们希望这些新数字适合现有的算法。最简单的方法是在无穷大处添加一个元素(https://en.wikipedia.org/wiki/Alexandroff_extension),并使它等于零除以零。这适用于实数(https://en.wikipedia.org/wiki/Projectively_extended_real_line)和复数(https://en.wikipedia.org/wiki/Riemann_sphere)。

其他扩展…

尽管单点压缩是简单的并且在数学上是合理的,但是已经寻求了包括多个限定的“更丰富”的扩展。实际浮点数的IEEE 754标准具有+ inf和-inf(https://en.wikipedia.org/wiki/Extended_real_number_line)。看起来自然而直接,但是已经迫使我们跳过了圈并发明了-0 https://en.wikipedia.org/wiki/Signed_zero

…复杂平面的

复杂平面的扩展超过一英寸呢?

在计算机中,复数通常是通过将两个fp实数粘贴在一起来实现的,一个实数粘贴一个虚数部分。只要一切都是有限的,那是完全可以的。但是,一旦考虑到无限性,事情就会变得棘手。

复平面具有自然的旋转对称性,这与复数算法很好地联系在一起,因为将整个平面乘以e ^ phij与绕φ的旋转相同0

那附件G的东西

现在,为了简单起见,复杂的fp仅使用基础实数实现的扩展名(+/- inf,nan等)。这种选择似乎很自然,甚至没有被视为一种选择,但让我们仔细研究一下它的含义。复杂平面的此扩展的简单可视化效果类似于(I =无限,f =有限,0 = 0)

I IIIIIIIII I
             
I fffffffff I
I fffffffff I
I fffffffff I
I fffffffff I
I ffff0ffff I
I fffffffff I
I fffffffff I
I fffffffff I
I fffffffff I
             
I IIIIIIIII I

但是,由于真正的复数平面是尊重复数乘法的平面,因此可以提供更多信息

     III    
 I         I  
    fffff    
   fffffff   
  fffffffff  
I fffffffff I
I ffff0ffff I
I fffffffff I
  fffffffff  
   fffffff   
    fffff    
 I         I 
     III    

在此投影中,我们看到无限大的“不均匀分布”不仅丑陋,而且还遭受了OP类型问题的根源:大多数无限大(((+/- inf,有限)形式和(有限,+ / -inf)集中在四个主要方向上,所有其他方向仅由四个无穷大(+/- inf,+ -inf)表示。将复数乘法扩展到此几何体是一场噩梦,这不足为奇。

在C99规范的附录G会尽可能使其工作,包括弯曲如何在规则infnan相互作用(主要是inf胜过nan)。OP的问题是通过不将实数和提议的纯虚数类型提升为复数来避免的,但是让实数1与复数1的行为不同并不能解决我的问题。可以说,附件G没有充分说明两个无限性的乘积应该是什么。

我们可以做得更好吗?

试图通过选择更好的无限性几何来尝试解决这些问题。类似于扩展的实线,我们可以为每个方向添加一个无穷大。此构造类似于投影平面,但不会将相反的方向聚集在一起。无限性将以极坐标inf xe ^ {2 omega pi i}表示,定义乘积将很简单。特别是,OP的问题将很自然地解决。

但这就是好消息结束的地方。从某种意义上说,我们可以不拘一格地(而不是不合理地)要求我们的新式无限性支持提取其实部或虚部的函数。加法是另一个问题。添加两个非对映的无穷大,我们必须将角度设置为不确定nan(即(可以说该角度必须位于两个输入角度之间,但是没有简单的方式来表示“部分南度”))

黎曼来营救

鉴于所有这些,也许最好的做法是进行旧的一点压实。也许附件G的作者在强制要求将cproj所有无穷大集合在一起的函数时有相同的感觉。


这是一个相关问题,比我本人更有能力回答。

Mechanistically, the accepted answer is, of course, correct, but I would argue that a deeper ansswer can be given.

First, it is useful to clarify the question as @PeterCordes does in a comment: “Is there a multiplicative identity for complex numbers that does work on inf + 0j?” or in other words is what OP sees a weakness in the computer implementation of complex multiplication or is there something conceptually unsound with inf+0j

Short answer:

Using polar coordinates we can view complex multiplication as a scaling and a rotation. Rotating an infinite “arm” even by 0 degrees as in the case of multiplying by one we cannot expect to place its tip with finite precision. So indeed, there is something fundamentally not right with inf+0j, namely, that as soon as we are at infinity a finite offset becomes meaningless.

Long answer:

Background: The “big thing” around which this question revolves is the matter of extending a system of numbers (think reals or complex numbers). One reason one might want to do that is to add some concept of infinity, or to “compactify” if one happens to be a mathematician. There are other reasons, too (https://en.wikipedia.org/wiki/Galois_theory, https://en.wikipedia.org/wiki/Non-standard_analysis), but we are not interested in those here.

One point compactification

The tricky bit about such an extension is, of course, that we want these new numbers to fit into the existing arithmetic. The simplest way is to add a single element at infinity (https://en.wikipedia.org/wiki/Alexandroff_extension) and make it equal anything but zero divided by zero. This works for the reals (https://en.wikipedia.org/wiki/Projectively_extended_real_line) and the complex numbers (https://en.wikipedia.org/wiki/Riemann_sphere).

Other extensions …

While the one point compactification is simple and mathematically sound, “richer” extensions comprising multiple infinties have been sought. The IEEE 754 standard for real floating point numbers has +inf and -inf (https://en.wikipedia.org/wiki/Extended_real_number_line). Looks natural and straightforward but already forces us to jump through hoops and invent stuff like -0 https://en.wikipedia.org/wiki/Signed_zero

… of the complex plane

What about more-than-one-inf extensions of the complex plane?

In computers, complex numbers are typically implemented by sticking two fp reals together one for the real and one for the imaginary part. That is perfectly fine as long as everything is finite. As soon, however, as infinities are considered things become tricky.

The complex plane has a natural rotational symmetry, which ties in nicely with complex arithmetic as multiplying the entire plane by e^phij is the same as a phi radian rotation around 0.

That annex G thing

Now, to keep things simple, complex fp simply uses the extensions (+/-inf, nan etc.) of the underlying real number implementation. This choice may seem so natural it isn’t even perceived as a choice, but let’s take a closer look at what it implies. A simple visualization of this extension of the complex plane looks like (I = infinite, f = finite, 0 = 0)

I IIIIIIIII I
             
I fffffffff I
I fffffffff I
I fffffffff I
I fffffffff I
I ffff0ffff I
I fffffffff I
I fffffffff I
I fffffffff I
I fffffffff I
             
I IIIIIIIII I

But since a true complex plane is one that respects complex multiplication, a more informative projection would be

     III    
 I         I  
    fffff    
   fffffff   
  fffffffff  
I fffffffff I
I ffff0ffff I
I fffffffff I
  fffffffff  
   fffffff   
    fffff    
 I         I 
     III    

In this projection we see the “uneven distribution” of infinities that is not only ugly but also the root of problems of the kind OP has suffered: Most infinities (those of the forms (+/-inf, finite) and (finite, +/-inf) are lumped together at the four principal directions all other directions are represented by just four infinities (+/-inf, +-inf). It shouldn’t come as a surprise that extending complex multiplication to this geometry is a nightmare.

Annex G of the C99 spec tries its best to make it work, including bending the rules on how inf and nan interact (essentially inf trumps nan). OP’s problem is sidestepped by not promoting reals and a proposed purely imaginary type to complex, but having the real 1 behave differently from the complex 1 doesn’t strike me as a solution. Tellingly, Annex G stops short of fully specifying what the product of two infinities should be.

Can we do better?

It is tempting to try and fix these problems by choosing a better geometry of infinities. In analogy to the extended real line we could add one infinity for each direction. This construction is similar to the projective plane but doesn’t lump together opposite directions. Infinities would be represented in polar coordinates inf x e^{2 omega pi i}, defining products would be straightforward. In particular, OP’s problem would be solved quite naturally.

But this is where the good news ends. In a way we can be hurled back to square one by—not unreasonably—requiring that our newstyle infinities support functions that extract their real or imaginary parts. Addition is another problem; adding two nonantipodal infinities we’d have to set the angle to undefined i.e. nan (one could argue the angle must lie between the two input angles but there is no simple way of representing that “partial nan-ness”)

Riemann to the rescue

In view of all this maybe the good old one point compactification is the safest thing to do. Maybe the authors of Annex G felt the same when mandating a function cproj that lumps all the infinities together.


Here is a related question answered by people more competent on the subject matter than I am.


回答 2

这是在CPython中如何实现复杂乘法的实现细节。与其他语言(例如C或C ++)不同,CPython采用了一种较为简单的方法:

  1. 整数/浮点数被乘以复数
  2. 使用简单的学校公式,一旦涉及到无限数,它就不会提供预期的/预期的结果:
Py_complex
_Py_c_prod(Py_complex a, Py_complex b)
{
    Py_complex r;
    r.real = a.real*b.real - a.imag*b.imag;
    r.imag = a.real*b.imag + a.imag*b.real;
    return r;
}

上述代码的一种有问题的情况是:

(0.0+1.0*j)*(inf+inf*j) = (0.0*inf-1*inf)+(0.0*inf+1.0*inf)j
                        =  nan + nan*j

但是,人们希望得到这样的-inf + inf*j结果。

在这方面,其他语言不是遥不可及:复数乘法很长一段时间以来都不是C标准的一部分,仅作为附录G包含在C99中,该附录G描述了应如何执行复数乘法-而且它不像上面的学校公式!C ++标准没有指定复杂乘法的工作方式,因此大多数编译器实现都回落到C实现上,这可能符合C99(gcc,clang)或不符合(MSVC)。

对于上述“问题”示例,符合C99的实现(比学校公式更复杂)将提供(请参见live)预期结果:

(0.0+1.0*j)*(inf+inf*j) = -inf + inf*j 

即使使用C99标准,也没有为所有输入定义明确的结果,即使对于符合C99的版本也可能有所不同。

在C99 中float未被提升为另一个副作用complexinf+0.0j1.0或相乘1.0+0.0j会导致不同的结果(请参见此处实时显示):

  • (inf+0.0j)*1.0 = inf+0.0j
  • (inf+0.0j)*(1.0+0.0j) = inf-nanj,虚部是-nan和不是nan(作为CPython的)不会在这里发挥作用,因为所有的安静NaN是相等的(见),甚至有的还具有符号位组(因此打印为“ – ”,看到),有些则没有。

这至少是违反直觉的。


我的主要收获是:“简单”的复数乘法(或除法)并不简单,当在语言或什至是编译器之间切换时,人们必须为微妙的错误/差异做好准备。

This is an implementation detail of how complex multiplication is implemented in CPython. Unlike other languages (e.g. C or C++), CPython takes a somewhat simplistic approach:

  1. ints/floats are promoted to complex numbers in multiplication
  2. the simple school-formula is used, which doesn’t provide desired/expected results as soon as infinite numbers are involved:
Py_complex
_Py_c_prod(Py_complex a, Py_complex b)
{
    Py_complex r;
    r.real = a.real*b.real - a.imag*b.imag;
    r.imag = a.real*b.imag + a.imag*b.real;
    return r;
}

One problematic case with the above code would be:

(0.0+1.0*j)*(inf+inf*j) = (0.0*inf-1*inf)+(0.0*inf+1.0*inf)j
                        =  nan + nan*j

However, one would like to have -inf + inf*j as result.

In this respect other languages are not far ahead: complex number multiplication was for long a time not part of the C standard, included only in C99 as appendix G, which describes how a complex multiplication should be performed – and it is not as simple as the school formula above! The C++ standard doesn’t specify how complex multiplication should work, thus most compiler implementations are falling back to C-implementation, which might be C99 conforming (gcc, clang) or not (MSVC).

For the above “problematic” example, C99-compliant implementations (which are more complicated than the school formula) would give (see live) the expected result:

(0.0+1.0*j)*(inf+inf*j) = -inf + inf*j 

Even with C99 standard, an unambiguous result is not defined for all inputs and it might be different even for C99-compliant versions.

Another side effect of float not being promoted to complex in C99 is that multiplyinginf+0.0j with 1.0 or 1.0+0.0j can lead to different results (see here live):

  • (inf+0.0j)*1.0 = inf+0.0j
  • (inf+0.0j)*(1.0+0.0j) = inf-nanj, imaginary part being -nan and not nan (as for CPython) doesn’t play a role here, because all quiet nans are equivalent (see this), even some of them have sign-bit set (and thus printed as “-“, see this) and some not.

Which is at least counter-intuitive.


My key take-away from it is: there is nothing simple about “simple” complex number multiplication (or division) and when switching between languages or even compilers one must brace oneself for subtle bugs/differences.


回答 3

Python的有趣定义。如果我们用笔和纸解决此问题,我会说预期的结果将expected: (inf + 0j)如您所指出的那样,因为我们知道我们的意思是1这样(float('inf')+0j)*1 =should= ('inf'+0j)

但是事实并非如此,当您运行它时,我们得到:

>>> Complex( float('inf') , 0j ) * 1
result: (inf + nanj)

Python的理解这*1是一个复杂的数量和不规范的做法1,因此解释为*(1+0j),当我们尝试做错误出现inf * 0j = nanjinf*0不能得到解决。

您实际想要做什么(假设1是1的范数):

回想一下,如果z = x + iy是具有实部x和虚部y的复数,则将的复共轭z定义为z* = x − iy,将绝对值(也称为norm of z)定义为:

假设1是正常的1,我们应该做的是这样的:

>>> c_num = complex(float('inf'),0)
>>> value = 1
>>> realPart=(c_num.real)*value
>>> imagPart=(c_num.imag)*value
>>> complex(realPart,imagPart)
result: (inf+0j)

我知道的不是很直观…但是有时编码语言的定义方式与我们日常使用的方式不同。

Funny definition from Python. If we are solving this with a pen and paper I would say that expected result would be expected: (inf + 0j) as you pointed out because we know that we mean the norm of 1 so (float('inf')+0j)*1 =should= ('inf'+0j):

But that is not the case as you can see… when we run it we get:

>>> Complex( float('inf') , 0j ) * 1
result: (inf + nanj)

Python understands this *1 as a complex number and not the norm of 1 so it interprets as *(1+0j) and the error appears when we try to do inf * 0j = nanj as inf*0 can’t be resolved.

What you actually want to do (assuming 1 is the norm of 1):

Recall that if z = x + iy is a complex number with real part x and imaginary part y, the complex conjugate of z is defined as z* = x − iy, and the absolute value, also called the norm of z is defined as:

Assuming 1 is the norm of 1 we should do something like:

>>> c_num = complex(float('inf'),0)
>>> value = 1
>>> realPart=(c_num.real)*value
>>> imagPart=(c_num.imag)*value
>>> complex(realPart,imagPart)
result: (inf+0j)

not very intuitive I know… but sometimes coding languages get defined in a different way from what we are used in our day to day.


为什么4 * 0.1的浮点值在Python 3中看起来不错,但3 * 0.1却不这样?

问题:为什么4 * 0.1的浮点值在Python 3中看起来不错,但3 * 0.1却不这样?

我知道大多数小数都没有确切的浮点表示形式(浮点数学运算符是否损坏?)。

但是,当两个值实际上都具有丑陋的十进制表示形式时,我看不出为什么4*0.1将其很好地打印为0.4,但3*0.1不是这样:

>>> 3*0.1
0.30000000000000004
>>> 4*0.1
0.4
>>> from decimal import Decimal
>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')

I know that most decimals don’t have an exact floating point representation (Is floating point math broken?).

But I don’t see why 4*0.1 is printed nicely as 0.4, but 3*0.1 isn’t, when both values actually have ugly decimal representations:

>>> 3*0.1
0.30000000000000004
>>> 4*0.1
0.4
>>> from decimal import Decimal
>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')

回答 0

简单的答案是因为3*0.1 != 0.3归因于量化(四舍五入)误差(而4*0.1 == 0.4乘以2的幂通常是“精确”运算)。

您可以.hex在Python中使用该方法查看数字的内部表示形式(基本上是确切的二进制浮点值,而不是以10为底的近似值)。这可以帮助解释幕后情况。

>>> (0.1).hex()
'0x1.999999999999ap-4'
>>> (0.3).hex()
'0x1.3333333333333p-2'
>>> (0.1*3).hex()
'0x1.3333333333334p-2'
>>> (0.4).hex()
'0x1.999999999999ap-2'
>>> (0.1*4).hex()
'0x1.999999999999ap-2'

0.1是0x1.999999999999a乘以2 ^ -4。末尾的“ a”表示数字10-换句话说,二进制浮点数中的0.1 略大于 “精确”值0.1(因为最终的0x0.99舍入为0x0.a)。当您将其乘以4(2的幂)时,指数会上移(从2 ^ -4到2 ^ -2),但数字不变,所以4*0.1 == 0.4

但是,当乘以3时,0x0.99与0x0.a0(0x0.07)之间的微小差异会放大为0x0.15错误,在最后一个位置显示为一位错误。这将导致0.1 * 3 略大于 0.3的舍入值。

Python 3的float repr被设计为可双向访问的,也就是说,显示的值应完全可转换为原始值。因此,它无法显示0.30.1*3完全相同的方式,或两个不同的数字最终会往返后相同。因此,Python 3的repr引擎选择显示一个略有明显错误的引擎。

The simple answer is because 3*0.1 != 0.3 due to quantization (roundoff) error (whereas 4*0.1 == 0.4 because multiplying by a power of two is usually an “exact” operation). Python tries to find the shortest string that would round to the desired value, so it can display 4*0.1 as 0.4 as these are equal, but it cannot display 3*0.1 as 0.3 because these are not equal.

You can use the .hex method in Python to view the internal representation of a number (basically, the exact binary floating point value, rather than the base-10 approximation). This can help to explain what’s going on under the hood.

>>> (0.1).hex()
'0x1.999999999999ap-4'
>>> (0.3).hex()
'0x1.3333333333333p-2'
>>> (0.1*3).hex()
'0x1.3333333333334p-2'
>>> (0.4).hex()
'0x1.999999999999ap-2'
>>> (0.1*4).hex()
'0x1.999999999999ap-2'

0.1 is 0x1.999999999999a times 2^-4. The “a” at the end means the digit 10 – in other words, 0.1 in binary floating point is very slightly larger than the “exact” value of 0.1 (because the final 0x0.99 is rounded up to 0x0.a). When you multiply this by 4, a power of two, the exponent shifts up (from 2^-4 to 2^-2) but the number is otherwise unchanged, so 4*0.1 == 0.4.

However, when you multiply by 3, the tiny little difference between 0x0.99 and 0x0.a0 (0x0.07) magnifies into a 0x0.15 error, which shows up as a one-digit error in the last position. This causes 0.1*3 to be very slightly larger than the rounded value of 0.3.

Python 3’s float repr is designed to be round-trippable, that is, the value shown should be exactly convertible into the original value (float(repr(f)) == f for all floats f). Therefore, it cannot display 0.3 and 0.1*3 exactly the same way, or the two different numbers would end up the same after round-tripping. Consequently, Python 3’s repr engine chooses to display one with a slight apparent error.


回答 1

reprstr在Python 3中)将根据需要输出尽可能多的数字,以使该值明确。在这种情况下,相乘的结果3*0.1不是最接近0.3的值(十六进制为0x1.3333333333333p-2),实际上是高了一个LSB​​(0x1.3333333333334p-2),因此它需要更多的数字才能与0.3区分。

另一方面,乘法4*0.1 的确获得了最接近0.4的值(十六进制为0x1.999999999999ap-2),因此不需要任何其他数字。

您可以很容易地验证这一点:

>>> 3*0.1 == 0.3
False
>>> 4*0.1 == 0.4
True

我在上面使用了十六进制表示法,因为它既美观又紧凑,并且显示了两个值之间的位差。您可以使用eg自己执行此操作(3*0.1).hex()。如果您希望以全部十进制的形式查看它们,请执行以下操作:

>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(0.3)
Decimal('0.299999999999999988897769753748434595763683319091796875')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')
>>> Decimal(0.4)
Decimal('0.40000000000000002220446049250313080847263336181640625')

repr (and str in Python 3) will put out as many digits as required to make the value unambiguous. In this case the result of the multiplication 3*0.1 isn’t the closest value to 0.3 (0x1.3333333333333p-2 in hex), it’s actually one LSB higher (0x1.3333333333334p-2) so it needs more digits to distinguish it from 0.3.

On the other hand, the multiplication 4*0.1 does get the closest value to 0.4 (0x1.999999999999ap-2 in hex), so it doesn’t need any additional digits.

You can verify this quite easily:

>>> 3*0.1 == 0.3
False
>>> 4*0.1 == 0.4
True

I used hex notation above because it’s nice and compact and shows the bit difference between the two values. You can do this yourself using e.g. (3*0.1).hex(). If you’d rather see them in all their decimal glory, here you go:

>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(0.3)
Decimal('0.299999999999999988897769753748434595763683319091796875')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')
>>> Decimal(0.4)
Decimal('0.40000000000000002220446049250313080847263336181640625')

回答 2

这是其他答案的简化结论。

如果您在Python的命令行上检查浮点数或将其打印,它将通过repr创建其字符串表示形式的函数。

从3.2版开始,Python strrepr使用复杂的舍入机制,其更喜欢好看的小数,如果有可能,但使用更多的数字在需要保证双射(一个一对一)映射花车和它们的字符串表示之间。

这种方案保证repr(float(s))即使简单的小数点不能精确地表示为浮点数(例如when),其值对于简单的小数点也看起来不错s = "0.1")

同时,它保证float(repr(x)) == x每个浮动都成立x

Here’s a simplified conclusion from other answers.

If you check a float on Python’s command line or print it, it goes through function repr which creates its string representation.

Starting with version 3.2, Python’s str and repr use a complex rounding scheme, which prefers nice-looking decimals if possible, but uses more digits where necessary to guarantee bijective (one-to-one) mapping between floats and their string representations.

This scheme guarantees that value of repr(float(s)) looks nice for simple decimals, even if they can’t be represented precisely as floats (eg. when s = "0.1").

At the same time it guarantees that float(repr(x)) == x holds for every float x


回答 3

并不是真的特定于Python的实现,而是应该适用于任何浮点数到十进制字符串的函数。

浮点数本质上是一个二进制数,但以科学计数法表示,有效数字的固定限制。

具有不与底数共享的质数因子的任何数字的逆将始终导致重复的点表示。例如1/7的素数为7,与10不共享,因此具有重复的十进制表示形式,素数为2和5的1/10也是如此,后者不与2共享; 这意味着0.1不能由点后的有限位数精确表示。

由于0.1没有精确的表示形式,因此将近似值转换为小数点字符串的函数通常将尝试近似某些值,以使它们不会像0.1000000000004121那样获得不直观的结果。

由于浮点数是科学计数法,因此任何乘以基数的幂只会影响数的指数部分。例如,十进制表示法为1.231e + 2 * 100 = 1.231e + 4,同样,二进制表示法为1.00101010e11 * 100 = 1.00101010e101。如果我乘以非底数的幂,则有效数字也会受到影响。例如1.2e1 * 3 = 3.6e1

根据所使用的算法,它可能会尝试仅根据有效数字来猜测常见的小数。0.1和0.4都具有相同的二进制有效数字,因为它们的浮点数本质上分别是(8/5)(2 ^ -4)和(8/5)(2 ^ -6)的截断。如果该算法将8/5 sigfig模式标识为十进制1.6,则它将适用于0.1、0.2、0.4、0.8等。对于其他组合(例如,浮点数3除以浮点数10),它也可能具有魔术的sigfig模式。以及其他统计上可能由10除以形成的魔术图案。

在3 * 0.1的情况下,最后几个有效数字可能与将浮点数3除以浮点数10有所不同,从而导致算法无法根据其对精度损失的容忍度来识别0.3常数的幻数。

编辑:https//docs.python.org/3.1/tutorial/floatingpoint.html

有趣的是,有许多不同的十进制数字共享相同的最接近的近似二进制分数。例如,数字0.1和0.10000000000000001和0.1000000000000000055511151231257827021181583404541015625都由3602879701896397/2 ** 55近似。由于所有这些十进制值都具有相同的近似值,因此可以显示其中任何一个,同时仍保留不变的eval(repr(x) )== x。

对于精度损失没有容忍度,如果float x(0.3)不完全等于float y(0.1 * 3),则repr(x)不完全等于repr(y)。

Not really specific to Python’s implementation but should apply to any float to decimal string functions.

A floating point number is essentially a binary number, but in scientific notation with a fixed limit of significant figures.

The inverse of any number that has a prime number factor that is not shared with the base will always result in a recurring dot point representation. For example 1/7 has a prime factor, 7, that is not shared with 10, and therefore has a recurring decimal representation, and the same is true for 1/10 with prime factors 2 and 5, the latter not being shared with 2; this means that 0.1 cannot be exactly represented by a finite number of bits after the dot point.

Since 0.1 has no exact representation, a function that converts the approximation to a decimal point string will usually try to approximate certain values so that they don’t get unintuitive results like 0.1000000000004121.

Since the floating point is in scientific notation, any multiplication by a power of the base only affects the exponent part of the number. For example 1.231e+2 * 100 = 1.231e+4 for decimal notation, and likewise, 1.00101010e11 * 100 = 1.00101010e101 in binary notation. If I multiply by a non-power of the base, the significant digits will also be affected. For example 1.2e1 * 3 = 3.6e1

Depending on the algorithm used, it may try to guess common decimals based on the significant figures only. Both 0.1 and 0.4 have the same significant figures in binary, because their floats are essentially truncations of (8/5)(2^-4) and (8/5)(2^-6) respectively. If the algorithm identifies the 8/5 sigfig pattern as the decimal 1.6, then it will work on 0.1, 0.2, 0.4, 0.8, etc. It may also have magic sigfig patterns for other combinations, such as the float 3 divided by float 10 and other magic patterns statistically likely to be formed by division by 10.

In the case of 3*0.1, the last few significant figures will likely be different from dividing a float 3 by float 10, causing the algorithm to fail to recognize the magic number for the 0.3 constant depending on its tolerance for precision loss.

Edit: https://docs.python.org/3.1/tutorial/floatingpoint.html

Interestingly, there are many different decimal numbers that share the same nearest approximate binary fraction. For example, the numbers 0.1 and 0.10000000000000001 and 0.1000000000000000055511151231257827021181583404541015625 are all approximated by 3602879701896397 / 2 ** 55. Since all of these decimal values share the same approximation, any one of them could be displayed while still preserving the invariant eval(repr(x)) == x.

There is no tolerance for precision loss, if float x (0.3) is not exactly equal to float y (0.1*3), then repr(x) is not exactly equal to repr(y).