在Python中将float转换为整数的最安全方法?

问题:在Python中将float转换为整数的最安全方法?

Python的math模块包含诸如floor&的便捷函数ceil。这些函数采用浮点数,并在其下或上返回最接近的整数。但是,这些函数将答案作为浮点数返回。例如:

import math
f=math.floor(2.3)

现在f返回:

2.0

从该浮点数中获取整数而不冒取舍入错误风险的最安全方法是什么(例如,如果浮点数等于1.99999),或者我应该完全使用另一个函数?

Python’s math module contain handy functions like floor & ceil. These functions take a floating point number and return the nearest integer below or above it. However these functions return the answer as a floating point number. For example:

import math
f=math.floor(2.3)

Now f returns:

2.0

What is the safest way to get an integer out of this float, without running the risk of rounding errors (for example if the float is the equivalent of 1.99999) or perhaps I should use another function altogether?


回答 0

可以用浮点数表示的所有整数均具有精确的表示形式。这样您就可以安全地使用int结果了。仅当您尝试使用非2的幂的分母来表示有理数时,才会出现不精确的表示。

这项工作一点都不小!IEEE浮点表示的一个属性是int∘floor=⌊⋅⌋,如果所讨论的数字的大小足够小,但是int(floor(2.3))可能为1的情况下,可能会有不同的表示形式。

要引用维基百科

绝对值小于或等于2 24的任何整数都可以用单精度格式准确表示,绝对值小于或等于2 53的任何整数都可以用双精度格式准确表示。

All integers that can be represented by floating point numbers have an exact representation. So you can safely use int on the result. Inexact representations occur only if you are trying to represent a rational number with a denominator that is not a power of two.

That this works is not trivial at all! It’s a property of the IEEE floating point representation that int∘floor = ⌊⋅⌋ if the magnitude of the numbers in question is small enough, but different representations are possible where int(floor(2.3)) might be 1.

To quote from Wikipedia,

Any integer with absolute value less than or equal to 224 can be exactly represented in the single precision format, and any integer with absolute value less than or equal to 253 can be exactly represented in the double precision format.


回答 1

使用int(your non integer number)将打钉。

print int(2.3) # "2"
print int(math.sqrt(5)) # "2"

Use int(your non integer number) will nail it.

print int(2.3) # "2"
print int(math.sqrt(5)) # "2"

回答 2

您可以使用舍入功能。如果您不使用第二个参数(有效数字位数),那么我认为您将获得想要的行为。

空闲输出。

>>> round(2.99999999999)
3
>>> round(2.6)
3
>>> round(2.5)
3
>>> round(2.4)
2

You could use the round function. If you use no second parameter (# of significant digits) then I think you will get the behavior you want.

IDLE output.

>>> round(2.99999999999)
3
>>> round(2.6)
3
>>> round(2.5)
3
>>> round(2.4)
2

回答 3

结合之前的两个结果,我们得到:

int(round(some_float))

这可以相当可靠地将浮点数转换为整数。

Combining two of the previous results, we have:

int(round(some_float))

This converts a float to an integer fairly dependably.


回答 4

这项工作一点都不小!IEEE浮点表示的一个属性是int∘floor=⌊⋅⌋,如果所讨论的数字的大小足够小,但是int(floor(2.3))可能为1的情况下,可能会有不同的表示形式。

这篇文章解释了为什么它可以在这个范围内工作

在double中,您可以毫无问题地表示32位整数。有不能是任何四舍五入问题。更精确地,双精度数可以表示2 53-2 53之间(包括2 53-2 53)的所有整数。

简短说明:一个double最多可以存储53个二进制数字。当您需要更多时,该数字将在右边填充零。

由此可见,53个数字是无需填充即可存储的最大数字。自然,所有需要较少数字的(整数)数字都可以准确存储。

111加1(省略)111(53个)将产生100 … 000,(53个零)。众所周知,我们可以存储53位数字,即最右边的零填充。

这是2 53的来源。


详细信息:我们需要考虑IEEE-754浮点如何工作。

  1 bit    11 / 8     52 / 23      # bits double/single precision
[ sign |  exponent | mantissa ]

然后,该数字的计算方式如下(不包括此处无关的特殊情况):

-1 ×1.尾数×2 指数-偏差

其中偏压= 2 指数- 1 1 –分别,即,1023和127,用于双/单精度。

明知乘以2 X根本改变所有位X位的左侧,可以很容易地看到,任何整数必须具备的所有位尾数为此右上小数点零。

除零以外的任何整数都具有以下二进制形式:

1x … x,其中x -es表示MSB右侧的位(最高有效位)。

因为我们排除了零,所以总会有一个MSB为1,这就是为什么不存储它的原因。要存储整数,我们必须将其转换为上述形式:-1 符号 ×1.尾数×2 指数偏差

就是说,将这些位移到小数点后直到只有MSB朝MSB的左侧移动。然后,小数点右边的所有位都存储在尾数中。

由此可见,除MSB外,我们最多可以存储52个二进制数字。

因此,显式存储所有位的最高编号为

111(omitted)111.   that's 53 ones (52 + implicit 1) in the case of doubles.

为此,我们需要设置指数,以使小数点后移52位。如果我们将指数增加一,我们将无法知道小数点后左边的数字。

111(omitted)111x.

按照惯例,它是0。将整个尾数设置为零,我们收到以下数字:

100(omitted)00x. = 100(omitted)000.

这是一个1,后跟53个零,已存储52个,并且由于指数而加了1。

它代表2 53,它标志着我们可以准确表示所有整数的边界(负向和正向)。如果要将1加到2 53,则必须将隐式零(由表示x)设置为1,但这是不可能的。

That this works is not trivial at all! It’s a property of the IEEE floating point representation that int∘floor = ⌊⋅⌋ if the magnitude of the numbers in question is small enough, but different representations are possible where int(floor(2.3)) might be 1.

This post explains why it works in that range.

In a double, you can represent 32bit integers without any problems. There cannot be any rounding issues. More precisely, doubles can represent all integers between and including 253 and -253.

Short explanation: A double can store up to 53 binary digits. When you require more, the number is padded with zeroes on the right.

It follows that 53 ones is the largest number that can be stored without padding. Naturally, all (integer) numbers requiring less digits can be stored accurately.

Adding one to 111(omitted)111 (53 ones) yields 100…000, (53 zeroes). As we know, we can store 53 digits, that makes the rightmost zero padding.

This is where 253 comes from.


More detail: We need to consider how IEEE-754 floating point works.

  1 bit    11 / 8     52 / 23      # bits double/single precision
[ sign |  exponent | mantissa ]

The number is then calculated as follows (excluding special cases that are irrelevant here):

-1sign × 1.mantissa ×2exponent – bias

where bias = 2exponent – 1 – 1, i.e. 1023 and 127 for double/single precision respectively.

Knowing that multiplying by 2X simply shifts all bits X places to the left, it’s easy to see that any integer must have all bits in the mantissa that end up right of the decimal point to zero.

Any integer except zero has the following form in binary:

1x…x where the x-es represent the bits to the right of the MSB (most significant bit).

Because we excluded zero, there will always be a MSB that is one—which is why it’s not stored. To store the integer, we must bring it into the aforementioned form: -1sign × 1.mantissa ×2exponent – bias.

That’s saying the same as shifting the bits over the decimal point until there’s only the MSB towards the left of the MSB. All the bits right of the decimal point are then stored in the mantissa.

From this, we can see that we can store at most 52 binary digits apart from the MSB.

It follows that the highest number where all bits are explicitly stored is

111(omitted)111.   that's 53 ones (52 + implicit 1) in the case of doubles.

For this, we need to set the exponent, such that the decimal point will be shifted 52 places. If we were to increase the exponent by one, we cannot know the digit right to the left after the decimal point.

111(omitted)111x.

By convention, it’s 0. Setting the entire mantissa to zero, we receive the following number:

100(omitted)00x. = 100(omitted)000.

That’s a 1 followed by 53 zeroes, 52 stored and 1 added due to the exponent.

It represents 253, which marks the boundary (both negative and positive) between which we can accurately represent all integers. If we wanted to add one to 253, we would have to set the implicit zero (denoted by the x) to one, but that’s impossible.


回答 5

math.floor将始终返回整数,因此int(math.floor(some_float))永远不会引入舍入错误。

但是,舍入错误可能已经引入了math.floor(some_large_float),或者甚至当首先将大量存储在float中时也已引入。(存储在浮点数中的大数字可能会失去精度。)

math.floor will always return an integer number and thus int(math.floor(some_float)) will never introduce rounding errors.

The rounding error might already be introduced in math.floor(some_large_float), though, or even when storing a large number in a float in the first place. (Large numbers may lose precision when stored in floats.)


回答 6

如果需要将字符串float转换为int,则可以使用此方法。

例如:'38.0'38

为了将其转换为int,可以将其转换为float,然后转换为int。这也适用于浮点字符串或整数字符串。

>>> int(float('38.0'))
38
>>> int(float('38'))
38

注意:这将删除小数点后的所有数字。

>>> int(float('38.2'))
38

If you need to convert a string float to an int you can use this method.

Example: '38.0' to 38

In order to convert this to an int you can cast it as a float then an int. This will also work for float strings or integer strings.

>>> int(float('38.0'))
38
>>> int(float('38'))
38

Note: This will strip any numbers after the decimal.

>>> int(float('38.2'))
38

回答 7

另一个代码示例使用变量将实数/浮点数转换为整数。“ vel”是一个实数/浮点数,并转换为第二高的整数“ newvel”。

import arcpy.math, os, sys, arcpy.da
.
.
with arcpy.da.SearchCursor(densifybkp,[floseg,vel,Length]) as cursor:
 for row in cursor:
    curvel = float(row[1])
    newvel = int(math.ceil(curvel))

Another code sample to convert a real/float to an integer using variables. “vel” is a real/float number and converted to the next highest INTEGER, “newvel”.

import arcpy.math, os, sys, arcpy.da
.
.
with arcpy.da.SearchCursor(densifybkp,[floseg,vel,Length]) as cursor:
 for row in cursor:
    curvel = float(row[1])
    newvel = int(math.ceil(curvel))

回答 8

由于您要求的是“最安全”的方式,因此我将提供除最佳答案之外的另一个答案。

确保您不损失任何精度的一种简单方法是检查转换后的值是否相等。

if int(some_value) == some_value:
     some_value = int(some_value)

例如,如果float为1.0,则1.0等于1。因此将执行向int的转换。如果float为1.1,则int(1.1)等于1,并且1.1!=1。因此,该值将保持为float值,并且不会损失任何精度。

Since you’re asking for the ‘safest’ way, I’ll provide another answer other than the top answer.

An easy way to make sure you don’t lose any precision is to check if the values would be equal after you convert them.

if int(some_value) == some_value:
     some_value = int(some_value)

If the float is 1.0 for example, 1.0 is equal to 1. So the conversion to int will execute. And if the float is 1.1, int(1.1) equates to 1, and 1.1 != 1. So the value will remain a float and you won’t lose any precision.


回答 9

df [‘Column_Name’] = df [‘Column_Name’]。astype(int)

df[‘Column_Name’]=df[‘Column_Name’].astype(int)