标签归档:type-conversion

如何将布尔数组转换为int数组

问题:如何将布尔数组转换为int数组

我使用Scilab,并希望将布尔数组转换为整数数组:

>>> x = np.array([4, 3, 2, 1])
>>> y = 2 >= x
>>> y
array([False, False,  True,  True], dtype=bool)

在Scilab中,我可以使用:

>>> bool2s(y)
0.    0.    1.    1.  

甚至只是将其乘以1:

>>> 1*y
0.    0.    1.    1.  

在Python中是否有一个简单的命令,还是我必须使用循环?

I use Scilab, and want to convert an array of booleans into an array of integers:

>>> x = np.array([4, 3, 2, 1])
>>> y = 2 >= x
>>> y
array([False, False,  True,  True], dtype=bool)

In Scilab I can use:

>>> bool2s(y)
0.    0.    1.    1.  

or even just multiply it by 1:

>>> 1*y
0.    0.    1.    1.  

Is there a simple command for this in Python, or would I have to use a loop?


回答 0

numpy数组有一个astype方法。做吧y.astype(int)

请注意,根据您使用数组的目的,甚至可能没有必要执行此操作。在许多情况下,Bool会自动提升为int,因此您可以将其添加到int数组中,而无需显式转换它:

>>> x
array([ True, False,  True], dtype=bool)
>>> x + [1, 2, 3]
array([2, 2, 4])

Numpy arrays have an astype method. Just do y.astype(int).

Note that it might not even be necessary to do this, depending on what you’re using the array for. Bool will be autopromoted to int in many cases, so you can add it to int arrays without having to explicitly convert it:

>>> x
array([ True, False,  True], dtype=bool)
>>> x + [1, 2, 3]
array([2, 2, 4])

回答 1

1*y方法也适用于Numpy:

>>> import numpy as np
>>> x = np.array([4, 3, 2, 1])
>>> y = 2 >= x
>>> y
array([False, False,  True,  True], dtype=bool)
>>> 1*y                      # Method 1
array([0, 0, 1, 1])
>>> y.astype(int)            # Method 2
array([0, 0, 1, 1]) 

如果您正在寻求一种将Python列表从Boolean转换为int的方法,则可以使用以下map方法:

>>> testList = [False, False,  True,  True]
>>> map(lambda x: 1 if x else 0, testList)
[0, 0, 1, 1]
>>> map(int, testList)
[0, 0, 1, 1]

或使用列表推导:

>>> testList
[False, False, True, True]
>>> [int(elem) for elem in testList]
[0, 0, 1, 1]

The 1*y method works in Numpy too:

>>> import numpy as np
>>> x = np.array([4, 3, 2, 1])
>>> y = 2 >= x
>>> y
array([False, False,  True,  True], dtype=bool)
>>> 1*y                      # Method 1
array([0, 0, 1, 1])
>>> y.astype(int)            # Method 2
array([0, 0, 1, 1]) 

If you are asking for a way to convert Python lists from Boolean to int, you can use map to do it:

>>> testList = [False, False,  True,  True]
>>> map(lambda x: 1 if x else 0, testList)
[0, 0, 1, 1]
>>> map(int, testList)
[0, 0, 1, 1]

Or using list comprehensions:

>>> testList
[False, False, True, True]
>>> [int(elem) for elem in testList]
[0, 0, 1, 1]

回答 2

使用numpy,您可以执行以下操作:

y = x.astype(int)

如果您使用的是非numpy数组,则可以使用列表推导

y = [int(val) for val in x]

Using numpy, you can do:

y = x.astype(int)

If you were using a non-numpy array, you could use a list comprehension:

y = [int(val) for val in x]

回答 3

大多数时候,您不需要转换:

>>>array([True,True,False,False]) + array([1,2,3,4])
array([2, 3, 3, 4])

正确的方法是:

yourArray.astype(int)

要么

yourArray.astype(float)

Most of the time you don’t need conversion:

>>>array([True,True,False,False]) + array([1,2,3,4])
array([2, 3, 3, 4])

The right way to do it is:

yourArray.astype(int)

or

yourArray.astype(float)

回答 4

我知道您要求使用非循环解决方案,但无论如何,我能想到的唯一解决方案可能是内部循环的:

map(int,y)

要么:

[i*1 for i in y]

要么:

import numpy
y=numpy.array(y)
y*1

I know you asked for non-looping solutions, but the only solutions I can come up with probably loop internally anyway:

map(int,y)

or:

[i*1 for i in y]

or:

import numpy
y=numpy.array(y)
y*1

回答 5

一个有趣的方法是

>>> np.array([True, False, False]) + 0 
np.array([1, 0, 0])

A funny way to do this is

>>> np.array([True, False, False]) + 0 
np.array([1, 0, 0])

如何在python中将int转换为Enum?

问题:如何在python中将int转换为Enum?

在python 2.7.6中使用新的Enum功能(通过backport enum34)。

给定以下定义,如何将int转换为相应的Enum值?

from enum import Enum

class Fruit(Enum):
    Apple = 4
    Orange = 5
    Pear = 6

我知道我可以手工制作一系列的if语句来进行转换,但是有没有简单的pythonic转换方法?基本上,我想要一个返回枚举值的函数ConvertIntToFruit(int)。

我的用例是我有一个记录的csv文件,在其中我将每个记录读入一个对象。文件字段之一是代表枚举的整数字段。在填充对象时,我想将文件中的整数字段转换为对象中对应的Enum值。

Using the new Enum feature (via backport enum34) with python 2.7.6.

Given the following definition, how can I convert an int to the corresponding Enum value?

from enum import Enum

class Fruit(Enum):
    Apple = 4
    Orange = 5
    Pear = 6

I know I can hand craft a series of if-statements to do the conversion but is there an easy pythonic way to convert? Basically, I’d like a function ConvertIntToFruit(int) that returns an enum value.

My use case is I have a csv file of records where I’m reading each record into an object. One of the file fields is an integer field that represents an enumeration. As I’m populating the object I’d like to convert that integer field from the file into the corresponding Enum value in the object.


回答 0

您“打电话”Enum上课:

Fruit(5)

轮到5Fruit.Orange

>>> from enum import Enum
>>> class Fruit(Enum):
...     Apple = 4
...     Orange = 5
...     Pear = 6
... 
>>> Fruit(5)
<Fruit.Orange: 5>

从文档的程序访问到枚举成员及其属性部分:

有时,以编程方式访问枚举中的成员很有用(例如,Color.red由于在编写程序时尚不知道确切的颜色而无法这样做)。Enum允许这样的访问:

>>> Color(1)
<Color.red: 1>
>>> Color(3)
<Color.blue: 3>

在相关说明中:要映射包含枚举成员名称的字符串值,请使用subscription:

>>> s = 'Apple'
>>> Fruit[s]
<Fruit.Apple: 4>

You ‘call’ the Enum class:

Fruit(5)

to turn 5 into Fruit.Orange:

>>> from enum import Enum
>>> class Fruit(Enum):
...     Apple = 4
...     Orange = 5
...     Pear = 6
... 
>>> Fruit(5)
<Fruit.Orange: 5>

From the Programmatic access to enumeration members and their attributes section of the documentation:

Sometimes it’s useful to access members in enumerations programmatically (i.e. situations where Color.red won’t do because the exact color is not known at program-writing time). Enum allows such access:

>>> Color(1)
<Color.red: 1>
>>> Color(3)
<Color.blue: 3>

In a related note: to map a string value containing the name of an enum member, use subscription:

>>> s = 'Apple'
>>> Fruit[s]
<Fruit.Apple: 4>

回答 1

我认为这是简单的话是对转换int价值为Enum通过调用EnumType(int_value),访问后name的的Enum对象:

my_fruit_from_int = Fruit(5) #convert to int
fruit_name = my_fruit_from_int.name #get the name
print(fruit_name) #Orange will be printed here

或作为功能:

def convert_int_to_fruit(int_value):
    try:
        my_fruit_from_int = Fruit(int_value)
        return my_fruit_from_int.name
    except:
        return None

I think it is in simple words is to convert the int value into Enum by calling EnumType(int_value), after that access the name of the Enum object:

my_fruit_from_int = Fruit(5) #convert to int
fruit_name = my_fruit_from_int.name #get the name
print(fruit_name) #Orange will be printed here

Or as a function:

def convert_int_to_fruit(int_value):
    try:
        my_fruit_from_int = Fruit(int_value)
        return my_fruit_from_int.name
    except:
        return None

回答 2

我想要类似的东西,以便可以从单个引用访问值对的任何一部分。香草版本:

#!/usr/bin/env python3


from enum import IntEnum


class EnumDemo(IntEnum):
    ENUM_ZERO       = 0
    ENUM_ONE        = 1
    ENUM_TWO        = 2
    ENUM_THREE      = 3
    ENUM_INVALID    = 4


#endclass.


print('Passes')
print('1) %d'%(EnumDemo['ENUM_TWO']))
print('2) %s'%(EnumDemo['ENUM_TWO']))
print('3) %s'%(EnumDemo.ENUM_TWO.name))
print('4) %d'%(EnumDemo.ENUM_TWO))
print()


print('Fails')
print('1) %d'%(EnumDemo.ENUM_TWOa))

失败将引发异常。

一个更强大的版本:

#!/usr/bin/env python3


class EnumDemo():


    enumeration =   (
                        'ENUM_ZERO',    # 0.
                        'ENUM_ONE',     # 1.
                        'ENUM_TWO',     # 2.
                        'ENUM_THREE',   # 3.
                        'ENUM_INVALID'  # 4.
                    )


    def name(self, val):
        try:

            name = self.enumeration[val]
        except IndexError:

            # Always return last tuple.
            name = self.enumeration[len(self.enumeration) - 1]

        return name


    def number(self, val):
        try:

            index = self.enumeration.index(val)
        except (TypeError, ValueError):

            # Always return last tuple.
            index = (len(self.enumeration) - 1)

        return index


#endclass.


print('Passes')
print('1) %d'%(EnumDemo().number('ENUM_TWO')))
print('2) %s'%(EnumDemo().number('ENUM_TWO')))
print('3) %s'%(EnumDemo().name(1)))
print('4) %s'%(EnumDemo().enumeration[1]))
print()
print('Fails')
print('1) %d'%(EnumDemo().number('ENUM_THREEa')))
print('2) %s'%(EnumDemo().number('ENUM_THREEa')))
print('3) %s'%(EnumDemo().name(11)))
print('4) %s'%(EnumDemo().enumeration[-1]))

如果使用不正确,这可以避免产生异常,而是传回故障指示。一种更Python化的方法是返回“ None”,但是我的特定应用程序直接使用文本。

I wanted something similar so that I could access either part of the value pair from a single reference. The vanilla version:

#!/usr/bin/env python3


from enum import IntEnum


class EnumDemo(IntEnum):
    ENUM_ZERO       = 0
    ENUM_ONE        = 1
    ENUM_TWO        = 2
    ENUM_THREE      = 3
    ENUM_INVALID    = 4


#endclass.


print('Passes')
print('1) %d'%(EnumDemo['ENUM_TWO']))
print('2) %s'%(EnumDemo['ENUM_TWO']))
print('3) %s'%(EnumDemo.ENUM_TWO.name))
print('4) %d'%(EnumDemo.ENUM_TWO))
print()


print('Fails')
print('1) %d'%(EnumDemo.ENUM_TWOa))

The failure throws an exception as would be expected.

A more robust version:

#!/usr/bin/env python3


class EnumDemo():


    enumeration =   (
                        'ENUM_ZERO',    # 0.
                        'ENUM_ONE',     # 1.
                        'ENUM_TWO',     # 2.
                        'ENUM_THREE',   # 3.
                        'ENUM_INVALID'  # 4.
                    )


    def name(self, val):
        try:

            name = self.enumeration[val]
        except IndexError:

            # Always return last tuple.
            name = self.enumeration[len(self.enumeration) - 1]

        return name


    def number(self, val):
        try:

            index = self.enumeration.index(val)
        except (TypeError, ValueError):

            # Always return last tuple.
            index = (len(self.enumeration) - 1)

        return index


#endclass.


print('Passes')
print('1) %d'%(EnumDemo().number('ENUM_TWO')))
print('2) %s'%(EnumDemo().number('ENUM_TWO')))
print('3) %s'%(EnumDemo().name(1)))
print('4) %s'%(EnumDemo().enumeration[1]))
print()
print('Fails')
print('1) %d'%(EnumDemo().number('ENUM_THREEa')))
print('2) %s'%(EnumDemo().number('ENUM_THREEa')))
print('3) %s'%(EnumDemo().name(11)))
print('4) %s'%(EnumDemo().enumeration[-1]))

When not used correctly this avoids creating an exception and, instead, passes back a fault indication. A more Pythonic way to do this would be to pass back “None” but my particular application uses the text directly.


如何在python3中将OrderedDict转换为常规dict

问题:如何在python3中将OrderedDict转换为常规dict

我正在努力解决以下问题:我想转换成OrderedDict这样:

OrderedDict([('method', 'constant'), ('data', '1.225')])

变成这样的常规字典:

{'method': 'constant', 'data':1.225}

因为我必须将其作为字符串存储在数据库中。转换后,顺序不再重要,因此无论如何我都可以保留订购的功能。

感谢您提供任何提示或解决方案,

I am struggling with the following problem: I want to convert an OrderedDict like this:

OrderedDict([('method', 'constant'), ('data', '1.225')])

into a regular dict like this:

{'method': 'constant', 'data':1.225}

because I have to store it as string in a database. After the conversion the order is not important anymore, so I can spare the ordered feature anyway.

Thanks for any hint or solutions,

Ben


回答 0

>>> from collections import OrderedDict
>>> OrderedDict([('method', 'constant'), ('data', '1.225')])
OrderedDict([('method', 'constant'), ('data', '1.225')])
>>> dict(OrderedDict([('method', 'constant'), ('data', '1.225')]))
{'data': '1.225', 'method': 'constant'}
>>>

但是,要将其存储在数据库中,最好将其转换为JSON或Pickle之类的格式。使用Pickle,您甚至可以保留订单!

>>> from collections import OrderedDict
>>> OrderedDict([('method', 'constant'), ('data', '1.225')])
OrderedDict([('method', 'constant'), ('data', '1.225')])
>>> dict(OrderedDict([('method', 'constant'), ('data', '1.225')]))
{'data': '1.225', 'method': 'constant'}
>>>

However, to store it in a database it’d be much better to convert it to a format such as JSON or Pickle. With Pickle you even preserve the order!


回答 1

即使这是一个古老的问题,我想说的是,dict如果您在命令字典中有命令字典,则使用将无济于事。可以转换这些递归有序字典的最简单方法是

import json
from collections import OrderedDict
input_dict = OrderedDict([('method', 'constant'), ('recursive', OrderedDict([('m', 'c')]))])
output_dict = json.loads(json.dumps(input_dict))
print output_dict

Even though this is a year old question, I would like to say that using dict will not help if you have an ordered dict within the ordered dict. The simplest way that could convert those recursive ordered dict will be

import json
from collections import OrderedDict
input_dict = OrderedDict([('method', 'constant'), ('recursive', OrderedDict([('m', 'c')]))])
output_dict = json.loads(json.dumps(input_dict))
print output_dict

回答 2

可以很容易地将您转换成OrderedDict这样的常规Dict

dict(OrderedDict([('method', 'constant'), ('data', '1.225')]))

如果必须将其作为字符串存储在数据库中,则可以使用JSON。这也很简单,您甚至不必担心转换为常规代码dict

import json
d = OrderedDict([('method', 'constant'), ('data', '1.225')])
dString = json.dumps(d)

或直接将数据转储到文件中:

with open('outFile.txt','w') as o:
    json.dump(d, o)

It is easy to convert your OrderedDict to a regular Dict like this:

dict(OrderedDict([('method', 'constant'), ('data', '1.225')]))

If you have to store it as a string in your database, using JSON is the way to go. That is also quite simple, and you don’t even have to worry about converting to a regular dict:

import json
d = OrderedDict([('method', 'constant'), ('data', '1.225')])
dString = json.dumps(d)

Or dump the data directly to a file:

with open('outFile.txt','w') as o:
    json.dump(d, o)

回答 3

如果要查找不使用json模块的递归版本:

def ordereddict_to_dict(value):
    for k, v in value.items():
        if isinstance(v, dict):
            value[k] = ordereddict_to_dict(v)
    return dict(value)

If you are looking for a recursive version without using the json module:

def ordereddict_to_dict(value):
    for k, v in value.items():
        if isinstance(v, dict):
            value[k] = ordereddict_to_dict(v)
    return dict(value)

回答 4

这是看起来最简单的方法,可在python 3.7中使用

from collections import OrderedDict

d = OrderedDict([('method', 'constant'), ('data', '1.225')])
d2 = dict(d)  # Now a normal dict

现在检查一下:

>>> type(d2)
<class 'dict'>
>>> isinstance(d2, OrderedDict)
False
>>> isinstance(d2, dict)
True

注意:这也有效,并且给出相同的结果-

>>> {**d}
{'method': 'constant', 'data': '1.225'}
>>> {**d} == d2
True

以及-

>>> dict(d)
{'method': 'constant', 'data': '1.225'}
>>> dict(d) == {**d}
True

干杯

Here is what seems simplest and works in python 3.7

from collections import OrderedDict

d = OrderedDict([('method', 'constant'), ('data', '1.225')])
d2 = dict(d)  # Now a normal dict

Now to check this:

>>> type(d2)
<class 'dict'>
>>> isinstance(d2, OrderedDict)
False
>>> isinstance(d2, dict)
True

NOTE: This also works, and gives same result –

>>> {**d}
{'method': 'constant', 'data': '1.225'}
>>> {**d} == d2
True

As well as this –

>>> dict(d)
{'method': 'constant', 'data': '1.225'}
>>> dict(d) == {**d}
True

Cheers


回答 5

如果您以某种方式想要一个简单而又不同的解决方案,则可以使用以下{**dict}语法:

from collections import OrderedDict

ordered = OrderedDict([('method', 'constant'), ('data', '1.225')])
regular = {**ordered}

If somehow you want a simple, yet different solution, you can use the {**dict} syntax:

from collections import OrderedDict

ordered = OrderedDict([('method', 'constant'), ('data', '1.225')])
regular = {**ordered}

回答 6

简单的方法

>>import json 
>>from collection import OrderedDict

>>json.dumps(dict(OrderedDict([('method', 'constant'), ('data', '1.225')])))

Its simple way

>>import json 
>>from collection import OrderedDict

>>json.dumps(dict(OrderedDict([('method', 'constant'), ('data', '1.225')])))

NumPy或Pandas:具有NaN值时,将数组类型保持为整数

问题:NumPy或Pandas:具有NaN值时,将数组类型保持为整数

有没有一种首选的方法来将numpy数组的数据类型固定为intint64或其他),同时仍将元素内部列出为numpy.NaN

特别是,我正在将内部数据结构转换为Pandas DataFrame。在我们的结构中,我们有仍然具有NaN的整数类型的列(但该列的dtype是int)。如果我们将其设为DataFrame,似乎将所有内容重播为浮点数,但我们真的很希望成为int

有什么想法吗?

尝试过的事情:

我尝试from_records()在pandas.DataFrame下使用该功能coerce_float=False,但这并没有帮助。我还尝试使用带有NaN fill_value的NumPy蒙版数组,该数组也无法正常工作。所有这些导致列数据类型变为浮点型。

Is there a preferred way to keep the data type of a numpy array fixed as int (or int64 or whatever), while still having an element inside listed as numpy.NaN?

In particular, I am converting an in-house data structure to a Pandas DataFrame. In our structure, we have integer-type columns that still have NaN’s (but the dtype of the column is int). It seems to recast everything as a float if we make this a DataFrame, but we’d really like to be int.

Thoughts?

Things tried:

I tried using the from_records() function under pandas.DataFrame, with coerce_float=False and this did not help. I also tried using NumPy masked arrays, with NaN fill_value, which also did not work. All of these caused the column data type to become a float.


回答 0

此功能已添加到熊猫(从0.24版开始):https : //pandas.pydata.org/pandas-docs/version/0.24/whatsnew/v0.24.0.html#optional-integer-na-support

此时,它需要使用扩展名dtype Int64(大写),而不是默认的dtype int64(小写)。

This capability has been added to pandas (beginning with version 0.24): https://pandas.pydata.org/pandas-docs/version/0.24/whatsnew/v0.24.0.html#optional-integer-na-support

At this point, it requires the use of extension dtype Int64 (capitalized), rather than the default dtype int64 (lowercase).


回答 1

NaN不能存储在整数数组中。目前,这是熊猫的已知限制;我一直在等待NumPy中的NA值(与R中的NA相似)取得进展,但是至少要等6个月到一年的时间,NumPy才能获得这些功能,这似乎是:

http://pandas.pydata.org/pandas-docs/stable/gotchas.html#support-for-integer-na

(此功能是从熊猫0.24版开始添加的,但请注意,它需要使用扩展名dtype Int64(大写),而不是默认的dtype int64(小写):https : //pandas.pydata.org/pandas- docs / version / 0.24 / whatsnew / v0.24.0.html#optional-integer-na-support

NaN can’t be stored in an integer array. This is a known limitation of pandas at the moment; I have been waiting for progress to be made with NA values in NumPy (similar to NAs in R), but it will be at least 6 months to a year before NumPy gets these features, it seems:

http://pandas.pydata.org/pandas-docs/stable/gotchas.html#support-for-integer-na

(This feature has been added beginning with version 0.24 of pandas, but note it requires the use of extension dtype Int64 (capitalized), rather than the default dtype int64 (lower case): https://pandas.pydata.org/pandas-docs/version/0.24/whatsnew/v0.24.0.html#optional-integer-na-support )


回答 2

如果性能不是主要问题,则可以存储字符串。

df.col = df.col.dropna().apply(lambda x: str(int(x)) )

然后,您可以NaN根据需要随意混合。如果您确实希望拥有整数,则可以根据您的应用程序使用-1,或0,或1234567890或一些其他专用值来表示NaN

您也可以临时复制这些列:一列,有浮点数;另一个是实验型,带有整数或字符串。然后将其插入asserts每个合理的位置,以检查两者是否同步。经过足够的测试后,您可以放开浮子。

If performance is not the main issue, you can store strings instead.

df.col = df.col.dropna().apply(lambda x: str(int(x)) )

Then you can mix then with NaN as much as you want. If you really want to have integers, depending on your application, you can use -1, or 0, or 1234567890, or some other dedicated value to represent NaN.

You can also temporarily duplicate the columns: one as you have, with floats; the other one experimental, with ints or strings. Then inserts asserts in every reasonable place checking that the two are in sync. After enough testing you can let go of the floats.


回答 3

这并不是对所有情况都适用的解决方案,但我使用的是(基因座标)(NaO)

a3['MapInfo'] = a3['MapInfo'].fillna(0).astype(int)

这至少允许使用正确的“本机”列类型,如减法,比较等操作均按预期工作

This is not a solution for all cases, but mine (genomic coordinates) I’ve resorted to using 0 as NaN

a3['MapInfo'] = a3['MapInfo'].fillna(0).astype(int)

This at least allows for the proper ‘native’ column type to be used, operations like subtraction, comparison etc work as expected


回答 4

熊猫v0.24 +

支持功能 NaNv0.24或更高版本将提供整数系列。有这些信息在v0.24部分,并在更多的细节“新什么是” 空整数数据类型

Pandas v0.23及更早版本

通常,最好float在可能的情况下使用系列,即使该系列是从intfloat由于包含的NaN值。这将启用基于矢量的基于NumPy的计算,否则将处理Python级别的循环。

文档确实建议:“一种可能性是使用dtype=object数组。” 例如:

s = pd.Series([1, 2, 3, np.nan])

print(s.astype(object))

0      1
1      2
2      3
3    NaN
dtype: object

出于美观原因,例如输出到文件,此 可能是更可取的。

熊猫v0.23及更早版本:背景

NaN被认为是float当前文档(自v0.23起)指定了将整数序列向上转换为的原因float

在没有从根本上将高性能NA支持内置到NumPy中的情况下,主要的受害者是能够以整数数组表示NA。

这种权衡主要是出于内存和性能方面的考虑,并且也使得最终的Series仍然是“数字”。

该文档还提供NaN包含以下内容的上传规则

Typeclass   Promotion dtype for storing NAs
floating    no change
object      no change
integer     cast to float64
boolean     cast to object

Pandas v0.24+

Functionality to support NaN in integer series will be available in v0.24 upwards. There’s information on this in the v0.24 “What’s New” section, and more details under Nullable Integer Data Type.

Pandas v0.23 and earlier

In general, it’s best to work with float series where possible, even when the series is upcast from int to float due to inclusion of NaN values. This enables vectorised NumPy-based calculations where, otherwise, Python-level loops would be processed.

The docs do suggest : “One possibility is to use dtype=object arrays instead.” For example:

s = pd.Series([1, 2, 3, np.nan])

print(s.astype(object))

0      1
1      2
2      3
3    NaN
dtype: object

For cosmetic reasons, e.g. output to a file, this may be preferable.

Pandas v0.23 and earlier: background

NaN is considered a float. The docs currently (as of v0.23) specify the reason why integer series are upcasted to float:

In the absence of high performance NA support being built into NumPy from the ground up, the primary casualty is the ability to represent NAs in integer arrays.

This trade-off is made largely for memory and performance reasons, and also so that the resulting Series continues to be “numeric”.

The docs also provide rules for upcasting due to NaN inclusion:

Typeclass   Promotion dtype for storing NAs
floating    no change
object      no change
integer     cast to float64
boolean     cast to object

回答 5

现在这是可能的,因为pandas v 0.24.0

pandas 0.24.x发行说明 Quote:“ Pandas已具备保存具有缺失值的整数dtypes的能力。

This is now possible, since pandas v 0.24.0

pandas 0.24.x release notes Quote: “Pandas has gained the ability to hold integer dtypes with missing values.


回答 6

只是想补充一下,以防您尝试将浮点数(1.143)向量转换为整数(1),并且将NA转换为新的’Int64’dtype会导致错误。为了解决这个问题,您必须四舍五入数字,然后执行“ .astype(’Int64’)”

s1 = pd.Series([1.434, 2.343, np.nan])
#without round() the next line returns an error 
s1.astype('Int64')
#cannot safely cast non-equivalent float64 to int64
##with round() it works
s1.round().astype('Int64')
0      1
1      2
2    NaN
dtype: Int64

我的用例是我有一个浮点数系列,我想四舍五入为整数,但是当您执行.round()时,数字末尾仍为’* .0’,因此您可以从末尾减去0转换为int。

Just wanted to add that in case you are trying to convert a float (1.143) vector to integer (1) that has NA converting to the new ‘Int64’ dtype will give you an error. In order to solve this you have to round the numbers and then do “.astype(‘Int64’)”

s1 = pd.Series([1.434, 2.343, np.nan])
#without round() the next line returns an error 
s1.astype('Int64')
#cannot safely cast non-equivalent float64 to int64
##with round() it works
s1.round().astype('Int64')
0      1
1      2
2    NaN
dtype: Int64

My use case is that I have a float series that I want to round to int, but when you do .round() a ‘*.0’ at the end of the number remains, so you can drop that 0 from the end by converting to int.


回答 7

如果文本数据中有空格,则通常为整数的列将转换为float64 dtype,因为int64 dtype无法处理null。如果您要加载多个文件,其中一些带有空白(最终将以float64的形式加载,而另一些将最终以int64的形式加载),则可能导致架构不一致

该代码将尝试将任何数字类型的列转换为Int64(而不是int64),因为Int64可以处理空值

import pandas as pd
import numpy as np

#show datatypes before transformation
mydf.dtypes

for c in mydf.select_dtypes(np.number).columns:
    try:
        mydf[c] = mydf[c].astype('Int64')
        print('casted {} as Int64'.format(c))
    except:
        print('could not cast {} to Int64'.format(c))

#show datatypes after transformation
mydf.dtypes

If there are blanks in the text data, columns that would normally be integers will be cast to floats as float64 dtype because int64 dtype cannot handle nulls. This can cause inconsistent schema if you are loading multiple files some with blanks (which will end up as float64 and others without which will end up as int64

This code will attempt to convert any number type columns to Int64 (as opposed to int64) since Int64 can handle nulls

import pandas as pd
import numpy as np

#show datatypes before transformation
mydf.dtypes

for c in mydf.select_dtypes(np.number).columns:
    try:
        mydf[c] = mydf[c].astype('Int64')
        print('casted {} as Int64'.format(c))
    except:
        print('could not cast {} to Int64'.format(c))

#show datatypes after transformation
mydf.dtypes

在Python中将字符串转换为Enum

问题:在Python中将字符串转换为Enum

我想知道将字符串转换(反序列化)为Python的Enum类的正确方法是什么。似乎可以getattr(YourEnumType, str)完成这项工作,但是我不确定它是否足够安全。

更具体地说,我想像这样将'debug'字符串转换为Enum对象:

class BuildType(Enum):
    debug = 200
    release = 400

I wonder what’s the correct way of converting (deserializing) a string to a Python’s Enum class. Seems like getattr(YourEnumType, str) does the job, but I’m not sure if it’s safe enough.

Just to be more specific, I would like to convert a 'debug'string to an Enum object like this:

class BuildType(Enum):
    debug = 200
    release = 400

回答 0

此功能已内置到枚举[1]中:

>>> from enum import Enum
>>> class Build(Enum):
...   debug = 200
...   build = 400
... 
>>> Build['debug']
<Build.debug: 200>

[1]官方文档: Enum programmatic access

This functionality is already built in to Enum [1]:

>>> from enum import Enum
>>> class Build(Enum):
...   debug = 200
...   build = 400
... 
>>> Build['debug']
<Build.debug: 200>

[1] Official docs: Enum programmatic access


回答 1

另一种选择(如果你的字符串不映射1-1到您的枚举的情况下特别有用)是一个添加staticmethod到您的Enum,如:

class QuestionType(enum.Enum):
    MULTI_SELECT = "multi"
    SINGLE_SELECT = "single"

    @staticmethod
    def from_str(label):
        if label in ('single', 'singleSelect'):
            return QuestionType.SINGLE_SELECT
        elif label in ('multi', 'multiSelect'):
            return QuestionType.MULTI_SELECT
        else:
            raise NotImplementedError

那你可以做 question_type = QuestionType.from_str('singleSelect')

Another alternative (especially useful if your strings don’t map 1-1 to your enum cases) is to add a staticmethod to your Enum, e.g.:

class QuestionType(enum.Enum):
    MULTI_SELECT = "multi"
    SINGLE_SELECT = "single"

    @staticmethod
    def from_str(label):
        if label in ('single', 'singleSelect'):
            return QuestionType.SINGLE_SELECT
        elif label in ('multi', 'multiSelect'):
            return QuestionType.MULTI_SELECT
        else:
            raise NotImplementedError

Then you can do question_type = QuestionType.from_str('singleSelect')


回答 2

def custom_enum(typename, items_dict):
    class_definition = """
from enum import Enum

class {}(Enum):
    {}""".format(typename, '\n    '.join(['{} = {}'.format(k, v) for k, v in items_dict.items()]))

    namespace = dict(__name__='enum_%s' % typename)
    exec(class_definition, namespace)
    result = namespace[typename]
    result._source = class_definition
    return result

MyEnum = custom_enum('MyEnum', {'a': 123, 'b': 321})
print(MyEnum.a, MyEnum.b)

还是需要将字符串转换为已知的 Enum?

class MyEnum(Enum):
    a = 'aaa'
    b = 123

print(MyEnum('aaa'), MyEnum(123))

要么:

class BuildType(Enum):
    debug = 200
    release = 400

print(BuildType.__dict__['debug'])

print(eval('BuildType.debug'))
print(type(eval('BuildType.debug')))    
print(eval(BuildType.__name__ + '.debug'))  # for work with code refactoring
def custom_enum(typename, items_dict):
    class_definition = """
from enum import Enum

class {}(Enum):
    {}""".format(typename, '\n    '.join(['{} = {}'.format(k, v) for k, v in items_dict.items()]))

    namespace = dict(__name__='enum_%s' % typename)
    exec(class_definition, namespace)
    result = namespace[typename]
    result._source = class_definition
    return result

MyEnum = custom_enum('MyEnum', {'a': 123, 'b': 321})
print(MyEnum.a, MyEnum.b)

Or you need to convert string to known Enum?

class MyEnum(Enum):
    a = 'aaa'
    b = 123

print(MyEnum('aaa'), MyEnum(123))

Or:

class BuildType(Enum):
    debug = 200
    release = 400

print(BuildType.__dict__['debug'])

print(eval('BuildType.debug'))
print(type(eval('BuildType.debug')))    
print(eval(BuildType.__name__ + '.debug'))  # for work with code refactoring

回答 3

我的类Java解决方案。希望它可以帮助某人…

    from enum import Enum, auto


    class SignInMethod(Enum):
        EMAIL = auto(),
        GOOGLE = auto()

        @staticmethod
        def value_of(value) -> Enum:
            for m, mm in SignInMethod.__members__.items():
                if m == value.upper():
                    return mm


    sim = SignInMethod.value_of('EMAIL')
    print("""TEST
    1). {0}
    2). {1}
    3). {2}
    """.format(sim, sim.name, isinstance(sim, SignInMethod)))

My Java-like solution to the problem. Hope it helps someone…

    from enum import Enum, auto


    class SignInMethod(Enum):
        EMAIL = auto(),
        GOOGLE = auto()

        @staticmethod
        def value_of(value) -> Enum:
            for m, mm in SignInMethod.__members__.items():
                if m == value.upper():
                    return mm


    sim = SignInMethod.value_of('EMAIL')
    print("""TEST
    1). {0}
    2). {1}
    3). {2}
    """.format(sim, sim.name, isinstance(sim, SignInMethod)))

回答 4

对@rogueleaderr答案的改进:

class QuestionType(enum.Enum):
    MULTI_SELECT = "multi"
    SINGLE_SELECT = "single"

    @classmethod
    def from_str(cls, label):
        if label in ('single', 'singleSelect'):
            return cls.SINGLE_SELECT
        elif label in ('multi', 'multiSelect'):
            return cls.MULTI_SELECT
        else:
            raise NotImplementedError

An improvement to the answer of @rogueleaderr :

class QuestionType(enum.Enum):
    MULTI_SELECT = "multi"
    SINGLE_SELECT = "single"

    @classmethod
    def from_str(cls, label):
        if label in ('single', 'singleSelect'):
            return cls.SINGLE_SELECT
        elif label in ('multi', 'multiSelect'):
            return cls.MULTI_SELECT
        else:
            raise NotImplementedError

回答 5

我只想通知这在python 3.6中不起作用

class MyEnum(Enum):
    a = 'aaa'
    b = 123

print(MyEnum('aaa'), MyEnum(123))

您将不得不像这样以元组形式提供数据

MyEnum(('aaa',))

编辑:这被证明是错误的。感谢指出我的错误的评论者

I just want to notify this does not work in python 3.6

class MyEnum(Enum):
    a = 'aaa'
    b = 123

print(MyEnum('aaa'), MyEnum(123))

You will have to give the data as a tuple like this

MyEnum(('aaa',))

EDIT: This turns out to be false. Credits to a commenter for pointing out my mistake


检查字符串是否可以在Python中转换为float

问题:检查字符串是否可以在Python中转换为float

我有一些运行在字符串列表中的Python代码,并在可能的情况下将它们转换为整数或浮点数。对整数执行此操作非常简单

if element.isdigit():
  newelement = int(element)

浮点数比较困难。现在,我正在使用partition('.')分割字符串并检查以确保一侧或两侧都是数字。

partition = element.partition('.')
if (partition[0].isdigit() and partition[1] == '.' and partition[2].isdigit()) 
    or (partition[0] == '' and partition[1] == '.' and partition[2].isdigit()) 
    or (partition[0].isdigit() and partition[1] == '.' and partition[2] == ''):
  newelement = float(element)

这是可行的,但是显然,如果使用if语句有点让人头疼。我考虑的另一种解决方案是将转换仅包装在try / catch块中,然后查看转换是否成功,如本问题所述

还有其他想法吗?关于分区和尝试/捕获方法的相对优点的看法?

I’ve got some Python code that runs through a list of strings and converts them to integers or floating point numbers if possible. Doing this for integers is pretty easy

if element.isdigit():
  newelement = int(element)

Floating point numbers are more difficult. Right now I’m using partition('.') to split the string and checking to make sure that one or both sides are digits.

partition = element.partition('.')
if (partition[0].isdigit() and partition[1] == '.' and partition[2].isdigit()) 
    or (partition[0] == '' and partition[1] == '.' and partition[2].isdigit()) 
    or (partition[0].isdigit() and partition[1] == '.' and partition[2] == ''):
  newelement = float(element)

This works, but obviously the if statement for that is a bit of a bear. The other solution I considered is to just wrap the conversion in a try/catch block and see if it succeeds, as described in this question.

Anyone have any other ideas? Opinions on the relative merits of the partition and try/catch approaches?


回答 0

我会用..

try:
    float(element)
except ValueError:
    print "Not a float"

..它很简单,并且可以正常工作

另一个选择是正则表达式:

import re
if re.match(r'^-?\d+(?:\.\d+)?$', element) is None:
    print "Not float"

I would just use..

try:
    float(element)
except ValueError:
    print "Not a float"

..it’s simple, and it works

Another option would be a regular expression:

import re
if re.match(r'^-?\d+(?:\.\d+)?$', element) is None:
    print "Not float"

回答 1

检查浮点数的Python方法:

def isfloat(value):
  try:
    float(value)
    return True
  except ValueError:
    return False

不要被隐藏在浮船上的妖精所咬!做单元测试!

什么是浮动货币,哪些不是浮动货币,可能会让您感到惊讶:

Command to parse                        Is it a float?  Comment
--------------------------------------  --------------- ------------
print(isfloat(""))                      False
print(isfloat("1234567"))               True 
print(isfloat("NaN"))                   True            nan is also float
print(isfloat("NaNananana BATMAN"))     False
print(isfloat("123.456"))               True
print(isfloat("123.E4"))                True
print(isfloat(".1"))                    True
print(isfloat("1,234"))                 False
print(isfloat("NULL"))                  False           case insensitive
print(isfloat(",1"))                    False           
print(isfloat("123.EE4"))               False           
print(isfloat("6.523537535629999e-07")) True
print(isfloat("6e777777"))              True            This is same as Inf
print(isfloat("-iNF"))                  True
print(isfloat("1.797693e+308"))         True
print(isfloat("infinity"))              True
print(isfloat("infinity and BEYOND"))   False
print(isfloat("12.34.56"))              False           Two dots not allowed.
print(isfloat("#56"))                   False
print(isfloat("56%"))                   False
print(isfloat("0E0"))                   True
print(isfloat("x86E0"))                 False
print(isfloat("86-5"))                  False
print(isfloat("True"))                  False           Boolean is not a float.   
print(isfloat(True))                    True            Boolean is a float
print(isfloat("+1e1^5"))                False
print(isfloat("+1e1"))                  True
print(isfloat("+1e1.3"))                False
print(isfloat("+1.3P1"))                False
print(isfloat("-+1"))                   False
print(isfloat("(1)"))                   False           brackets not interpreted

Python method to check for float:

def isfloat(value):
  try:
    float(value)
    return True
  except ValueError:
    return False

Don’t get bit by the goblins hiding in the float boat! DO UNIT TESTING!

What is, and is not a float may surprise you:

Command to parse                        Is it a float?  Comment
--------------------------------------  --------------- ------------
print(isfloat(""))                      False
print(isfloat("1234567"))               True 
print(isfloat("NaN"))                   True            nan is also float
print(isfloat("NaNananana BATMAN"))     False
print(isfloat("123.456"))               True
print(isfloat("123.E4"))                True
print(isfloat(".1"))                    True
print(isfloat("1,234"))                 False
print(isfloat("NULL"))                  False           case insensitive
print(isfloat(",1"))                    False           
print(isfloat("123.EE4"))               False           
print(isfloat("6.523537535629999e-07")) True
print(isfloat("6e777777"))              True            This is same as Inf
print(isfloat("-iNF"))                  True
print(isfloat("1.797693e+308"))         True
print(isfloat("infinity"))              True
print(isfloat("infinity and BEYOND"))   False
print(isfloat("12.34.56"))              False           Two dots not allowed.
print(isfloat("#56"))                   False
print(isfloat("56%"))                   False
print(isfloat("0E0"))                   True
print(isfloat("x86E0"))                 False
print(isfloat("86-5"))                  False
print(isfloat("True"))                  False           Boolean is not a float.   
print(isfloat(True))                    True            Boolean is a float
print(isfloat("+1e1^5"))                False
print(isfloat("+1e1"))                  True
print(isfloat("+1e1.3"))                False
print(isfloat("+1.3P1"))                False
print(isfloat("-+1"))                   False
print(isfloat("(1)"))                   False           brackets not interpreted

回答 2

'1.43'.replace('.','',1).isdigit()

true仅当存在一个或没有“。”时返回。在数字字符串中。

'1.4.3'.replace('.','',1).isdigit()

将返回 false

'1.ww'.replace('.','',1).isdigit()

将返回 false

'1.43'.replace('.','',1).isdigit()

which will return true only if there is one or no ‘.’ in the string of digits.

'1.4.3'.replace('.','',1).isdigit()

will return false

'1.ww'.replace('.','',1).isdigit()

will return false


回答 3

TL; DR

  • 如果您输入的大部分内容都是可以转换为浮点数的字符串,try: except:方法是最好的本机Python方法。
  • 如果您的输入大部分是不能输入的字符串转换为浮点数的,则正则表达式或partition方法会更好。
  • 如果您1)不确定您的输入或需要更快的速度,并且2)不介意并可以安装第三方C扩展名,则fastnumbers效果很好。

通过第三方模块可以使用另一种称为fastnumbers的方法(公开,我是作者)。它提供了一个称为isfloat的功能。我在这个答案中采用了Jacob Gabrielson概述的单元测试示例,但是添加了该fastnumbers.isfloat方法。我还应该注意,Jacob的示例对正则表达式选项不公道,因为该示例中的大多数时间都因为点运算符而花费在全局查找中try: except:


def is_float_try(str):
    try:
        float(str)
        return True
    except ValueError:
        return False

import re
_float_regexp = re.compile(r"^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$").match
def is_float_re(str):
    return True if _float_regexp(str) else False

def is_float_partition(element):
    partition=element.partition('.')
    if (partition[0].isdigit() and partition[1]=='.' and partition[2].isdigit()) or (partition[0]=='' and partition[1]=='.' and partition[2].isdigit()) or (partition[0].isdigit() and partition[1]=='.' and partition[2]==''):
        return True
    else:
        return False

from fastnumbers import isfloat


if __name__ == '__main__':
    import unittest
    import timeit

    class ConvertTests(unittest.TestCase):

        def test_re_perf(self):
            print
            print 're sad:', timeit.Timer('ttest.is_float_re("12.2x")', "import ttest").timeit()
            print 're happy:', timeit.Timer('ttest.is_float_re("12.2")', "import ttest").timeit()

        def test_try_perf(self):
            print
            print 'try sad:', timeit.Timer('ttest.is_float_try("12.2x")', "import ttest").timeit()
            print 'try happy:', timeit.Timer('ttest.is_float_try("12.2")', "import ttest").timeit()

        def test_fn_perf(self):
            print
            print 'fn sad:', timeit.Timer('ttest.isfloat("12.2x")', "import ttest").timeit()
            print 'fn happy:', timeit.Timer('ttest.isfloat("12.2")', "import ttest").timeit()


        def test_part_perf(self):
            print
            print 'part sad:', timeit.Timer('ttest.is_float_partition("12.2x")', "import ttest").timeit()
            print 'part happy:', timeit.Timer('ttest.is_float_partition("12.2")', "import ttest").timeit()

    unittest.main()

在我的机器上,输出为:

fn sad: 0.220988988876
fn happy: 0.212214946747
.
part sad: 1.2219619751
part happy: 0.754667043686
.
re sad: 1.50515985489
re happy: 1.01107215881
.
try sad: 2.40243887901
try happy: 0.425730228424
.
----------------------------------------------------------------------
Ran 4 tests in 7.761s

OK

如您所见,regex实际上并不像最初看起来的那样糟糕,并且如果您确实对速度有需求,则此fastnumbers方法相当不错。

TL;DR:

  • If your input is mostly strings that can be converted to floats, the try: except: method is the best native Python method.
  • If your input is mostly strings that cannot be converted to floats, regular expressions or the partition method will be better.
  • If you are 1) unsure of your input or need more speed and 2) don’t mind and can install a third-party C-extension, fastnumbers works very well.

There is another method available via a third-party module called fastnumbers (disclosure, I am the author); it provides a function called isfloat. I have taken the unittest example outlined by Jacob Gabrielson in this answer, but added the fastnumbers.isfloat method. I should also note that Jacob’s example did not do justice to the regex option because most of the time in that example was spent in global lookups because of the dot operator… I have modified that function to give a fairer comparison to try: except:.


def is_float_try(str):
    try:
        float(str)
        return True
    except ValueError:
        return False

import re
_float_regexp = re.compile(r"^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$").match
def is_float_re(str):
    return True if _float_regexp(str) else False

def is_float_partition(element):
    partition=element.partition('.')
    if (partition[0].isdigit() and partition[1]=='.' and partition[2].isdigit()) or (partition[0]=='' and partition[1]=='.' and partition[2].isdigit()) or (partition[0].isdigit() and partition[1]=='.' and partition[2]==''):
        return True
    else:
        return False

from fastnumbers import isfloat


if __name__ == '__main__':
    import unittest
    import timeit

    class ConvertTests(unittest.TestCase):

        def test_re_perf(self):
            print
            print 're sad:', timeit.Timer('ttest.is_float_re("12.2x")', "import ttest").timeit()
            print 're happy:', timeit.Timer('ttest.is_float_re("12.2")', "import ttest").timeit()

        def test_try_perf(self):
            print
            print 'try sad:', timeit.Timer('ttest.is_float_try("12.2x")', "import ttest").timeit()
            print 'try happy:', timeit.Timer('ttest.is_float_try("12.2")', "import ttest").timeit()

        def test_fn_perf(self):
            print
            print 'fn sad:', timeit.Timer('ttest.isfloat("12.2x")', "import ttest").timeit()
            print 'fn happy:', timeit.Timer('ttest.isfloat("12.2")', "import ttest").timeit()


        def test_part_perf(self):
            print
            print 'part sad:', timeit.Timer('ttest.is_float_partition("12.2x")', "import ttest").timeit()
            print 'part happy:', timeit.Timer('ttest.is_float_partition("12.2")', "import ttest").timeit()

    unittest.main()

On my machine, the output is:

fn sad: 0.220988988876
fn happy: 0.212214946747
.
part sad: 1.2219619751
part happy: 0.754667043686
.
re sad: 1.50515985489
re happy: 1.01107215881
.
try sad: 2.40243887901
try happy: 0.425730228424
.
----------------------------------------------------------------------
Ran 4 tests in 7.761s

OK

As you can see, regex is actually not as bad as it originally seemed, and if you have a real need for speed, the fastnumbers method is quite good.


回答 4

如果您关心性能(我不建议您这样做),则只要您不期望太多,基于尝试的方法就是明显的赢家(与基于分区的方法或regexp方法相比)无效的字符串,在这种情况下,它可能会变慢(可能是由于异常处理的开销)。

再一次,我不建议您关心性能,只是给您数据,以防您每秒进行100亿次这样的操作。同样,基于分区的代码不能处理至少一个有效的字符串。

$ ./floatstr.py
F..
分区悲伤:3.1102449894
分区快乐:2.09208488464
..
难过:7.76906108856
重新开心:7.09421992302
..
尝试悲伤:12.1525540352
尝试快乐:1.44165301323
。
================================================== ====================
失败:test_partition(__main __。ConvertTests)
-------------------------------------------------- --------------------
追溯(最近一次通话):
  在test_partition中,文件“ ./floatstr.py”,第48行
    self.failUnless(is_float_partition(“ 20e2”))
断言错误

-------------------------------------------------- --------------------
在33.670秒内进行了8次测试

失败(失败= 1)

以下是代码(Python 2.6,来自John Gietzen的答案的正则表达式):

def is_float_try(str):
    try:
        float(str)
        return True
    except ValueError:
        return False

import re
_float_regexp = re.compile(r"^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$")
def is_float_re(str):
    return re.match(_float_regexp, str)


def is_float_partition(element):
    partition=element.partition('.')
    if (partition[0].isdigit() and partition[1]=='.' and partition[2].isdigit()) or (partition[0]=='' and partition[1]=='.' and pa\
rtition[2].isdigit()) or (partition[0].isdigit() and partition[1]=='.' and partition[2]==''):
        return True

if __name__ == '__main__':
    import unittest
    import timeit

    class ConvertTests(unittest.TestCase):
        def test_re(self):
            self.failUnless(is_float_re("20e2"))

        def test_try(self):
            self.failUnless(is_float_try("20e2"))

        def test_re_perf(self):
            print
            print 're sad:', timeit.Timer('floatstr.is_float_re("12.2x")', "import floatstr").timeit()
            print 're happy:', timeit.Timer('floatstr.is_float_re("12.2")', "import floatstr").timeit()

        def test_try_perf(self):
            print
            print 'try sad:', timeit.Timer('floatstr.is_float_try("12.2x")', "import floatstr").timeit()
            print 'try happy:', timeit.Timer('floatstr.is_float_try("12.2")', "import floatstr").timeit()

        def test_partition_perf(self):
            print
            print 'partition sad:', timeit.Timer('floatstr.is_float_partition("12.2x")', "import floatstr").timeit()
            print 'partition happy:', timeit.Timer('floatstr.is_float_partition("12.2")', "import floatstr").timeit()

        def test_partition(self):
            self.failUnless(is_float_partition("20e2"))

        def test_partition2(self):
            self.failUnless(is_float_partition(".2"))

        def test_partition3(self):
            self.failIf(is_float_partition("1234x.2"))

    unittest.main()

If you cared about performance (and I’m not suggesting you should), the try-based approach is the clear winner (compared with your partition-based approach or the regexp approach), as long as you don’t expect a lot of invalid strings, in which case it’s potentially slower (presumably due to the cost of exception handling).

Again, I’m not suggesting you care about performance, just giving you the data in case you’re doing this 10 billion times a second, or something. Also, the partition-based code doesn’t handle at least one valid string.

$ ./floatstr.py
F..
partition sad: 3.1102449894
partition happy: 2.09208488464
..
re sad: 7.76906108856
re happy: 7.09421992302
..
try sad: 12.1525540352
try happy: 1.44165301323
.
======================================================================
FAIL: test_partition (__main__.ConvertTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./floatstr.py", line 48, in test_partition
    self.failUnless(is_float_partition("20e2"))
AssertionError

----------------------------------------------------------------------
Ran 8 tests in 33.670s

FAILED (failures=1)

Here’s the code (Python 2.6, regexp taken from John Gietzen’s answer):

def is_float_try(str):
    try:
        float(str)
        return True
    except ValueError:
        return False

import re
_float_regexp = re.compile(r"^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$")
def is_float_re(str):
    return re.match(_float_regexp, str)


def is_float_partition(element):
    partition=element.partition('.')
    if (partition[0].isdigit() and partition[1]=='.' and partition[2].isdigit()) or (partition[0]=='' and partition[1]=='.' and pa\
rtition[2].isdigit()) or (partition[0].isdigit() and partition[1]=='.' and partition[2]==''):
        return True

if __name__ == '__main__':
    import unittest
    import timeit

    class ConvertTests(unittest.TestCase):
        def test_re(self):
            self.failUnless(is_float_re("20e2"))

        def test_try(self):
            self.failUnless(is_float_try("20e2"))

        def test_re_perf(self):
            print
            print 're sad:', timeit.Timer('floatstr.is_float_re("12.2x")', "import floatstr").timeit()
            print 're happy:', timeit.Timer('floatstr.is_float_re("12.2")', "import floatstr").timeit()

        def test_try_perf(self):
            print
            print 'try sad:', timeit.Timer('floatstr.is_float_try("12.2x")', "import floatstr").timeit()
            print 'try happy:', timeit.Timer('floatstr.is_float_try("12.2")', "import floatstr").timeit()

        def test_partition_perf(self):
            print
            print 'partition sad:', timeit.Timer('floatstr.is_float_partition("12.2x")', "import floatstr").timeit()
            print 'partition happy:', timeit.Timer('floatstr.is_float_partition("12.2")', "import floatstr").timeit()

        def test_partition(self):
            self.failUnless(is_float_partition("20e2"))

        def test_partition2(self):
            self.failUnless(is_float_partition(".2"))

        def test_partition3(self):
            self.failIf(is_float_partition("1234x.2"))

    unittest.main()

回答 5

仅出于多样性,这是另一种方法。

>>> all([i.isnumeric() for i in '1.2'.split('.',1)])
True
>>> all([i.isnumeric() for i in '2'.split('.',1)])
True
>>> all([i.isnumeric() for i in '2.f'.split('.',1)])
False

编辑:我确信它不会容纳所有的float情况,尽管特别是在有指数的时候。为了解决它看起来像这样。这将返回True,只有val是浮点数,而对于int则返回False,但性能可能不如regex。

>>> def isfloat(val):
...     return all([ [any([i.isnumeric(), i in ['.','e']]) for i in val],  len(val.split('.')) == 2] )
...
>>> isfloat('1')
False
>>> isfloat('1.2')
True
>>> isfloat('1.2e3')
True
>>> isfloat('12e3')
False

Just for variety here is another method to do it.

>>> all([i.isnumeric() for i in '1.2'.split('.',1)])
True
>>> all([i.isnumeric() for i in '2'.split('.',1)])
True
>>> all([i.isnumeric() for i in '2.f'.split('.',1)])
False

Edit: Im sure it will not hold up to all cases of float though especially when there is an exponent. To solve that it looks like this. This will return True only val is a float and False for int but is probably less performant than regex.

>>> def isfloat(val):
...     return all([ [any([i.isnumeric(), i in ['.','e']]) for i in val],  len(val.split('.')) == 2] )
...
>>> isfloat('1')
False
>>> isfloat('1.2')
True
>>> isfloat('1.2e3')
True
>>> isfloat('12e3')
False

回答 6

此正则表达式将检查科学的浮点数:

^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$

但是,我相信您最好的选择是尝试使用解析器。

This regex will check for scientific floating point numbers:

^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$

However, I believe that your best bet is to use the parser in a try.


回答 7

如果您不必担心数字的科学表达式或其他表达式,而只使用可能是带或不带句点的数字的字符串,则:

功能

def is_float(s):
    result = False
    if s.count(".") == 1:
        if s.replace(".", "").isdigit():
            result = True
    return result

Lambda版本

is_float = lambda x: x.replace('.','',1).isdigit() and "." in x

if is_float(some_string):
    some_string = float(some_string)
elif some_string.isdigit():
    some_string = int(some_string)
else:
    print "Does not convert to int or float."

这样,您就不会意外将应为int的内容转换为float。

If you don’t need to worry about scientific or other expressions of numbers and are only working with strings that could be numbers with or without a period:

Function

def is_float(s):
    result = False
    if s.count(".") == 1:
        if s.replace(".", "").isdigit():
            result = True
    return result

Lambda version

is_float = lambda x: x.replace('.','',1).isdigit() and "." in x

Example

if is_float(some_string):
    some_string = float(some_string)
elif some_string.isdigit():
    some_string = int(some_string)
else:
    print "Does not convert to int or float."

This way you aren’t accidentally converting what should be an int, into a float.


回答 8

函数的简化版本, is_digit(str)在大多数情况下就足够了(不考虑指数符号“ NaN”值):

def is_digit(str):
    return str.lstrip('-').replace('.', '').isdigit()

Simplified version of the function is_digit(str), which suffices in most cases (doesn’t consider exponential notation and “NaN” value):

def is_digit(str):
    return str.lstrip('-').replace('.', '').isdigit()

回答 9

我使用了已经提到的函数,但是很快我注意到,字符串“ Nan”,“ Inf”及其变体被视为数字。因此,我建议您对函数进行改进,使其在这些输入类型上返回false,并且不会使“ 1e3”变体失败:

def is_float(text):
    # check for nan/infinity etc.
    if text.isalpha():
        return False
    try:
        float(text)
        return True
    except ValueError:
        return False

I used the function already mentioned, but soon I notice that strings as “Nan”, “Inf” and it’s variation are considered as number. So I propose you improved version of the function, that will return false on those type of input and will not fail “1e3” variants:

def is_float(text):
    # check for nan/infinity etc.
    if text.isalpha():
        return False
    try:
        float(text)
        return True
    except ValueError:
        return False

回答 10

尝试转换为浮点数。如果有错误,请打印ValueError异常。

try:
    x = float('1.23')
    print('val=',x)
    y = float('abc')
    print('val=',y)
except ValueError as err:
    print('floatErr;',err)

输出:

val= 1.23
floatErr: could not convert string to float: 'abc'

Try to convert to float. If there is an error, print the ValueError exception.

try:
    x = float('1.23')
    print('val=',x)
    y = float('abc')
    print('val=',y)
except ValueError as err:
    print('floatErr;',err)

Output:

val= 1.23
floatErr: could not convert string to float: 'abc'

回答 11

将字典作为参数传递时,它将转换可以转换为float的字符串,并留下其他字符串

def covertDict_float(data):
        for i in data:
            if data[i].split(".")[0].isdigit():
                try:
                    data[i] = float(data[i])
                except:
                    continue
        return data

Passing dictionary as argument it will convert strings which can be converted to float and will leave others

def covertDict_float(data):
        for i in data:
            if data[i].split(".")[0].isdigit():
                try:
                    data[i] = float(data[i])
                except:
                    continue
        return data

回答 12

我一直在寻找一些类似的代码,但是看起来使用try / excepts是最好的方法。这是我正在使用的代码。如果输入无效,则包括重试功能。我需要检查输入是否大于0,是否将其转换为浮点型。

def cleanInput(question,retry=False): 
    inputValue = input("\n\nOnly positive numbers can be entered, please re-enter the value.\n\n{}".format(question)) if retry else input(question)
    try:
        if float(inputValue) <= 0 : raise ValueError()
        else : return(float(inputValue))
    except ValueError : return(cleanInput(question,retry=True))


willbefloat = cleanInput("Give me the number: ")

I was looking for some similar code, but it looks like using try/excepts is the best way. Here is the code I’m using. It includes a retry function if the input is invalid. I needed to check if the input was greater than 0 and if so convert it to a float.

def cleanInput(question,retry=False): 
    inputValue = input("\n\nOnly positive numbers can be entered, please re-enter the value.\n\n{}".format(question)) if retry else input(question)
    try:
        if float(inputValue) <= 0 : raise ValueError()
        else : return(float(inputValue))
    except ValueError : return(cleanInput(question,retry=True))


willbefloat = cleanInput("Give me the number: ")

回答 13

def try_parse_float(item):
  result = None
  try:
    float(item)
  except:
    pass
  else:
    result = float(item)
  return result
def try_parse_float(item):
  result = None
  try:
    float(item)
  except:
    pass
  else:
    result = float(item)
  return result

回答 14

我尝试了上述一些简单的选项,并使用了一个围绕转换为浮点数的try测试,发现大多数答复中都存在问题。

简单测试(沿以上答案行):

entry = ttk.Entry(self, validate='key')
entry['validatecommand'] = (entry.register(_test_num), '%P')

def _test_num(P):
    try: 
        float(P)
        return True
    except ValueError:
        return False

问题出现在以下情况:

  • 输入“-”以开始一个负数:

然后float('-'),您正在尝试失败

  • 您输入一个数字,但然后尝试删除所有数字

然后float(''),您正在尝试同样失败的尝试

我的快速解决方案是:

def _test_num(P):
    if P == '' or P == '-': return True
    try: 
        float(P)
        return True
    except ValueError:
        return False

I tried some of the above simple options, using a try test around converting to a float, and found that there is a problem in most of the replies.

Simple test (along the lines of above answers):

entry = ttk.Entry(self, validate='key')
entry['validatecommand'] = (entry.register(_test_num), '%P')

def _test_num(P):
    try: 
        float(P)
        return True
    except ValueError:
        return False

The problem comes when:

  • You enter ‘-‘ to start a negative number:

You are then trying float('-') which fails

  • You enter a number, but then try to delete all the digits

You are then trying float('') which likewise also fails

The quick solution I had is:

def _test_num(P):
    if P == '' or P == '-': return True
    try: 
        float(P)
        return True
    except ValueError:
        return False

回答 15

str(strval).isdigit()

似乎很简单。

处理以字符串或int或float形式存储的值

str(strval).isdigit()

seems to be simple.

Handles values stored in as a string or int or float


如何在Python中将字符转换为整数,反之亦然?

问题:如何在Python中将字符转换为整数,反之亦然?

我想要获得一个角色的ASCII价值。

例如,对于角色a,我要获取97,反之亦然。

I want to get, given a character, its ASCII value.

For example, for the character a, I want to get 97, and vice versa.


回答 0

使用chr()ord()

>>> chr(97)
'a'
>>> ord('a')
97

Use chr() and ord():

>>> chr(97)
'a'
>>> ord('a')
97

回答 1

>>> ord('a')
97
>>> chr(97)
'a'
>>> ord('a')
97
>>> chr(97)
'a'

回答 2

ord和chr


将Unicode字符串转换为Python中的字符串(包含多余的符号)

问题:将Unicode字符串转换为Python中的字符串(包含多余的符号)

如何将Unicode字符串(包含额外的字符,如£$等)转换为Python字符串?

How do you convert a Unicode string (containing extra characters like £ $, etc.) into a Python string?


回答 0

title = u"Klüft skräms inför på fédéral électoral große"
import unicodedata
unicodedata.normalize('NFKD', title).encode('ascii','ignore')
'Kluft skrams infor pa federal electoral groe'
title = u"Klüft skräms inför på fédéral électoral große"
import unicodedata
unicodedata.normalize('NFKD', title).encode('ascii','ignore')
'Kluft skrams infor pa federal electoral groe'

回答 1

如果不需要翻译非ASCII字符,则可以使用编码为ASCII:

>>> a=u"aaaàçççñññ"
>>> type(a)
<type 'unicode'>
>>> a.encode('ascii','ignore')
'aaa'
>>> a.encode('ascii','replace')
'aaa???????'
>>>

You can use encode to ASCII if you don’t need to translate the non-ASCII characters:

>>> a=u"aaaàçççñññ"
>>> type(a)
<type 'unicode'>
>>> a.encode('ascii','ignore')
'aaa'
>>> a.encode('ascii','replace')
'aaa???????'
>>>

回答 2

>>> text=u'abcd'
>>> str(text)
'abcd'

如果字符串仅包含ascii字符。

>>> text=u'abcd'
>>> str(text)
'abcd'

If the string only contains ascii characters.


回答 3

如果您有Unicode字符串,并且想要将其写入文件或其他序列化形式,则必须首先将其编码为可以存储的特定表示形式。有几种常见的Unicode编码,例如UTF-16(大多数Unicode字符使用两个字节)或UTF-8(1-4个字节/代码点,取决于字符)等。要将该字符串转换为特定的编码,您可以可以使用:

>>> s= u'£10'
>>> s.encode('utf8')
'\xc2\x9c10'
>>> s.encode('utf16')
'\xff\xfe\x9c\x001\x000\x00'

可以将此原始字节字符串写入文件。但是,请注意,当读回它时,您必须知道它所使用的编码并使用相同的编码对其进行解码。

写入文件时,您可以使用编解码器模块摆脱此手动编码/解码过程。因此,要打开将所有Unicode字符串编码为UTF-8的文件,请使用:

import codecs
f = codecs.open('path/to/file.txt','w','utf8')
f.write(my_unicode_string)  # Stored on disk as UTF-8

请注意,正在使用这些文件的其他任何文件,如果要读取它们,都必须了解文件的编码格式。如果您是唯一一个进行读/写的人,那么这不是问题,否则请确保以一种其他任何使用文件都可以理解的形式书写。

在Python 3中,这种形式的文件访问是默认的,并且内置open函数将采用编码参数,并始终与以文本模式打开的文件在Unicode字符串(Python 3中的默认字符串对象)之间进行转换。

If you have a Unicode string, and you want to write this to a file, or other serialised form, you must first encode it into a particular representation that can be stored. There are several common Unicode encodings, such as UTF-16 (uses two bytes for most Unicode characters) or UTF-8 (1-4 bytes / codepoint depending on the character), etc. To convert that string into a particular encoding, you can use:

>>> s= u'£10'
>>> s.encode('utf8')
'\xc2\x9c10'
>>> s.encode('utf16')
'\xff\xfe\x9c\x001\x000\x00'

This raw string of bytes can be written to a file. However, note that when reading it back, you must know what encoding it is in and decode it using that same encoding.

When writing to files, you can get rid of this manual encode/decode process by using the codecs module. So, to open a file that encodes all Unicode strings into UTF-8, use:

import codecs
f = codecs.open('path/to/file.txt','w','utf8')
f.write(my_unicode_string)  # Stored on disk as UTF-8

Do note that anything else that is using these files must understand what encoding the file is in if they want to read them. If you are the only one doing the reading/writing this isn’t a problem, otherwise make sure that you write in a form understandable by whatever else uses the files.

In Python 3, this form of file access is the default, and the built-in open function will take an encoding parameter and always translate to/from Unicode strings (the default string object in Python 3) for files opened in text mode.


回答 4

这是一个例子:

>>> u = u'€€€'
>>> s = u.encode('utf8')
>>> s
'\xe2\x82\xac\xe2\x82\xac\xe2\x82\xac'

Here is an example:

>>> u = u'€€€'
>>> s = u.encode('utf8')
>>> s
'\xe2\x82\xac\xe2\x82\xac\xe2\x82\xac'

回答 5

好吧,如果您愿意/准备切换到Python 3(可能不是由于与某些Python 2代码的向后不兼容),则不必进行任何转换。Python 3中的所有文本均以Unicode字符串表示,这也意味着不再使用该u'<text>'语法。实际上,您还有字节字符串,用于表示数据(可以是编码字符串)。

http://docs.python.org/3.1/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8位

(当然,如果您当前使用的是Python 3,则问题可能与您尝试将文本保存到文件中有关。)

Well, if you’re willing/ready to switch to Python 3 (which you may not be due to the backwards incompatibility with some Python 2 code), you don’t have to do any converting; all text in Python 3 is represented with Unicode strings, which also means that there’s no more usage of the u'<text>' syntax. You also have what are, in effect, strings of bytes, which are used to represent data (which may be an encoded string).

http://docs.python.org/3.1/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit

(Of course, if you’re currently using Python 3, then the problem is likely something to do with how you’re attempting to save the text to a file.)


回答 6

这是一个示例代码

import unicodedata    
raw_text = u"here $%6757 dfgdfg"
convert_text = unicodedata.normalize('NFKD', raw_text).encode('ascii','ignore')

Here is an example code

import unicodedata    
raw_text = u"here $%6757 dfgdfg"
convert_text = unicodedata.normalize('NFKD', raw_text).encode('ascii','ignore')

回答 7

文件包含Unicode字符串

\"message\": \"\\u0410\\u0432\\u0442\\u043e\\u0437\\u0430\\u0446\\u0438\\u044f .....\",

为了我

 f = open("56ad62-json.log", encoding="utf-8")
 qq=f.readline() 

 print(qq)                          
 {"log":\"message\": \"\\u0410\\u0432\\u0442\\u043e\\u0440\\u0438\\u0437\\u0430\\u0446\\u0438\\u044f \\u043f\\u043e\\u043b\\u044c\\u0437\\u043e\\u0432\\u0430\\u0442\\u0435\\u043b\\u044f\"}

(qq.encode().decode("unicode-escape").encode().decode("unicode-escape")) 
# '{"log":"message": "Авторизация пользователя"}\n'

file contain unicode-esaped string

\"message\": \"\\u0410\\u0432\\u0442\\u043e\\u0437\\u0430\\u0446\\u0438\\u044f .....\",

for me

 f = open("56ad62-json.log", encoding="utf-8")
 qq=f.readline() 

 print(qq)                          
 {"log":\"message\": \"\\u0410\\u0432\\u0442\\u043e\\u0440\\u0438\\u0437\\u0430\\u0446\\u0438\\u044f \\u043f\\u043e\\u043b\\u044c\\u0437\\u043e\\u0432\\u0430\\u0442\\u0435\\u043b\\u044f\"}

(qq.encode().decode("unicode-escape").encode().decode("unicode-escape")) 
# '{"log":"message": "Авторизация пользователя"}\n'

回答 8

对于我的情况,没有答案可用。在这里,我有一个包含unichar字符的字符串变量,在此没有解释的encoding-decode起作用。

如果我在航站楼里

echo "no me llama mucho la atenci\u00f3n"

要么

python3
>>> print("no me llama mucho la atenci\u00f3n")

输出正确:

output: no me llama mucho la atención

但是使用脚本加载此字符串变量无法正常工作。

这是对我的案例起作用的,以防万一:

string_to_convert = "no me llama mucho la atenci\u00f3n"
print(json.dumps(json.loads(r'"%s"' % string_to_convert), ensure_ascii=False))
output: no me llama mucho la atención

No answere worked for my case, where I had a string variable containing unicode chars, and no encode-decode explained here did the work.

If I do in a Terminal

echo "no me llama mucho la atenci\u00f3n"

or

python3
>>> print("no me llama mucho la atenci\u00f3n")

The output is correct:

output: no me llama mucho la atención

But working with scripts loading this string variable didn’t work.

This is what worked on my case, in case helps anybody:

string_to_convert = "no me llama mucho la atenci\u00f3n"
print(json.dumps(json.loads(r'"%s"' % string_to_convert), ensure_ascii=False))
output: no me llama mucho la atención

如何将字符串解析为float或int?

问题:如何将字符串解析为float或int?

在Python中,如何解析类似于"545.2222"其对应的float值的数字字符串545.2222?还是将字符串解析为"31"整数31

我只是想知道如何分析一个浮动 strfloat,和(单独)的INT strint

In Python, how can I parse a numeric string like "545.2222" to its corresponding float value, 545.2222? Or parse the string "31" to an integer, 31?

I just want to know how to parse a float str to a float, and (separately) an int str to an int.


回答 0

>>> a = "545.2222"
>>> float(a)
545.22220000000004
>>> int(float(a))
545
>>> a = "545.2222"
>>> float(a)
545.22220000000004
>>> int(float(a))
545

回答 1

def num(s):
    try:
        return int(s)
    except ValueError:
        return float(s)
def num(s):
    try:
        return int(s)
    except ValueError:
        return float(s)

回答 2

检查字符串是否为浮点数的Python方法:

def is_float(value):
  try:
    float(value)
    return True
  except:
    return False

此功能的更长更准确的名称可能是: is_convertible_to_float(value)

什么是Python中的浮点数,哪些不是浮点数,可能会让您感到惊讶:

val                   is_float(val) Note
--------------------  ----------   --------------------------------
""                    False        Blank string
"127"                 True         Passed string
True                  True         Pure sweet Truth
"True"                False        Vile contemptible lie
False                 True         So false it becomes true
"123.456"             True         Decimal
"      -127    "      True         Spaces trimmed
"\t\n12\r\n"          True         whitespace ignored
"NaN"                 True         Not a number
"NaNanananaBATMAN"    False        I am Batman
"-iNF"                True         Negative infinity
"123.E4"              True         Exponential notation
".1"                  True         mantissa only
"1,234"               False        Commas gtfo
u'\x30'               True         Unicode is fine.
"NULL"                False        Null is not special
0x3fade               True         Hexadecimal
"6e7777777777777"     True         Shrunk to infinity
"1.797693e+308"       True         This is max value
"infinity"            True         Same as inf
"infinityandBEYOND"   False        Extra characters wreck it
"12.34.56"            False        Only one dot allowed
u'四'                 False        Japanese '4' is not a float.
"#56"                 False        Pound sign
"56%"                 False        Percent of what?
"0E0"                 True         Exponential, move dot 0 places
0**0                  True         0___0  Exponentiation
"-5e-5"               True         Raise to a negative number
"+1e1"                True         Plus is OK with exponent
"+1e1^5"              False        Fancy exponent not interpreted
"+1e1.3"              False        No decimals in exponent
"-+1"                 False        Make up your mind
"(1)"                 False        Parenthesis is bad

您以为知道什么数字?你不像你想的那样好!并不奇怪。

不要在对生命至关重要的软件上使用此代码!

用这种方式捕获广泛的异常,杀死金丝雀和吞噬异常会产生很小的机会,即有效的float字符串将返回false。该float(...)行代码可以失败的任何什么都没有做的字符串的内容一千个理由。但是,如果您使用Python这样的鸭子式原型语言来编写至关重要的软件,那么您将遇到更大的问题。

Python method to check if a string is a float:

def is_float(value):
  try:
    float(value)
    return True
  except:
    return False

A longer and more accurate name for this function could be: is_convertible_to_float(value)

What is, and is not a float in Python may surprise you:

val                   is_float(val) Note
--------------------  ----------   --------------------------------
""                    False        Blank string
"127"                 True         Passed string
True                  True         Pure sweet Truth
"True"                False        Vile contemptible lie
False                 True         So false it becomes true
"123.456"             True         Decimal
"      -127    "      True         Spaces trimmed
"\t\n12\r\n"          True         whitespace ignored
"NaN"                 True         Not a number
"NaNanananaBATMAN"    False        I am Batman
"-iNF"                True         Negative infinity
"123.E4"              True         Exponential notation
".1"                  True         mantissa only
"1,234"               False        Commas gtfo
u'\x30'               True         Unicode is fine.
"NULL"                False        Null is not special
0x3fade               True         Hexadecimal
"6e7777777777777"     True         Shrunk to infinity
"1.797693e+308"       True         This is max value
"infinity"            True         Same as inf
"infinityandBEYOND"   False        Extra characters wreck it
"12.34.56"            False        Only one dot allowed
u'四'                 False        Japanese '4' is not a float.
"#56"                 False        Pound sign
"56%"                 False        Percent of what?
"0E0"                 True         Exponential, move dot 0 places
0**0                  True         0___0  Exponentiation
"-5e-5"               True         Raise to a negative number
"+1e1"                True         Plus is OK with exponent
"+1e1^5"              False        Fancy exponent not interpreted
"+1e1.3"              False        No decimals in exponent
"-+1"                 False        Make up your mind
"(1)"                 False        Parenthesis is bad

You think you know what numbers are? You are not so good as you think! Not big surprise.

Don’t use this code on life-critical software!

Catching broad exceptions this way, killing canaries and gobbling the exception creates a tiny chance that a valid float as string will return false. The float(...) line of code can failed for any of a thousand reasons that have nothing to do with the contents of the string. But if you’re writing life-critical software in a duck-typing prototype language like Python, then you’ve got much larger problems.


回答 3

这是另一个值得一提的方法ast.literal_eval

这可用于安全地评估包含来自不受信任来源的Python表达式的字符串,而无需自己解析值。

也就是说,一个安全的“评估”

>>> import ast
>>> ast.literal_eval("545.2222")
545.2222
>>> ast.literal_eval("31")
31

This is another method which deserves to be mentioned here, ast.literal_eval:

This can be used for safely evaluating strings containing Python expressions from untrusted sources without the need to parse the values oneself.

That is, a safe ‘eval’

>>> import ast
>>> ast.literal_eval("545.2222")
545.2222
>>> ast.literal_eval("31")
31

回答 4

float(x) if '.' in x else int(x)
float(x) if '.' in x else int(x)

回答 5

本地化和逗号

您应该考虑数字的字符串表示形式中可能出现逗号的情况,例如 float("545,545.2222")抛出异常的情况。而是使用in locale中的方法将字符串转换为数字并正确解释逗号。locale.atof一旦为所需的数字约定设置了语言环境,该方法便会一步转换为浮点数。

示例1-美国数字约定

在美国和英国,逗号可以用作千位分隔符。在具有美国语言环境的此示例中,逗号作为分隔符正确处理:

>>> import locale
>>> a = u'545,545.2222'
>>> locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
'en_US.UTF-8'
>>> locale.atof(a)
545545.2222
>>> int(locale.atof(a))
545545
>>>

示例2-欧洲数字约定

在世界上大多数国家/地区,逗号用于小数点而不是句点。在此使用法语语言环境的示例中,逗号被正确处理为小数点:

>>> import locale
>>> b = u'545,2222'
>>> locale.setlocale(locale.LC_ALL, 'fr_FR')
'fr_FR'
>>> locale.atof(b)
545.2222

该方法locale.atoi也可用,但参数应为整数。

Localization and commas

You should consider the possibility of commas in the string representation of a number, for cases like float("545,545.2222") which throws an exception. Instead, use methods in locale to convert the strings to numbers and interpret commas correctly. The locale.atof method converts to a float in one step once the locale has been set for the desired number convention.

Example 1 — United States number conventions

In the United States and the UK, commas can be used as a thousands separator. In this example with American locale, the comma is handled properly as a separator:

>>> import locale
>>> a = u'545,545.2222'
>>> locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
'en_US.UTF-8'
>>> locale.atof(a)
545545.2222
>>> int(locale.atof(a))
545545
>>>

Example 2 — European number conventions

In the majority of countries of the world, commas are used for decimal marks instead of periods. In this example with French locale, the comma is correctly handled as a decimal mark:

>>> import locale
>>> b = u'545,2222'
>>> locale.setlocale(locale.LC_ALL, 'fr_FR')
'fr_FR'
>>> locale.atof(b)
545.2222

The method locale.atoi is also available, but the argument should be an integer.


回答 6

如果您不喜欢第三方模块,则可以签出fastnumbers模块。它提供了一个名为fast_real的函数,该函数可以完全满足此问题的要求,并且比纯Python实现要快:

>>> from fastnumbers import fast_real
>>> fast_real("545.2222")
545.2222
>>> type(fast_real("545.2222"))
float
>>> fast_real("31")
31
>>> type(fast_real("31"))
int

If you aren’t averse to third-party modules, you could check out the fastnumbers module. It provides a function called fast_real that does exactly what this question is asking for and does it faster than a pure-Python implementation:

>>> from fastnumbers import fast_real
>>> fast_real("545.2222")
545.2222
>>> type(fast_real("545.2222"))
float
>>> fast_real("31")
31
>>> type(fast_real("31"))
int

回答 7

用户codelogicharley是正确的,但是请记住,如果您知道字符串是整数(例如545),则可以调用int(“ 545”)而不先进行浮点运算。

如果您的字符串在列表中,则也可以使用map函数。

>>> x = ["545.0", "545.6", "999.2"]
>>> map(float, x)
[545.0, 545.60000000000002, 999.20000000000005]
>>>

只有它们都是相同的类型才是好的。

Users codelogic and harley are correct, but keep in mind if you know the string is an integer (for example, 545) you can call int(“545”) without first casting to float.

If your strings are in a list, you could use the map function as well.

>>> x = ["545.0", "545.6", "999.2"]
>>> map(float, x)
[545.0, 545.60000000000002, 999.20000000000005]
>>>

It is only good if they’re all the same type.


回答 8

在Python中,如何将“ 545.2222”之类的数字字符串解析为其对应的浮点值542.2222?还是将字符串“ 31”解析为整数31? 我只想知道如何将float字符串解析为float,以及将int字符串分别解析为int。

您最好单独进行这些操作。如果您要混合使用它们,则可能会在以后遇到问题。简单的答案是:

"545.2222" 漂浮:

>>> float("545.2222")
545.2222

"31" 到一个整数:

>>> int("31")
31

其他与字符串和文字之间的转换,整数转换:

来自各种基准的转换,您应该事先知道基准(默认值为10)。请注意,您可以为它们加上Python期望的字面量(请参见下文)或删除前缀:

>>> int("0b11111", 2)
31
>>> int("11111", 2)
31
>>> int('0o37', 8)
31
>>> int('37', 8)
31
>>> int('0x1f', 16)
31
>>> int('1f', 16)
31

如果您不预先知道基础,但是您知道它们将具有正确的前缀,那么如果您通过0作为基础,Python可以为您推断出这个前缀:

>>> int("0b11111", 0)
31
>>> int('0o37', 0)
31
>>> int('0x1f', 0)
31

其他基数的非十进制(即整数)文字

但是,如果您的动机是让自己的代码清楚地表示硬编码的特定值,则可能不需要从基数进行转换-您可以让Python使用正确的语法自动为您完成。

您可以使用apropos前缀自动转换为具有以下文字的整数。这些对Python 2和3有效:

二进制前缀 0b

>>> 0b11111
31

八进制,前缀 0o

>>> 0o37
31

十六进制,前缀 0x

>>> 0x1f
31

当描述二进制标志,代码中的文件许可权或颜色的十六进制值时,这很有用-例如,请注意不要使用引号:

>>> 0b10101 # binary flags
21
>>> 0o755 # read, write, execute perms for owner, read & ex for group & others
493
>>> 0xffffff # the color, white, max values for red, green, and blue
16777215

使模棱两可的Python 2八进制与Python 3兼容

如果您在Python 2中看到一个以0开头的整数,则这是(不建议使用的)八进制语法。

>>> 037
31

这很糟糕,因为看起来值应该是37。因此,在Python 3中,它现在引发了SyntaxError

>>> 037
  File "<stdin>", line 1
    037
      ^
SyntaxError: invalid token

使用0o前缀将Python 2八进制转换为在2和3中均可使用的八进制:

>>> 0o37
31

In Python, how can I parse a numeric string like “545.2222” to its corresponding float value, 542.2222? Or parse the string “31” to an integer, 31? I just want to know how to parse a float string to a float, and (separately) an int string to an int.

It’s good that you ask to do these separately. If you’re mixing them, you may be setting yourself up for problems later. The simple answer is:

"545.2222" to float:

>>> float("545.2222")
545.2222

"31" to an integer:

>>> int("31")
31

Other conversions, ints to and from strings and literals:

Conversions from various bases, and you should know the base in advance (10 is the default). Note you can prefix them with what Python expects for its literals (see below) or remove the prefix:

>>> int("0b11111", 2)
31
>>> int("11111", 2)
31
>>> int('0o37', 8)
31
>>> int('37', 8)
31
>>> int('0x1f', 16)
31
>>> int('1f', 16)
31

If you don’t know the base in advance, but you do know they will have the correct prefix, Python can infer this for you if you pass 0 as the base:

>>> int("0b11111", 0)
31
>>> int('0o37', 0)
31
>>> int('0x1f', 0)
31

Non-Decimal (i.e. Integer) Literals from other Bases

If your motivation is to have your own code clearly represent hard-coded specific values, however, you may not need to convert from the bases – you can let Python do it for you automatically with the correct syntax.

You can use the apropos prefixes to get automatic conversion to integers with the following literals. These are valid for Python 2 and 3:

Binary, prefix 0b

>>> 0b11111
31

Octal, prefix 0o

>>> 0o37
31

Hexadecimal, prefix 0x

>>> 0x1f
31

This can be useful when describing binary flags, file permissions in code, or hex values for colors – for example, note no quotes:

>>> 0b10101 # binary flags
21
>>> 0o755 # read, write, execute perms for owner, read & ex for group & others
493
>>> 0xffffff # the color, white, max values for red, green, and blue
16777215

Making ambiguous Python 2 octals compatible with Python 3

If you see an integer that starts with a 0, in Python 2, this is (deprecated) octal syntax.

>>> 037
31

It is bad because it looks like the value should be 37. So in Python 3, it now raises a SyntaxError:

>>> 037
  File "<stdin>", line 1
    037
      ^
SyntaxError: invalid token

Convert your Python 2 octals to octals that work in both 2 and 3 with the 0o prefix:

>>> 0o37
31

回答 9

这个问题似乎有点老了。但是让我建议一个函数parseStr,它的功能类似,即返回整数或浮点数,并且如果无法将给定的ASCII字符串转换为其中的任何一个,则它将返回原样。当然,可以将代码调整为仅执行所需的操作:

   >>> import string
   >>> parseStr = lambda x: x.isalpha() and x or x.isdigit() and \
   ...                      int(x) or x.isalnum() and x or \
   ...                      len(set(string.punctuation).intersection(x)) == 1 and \
   ...                      x.count('.') == 1 and float(x) or x
   >>> parseStr('123')
   123
   >>> parseStr('123.3')
   123.3
   >>> parseStr('3HC1')
   '3HC1'
   >>> parseStr('12.e5')
   1200000.0
   >>> parseStr('12$5')
   '12$5'
   >>> parseStr('12.2.2')
   '12.2.2'

The question seems a little bit old. But let me suggest a function, parseStr, which makes something similar, that is, returns integer or float and if a given ASCII string cannot be converted to none of them it returns it untouched. The code of course might be adjusted to do only what you want:

   >>> import string
   >>> parseStr = lambda x: x.isalpha() and x or x.isdigit() and \
   ...                      int(x) or x.isalnum() and x or \
   ...                      len(set(string.punctuation).intersection(x)) == 1 and \
   ...                      x.count('.') == 1 and float(x) or x
   >>> parseStr('123')
   123
   >>> parseStr('123.3')
   123.3
   >>> parseStr('3HC1')
   '3HC1'
   >>> parseStr('12.e5')
   1200000.0
   >>> parseStr('12$5')
   '12$5'
   >>> parseStr('12.2.2')
   '12.2.2'

回答 10

float("545.2222")int(float("545.2222"))

float("545.2222") and int(float("545.2222"))


回答 11

我为此使用此功能

import ast

def parse_str(s):
   try:
      return ast.literal_eval(str(s))
   except:
      return

它将字符串转换为其类型

value = parse_str('1')  # Returns Integer
value = parse_str('1.5')  # Returns Float

I use this function for that

import ast

def parse_str(s):
   try:
      return ast.literal_eval(str(s))
   except:
      return

It will convert the string to its type

value = parse_str('1')  # Returns Integer
value = parse_str('1.5')  # Returns Float

回答 12

YAML解析器可以帮助你找出你的数据类型的字符串是什么。使用yaml.load(),然后可以使用type(result)测试类型:

>>> import yaml

>>> a = "545.2222"
>>> result = yaml.load(a)
>>> result
545.22220000000004
>>> type(result)
<type 'float'>

>>> b = "31"
>>> result = yaml.load(b)
>>> result
31
>>> type(result)
<type 'int'>

>>> c = "HI"
>>> result = yaml.load(c)
>>> result
'HI'
>>> type(result)
<type 'str'>

The YAML parser can help you figure out what datatype your string is. Use yaml.load(), and then you can use type(result) to test for type:

>>> import yaml

>>> a = "545.2222"
>>> result = yaml.load(a)
>>> result
545.22220000000004
>>> type(result)
<type 'float'>

>>> b = "31"
>>> result = yaml.load(b)
>>> result
31
>>> type(result)
<type 'int'>

>>> c = "HI"
>>> result = yaml.load(c)
>>> result
'HI'
>>> type(result)
<type 'str'>

回答 13

def get_int_or_float(v):
    number_as_float = float(v)
    number_as_int = int(number_as_float)
    return number_as_int if number_as_float == number_as_int else number_as_float
def get_int_or_float(v):
    number_as_float = float(v)
    number_as_int = int(number_as_float)
    return number_as_int if number_as_float == number_as_int else number_as_float

回答 14

def num(s):
    """num(s)
    num(3),num(3.7)-->3
    num('3')-->3, num('3.7')-->3.7
    num('3,700')-->ValueError
    num('3a'),num('a3'),-->ValueError
    num('3e4') --> 30000.0
    """
    try:
        return int(s)
    except ValueError:
        try:
            return float(s)
        except ValueError:
            raise ValueError('argument is not a string of number')
def num(s):
    """num(s)
    num(3),num(3.7)-->3
    num('3')-->3, num('3.7')-->3.7
    num('3,700')-->ValueError
    num('3a'),num('a3'),-->ValueError
    num('3e4') --> 30000.0
    """
    try:
        return int(s)
    except ValueError:
        try:
            return float(s)
        except ValueError:
            raise ValueError('argument is not a string of number')

回答 15

您需要考虑到四舍五入才能正确执行此操作。

即int(5.1)=> 5 int(5.6)=> 5-错误,应该为6所以我们做int(5.6 + 0.5)=> 6

def convert(n):
    try:
        return int(n)
    except ValueError:
        return float(n + 0.5)

You need to take into account rounding to do this properly.

I.e. int(5.1) => 5 int(5.6) => 5 — wrong, should be 6 so we do int(5.6 + 0.5) => 6

def convert(n):
    try:
        return int(n)
    except ValueError:
        return float(n + 0.5)

回答 16

我很惊讶没有人提到正则表达式,因为有时必须在转换为数字之前准备好字符串并对其进行规范化

import re
def parseNumber(value, as_int=False):
    try:
        number = float(re.sub('[^.\-\d]', '', value))
        if as_int:
            return int(number + 0.5)
        else:
            return number
    except ValueError:
        return float('nan')  # or None if you wish

用法:

parseNumber('13,345')
> 13345.0

parseNumber('- 123 000')
> -123000.0

parseNumber('99999\n')
> 99999.0

顺便说一句,以验证您有一个数字:

import numbers
def is_number(value):
    return isinstance(value, numbers.Number)
    # will work with int, float, long, Decimal

I am surprised nobody mentioned regex because sometimes string must be prepared and normalized before casting to number

import re
def parseNumber(value, as_int=False):
    try:
        number = float(re.sub('[^.\-\d]', '', value))
        if as_int:
            return int(number + 0.5)
        else:
            return number
    except ValueError:
        return float('nan')  # or None if you wish

usage:

parseNumber('13,345')
> 13345.0

parseNumber('- 123 000')
> -123000.0

parseNumber('99999\n')
> 99999.0

and by the way, something to verify you have a number:

import numbers
def is_number(value):
    return isinstance(value, numbers.Number)
    # will work with int, float, long, Decimal

回答 17

要在python中进行类型转换,请使用该类型的构造函数,并将字符串(或您尝试投射的任何值)作为参数传递。

例如:

>>>float("23.333")
   23.333

在后台,python正在调用objects __float__方法,该方法应该返回参数的float表示形式。这是特别强大的功能,因为您可以使用__float__方法定义自己的类型(使用类),以便可以使用float(myobject)将其转换为float。

To typecast in python use the constructor funtions of the type, passing the string (or whatever value you are trying to cast) as a parameter.

For example:

>>>float("23.333")
   23.333

Behind the scenes, python is calling the objects __float__ method, which should return a float representation of the parameter. This is especially powerful, as you can define your own types (using classes) with a __float__ method so that it can be casted into a float using float(myobject).


回答 18

这是一个正确版本https://stackoverflow.com/a/33017514/5973334

这将尝试解析一个字符串并返回一个intfloat取决于该字符串表示什么。它可能会引发解析异常或具有某些意外行为

  def get_int_or_float(v):
        number_as_float = float(v)
        number_as_int = int(number_as_float)
        return number_as_int if number_as_float == number_as_int else 
        number_as_float

This is a corrected version of https://stackoverflow.com/a/33017514/5973334

This will try to parse a string and return either int or float depending on what the string represents. It might rise parsing exceptions or have some unexpected behaviour.

  def get_int_or_float(v):
        number_as_float = float(v)
        number_as_int = int(number_as_float)
        return number_as_int if number_as_float == number_as_int else 
        number_as_float

回答 19

将您的字符串传递给此函数:

def string_to_number(str):
  if("." in str):
    try:
      res = float(str)
    except:
      res = str  
  elif(str.isdigit()):
    res = int(str)
  else:
    res = str
  return(res)

根据所传递的内容,它将返回int,float或string。

一个int字符串

print(type(string_to_number("124")))
<class 'int'>

浮点数的字符串

print(type(string_to_number("12.4")))
<class 'float'>

字符串即字符串

print(type(string_to_number("hello")))
<class 'str'>

看起来像个浮点数的字符串

print(type(string_to_number("hel.lo")))
<class 'str'>

Pass your string to this function:

def string_to_number(str):
  if("." in str):
    try:
      res = float(str)
    except:
      res = str  
  elif(str.isdigit()):
    res = int(str)
  else:
    res = str
  return(res)

It will return int, float or string depending on what was passed.

string that is an int

print(type(string_to_number("124")))
<class 'int'>

string that is a float

print(type(string_to_number("12.4")))
<class 'float'>

string that is a string

print(type(string_to_number("hello")))
<class 'str'>

string that looks like a float

print(type(string_to_number("hel.lo")))
<class 'str'>

回答 20

采用:

def num(s):
    try:
        for each in s:
            yield int(each)
    except ValueError:
        yield float(each)
a = num(["123.55","345","44"])
print a.next()
print a.next()

这是我想出的最Python化的方式。

Use:

def num(s):
    try:
        for each in s:
            yield int(each)
    except ValueError:
        yield float(each)
a = num(["123.55","345","44"])
print a.next()
print a.next()

This is the most Pythonic way I could come up with.


回答 21

处理十六进制,八进制,二进制,十进制和浮点数

该解决方案将处理数字的所有字符串约定(我所知道的全部)。

def to_number(n):
    ''' Convert any number representation to a number 
    This covers: float, decimal, hex, and octal numbers.
    '''

    try:
        return int(str(n), 0)
    except:
        try:
            # python 3 doesn't accept "010" as a valid octal.  You must use the
            # '0o' prefix
            return int('0o' + n, 0)
        except:
            return float(n)

该测试用例输出说明了我在说什么。

======================== CAPTURED OUTPUT =========================
to_number(3735928559)   = 3735928559 == 3735928559
to_number("0xFEEDFACE") = 4277009102 == 4277009102
to_number("0x0")        =          0 ==          0
to_number(100)          =        100 ==        100
to_number("42")         =         42 ==         42
to_number(8)            =          8 ==          8
to_number("0o20")       =         16 ==         16
to_number("020")        =         16 ==         16
to_number(3.14)         =       3.14 ==       3.14
to_number("2.72")       =       2.72 ==       2.72
to_number("1e3")        =     1000.0 ==       1000
to_number(0.001)        =      0.001 ==      0.001
to_number("0xA")        =         10 ==         10
to_number("012")        =         10 ==         10
to_number("0o12")       =         10 ==         10
to_number("0b01010")    =         10 ==         10
to_number("10")         =         10 ==         10
to_number("10.0")       =       10.0 ==         10
to_number("1e1")        =       10.0 ==         10

这是测试:

class test_to_number(unittest.TestCase):

    def test_hex(self):
        # All of the following should be converted to an integer
        #
        values = [

                 #          HEX
                 # ----------------------
                 # Input     |   Expected
                 # ----------------------
                (0xDEADBEEF  , 3735928559), # Hex
                ("0xFEEDFACE", 4277009102), # Hex
                ("0x0"       ,          0), # Hex

                 #        Decimals
                 # ----------------------
                 # Input     |   Expected
                 # ----------------------
                (100         ,        100), # Decimal
                ("42"        ,         42), # Decimal
            ]



        values += [
                 #        Octals
                 # ----------------------
                 # Input     |   Expected
                 # ----------------------
                (0o10        ,          8), # Octal
                ("0o20"      ,         16), # Octal
                ("020"       ,         16), # Octal
            ]


        values += [
                 #        Floats
                 # ----------------------
                 # Input     |   Expected
                 # ----------------------
                (3.14        ,       3.14), # Float
                ("2.72"      ,       2.72), # Float
                ("1e3"       ,       1000), # Float
                (1e-3        ,      0.001), # Float
            ]

        values += [
                 #        All ints
                 # ----------------------
                 # Input     |   Expected
                 # ----------------------
                ("0xA"       ,         10), 
                ("012"       ,         10), 
                ("0o12"      ,         10), 
                ("0b01010"   ,         10), 
                ("10"        ,         10), 
                ("10.0"      ,         10), 
                ("1e1"       ,         10), 
            ]

        for _input, expected in values:
            value = to_number(_input)

            if isinstance(_input, str):
                cmd = 'to_number("{}")'.format(_input)
            else:
                cmd = 'to_number({})'.format(_input)

            print("{:23} = {:10} == {:10}".format(cmd, value, expected))
            self.assertEqual(value, expected)

Handles hex, octal, binary, decimal, and float

This solution will handle all of the string conventions for numbers (all that I know about).

def to_number(n):
    ''' Convert any number representation to a number 
    This covers: float, decimal, hex, and octal numbers.
    '''

    try:
        return int(str(n), 0)
    except:
        try:
            # python 3 doesn't accept "010" as a valid octal.  You must use the
            # '0o' prefix
            return int('0o' + n, 0)
        except:
            return float(n)

This test case output illustrates what I’m talking about.

======================== CAPTURED OUTPUT =========================
to_number(3735928559)   = 3735928559 == 3735928559
to_number("0xFEEDFACE") = 4277009102 == 4277009102
to_number("0x0")        =          0 ==          0
to_number(100)          =        100 ==        100
to_number("42")         =         42 ==         42
to_number(8)            =          8 ==          8
to_number("0o20")       =         16 ==         16
to_number("020")        =         16 ==         16
to_number(3.14)         =       3.14 ==       3.14
to_number("2.72")       =       2.72 ==       2.72
to_number("1e3")        =     1000.0 ==       1000
to_number(0.001)        =      0.001 ==      0.001
to_number("0xA")        =         10 ==         10
to_number("012")        =         10 ==         10
to_number("0o12")       =         10 ==         10
to_number("0b01010")    =         10 ==         10
to_number("10")         =         10 ==         10
to_number("10.0")       =       10.0 ==         10
to_number("1e1")        =       10.0 ==         10

Here is the test:

class test_to_number(unittest.TestCase):

    def test_hex(self):
        # All of the following should be converted to an integer
        #
        values = [

                 #          HEX
                 # ----------------------
                 # Input     |   Expected
                 # ----------------------
                (0xDEADBEEF  , 3735928559), # Hex
                ("0xFEEDFACE", 4277009102), # Hex
                ("0x0"       ,          0), # Hex

                 #        Decimals
                 # ----------------------
                 # Input     |   Expected
                 # ----------------------
                (100         ,        100), # Decimal
                ("42"        ,         42), # Decimal
            ]



        values += [
                 #        Octals
                 # ----------------------
                 # Input     |   Expected
                 # ----------------------
                (0o10        ,          8), # Octal
                ("0o20"      ,         16), # Octal
                ("020"       ,         16), # Octal
            ]


        values += [
                 #        Floats
                 # ----------------------
                 # Input     |   Expected
                 # ----------------------
                (3.14        ,       3.14), # Float
                ("2.72"      ,       2.72), # Float
                ("1e3"       ,       1000), # Float
                (1e-3        ,      0.001), # Float
            ]

        values += [
                 #        All ints
                 # ----------------------
                 # Input     |   Expected
                 # ----------------------
                ("0xA"       ,         10), 
                ("012"       ,         10), 
                ("0o12"      ,         10), 
                ("0b01010"   ,         10), 
                ("10"        ,         10), 
                ("10.0"      ,         10), 
                ("1e1"       ,         10), 
            ]

        for _input, expected in values:
            value = to_number(_input)

            if isinstance(_input, str):
                cmd = 'to_number("{}")'.format(_input)
            else:
                cmd = 'to_number({})'.format(_input)

            print("{:23} = {:10} == {:10}".format(cmd, value, expected))
            self.assertEqual(value, expected)

回答 22

采用:

>>> str_float = "545.2222"
>>> float(str_float)
545.2222
>>> type(_) # Check its type
<type 'float'>

>>> str_int = "31"
>>> int(str_int)
31
>>> type(_) # Check its type
<type 'int'>

Use:

>>> str_float = "545.2222"
>>> float(str_float)
545.2222
>>> type(_) # Check its type
<type 'float'>

>>> str_int = "31"
>>> int(str_int)
31
>>> type(_) # Check its type
<type 'int'>

回答 23

这是将转换任何一个函数object(不只是str)到intfloat方法,依据实际的字符串提供模样 intfloat。此外,如果它是同时具有__float__int__方法的对象,则默认使用__float__

def conv_to_num(x, num_type='asis'):
    '''Converts an object to a number if possible.
    num_type: int, float, 'asis'
    Defaults to floating point in case of ambiguity.
    '''
    import numbers

    is_num, is_str, is_other = [False]*3

    if isinstance(x, numbers.Number):
        is_num = True
    elif isinstance(x, str):
        is_str = True

    is_other = not any([is_num, is_str])

    if is_num:
        res = x
    elif is_str:
        is_float, is_int, is_char = [False]*3
        try:
            res = float(x)
            if '.' in x:
                is_float = True
            else:
                is_int = True
        except ValueError:
            res = x
            is_char = True

    else:
        if num_type == 'asis':
            funcs = [int, float]
        else:
            funcs = [num_type]

        for func in funcs:
            try:
                res = func(x)
                break
            except TypeError:
                continue
        else:
            res = x

This is a function which will convert any object (not just str) to int or float, based on if the actual string supplied looks like int or float. Further if it’s an object which has both __float and __int__ methods, it defaults to using __float__

def conv_to_num(x, num_type='asis'):
    '''Converts an object to a number if possible.
    num_type: int, float, 'asis'
    Defaults to floating point in case of ambiguity.
    '''
    import numbers

    is_num, is_str, is_other = [False]*3

    if isinstance(x, numbers.Number):
        is_num = True
    elif isinstance(x, str):
        is_str = True

    is_other = not any([is_num, is_str])

    if is_num:
        res = x
    elif is_str:
        is_float, is_int, is_char = [False]*3
        try:
            res = float(x)
            if '.' in x:
                is_float = True
            else:
                is_int = True
        except ValueError:
            res = x
            is_char = True

    else:
        if num_type == 'asis':
            funcs = [int, float]
        else:
            funcs = [num_type]

        for func in funcs:
            try:
                res = func(x)
                break
            except TypeError:
                continue
        else:
            res = x

回答 24

通过使用int和float方法,我们可以将字符串转换为整数和浮点数。

s="45.8"
print(float(s))

y='67'
print(int(y))

By using int and float methods we can convert a string to integer and floats.

s="45.8"
print(float(s))

y='67'
print(int(y))

回答 25

eval()是这个问题的很好解决方案。它不需要检查数字是int还是float,它只给出相应的等价物。如果需要其他方法,请尝试

if '.' in string:
    print(float(string))
else:
    print(int(string))

try-except也可以用作替代方法。尝试在try块中将字符串转换为int。如果该字符串是一个浮点值,它将抛出一个错误,该错误将在except块中捕获,像这样

try:
    print(int(string))
except:
    print(float(string))

eval() is a very good solution to this question. It doesn’t need to check if the number is int or float, it just gives the corresponding equivalent. If other methods are required, try

if '.' in string:
    print(float(string))
else:
    print(int(string))

try-except can also be used as an alternative. Try converting string to int inside the try block. If the string would be a float value, it will throw an error which will be catched in the except block, like this

try:
    print(int(string))
except:
    print(float(string))

回答 26

这是您问题的另一种解释(提示:含糊)。您可能正在寻找这样的东西:

def parseIntOrFloat( aString ):
    return eval( aString )

它是这样的…

>>> parseIntOrFloat("545.2222")
545.22220000000004
>>> parseIntOrFloat("545")
545

从理论上讲,存在注入漏洞。字符串可以是例如"import os; os.abort()"。但是,由于没有关于字符串来自何处的任何背景,因此可能是理论上的推测。由于问题很模糊,因此尚不清楚此漏洞是否确实存在。

Here’s another interpretation of your question (hint: it’s vague). It’s possible you’re looking for something like this:

def parseIntOrFloat( aString ):
    return eval( aString )

It works like this…

>>> parseIntOrFloat("545.2222")
545.22220000000004
>>> parseIntOrFloat("545")
545

Theoretically, there’s an injection vulnerability. The string could, for example be "import os; os.abort()". Without any background on where the string comes from, however, the possibility is theoretical speculation. Since the question is vague, it’s not at all clear if this vulnerability actually exists or not.


如何检查字符串是否为数字(浮点数)?

问题:如何检查字符串是否为数字(浮点数)?

检查字符串是否可以在Python中表示为数字的最佳方法是什么?

我目前拥有的功能是:

def is_number(s):
    try:
        float(s)
        return True
    except ValueError:
        return False

这不仅丑陋而且缓慢,看起来笨拙。但是我还没有找到更好的方法,因为调用floatmain函数甚至更糟。

What is the best possible way to check if a string can be represented as a number in Python?

The function I currently have right now is:

def is_number(s):
    try:
        float(s)
        return True
    except ValueError:
        return False

Which, not only is ugly and slow, seems clunky. However I haven’t found a better method because calling float in the main function is even worse.


回答 0

不仅丑陋而且缓慢

我都对此表示怀疑。

正则表达式或其他字符串解析方法将更难看,更慢。

我不确定任何事情都可以比上述速度更快。它调用该函数并返回。“尝试/捕获”不会带来太多开销,因为无需对堆栈帧进行大量搜索即可捕获最常见的异常。

问题是任何数值转换函数都有两种结果

  • 一个数字(如果该数字有效)
  • 状态代码(例如,通过errno)或异常,表明无法解析任何有效数字。

C(作为示例)通过多种方式进行破解。Python清楚明确地列出了它。

我认为您执行此操作的代码是完美的。

Which, not only is ugly and slow

I’d dispute both.

A regex or other string parsing method would be uglier and slower.

I’m not sure that anything much could be faster than the above. It calls the function and returns. Try/Catch doesn’t introduce much overhead because the most common exception is caught without an extensive search of stack frames.

The issue is that any numeric conversion function has two kinds of results

  • A number, if the number is valid
  • A status code (e.g., via errno) or exception to show that no valid number could be parsed.

C (as an example) hacks around this a number of ways. Python lays it out clearly and explicitly.

I think your code for doing this is perfect.


回答 1

如果您正在寻找解析(正,无符号)整数而不是浮点数,则可以将该isdigit()函数用于字符串对象。

>>> a = "03523"
>>> a.isdigit()
True
>>> b = "963spam"
>>> b.isdigit()
False

字符串方法- isdigit()Python2Python3

Unicode字符串上也有一些内容,我不太熟悉 Unicode-是十进制/十进制

In case you are looking for parsing (positive, unsigned) integers instead of floats, you can use the isdigit() function for string objects.

>>> a = "03523"
>>> a.isdigit()
True
>>> b = "963spam"
>>> b.isdigit()
False

String Methods – isdigit(): Python2, Python3

There’s also something on Unicode strings, which I’m not too familiar with Unicode – Is decimal/decimal


回答 2

TL; DR最好的解决方案是s.replace('.','',1).isdigit()

我做了一些基准比较不同的方法

def is_number_tryexcept(s):
    """ Returns True is string is a number. """
    try:
        float(s)
        return True
    except ValueError:
        return False

import re    
def is_number_regex(s):
    """ Returns True is string is a number. """
    if re.match("^\d+?\.\d+?$", s) is None:
        return s.isdigit()
    return True


def is_number_repl_isdigit(s):
    """ Returns True is string is a number. """
    return s.replace('.','',1).isdigit()

如果字符串不是数字,则except-block很慢。但更重要的是,try-except方法是正确处理科学计数法的唯一方法。

funcs = [
          is_number_tryexcept, 
          is_number_regex,
          is_number_repl_isdigit
          ]

a_float = '.1234'

print('Float notation ".1234" is not supported by:')
for f in funcs:
    if not f(a_float):
        print('\t -', f.__name__)

浮点符号“ .1234”不受以下支持:
-is_number_regex

scientific1 = '1.000000e+50'
scientific2 = '1e50'


print('Scientific notation "1.000000e+50" is not supported by:')
for f in funcs:
    if not f(scientific1):
        print('\t -', f.__name__)




print('Scientific notation "1e50" is not supported by:')
for f in funcs:
    if not f(scientific2):
        print('\t -', f.__name__)

科学符号“ 1.000000e + 50”不支持:
-is_number_regex
-is_number_repl_isdigit
科学符号“ 1e50”不支持:
-is_number_regex
-is_number_repl_isdigit

编辑:基准结果

import timeit

test_cases = ['1.12345', '1.12.345', 'abc12345', '12345']
times_n = {f.__name__:[] for f in funcs}

for t in test_cases:
    for f in funcs:
        f = f.__name__
        times_n[f].append(min(timeit.Timer('%s(t)' %f, 
                      'from __main__ import %s, t' %f)
                              .repeat(repeat=3, number=1000000)))

测试以下功能的地方

from re import match as re_match
from re import compile as re_compile

def is_number_tryexcept(s):
    """ Returns True is string is a number. """
    try:
        float(s)
        return True
    except ValueError:
        return False

def is_number_regex(s):
    """ Returns True is string is a number. """
    if re_match("^\d+?\.\d+?$", s) is None:
        return s.isdigit()
    return True


comp = re_compile("^\d+?\.\d+?$")    

def compiled_regex(s):
    """ Returns True is string is a number. """
    if comp.match(s) is None:
        return s.isdigit()
    return True


def is_number_repl_isdigit(s):
    """ Returns True is string is a number. """
    return s.replace('.','',1).isdigit()

TL;DR The best solution is s.replace('.','',1).isdigit()

I did some benchmarks comparing the different approaches

def is_number_tryexcept(s):
    """ Returns True is string is a number. """
    try:
        float(s)
        return True
    except ValueError:
        return False

import re    
def is_number_regex(s):
    """ Returns True is string is a number. """
    if re.match("^\d+?\.\d+?$", s) is None:
        return s.isdigit()
    return True


def is_number_repl_isdigit(s):
    """ Returns True is string is a number. """
    return s.replace('.','',1).isdigit()

If the string is not a number, the except-block is quite slow. But more importantly, the try-except method is the only approach that handles scientific notations correctly.

funcs = [
          is_number_tryexcept, 
          is_number_regex,
          is_number_repl_isdigit
          ]

a_float = '.1234'

print('Float notation ".1234" is not supported by:')
for f in funcs:
    if not f(a_float):
        print('\t -', f.__name__)

Float notation “.1234” is not supported by:
– is_number_regex

scientific1 = '1.000000e+50'
scientific2 = '1e50'


print('Scientific notation "1.000000e+50" is not supported by:')
for f in funcs:
    if not f(scientific1):
        print('\t -', f.__name__)




print('Scientific notation "1e50" is not supported by:')
for f in funcs:
    if not f(scientific2):
        print('\t -', f.__name__)

Scientific notation “1.000000e+50” is not supported by:
– is_number_regex
– is_number_repl_isdigit
Scientific notation “1e50” is not supported by:
– is_number_regex
– is_number_repl_isdigit

EDIT: The benchmark results

import timeit

test_cases = ['1.12345', '1.12.345', 'abc12345', '12345']
times_n = {f.__name__:[] for f in funcs}

for t in test_cases:
    for f in funcs:
        f = f.__name__
        times_n[f].append(min(timeit.Timer('%s(t)' %f, 
                      'from __main__ import %s, t' %f)
                              .repeat(repeat=3, number=1000000)))

where the following functions were tested

from re import match as re_match
from re import compile as re_compile

def is_number_tryexcept(s):
    """ Returns True is string is a number. """
    try:
        float(s)
        return True
    except ValueError:
        return False

def is_number_regex(s):
    """ Returns True is string is a number. """
    if re_match("^\d+?\.\d+?$", s) is None:
        return s.isdigit()
    return True


comp = re_compile("^\d+?\.\d+?$")    

def compiled_regex(s):
    """ Returns True is string is a number. """
    if comp.match(s) is None:
        return s.isdigit()
    return True


def is_number_repl_isdigit(s):
    """ Returns True is string is a number. """
    return s.replace('.','',1).isdigit()


回答 3

您可能需要考虑一个exceptions:字符串“ NaN”

如果要让is_number为’NaN’返回FALSE,则此代码将不起作用,因为Python将其转换为非数字的表示形式(谈论身份问题):

>>> float('NaN')
nan

否则,我实际上应该感谢您现在广泛使用的那段代码。:)

G。

There is one exception that you may want to take into account: the string ‘NaN’

If you want is_number to return FALSE for ‘NaN’ this code will not work as Python converts it to its representation of a number that is not a number (talk about identity issues):

>>> float('NaN')
nan

Otherwise, I should actually thank you for the piece of code I now use extensively. :)

G.


回答 4

这个怎么样:

'3.14'.replace('.','',1).isdigit()

仅当存在一个或没有“。”时,它才返回true。在数字字符串中。

'3.14.5'.replace('.','',1).isdigit()

将返回假

编辑:刚刚看到另一条评论…添加.replace(badstuff,'',maxnum_badstuff)其他情况下可以完成。如果您传递盐而不是任意调味品(ref:xkcd#974),这将很好:P

how about this:

'3.14'.replace('.','',1).isdigit()

which will return true only if there is one or no ‘.’ in the string of digits.

'3.14.5'.replace('.','',1).isdigit()

will return false

edit: just saw another comment … adding a .replace(badstuff,'',maxnum_badstuff) for other cases can be done. if you are passing salt and not arbitrary condiments (ref:xkcd#974) this will do fine :P


回答 5

不仅丑陋且缓慢,而且看起来笨拙。

这可能需要一些时间来适应,但这是实现此目的的Python方法。正如已经指出的那样,替代方案更糟。但是用这种方式做事还有另一个好处:多态。

鸭子打字背后的中心思想是“如果它像鸭子一样走路和说话,那就是鸭子。” 如果您决定需要对字符串进行子类化,以便可以更改确定将某些内容转换为浮点数的方式,该怎么办?或者,如果您决定完全测试其他对象,该怎么办?您可以执行这些操作而不必更改上面的代码。

其他语言通过使用接口来解决这些问题。我将保存对另一个线程更好的解决方案的分析。不过,要点是,Python绝对位于等式的鸭式输入端,如果您打算在Python中进行大量编程,则可能必须习惯使用这种语法(但这并不意味着您当然要喜欢它)。

您可能还需要考虑的另一件事:与许多其他语言相比,Python在引发和捕获异常方面非常快(例如,比.Net快30倍)。哎呀,语言本身甚至抛出异常来传达非异常的正常程序条件(每次使用for循环时)。因此,除非您注意到一个重大问题,否则我不必担心此代码的性能方面。

Which, not only is ugly and slow, seems clunky.

It may take some getting used to, but this is the pythonic way of doing it. As has been already pointed out, the alternatives are worse. But there is one other advantage of doing things this way: polymorphism.

The central idea behind duck typing is that “if it walks and talks like a duck, then it’s a duck.” What if you decide that you need to subclass string so that you can change how you determine if something can be converted into a float? Or what if you decide to test some other object entirely? You can do these things without having to change the above code.

Other languages solve these problems by using interfaces. I’ll save the analysis of which solution is better for another thread. The point, though, is that python is decidedly on the duck typing side of the equation, and you’re probably going to have to get used to syntax like this if you plan on doing much programming in Python (but that doesn’t mean you have to like it of course).

One other thing you might want to take into consideration: Python is pretty fast in throwing and catching exceptions compared to a lot of other languages (30x faster than .Net for instance). Heck, the language itself even throws exceptions to communicate non-exceptional, normal program conditions (every time you use a for loop). Thus, I wouldn’t worry too much about the performance aspects of this code until you notice a significant problem.


回答 6

在Alfe指出您不需要单独检查float之后进行了更新,因为这两种情况都比较复杂:

def is_number(s):
    try:
        complex(s) # for int, long, float and complex
    except ValueError:
        return False

    return True

先前曾说过:在极少数情况下,您可能还需要检查复数(例如1 + 2i),而复数不能用浮点数表示:

def is_number(s):
    try:
        float(s) # for int, long and float
    except ValueError:
        try:
            complex(s) # for complex
        except ValueError:
            return False

    return True

Updated after Alfe pointed out you don’t need to check for float separately as complex handles both:

def is_number(s):
    try:
        complex(s) # for int, long, float and complex
    except ValueError:
        return False

    return True

Previously said: Is some rare cases you might also need to check for complex numbers (e.g. 1+2i), which can not be represented by a float:

def is_number(s):
    try:
        float(s) # for int, long and float
    except ValueError:
        try:
            complex(s) # for complex
        except ValueError:
            return False

    return True

回答 7

为此int使用:

>>> "1221323".isdigit()
True

但是因为float我们需要一些技巧;-)。每个浮点数都有一个点…

>>> "12.34".isdigit()
False
>>> "12.34".replace('.','',1).isdigit()
True
>>> "12.3.4".replace('.','',1).isdigit()
False

同样对于负数,只需添加lstrip()

>>> '-12'.lstrip('-')
'12'

现在我们有了一种通用的方式:

>>> '-12.34'.lstrip('-').replace('.','',1).isdigit()
True
>>> '.-234'.lstrip('-').replace('.','',1).isdigit()
False

For int use this:

>>> "1221323".isdigit()
True

But for float we need some tricks ;-). Every float number has one point…

>>> "12.34".isdigit()
False
>>> "12.34".replace('.','',1).isdigit()
True
>>> "12.3.4".replace('.','',1).isdigit()
False

Also for negative numbers just add lstrip():

>>> '-12'.lstrip('-')
'12'

And now we get a universal way:

>>> '-12.34'.lstrip('-').replace('.','',1).isdigit()
True
>>> '.-234'.lstrip('-').replace('.','',1).isdigit()
False

回答 8

只是模仿C#

在C#中,有两个不同的函数可以处理标量值的解析:

  • Float.Parse()
  • Float.TryParse()

float.parse():

def parse(string):
    try:
        return float(string)
    except Exception:
        throw TypeError

注意:如果您想知道为什么我将异常更改为TypeError,请参见文档

float.try_parse():

def try_parse(string, fail=None):
    try:
        return float(string)
    except Exception:
        return fail;

注意:您不想返回布尔值“ False”,因为它仍然是值类型。没有哪个更好,因为它表示失败。当然,如果您想要不同的东西,可以将fail参数更改为所需的任何参数。

要扩展float以包括’parse()’和’try_parse()’,您需要对’float’类进行Monkey补丁添加这些方法。

如果您想尊重现有功能,则代码应类似于:

def monkey_patch():
    if(!hasattr(float, 'parse')):
        float.parse = parse
    if(!hasattr(float, 'try_parse')):
        float.try_parse = try_parse

SideNote:我个人更喜欢将其命名为Monkey Punching,因为这样做的时候感觉就像是在滥用语言,但是YMMV一样。

用法:

float.parse('giggity') // throws TypeException
float.parse('54.3') // returns the scalar value 54.3
float.tryParse('twank') // returns None
float.tryParse('32.2') // returns the scalar value 32.2

伟大的贤者Python对罗马教廷神父说,“任何你能做的我都能做得更好;我能做的比你做得更好。”

Just Mimic C#

In C# there are two different functions that handle parsing of scalar values:

  • Float.Parse()
  • Float.TryParse()

float.parse():

def parse(string):
    try:
        return float(string)
    except Exception:
        throw TypeError

Note: If you’re wondering why I changed the exception to a TypeError, here’s the documentation.

float.try_parse():

def try_parse(string, fail=None):
    try:
        return float(string)
    except Exception:
        return fail;

Note: You don’t want to return the boolean ‘False’ because that’s still a value type. None is better because it indicates failure. Of course, if you want something different you can change the fail parameter to whatever you want.

To extend float to include the ‘parse()’ and ‘try_parse()’ you’ll need to monkeypatch the ‘float’ class to add these methods.

If you want respect pre-existing functions the code should be something like:

def monkey_patch():
    if(!hasattr(float, 'parse')):
        float.parse = parse
    if(!hasattr(float, 'try_parse')):
        float.try_parse = try_parse

SideNote: I personally prefer to call it Monkey Punching because it feels like I’m abusing the language when I do this but YMMV.

Usage:

float.parse('giggity') // throws TypeException
float.parse('54.3') // returns the scalar value 54.3
float.tryParse('twank') // returns None
float.tryParse('32.2') // returns the scalar value 32.2

And the great Sage Pythonas said to the Holy See Sharpisus, “Anything you can do I can do better; I can do anything better than you.”


回答 9

对于非数字字符串,try: except:实际上比正则表达式要慢。对于有效数字字符串,正则表达式要慢一些。因此,适当的方法取决于您的输入。

如果发现您处于性能绑定中,则可以使用名为fastnumbers的新第三方模块,该模块提供了称为isfloat的功能。完全公开,我是作者。我将其结果包括在以下时间中。


from __future__ import print_function
import timeit

prep_base = '''\
x = 'invalid'
y = '5402'
z = '4.754e3'
'''

prep_try_method = '''\
def is_number_try(val):
    try:
        float(val)
        return True
    except ValueError:
        return False

'''

prep_re_method = '''\
import re
float_match = re.compile(r'[-+]?\d*\.?\d+(?:[eE][-+]?\d+)?$').match
def is_number_re(val):
    return bool(float_match(val))

'''

fn_method = '''\
from fastnumbers import isfloat

'''

print('Try with non-number strings', timeit.timeit('is_number_try(x)',
    prep_base + prep_try_method), 'seconds')
print('Try with integer strings', timeit.timeit('is_number_try(y)',
    prep_base + prep_try_method), 'seconds')
print('Try with float strings', timeit.timeit('is_number_try(z)',
    prep_base + prep_try_method), 'seconds')
print()
print('Regex with non-number strings', timeit.timeit('is_number_re(x)',
    prep_base + prep_re_method), 'seconds')
print('Regex with integer strings', timeit.timeit('is_number_re(y)',
    prep_base + prep_re_method), 'seconds')
print('Regex with float strings', timeit.timeit('is_number_re(z)',
    prep_base + prep_re_method), 'seconds')
print()
print('fastnumbers with non-number strings', timeit.timeit('isfloat(x)',
    prep_base + 'from fastnumbers import isfloat'), 'seconds')
print('fastnumbers with integer strings', timeit.timeit('isfloat(y)',
    prep_base + 'from fastnumbers import isfloat'), 'seconds')
print('fastnumbers with float strings', timeit.timeit('isfloat(z)',
    prep_base + 'from fastnumbers import isfloat'), 'seconds')
print()

Try with non-number strings 2.39108395576 seconds
Try with integer strings 0.375686168671 seconds
Try with float strings 0.369210958481 seconds

Regex with non-number strings 0.748660802841 seconds
Regex with integer strings 1.02021503448 seconds
Regex with float strings 1.08564686775 seconds

fastnumbers with non-number strings 0.174362897873 seconds
fastnumbers with integer strings 0.179651021957 seconds
fastnumbers with float strings 0.20222902298 seconds

如你看到的

  • try: except: 对于数字输入速度很快,但是对于无效输入速度非常慢
  • 输入无效时,正则表达式非常有效
  • fastnumbers 在两种情况下均获胜

For strings of non-numbers, try: except: is actually slower than regular expressions. For strings of valid numbers, regex is slower. So, the appropriate method depends on your input.

If you find that you are in a performance bind, you can use a new third-party module called fastnumbers that provides a function called isfloat. Full disclosure, I am the author. I have included its results in the timings below.


from __future__ import print_function
import timeit

prep_base = '''\
x = 'invalid'
y = '5402'
z = '4.754e3'
'''

prep_try_method = '''\
def is_number_try(val):
    try:
        float(val)
        return True
    except ValueError:
        return False

'''

prep_re_method = '''\
import re
float_match = re.compile(r'[-+]?\d*\.?\d+(?:[eE][-+]?\d+)?$').match
def is_number_re(val):
    return bool(float_match(val))

'''

fn_method = '''\
from fastnumbers import isfloat

'''

print('Try with non-number strings', timeit.timeit('is_number_try(x)',
    prep_base + prep_try_method), 'seconds')
print('Try with integer strings', timeit.timeit('is_number_try(y)',
    prep_base + prep_try_method), 'seconds')
print('Try with float strings', timeit.timeit('is_number_try(z)',
    prep_base + prep_try_method), 'seconds')
print()
print('Regex with non-number strings', timeit.timeit('is_number_re(x)',
    prep_base + prep_re_method), 'seconds')
print('Regex with integer strings', timeit.timeit('is_number_re(y)',
    prep_base + prep_re_method), 'seconds')
print('Regex with float strings', timeit.timeit('is_number_re(z)',
    prep_base + prep_re_method), 'seconds')
print()
print('fastnumbers with non-number strings', timeit.timeit('isfloat(x)',
    prep_base + 'from fastnumbers import isfloat'), 'seconds')
print('fastnumbers with integer strings', timeit.timeit('isfloat(y)',
    prep_base + 'from fastnumbers import isfloat'), 'seconds')
print('fastnumbers with float strings', timeit.timeit('isfloat(z)',
    prep_base + 'from fastnumbers import isfloat'), 'seconds')
print()

Try with non-number strings 2.39108395576 seconds
Try with integer strings 0.375686168671 seconds
Try with float strings 0.369210958481 seconds

Regex with non-number strings 0.748660802841 seconds
Regex with integer strings 1.02021503448 seconds
Regex with float strings 1.08564686775 seconds

fastnumbers with non-number strings 0.174362897873 seconds
fastnumbers with integer strings 0.179651021957 seconds
fastnumbers with float strings 0.20222902298 seconds

As you can see

  • try: except: was fast for numeric input but very slow for an invalid input
  • regex is very efficient when the input is invalid
  • fastnumbers wins in both cases

回答 10

我知道这是特别古老的,但我想补充一个答案,我相信它涵盖了投票率最高的答案所缺少的信息,对于发现此问题的任何人可能都非常有价值:

对于以下每种方法,如果需要接受任何输入,请使用计数将它们连接起来。(假设我们使用的是声音的整数定义,而不是0-255等)

x.isdigit() 非常适合检查x是否为整数。

x.replace('-','').isdigit() 对于检查x是否为负数效果很好(检查-在第一个位置)

x.replace('.','').isdigit() 非常适合检查x是否为小数。

x.replace(':','').isdigit() 非常适合检查x是否为比率。

x.replace('/','',1).isdigit() 非常适合检查x是否为分数。

I know this is particularly old but I would add an answer I believe covers the information missing from the highest voted answer that could be very valuable to any who find this:

For each of the following methods connect them with a count if you need any input to be accepted. (Assuming we are using vocal definitions of integers rather than 0-255, etc.)

x.isdigit() works well for checking if x is an integer.

x.replace('-','').isdigit() works well for checking if x is a negative.(Check – in first position)

x.replace('.','').isdigit() works well for checking if x is a decimal.

x.replace(':','').isdigit() works well for checking if x is a ratio.

x.replace('/','',1).isdigit() works well for checking if x is a fraction.


回答 11

该答案提供了具有示例功能的逐步指南,以查找字符串为:

  • 正整数
  • 正/负-整数/浮点数
  • 在检查数字时如何丢弃“ NaN”(不是数字)字符串?

检查字符串是否为整数

您可以str.isdigit()用来检查给定的字符串是否为整数。

样本结果:

# For digit
>>> '1'.isdigit()
True
>>> '1'.isalpha()
False

检查字符串是否为正/负-整数/浮点数

str.isdigit()返回False字符串是否为负数或浮点数。例如:

# returns `False` for float
>>> '123.3'.isdigit()
False
# returns `False` for negative number
>>> '-123'.isdigit()
False

如果还想检查整数和float,则可以编写一个自定义函数来检查它,如下所示:

def is_number(n):
    try:
        float(n)   # Type-casting the string to `float`.
                   # If string is not a valid `float`, 
                   # it'll raise `ValueError` exception
    except ValueError:
        return False
    return True

样品运行:

>>> is_number('123')    # positive integer number
True

>>> is_number('123.4')  # positive float number
True

>>> is_number('-123')   # negative integer number
True

>>> is_number('-123.4') # negative `float` number
True

>>> is_number('abc')    # `False` for "some random" string
False

检查数字时,丢弃“ NaN”(不是数字)字符串

上面的函数将返回True“ NAN”(非数字)字符串,因为对于Python,它是有效的浮点数,表示它不是数字。例如:

>>> is_number('NaN')
True

为了检查数字是否为“ NaN”,您可以使用math.isnan()

>>> import math
>>> nan_num = float('nan')

>>> math.isnan(nan_num)
True

或者,如果您不想导入其他库进行检查,则可以通过使用与自己进行比较来简单地进行检查==。Python中返回False时,nan浮子与自身相比。例如:

# `nan_num` variable is taken from above example
>>> nan_num == nan_num
False

因此,上述功能is_number可以更新,返回False"NaN"是:

def is_number(n):
    is_number = True
    try:
        num = float(n)
        # check for "nan" floats
        is_number = num == num   # or use `math.isnan(num)`
    except ValueError:
        is_number = False
    return is_number

样品运行:

>>> is_number('Nan')   # not a number "Nan" string
False

>>> is_number('nan')   # not a number string "nan" with all lower cased
False

>>> is_number('123')   # positive integer
True

>>> is_number('-123')  # negative integer
True

>>> is_number('-1.12') # negative `float`
True

>>> is_number('abc')   # "some random" string
False

PS:根据号码类型,每次检查的每次操作都会带来额外的开销。选择is_number适合您要求的功能版本。

This answer provides step by step guide having function with examples to find the string is:

  • Positive integer
  • Positive/negative – integer/float
  • How to discard “NaN” (not a number) strings while checking for number?

Check if string is positive integer

You may use str.isdigit() to check whether given string is positive integer.

Sample Results:

# For digit
>>> '1'.isdigit()
True
>>> '1'.isalpha()
False

Check for string as positive/negative – integer/float

str.isdigit() returns False if the string is a negative number or a float number. For example:

# returns `False` for float
>>> '123.3'.isdigit()
False
# returns `False` for negative number
>>> '-123'.isdigit()
False

If you want to also check for the negative integers and float, then you may write a custom function to check for it as:

def is_number(n):
    try:
        float(n)   # Type-casting the string to `float`.
                   # If string is not a valid `float`, 
                   # it'll raise `ValueError` exception
    except ValueError:
        return False
    return True

Sample Run:

>>> is_number('123')    # positive integer number
True

>>> is_number('123.4')  # positive float number
True

>>> is_number('-123')   # negative integer number
True

>>> is_number('-123.4') # negative `float` number
True

>>> is_number('abc')    # `False` for "some random" string
False

Discard “NaN” (not a number) strings while checking for number

The above functions will return True for the “NAN” (Not a number) string because for Python it is valid float representing it is not a number. For example:

>>> is_number('NaN')
True

In order to check whether the number is “NaN”, you may use math.isnan() as:

>>> import math
>>> nan_num = float('nan')

>>> math.isnan(nan_num)
True

Or if you don’t want to import additional library to check this, then you may simply check it via comparing it with itself using ==. Python returns False when nan float is compared with itself. For example:

# `nan_num` variable is taken from above example
>>> nan_num == nan_num
False

Hence, above function is_number can be updated to return False for "NaN" as:

def is_number(n):
    is_number = True
    try:
        num = float(n)
        # check for "nan" floats
        is_number = num == num   # or use `math.isnan(num)`
    except ValueError:
        is_number = False
    return is_number

Sample Run:

>>> is_number('Nan')   # not a number "Nan" string
False

>>> is_number('nan')   # not a number string "nan" with all lower cased
False

>>> is_number('123')   # positive integer
True

>>> is_number('-123')  # negative integer
True

>>> is_number('-1.12') # negative `float`
True

>>> is_number('abc')   # "some random" string
False

PS: Each operation for each check depending on the type of number comes with additional overhead. Choose the version of is_number function which fits your requirement.


回答 12

强制转换为float并捕获ValueError可能是最快的方法,因为float()专门用于此目的。其他任何需要字符串解析的操作(正则表达式等)都可能会变慢,因为它没有针对该操作进行调整。我的$ 0.02。

Casting to float and catching ValueError is probably the fastest way, since float() is specifically meant for just that. Anything else that requires string parsing (regex, etc) will likely be slower due to the fact that it’s not tuned for this operation. My $0.02.


回答 13

您可以使用Unicode字符串,它们有一种方法可以执行您想要的操作:

>>> s = u"345"
>>> s.isnumeric()
True

要么:

>>> s = "345"
>>> u = unicode(s)
>>> u.isnumeric()
True

http://www.tutorialspoint.com/python/string_isnumeric.htm

http://docs.python.org/2/howto/unicode.html

You can use Unicode strings, they have a method to do just what you want:

>>> s = u"345"
>>> s.isnumeric()
True

Or:

>>> s = "345"
>>> u = unicode(s)
>>> u.isnumeric()
True

http://www.tutorialspoint.com/python/string_isnumeric.htm

http://docs.python.org/2/howto/unicode.html


回答 14

我想看看哪种方法最快。总体上,最佳和最一致的结果由该check_replace功能给出。该check_exception函数给出最快的结果,但前提是没有引发异常-这意味着其代码是最有效的,但是引发异常的开销非常大。

请注意,检查是否成功进行了强制转换是唯一准确的方法,例如,此方法可以使用,check_exception但其他两个测试函数对于有效的浮点数将返回False:

huge_number = float('1e+100')

这是基准代码:

import time, re, random, string

ITERATIONS = 10000000

class Timer:    
    def __enter__(self):
        self.start = time.clock()
        return self
    def __exit__(self, *args):
        self.end = time.clock()
        self.interval = self.end - self.start

def check_regexp(x):
    return re.compile("^\d*\.?\d*$").match(x) is not None

def check_replace(x):
    return x.replace('.','',1).isdigit()

def check_exception(s):
    try:
        float(s)
        return True
    except ValueError:
        return False

to_check = [check_regexp, check_replace, check_exception]

print('preparing data...')
good_numbers = [
    str(random.random() / random.random()) 
    for x in range(ITERATIONS)]

bad_numbers = ['.' + x for x in good_numbers]

strings = [
    ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(random.randint(1,10)))
    for x in range(ITERATIONS)]

print('running test...')
for func in to_check:
    with Timer() as t:
        for x in good_numbers:
            res = func(x)
    print('%s with good floats: %s' % (func.__name__, t.interval))
    with Timer() as t:
        for x in bad_numbers:
            res = func(x)
    print('%s with bad floats: %s' % (func.__name__, t.interval))
    with Timer() as t:
        for x in strings:
            res = func(x)
    print('%s with strings: %s' % (func.__name__, t.interval))

以下是2017年MacBook Pro 13上Python 2.7.10的结果:

check_regexp with good floats: 12.688639
check_regexp with bad floats: 11.624862
check_regexp with strings: 11.349414
check_replace with good floats: 4.419841
check_replace with bad floats: 4.294909
check_replace with strings: 4.086358
check_exception with good floats: 3.276668
check_exception with bad floats: 13.843092
check_exception with strings: 15.786169

以下是2017年MacBook Pro 13上Python 3.6.5的结果:

check_regexp with good floats: 13.472906000000009
check_regexp with bad floats: 12.977665000000016
check_regexp with strings: 12.417542999999995
check_replace with good floats: 6.011045999999993
check_replace with bad floats: 4.849356
check_replace with strings: 4.282754000000011
check_exception with good floats: 6.039081999999979
check_exception with bad floats: 9.322753000000006
check_exception with strings: 9.952595000000002

以下是2017年MacBook Pro 13上PyPy 2.7.13的结果:

check_regexp with good floats: 2.693217
check_regexp with bad floats: 2.744819
check_regexp with strings: 2.532414
check_replace with good floats: 0.604367
check_replace with bad floats: 0.538169
check_replace with strings: 0.598664
check_exception with good floats: 1.944103
check_exception with bad floats: 2.449182
check_exception with strings: 2.200056

I wanted to see which method is fastest. Overall the best and most consistent results were given by the check_replace function. The fastest results were given by the check_exception function, but only if there was no exception fired – meaning its code is the most efficient, but the overhead of throwing an exception is quite large.

Please note that checking for a successful cast is the only method which is accurate, for example, this works with check_exception but the other two test functions will return False for a valid float:

huge_number = float('1e+100')

Here is the benchmark code:

import time, re, random, string

ITERATIONS = 10000000

class Timer:    
    def __enter__(self):
        self.start = time.clock()
        return self
    def __exit__(self, *args):
        self.end = time.clock()
        self.interval = self.end - self.start

def check_regexp(x):
    return re.compile("^\d*\.?\d*$").match(x) is not None

def check_replace(x):
    return x.replace('.','',1).isdigit()

def check_exception(s):
    try:
        float(s)
        return True
    except ValueError:
        return False

to_check = [check_regexp, check_replace, check_exception]

print('preparing data...')
good_numbers = [
    str(random.random() / random.random()) 
    for x in range(ITERATIONS)]

bad_numbers = ['.' + x for x in good_numbers]

strings = [
    ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(random.randint(1,10)))
    for x in range(ITERATIONS)]

print('running test...')
for func in to_check:
    with Timer() as t:
        for x in good_numbers:
            res = func(x)
    print('%s with good floats: %s' % (func.__name__, t.interval))
    with Timer() as t:
        for x in bad_numbers:
            res = func(x)
    print('%s with bad floats: %s' % (func.__name__, t.interval))
    with Timer() as t:
        for x in strings:
            res = func(x)
    print('%s with strings: %s' % (func.__name__, t.interval))

Here are the results with Python 2.7.10 on a 2017 MacBook Pro 13:

check_regexp with good floats: 12.688639
check_regexp with bad floats: 11.624862
check_regexp with strings: 11.349414
check_replace with good floats: 4.419841
check_replace with bad floats: 4.294909
check_replace with strings: 4.086358
check_exception with good floats: 3.276668
check_exception with bad floats: 13.843092
check_exception with strings: 15.786169

Here are the results with Python 3.6.5 on a 2017 MacBook Pro 13:

check_regexp with good floats: 13.472906000000009
check_regexp with bad floats: 12.977665000000016
check_regexp with strings: 12.417542999999995
check_replace with good floats: 6.011045999999993
check_replace with bad floats: 4.849356
check_replace with strings: 4.282754000000011
check_exception with good floats: 6.039081999999979
check_exception with bad floats: 9.322753000000006
check_exception with strings: 9.952595000000002

Here are the results with PyPy 2.7.13 on a 2017 MacBook Pro 13:

check_regexp with good floats: 2.693217
check_regexp with bad floats: 2.744819
check_regexp with strings: 2.532414
check_replace with good floats: 0.604367
check_replace with bad floats: 0.538169
check_replace with strings: 0.598664
check_exception with good floats: 1.944103
check_exception with bad floats: 2.449182
check_exception with strings: 2.200056

回答 15

因此,将它们放在一起,检查Nan,无穷大和复数(似乎它们是用j而不是i来指定的,即1 + 2j),结果为:

def is_number(s):
    try:
        n=str(float(s))
        if n == "nan" or n=="inf" or n=="-inf" : return False
    except ValueError:
        try:
            complex(s) # for complex
        except ValueError:
            return False
    return True

So to put it all together, checking for Nan, infinity and complex numbers (it would seem they are specified with j, not i, i.e. 1+2j) it results in:

def is_number(s):
    try:
        n=str(float(s))
        if n == "nan" or n=="inf" or n=="-inf" : return False
    except ValueError:
        try:
            complex(s) # for complex
        except ValueError:
            return False
    return True

回答 16

输入可能如下:

a="50" b=50 c=50.1 d="50.1"


1-常规输入:

此功能的输入可以是所有内容!

查找给定变量是否为数字。数字字符串由可选符号,任意数量的数字,可选小数部分和可选指数部分组成。因此,+ 0123.45e6是有效的数值。不允许使用十六进制(例如0xf4c3b00c)和二进制(例如0b10100111001)表示法。

is_numeric函数

import ast
import numbers              
def is_numeric(obj):
    if isinstance(obj, numbers.Number):
        return True
    elif isinstance(obj, str):
        nodes = list(ast.walk(ast.parse(obj)))[1:]
        if not isinstance(nodes[0], ast.Expr):
            return False
        if not isinstance(nodes[-1], ast.Num):
            return False
        nodes = nodes[1:-1]
        for i in range(len(nodes)):
            #if used + or - in digit :
            if i % 2 == 0:
                if not isinstance(nodes[i], ast.UnaryOp):
                    return False
            else:
                if not isinstance(nodes[i], (ast.USub, ast.UAdd)):
                    return False
        return True
    else:
        return False

测试:

>>> is_numeric("54")
True
>>> is_numeric("54.545")
True
>>> is_numeric("0x45")
True

is_float函数

查找给定变量是否为float。浮点字符串包含可选符号,任意数量的数字,…

import ast

def is_float(obj):
    if isinstance(obj, float):
        return True
    if isinstance(obj, int):
        return False
    elif isinstance(obj, str):
        nodes = list(ast.walk(ast.parse(obj)))[1:]
        if not isinstance(nodes[0], ast.Expr):
            return False
        if not isinstance(nodes[-1], ast.Num):
            return False
        if not isinstance(nodes[-1].n, float):
            return False
        nodes = nodes[1:-1]
        for i in range(len(nodes)):
            if i % 2 == 0:
                if not isinstance(nodes[i], ast.UnaryOp):
                    return False
            else:
                if not isinstance(nodes[i], (ast.USub, ast.UAdd)):
                    return False
        return True
    else:
        return False

测试:

>>> is_float("5.4")
True
>>> is_float("5")
False
>>> is_float(5)
False
>>> is_float("5")
False
>>> is_float("+5.4")
True

什么是ast


2-如果您确信变量内容为String

使用str.isdigit()方法

>>> a=454
>>> a.isdigit()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'isdigit'
>>> a="454"
>>> a.isdigit()
True

3个数字输入:

检测int值:

>>> isinstance("54", int)
False
>>> isinstance(54, int)
True
>>> 

检测浮动:

>>> isinstance("45.1", float)
False
>>> isinstance(45.1, float)
True

The input may be as follows:

a="50" b=50 c=50.1 d="50.1"


1-General input:

The input of this function can be everything!

Finds whether the given variable is numeric. Numeric strings consist of optional sign, any number of digits, optional decimal part and optional exponential part. Thus +0123.45e6 is a valid numeric value. Hexadecimal (e.g. 0xf4c3b00c) and binary (e.g. 0b10100111001) notation is not allowed.

is_numeric function

import ast
import numbers              
def is_numeric(obj):
    if isinstance(obj, numbers.Number):
        return True
    elif isinstance(obj, str):
        nodes = list(ast.walk(ast.parse(obj)))[1:]
        if not isinstance(nodes[0], ast.Expr):
            return False
        if not isinstance(nodes[-1], ast.Num):
            return False
        nodes = nodes[1:-1]
        for i in range(len(nodes)):
            #if used + or - in digit :
            if i % 2 == 0:
                if not isinstance(nodes[i], ast.UnaryOp):
                    return False
            else:
                if not isinstance(nodes[i], (ast.USub, ast.UAdd)):
                    return False
        return True
    else:
        return False

test:

>>> is_numeric("54")
True
>>> is_numeric("54.545")
True
>>> is_numeric("0x45")
True

is_float function

Finds whether the given variable is float. float strings consist of optional sign, any number of digits, …

import ast

def is_float(obj):
    if isinstance(obj, float):
        return True
    if isinstance(obj, int):
        return False
    elif isinstance(obj, str):
        nodes = list(ast.walk(ast.parse(obj)))[1:]
        if not isinstance(nodes[0], ast.Expr):
            return False
        if not isinstance(nodes[-1], ast.Num):
            return False
        if not isinstance(nodes[-1].n, float):
            return False
        nodes = nodes[1:-1]
        for i in range(len(nodes)):
            if i % 2 == 0:
                if not isinstance(nodes[i], ast.UnaryOp):
                    return False
            else:
                if not isinstance(nodes[i], (ast.USub, ast.UAdd)):
                    return False
        return True
    else:
        return False

test:

>>> is_float("5.4")
True
>>> is_float("5")
False
>>> is_float(5)
False
>>> is_float("5")
False
>>> is_float("+5.4")
True

what is ast?


2- If you are confident that the variable content is String:

use str.isdigit() method

>>> a=454
>>> a.isdigit()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'int' object has no attribute 'isdigit'
>>> a="454"
>>> a.isdigit()
True

3-Numerical input:

detect int value:

>>> isinstance("54", int)
False
>>> isinstance(54, int)
True
>>> 

detect float:

>>> isinstance("45.1", float)
False
>>> isinstance(45.1, float)
True

回答 17

我做了一些速度测试。假设如果字符串可能是数字,则try / except策略可能是最快的方法。如果字符串不太可能是数字,并且您对Integer检查感兴趣,那么值得进行一些测试(等号加标题) ‘-‘)。如果您有兴趣检查浮点数,则必须使用try / except代码whitout转义。

I did some speed test. Lets say that if the string is likely to be a number the try/except strategy is the fastest possible.If the string is not likely to be a number and you are interested in Integer check, it worths to do some test (isdigit plus heading ‘-‘). If you are interested to check float number, you have to use the try/except code whitout escape.


回答 18

我需要确定字符串是否转换为基本类型(float,int,str,bool)。在网上找不到任何东西后,我创建了这个:

def str_to_type (s):
    """ Get possible cast type for a string

    Parameters
    ----------
    s : string

    Returns
    -------
    float,int,str,bool : type
        Depending on what it can be cast to

    """    
    try:                
        f = float(s)        
        if "." not in s:
            return int
        return float
    except ValueError:
        value = s.upper()
        if value == "TRUE" or value == "FALSE":
            return bool
        return type(s)

str_to_type("true") # bool
str_to_type("6.0") # float
str_to_type("6") # int
str_to_type("6abc") # str
str_to_type(u"6abc") # unicode       

您可以捕获类型并使用它

s = "6.0"
type_ = str_to_type(s) # float
f = type_(s) 

I needed to determine if a string cast into basic types (float,int,str,bool). After not finding anything on the internet I created this:

def str_to_type (s):
    """ Get possible cast type for a string

    Parameters
    ----------
    s : string

    Returns
    -------
    float,int,str,bool : type
        Depending on what it can be cast to

    """    
    try:                
        f = float(s)        
        if "." not in s:
            return int
        return float
    except ValueError:
        value = s.upper()
        if value == "TRUE" or value == "FALSE":
            return bool
        return type(s)

Example

str_to_type("true") # bool
str_to_type("6.0") # float
str_to_type("6") # int
str_to_type("6abc") # str
str_to_type(u"6abc") # unicode       

You can capture the type and use it

s = "6.0"
type_ = str_to_type(s) # float
f = type_(s) 

回答 19

RyanN建议

如果要为NaN和Inf返回False,请将行更改为x = float(s); 返回(x == x)和(x-1!= x)。对于除Inf和NaN之外的所有浮点数,这应该返回True

但这并不是很有效,因为对于足够大的浮点数,x-1 == x返回true。例如,2.0**54 - 1 == 2.0**54

RyanN suggests

If you want to return False for a NaN and Inf, change line to x = float(s); return (x == x) and (x – 1 != x). This should return True for all floats except Inf and NaN

But this doesn’t quite work, because for sufficiently large floats, x-1 == x returns true. For example, 2.0**54 - 1 == 2.0**54


回答 20

我认为您的解决方案是好的,但有一个正确的正则表达式的实现。

这些答案似乎确实有很多正则表达式的讨厌之处,我认为这是不合理的,正则表达式可以合理地清洁,正确和快速。这实际上取决于您要执行的操作。最初的问题是如何“检查字符串是否可以表示为数字(浮点数)”(根据标题)。大概在检查完数字/浮点值的有效性之后,便会希望使用它。在这种情况下,您的try / except很有意义。但是,如果由于某种原因,您只想验证字符串是否为数字那么正则表达式也可以正常工作,但是很难正确。我认为到目前为止,大多数正则表达式答案都无法正确解析没有整数部分(例如“ .7”)的字符串,而整数部分就python而言是一个浮点数。在不需要小数部分的单个正则表达式中进行检查有点棘手。我已经包含了两个正则表达式来说明这一点。

确实提出了一个有趣的问题,即“数字”是什么。您是否在Python中包含可作为浮点数有效的“ inf”?还是您包含的数字是“数字”,但可能无法用python表示(例如,大于float max的数字)。

在解析数字方面也存在歧义。例如,“-20”呢?这是“数字”吗?这是代表“ 20”的合法方法吗?Python将允许您执行“ var = –20”并将其设置为20(尽管实际上这是因为它将其视为表达式),但是float(“-20”)无效。

无论如何,如果没有更多信息,我相信这是一个正则表达式,它涵盖了所有int和float,因为python解析了它们

# Doesn't properly handle floats missing the integer part, such as ".7"
SIMPLE_FLOAT_REGEXP = re.compile(r'^[-+]?[0-9]+\.?[0-9]+([eE][-+]?[0-9]+)?$')
# Example "-12.34E+56"      # sign (-)
                            #     integer (12)
                            #           mantissa (34)
                            #                    exponent (E+56)

# Should handle all floats
FLOAT_REGEXP = re.compile(r'^[-+]?([0-9]+|[0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?$')
# Example "-12.34E+56"      # sign (-)
                            #     integer (12)
                            #           OR
                            #             int/mantissa (12.34)
                            #                            exponent (E+56)

def is_float(str):
  return True if FLOAT_REGEXP.match(str) else False

一些示例测试值:

True  <- +42
True  <- +42.42
False <- +42.42.22
True  <- +42.42e22
True  <- +42.42E-22
False <- +42.42e-22.8
True  <- .42
False <- 42nope

在@ ron-reiter的答案中运行基准测试代码表明,此regex实际上比普通regex快,并且处理异常值的速度也比异常快得多,这是有道理的。结果:

check_regexp with good floats: 18.001921
check_regexp with bad floats: 17.861423
check_regexp with strings: 17.558862
check_correct_regexp with good floats: 11.04428
check_correct_regexp with bad floats: 8.71211
check_correct_regexp with strings: 8.144161
check_replace with good floats: 6.020597
check_replace with bad floats: 5.343049
check_replace with strings: 5.091642
check_exception with good floats: 5.201605
check_exception with bad floats: 23.921864
check_exception with strings: 23.755481

I think your solution is fine, but there is a correct regexp implementation.

There does seem to be a lot of regexp hate towards these answers which I think is unjustified, regexps can be reasonably clean and correct and fast. It really depends on what you’re trying to do. The original question was how can you “check if a string can be represented as a number (float)” (as per your title). Presumably you would want to use the numeric/float value once you’ve checked that it’s valid, in which case your try/except makes a lot of sense. But if, for some reason, you just want to validate that a string is a number then a regex also works fine, but it’s hard to get correct. I think most of the regex answers so far, for example, do not properly parse strings without an integer part (such as “.7”) which is a float as far as python is concerned. And that’s slightly tricky to check for in a single regex where the fractional portion is not required. I’ve included two regex to show this.

It does raise the interesting question as to what a “number” is. Do you include “inf” which is valid as a float in python? Or do you include numbers that are “numbers” but maybe can’t be represented in python (such as numbers that are larger than the float max).

There’s also ambiguities in how you parse numbers. For example, what about “–20”? Is this a “number”? Is this a legal way to represent “20”? Python will let you do “var = –20” and set it to 20 (though really this is because it treats it as an expression), but float(“–20”) does not work.

Anyways, without more info, here’s a regex that I believe covers all the ints and floats as python parses them.

# Doesn't properly handle floats missing the integer part, such as ".7"
SIMPLE_FLOAT_REGEXP = re.compile(r'^[-+]?[0-9]+\.?[0-9]+([eE][-+]?[0-9]+)?$')
# Example "-12.34E+56"      # sign (-)
                            #     integer (12)
                            #           mantissa (34)
                            #                    exponent (E+56)

# Should handle all floats
FLOAT_REGEXP = re.compile(r'^[-+]?([0-9]+|[0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?$')
# Example "-12.34E+56"      # sign (-)
                            #     integer (12)
                            #           OR
                            #             int/mantissa (12.34)
                            #                            exponent (E+56)

def is_float(str):
  return True if FLOAT_REGEXP.match(str) else False

Some example test values:

True  <- +42
True  <- +42.42
False <- +42.42.22
True  <- +42.42e22
True  <- +42.42E-22
False <- +42.42e-22.8
True  <- .42
False <- 42nope

Running the benchmarking code in @ron-reiter’s answer shows that this regex is actually faster than the normal regex and is much faster at handling bad values than the exception, which makes some sense. Results:

check_regexp with good floats: 18.001921
check_regexp with bad floats: 17.861423
check_regexp with strings: 17.558862
check_correct_regexp with good floats: 11.04428
check_correct_regexp with bad floats: 8.71211
check_correct_regexp with strings: 8.144161
check_replace with good floats: 6.020597
check_replace with bad floats: 5.343049
check_replace with strings: 5.091642
check_exception with good floats: 5.201605
check_exception with bad floats: 23.921864
check_exception with strings: 23.755481

回答 21

import re
def is_number(num):
    pattern = re.compile(r'^[-+]?[-0-9]\d*\.\d*|[-+]?\.?[0-9]\d*$')
    result = pattern.match(num)
    if result:
        return True
    else:
        return False


​>>>: is_number('1')
True

>>>: is_number('111')
True

>>>: is_number('11.1')
True

>>>: is_number('-11.1')
True

>>>: is_number('inf')
False

>>>: is_number('-inf')
False
import re
def is_number(num):
    pattern = re.compile(r'^[-+]?[-0-9]\d*\.\d*|[-+]?\.?[0-9]\d*$')
    result = pattern.match(num)
    if result:
        return True
    else:
        return False


​>>>: is_number('1')
True

>>>: is_number('111')
True

>>>: is_number('11.1')
True

>>>: is_number('-11.1')
True

>>>: is_number('inf')
False

>>>: is_number('-inf')
False

回答 22

这是我执行此操作的简单方法。假设我正在遍历一些字符串,并且如果它们最终是数字,我想将它们添加到数组中。

try:
    myvar.append( float(string_to_check) )
except:
    continue

如果结果是数字,则将myvar.apppend替换为要对字符串进行的任何操作。这个想法是尝试使用float()操作并使用返回的错误来确定字符串是否为数字。

Here’s my simple way of doing it. Let’s say that I’m looping through some strings and I want to add them to an array if they turn out to be numbers.

try:
    myvar.append( float(string_to_check) )
except:
    continue

Replace the myvar.apppend with whatever operation you want to do with the string if it turns out to be a number. The idea is to try to use a float() operation and use the returned error to determine whether or not the string is a number.


回答 23

我还使用了您提到的函数,但是很快我注意到,字符串“ Nan”,“ Inf”及其变体被视为数字。因此,我建议您对函数进行改进,使其在这些输入类型上返回false,并且不会使“ 1e3”变体失败:

def is_float(text):
    try:
        float(text)
        # check for nan/infinity etc.
        if text.isalpha():
            return False
        return True
    except ValueError:
        return False

I also used the function you mentioned, but soon I notice that strings as “Nan”, “Inf” and it’s variation are considered as number. So I propose you improved version of your function, that will return false on those type of input and will not fail “1e3” variants:

def is_float(text):
    try:
        float(text)
        # check for nan/infinity etc.
        if text.isalpha():
            return False
        return True
    except ValueError:
        return False

回答 24

该代码使用正则表达式处理指数,浮点数和整数。

return True if str1.lstrip('-').replace('.','',1).isdigit() or float(str1) else False

This code handles the exponents, floats, and integers, wihtout using regex.

return True if str1.lstrip('-').replace('.','',1).isdigit() or float(str1) else False

回答 25

用户助手功能:

def if_ok(fn, string):
  try:
    return fn(string)
  except Exception as e:
    return None

然后

if_ok(int, my_str) or if_ok(float, my_str) or if_ok(complex, my_str)
is_number = lambda s: any([if_ok(fn, s) for fn in (int, float, complex)])

User helper function:

def if_ok(fn, string):
  try:
    return fn(string)
  except Exception as e:
    return None

then

if_ok(int, my_str) or if_ok(float, my_str) or if_ok(complex, my_str)
is_number = lambda s: any([if_ok(fn, s) for fn in (int, float, complex)])

回答 26

您可以通过返回比True和False更有用的值,以有用的方式概括异常技术。例如,此函数将引号括在字符串中,但不留数字。这正是我为快速而肮脏的过滤器为R定义一些变量所需要的。

import sys

def fix_quotes(s):
    try:
        float(s)
        return s
    except ValueError:
        return '"{0}"'.format(s)

for line in sys.stdin:
    input = line.split()
    print input[0], '<- c(', ','.join(fix_quotes(c) for c in input[1:]), ')'

You can generalize the exception technique in a useful way by returning more useful values than True and False. For example this function puts quotes round strings but leaves numbers alone. Which is just what I needed for a quick and dirty filter to make some variable definitions for R.

import sys

def fix_quotes(s):
    try:
        float(s)
        return s
    except ValueError:
        return '"{0}"'.format(s)

for line in sys.stdin:
    input = line.split()
    print input[0], '<- c(', ','.join(fix_quotes(c) for c in input[1:]), ')'

回答 27

我正在研究一个导致我进入此线程的问题,即如何以最直观的方式将数据集合转换为字符串和数字。阅读原始代码后,我意识到我需要的东西在两个方面有所不同:

1-如果字符串表示一个整数,我想要一个整数结果

2-我希望将数字或字符串结果插入数据结构

所以我修改了原始代码以生成此派生代码:

def string_or_number(s):
    try:
        z = int(s)
        return z
    except ValueError:
        try:
            z = float(s)
            return z
        except ValueError:
            return s

I was working on a problem that led me to this thread, namely how to convert a collection of data to strings and numbers in the most intuitive way. I realized after reading the original code that what I needed was different in two ways:

1 – I wanted an integer result if the string represented an integer

2 – I wanted a number or a string result to stick into a data structure

so I adapted the original code to produce this derivative:

def string_or_number(s):
    try:
        z = int(s)
        return z
    except ValueError:
        try:
            z = float(s)
            return z
        except ValueError:
            return s

回答 28

尝试这个。

 def is_number(var):
    try:
       if var == int(var):
            return True
    except Exception:
        return False

Try this.

 def is_number(var):
    try:
       if var == int(var):
            return True
    except Exception:
        return False

回答 29

def is_float(s):
    if s is None:
        return False

    if len(s) == 0:
        return False

    digits_count = 0
    dots_count = 0
    signs_count = 0

    for c in s:
        if '0' <= c <= '9':
            digits_count += 1
        elif c == '.':
            dots_count += 1
        elif c == '-' or c == '+':
            signs_count += 1
        else:
            return False

    if digits_count == 0:
        return False

    if dots_count > 1:
        return False

    if signs_count > 1:
        return False

    return True
def is_float(s):
    if s is None:
        return False

    if len(s) == 0:
        return False

    digits_count = 0
    dots_count = 0
    signs_count = 0

    for c in s:
        if '0' <= c <= '9':
            digits_count += 1
        elif c == '.':
            dots_count += 1
        elif c == '-' or c == '+':
            signs_count += 1
        else:
            return False

    if digits_count == 0:
        return False

    if dots_count > 1:
        return False

    if signs_count > 1:
        return False

    return True