标签归档:int

Python中的最大浮点数是多少?

问题:Python中的最大浮点数是多少?

我认为可以通过调用python中的最大整数sys.maxint

最大值floatlongPython中的最大值是多少?

I think the maximum integer in python is available by calling sys.maxint.

What is the maximum float or long in Python?


回答 0

对于float看看sys.float_info

>>> import sys
>>> sys.float_info
sys.floatinfo(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2
250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsil
on=2.2204460492503131e-16, radix=2, rounds=1)

具体来说sys.float_info.max

>>> sys.float_info.max
1.7976931348623157e+308

如果那还不够大,那么总会有正无穷大

>>> infinity = float("inf")
>>> infinity
inf
>>> infinity / 10000
inf

long类型具有无限的精度,因此我认为您仅受可用内存的限制。

For float have a look at sys.float_info:

>>> import sys
>>> sys.float_info
sys.floatinfo(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2
250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsil
on=2.2204460492503131e-16, radix=2, rounds=1)

Specifically, sys.float_info.max:

>>> sys.float_info.max
1.7976931348623157e+308

If that’s not big enough, there’s always positive infinity:

>>> infinity = float("inf")
>>> infinity
inf
>>> infinity / 10000
inf

The long type has unlimited precision, so I think you’re only limited by available memory.


回答 1

sys.maxint不是python支持的最大整数。它是python的常规整数类型支持的最大整数。

sys.maxint is not the largest integer supported by python. It’s the largest integer supported by python’s regular integer type.


回答 2

如果您使用numpy的,你可以使用D型float128 ”,并得到的最大浮动10E + 4931

>>> np.finfo(np.float128)
finfo(resolution=1e-18, min=-1.18973149536e+4932, max=1.18973149536e+4932, dtype=float128)

If you are using numpy, you can use dtypefloat128‘ and get a max float of 10e+4931

>>> np.finfo(np.float128)
finfo(resolution=1e-18, min=-1.18973149536e+4932, max=1.18973149536e+4932, dtype=float128)

如何在Python中将’false’转换为0并将’true’转换为1

问题:如何在Python中将’false’转换为0并将’true’转换为1

有没有一种方法可以将true类型转换unicode为1并将false类型转换unicode为0(在Python中)?

例如: x == 'true' and type(x) == unicode

我想要 x = 1

PS:我不想使用ifelse

Is there a way to convert true of type unicode to 1 and false of type unicode to 0 (in Python)?

For example: x == 'true' and type(x) == unicode

I want x = 1

PS: I don’t want to use ifelse.


回答 0

使用int()一个布尔测试:

x = int(x == 'true')

int()将布尔值转换为10。请注意,任何等于的值'true'都将导致0返回。

Use int() on a boolean test:

x = int(x == 'true')

int() turns the boolean into 1 or 0. Note that any value not equal to 'true' will result in 0 being returned.


回答 1

如果B是布尔数组,则写

B = B*1

(一些代码golfy。)

If B is a Boolean array, write

B = B*1

(A bit code golfy.)


回答 2

您可以使用x.astype('uint8')where x是布尔数组。

You can use x.astype('uint8') where x is your Boolean array.


回答 3

这是您的问题的另一种解决方案:

def to_bool(s):
    return 1 - sum(map(ord, s)) % 2
    # return 1 - sum(s.encode('ascii')) % 2  # Alternative for Python 3

它的工作原理因为ASCII码的总和'true'就是448,这是偶数,而的ASCII码的总和'false'就是523这是奇怪的。


关于此解决方案的有趣之处在于,如果输入不是'true' or 之一,则其结果是非常随机的'false'。一半的时间会回来0,另一半1encode如果输入不是ASCII ,变体using 将引发编码错误(从而增加行为的不确定性)。


认真地说,我认为最易读,更快捷的解决方案是使用if

def to_bool(s):
    return 1 if s == 'true' else 0

查看一些微基准测试:

In [14]: def most_readable(s):
    ...:     return 1 if s == 'true' else 0

In [15]: def int_cast(s):
    ...:     return int(s == 'true')

In [16]: def str2bool(s):
    ...:     try:
    ...:         return ['false', 'true'].index(s)
    ...:     except (ValueError, AttributeError):
    ...:         raise ValueError()

In [17]: def str2bool2(s):
    ...:     try:
    ...:         return ('false', 'true').index(s)
    ...:     except (ValueError, AttributeError):
    ...:         raise ValueError()

In [18]: def to_bool(s):
    ...:     return 1 - sum(s.encode('ascii')) % 2

In [19]: %timeit most_readable('true')
10000000 loops, best of 3: 112 ns per loop

In [20]: %timeit most_readable('false')
10000000 loops, best of 3: 109 ns per loop

In [21]: %timeit int_cast('true')
1000000 loops, best of 3: 259 ns per loop

In [22]: %timeit int_cast('false')
1000000 loops, best of 3: 262 ns per loop

In [23]: %timeit str2bool('true')
1000000 loops, best of 3: 343 ns per loop

In [24]: %timeit str2bool('false')
1000000 loops, best of 3: 325 ns per loop

In [25]: %timeit str2bool2('true')
1000000 loops, best of 3: 295 ns per loop

In [26]: %timeit str2bool2('false')
1000000 loops, best of 3: 277 ns per loop

In [27]: %timeit to_bool('true')
1000000 loops, best of 3: 607 ns per loop

In [28]: %timeit to_bool('false')
1000000 loops, best of 3: 612 ns per loop

请注意该怎么if解决办法是至少 2.5倍速度所有其他解决方案。避免使用s 是没有意义的,if除非这是某种家庭作业(在这种情况下,您本来不应该首先问这个问题)。

Here’s a yet another solution to your problem:

def to_bool(s):
    return 1 - sum(map(ord, s)) % 2
    # return 1 - sum(s.encode('ascii')) % 2  # Alternative for Python 3

It works because the sum of the ASCII codes of 'true' is 448, which is even, while the sum of the ASCII codes of 'false' is 523 which is odd.


The funny thing about this solution is that its result is pretty random if the input is not one of 'true' or 'false'. Half of the time it will return 0, and the other half 1. The variant using encode will raise an encoding error if the input is not ASCII (thus increasing the undefined-ness of the behaviour).


Seriously, I believe the most readable, and faster, solution is to use an if:

def to_bool(s):
    return 1 if s == 'true' else 0

See some microbenchmarks:

In [14]: def most_readable(s):
    ...:     return 1 if s == 'true' else 0

In [15]: def int_cast(s):
    ...:     return int(s == 'true')

In [16]: def str2bool(s):
    ...:     try:
    ...:         return ['false', 'true'].index(s)
    ...:     except (ValueError, AttributeError):
    ...:         raise ValueError()

In [17]: def str2bool2(s):
    ...:     try:
    ...:         return ('false', 'true').index(s)
    ...:     except (ValueError, AttributeError):
    ...:         raise ValueError()

In [18]: def to_bool(s):
    ...:     return 1 - sum(s.encode('ascii')) % 2

In [19]: %timeit most_readable('true')
10000000 loops, best of 3: 112 ns per loop

In [20]: %timeit most_readable('false')
10000000 loops, best of 3: 109 ns per loop

In [21]: %timeit int_cast('true')
1000000 loops, best of 3: 259 ns per loop

In [22]: %timeit int_cast('false')
1000000 loops, best of 3: 262 ns per loop

In [23]: %timeit str2bool('true')
1000000 loops, best of 3: 343 ns per loop

In [24]: %timeit str2bool('false')
1000000 loops, best of 3: 325 ns per loop

In [25]: %timeit str2bool2('true')
1000000 loops, best of 3: 295 ns per loop

In [26]: %timeit str2bool2('false')
1000000 loops, best of 3: 277 ns per loop

In [27]: %timeit to_bool('true')
1000000 loops, best of 3: 607 ns per loop

In [28]: %timeit to_bool('false')
1000000 loops, best of 3: 612 ns per loop

Notice how the if solution is at least 2.5x times faster than all the other solutions. It does not make sense to put as a requirement to avoid using ifs except if this is some kind of homework (in which case you shouldn’t have asked this in the first place).


回答 4

如果您需要从本身不是布尔值的字符串进行通用转换,则最好编写类似于以下所示的例程。秉承鸭子打字的精神,我没有默默地传递错误,而是将其转换为适合当前情况的错误。

>>> def str2bool(st):
try:
    return ['false', 'true'].index(st.lower())
except (ValueError, AttributeError):
    raise ValueError('no Valid Conversion Possible')


>>> str2bool('garbaze')

Traceback (most recent call last):
  File "<pyshell#106>", line 1, in <module>
    str2bool('garbaze')
  File "<pyshell#105>", line 5, in str2bool
    raise TypeError('no Valid COnversion Possible')
TypeError: no Valid Conversion Possible
>>> str2bool('false')
0
>>> str2bool('True')
1

If you need a general purpose conversion from a string which per se is not a bool, you should better write a routine similar to the one depicted below. In keeping with the spirit of duck typing, I have not silently passed the error but converted it as appropriate for the current scenario.

>>> def str2bool(st):
try:
    return ['false', 'true'].index(st.lower())
except (ValueError, AttributeError):
    raise ValueError('no Valid Conversion Possible')


>>> str2bool('garbaze')

Traceback (most recent call last):
  File "<pyshell#106>", line 1, in <module>
    str2bool('garbaze')
  File "<pyshell#105>", line 5, in str2bool
    raise TypeError('no Valid COnversion Possible')
TypeError: no Valid Conversion Possible
>>> str2bool('false')
0
>>> str2bool('True')
1

回答 5

布尔到整数: x = (x == 'true') + 0

现在x包含1,x == 'true'否则为0。

注意:x == 'true'将返回bool,然后将其与0一起转换为具有值(如果bool值为True则为1,否则为0)的int类型。

bool to int: x = (x == 'true') + 0

Now the x contains 1 if x == 'true' else 0.

Note: x == 'true' will return bool which then will be typecasted to int having value (1 if bool value is True else 0) when added with 0.


回答 6

仅与此:

const a = true; const b = false;

console.log(+ a); // 1 console.log(+ b); // 0

only with this:

const a = true; const b = false;

console.log(+a);//1 console.log(+b);//0


如何在python中将int转换为Enum?

问题:如何在python中将int转换为Enum?

在python 2.7.6中使用新的Enum功能(通过backport enum34)。

给定以下定义,如何将int转换为相应的Enum值?

from enum import Enum

class Fruit(Enum):
    Apple = 4
    Orange = 5
    Pear = 6

我知道我可以手工制作一系列的if语句来进行转换,但是有没有简单的pythonic转换方法?基本上,我想要一个返回枚举值的函数ConvertIntToFruit(int)。

我的用例是我有一个记录的csv文件,在其中我将每个记录读入一个对象。文件字段之一是代表枚举的整数字段。在填充对象时,我想将文件中的整数字段转换为对象中对应的Enum值。

Using the new Enum feature (via backport enum34) with python 2.7.6.

Given the following definition, how can I convert an int to the corresponding Enum value?

from enum import Enum

class Fruit(Enum):
    Apple = 4
    Orange = 5
    Pear = 6

I know I can hand craft a series of if-statements to do the conversion but is there an easy pythonic way to convert? Basically, I’d like a function ConvertIntToFruit(int) that returns an enum value.

My use case is I have a csv file of records where I’m reading each record into an object. One of the file fields is an integer field that represents an enumeration. As I’m populating the object I’d like to convert that integer field from the file into the corresponding Enum value in the object.


回答 0

您“打电话”Enum上课:

Fruit(5)

轮到5Fruit.Orange

>>> from enum import Enum
>>> class Fruit(Enum):
...     Apple = 4
...     Orange = 5
...     Pear = 6
... 
>>> Fruit(5)
<Fruit.Orange: 5>

从文档的程序访问到枚举成员及其属性部分:

有时,以编程方式访问枚举中的成员很有用(例如,Color.red由于在编写程序时尚不知道确切的颜色而无法这样做)。Enum允许这样的访问:

>>> Color(1)
<Color.red: 1>
>>> Color(3)
<Color.blue: 3>

在相关说明中:要映射包含枚举成员名称的字符串值,请使用subscription:

>>> s = 'Apple'
>>> Fruit[s]
<Fruit.Apple: 4>

You ‘call’ the Enum class:

Fruit(5)

to turn 5 into Fruit.Orange:

>>> from enum import Enum
>>> class Fruit(Enum):
...     Apple = 4
...     Orange = 5
...     Pear = 6
... 
>>> Fruit(5)
<Fruit.Orange: 5>

From the Programmatic access to enumeration members and their attributes section of the documentation:

Sometimes it’s useful to access members in enumerations programmatically (i.e. situations where Color.red won’t do because the exact color is not known at program-writing time). Enum allows such access:

>>> Color(1)
<Color.red: 1>
>>> Color(3)
<Color.blue: 3>

In a related note: to map a string value containing the name of an enum member, use subscription:

>>> s = 'Apple'
>>> Fruit[s]
<Fruit.Apple: 4>

回答 1

我认为这是简单的话是对转换int价值为Enum通过调用EnumType(int_value),访问后name的的Enum对象:

my_fruit_from_int = Fruit(5) #convert to int
fruit_name = my_fruit_from_int.name #get the name
print(fruit_name) #Orange will be printed here

或作为功能:

def convert_int_to_fruit(int_value):
    try:
        my_fruit_from_int = Fruit(int_value)
        return my_fruit_from_int.name
    except:
        return None

I think it is in simple words is to convert the int value into Enum by calling EnumType(int_value), after that access the name of the Enum object:

my_fruit_from_int = Fruit(5) #convert to int
fruit_name = my_fruit_from_int.name #get the name
print(fruit_name) #Orange will be printed here

Or as a function:

def convert_int_to_fruit(int_value):
    try:
        my_fruit_from_int = Fruit(int_value)
        return my_fruit_from_int.name
    except:
        return None

回答 2

我想要类似的东西,以便可以从单个引用访问值对的任何一部分。香草版本:

#!/usr/bin/env python3


from enum import IntEnum


class EnumDemo(IntEnum):
    ENUM_ZERO       = 0
    ENUM_ONE        = 1
    ENUM_TWO        = 2
    ENUM_THREE      = 3
    ENUM_INVALID    = 4


#endclass.


print('Passes')
print('1) %d'%(EnumDemo['ENUM_TWO']))
print('2) %s'%(EnumDemo['ENUM_TWO']))
print('3) %s'%(EnumDemo.ENUM_TWO.name))
print('4) %d'%(EnumDemo.ENUM_TWO))
print()


print('Fails')
print('1) %d'%(EnumDemo.ENUM_TWOa))

失败将引发异常。

一个更强大的版本:

#!/usr/bin/env python3


class EnumDemo():


    enumeration =   (
                        'ENUM_ZERO',    # 0.
                        'ENUM_ONE',     # 1.
                        'ENUM_TWO',     # 2.
                        'ENUM_THREE',   # 3.
                        'ENUM_INVALID'  # 4.
                    )


    def name(self, val):
        try:

            name = self.enumeration[val]
        except IndexError:

            # Always return last tuple.
            name = self.enumeration[len(self.enumeration) - 1]

        return name


    def number(self, val):
        try:

            index = self.enumeration.index(val)
        except (TypeError, ValueError):

            # Always return last tuple.
            index = (len(self.enumeration) - 1)

        return index


#endclass.


print('Passes')
print('1) %d'%(EnumDemo().number('ENUM_TWO')))
print('2) %s'%(EnumDemo().number('ENUM_TWO')))
print('3) %s'%(EnumDemo().name(1)))
print('4) %s'%(EnumDemo().enumeration[1]))
print()
print('Fails')
print('1) %d'%(EnumDemo().number('ENUM_THREEa')))
print('2) %s'%(EnumDemo().number('ENUM_THREEa')))
print('3) %s'%(EnumDemo().name(11)))
print('4) %s'%(EnumDemo().enumeration[-1]))

如果使用不正确,这可以避免产生异常,而是传回故障指示。一种更Python化的方法是返回“ None”,但是我的特定应用程序直接使用文本。

I wanted something similar so that I could access either part of the value pair from a single reference. The vanilla version:

#!/usr/bin/env python3


from enum import IntEnum


class EnumDemo(IntEnum):
    ENUM_ZERO       = 0
    ENUM_ONE        = 1
    ENUM_TWO        = 2
    ENUM_THREE      = 3
    ENUM_INVALID    = 4


#endclass.


print('Passes')
print('1) %d'%(EnumDemo['ENUM_TWO']))
print('2) %s'%(EnumDemo['ENUM_TWO']))
print('3) %s'%(EnumDemo.ENUM_TWO.name))
print('4) %d'%(EnumDemo.ENUM_TWO))
print()


print('Fails')
print('1) %d'%(EnumDemo.ENUM_TWOa))

The failure throws an exception as would be expected.

A more robust version:

#!/usr/bin/env python3


class EnumDemo():


    enumeration =   (
                        'ENUM_ZERO',    # 0.
                        'ENUM_ONE',     # 1.
                        'ENUM_TWO',     # 2.
                        'ENUM_THREE',   # 3.
                        'ENUM_INVALID'  # 4.
                    )


    def name(self, val):
        try:

            name = self.enumeration[val]
        except IndexError:

            # Always return last tuple.
            name = self.enumeration[len(self.enumeration) - 1]

        return name


    def number(self, val):
        try:

            index = self.enumeration.index(val)
        except (TypeError, ValueError):

            # Always return last tuple.
            index = (len(self.enumeration) - 1)

        return index


#endclass.


print('Passes')
print('1) %d'%(EnumDemo().number('ENUM_TWO')))
print('2) %s'%(EnumDemo().number('ENUM_TWO')))
print('3) %s'%(EnumDemo().name(1)))
print('4) %s'%(EnumDemo().enumeration[1]))
print()
print('Fails')
print('1) %d'%(EnumDemo().number('ENUM_THREEa')))
print('2) %s'%(EnumDemo().number('ENUM_THREEa')))
print('3) %s'%(EnumDemo().name(11)))
print('4) %s'%(EnumDemo().enumeration[-1]))

When not used correctly this avoids creating an exception and, instead, passes back a fault indication. A more Pythonic way to do this would be to pass back “None” but my particular application uses the text directly.


将pandas数据框中的列从int转换为string

问题:将pandas数据框中的列从int转换为string

我在pandas中有一个数据帧,其中包含int和str数据列。我想先串联数据框内的列。为此,我必须将int列转换为str。我尝试做如下:

mtrx['X.3'] = mtrx.to_string(columns = ['X.3'])

要么

mtrx['X.3'] = mtrx['X.3'].astype(str)

但是在两种情况下都无法正常工作,并且我收到一条错误消息:“无法连接’str’和’int’对象”。连接两str列效果很好。

I have a dataframe in pandas with mixed int and str data columns. I want to concatenate first the columns within the dataframe. To do that I have to convert an int column to str. I’ve tried to do as follows:

mtrx['X.3'] = mtrx.to_string(columns = ['X.3'])

or

mtrx['X.3'] = mtrx['X.3'].astype(str)

but in both cases it’s not working and I’m getting an error saying “cannot concatenate ‘str’ and ‘int’ objects”. Concatenating two str columns is working perfectly fine.


回答 0

In [16]: df = DataFrame(np.arange(10).reshape(5,2),columns=list('AB'))

In [17]: df
Out[17]: 
   A  B
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9

In [18]: df.dtypes
Out[18]: 
A    int64
B    int64
dtype: object

转换系列

In [19]: df['A'].apply(str)
Out[19]: 
0    0
1    2
2    4
3    6
4    8
Name: A, dtype: object

In [20]: df['A'].apply(str)[0]
Out[20]: '0'

不要忘记将结果分配回去:

df['A'] = df['A'].apply(str)

转换整个框架

In [21]: df.applymap(str)
Out[21]: 
   A  B
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9

In [22]: df.applymap(str).iloc[0,0]
Out[22]: '0'

df = df.applymap(str)
In [16]: df = DataFrame(np.arange(10).reshape(5,2),columns=list('AB'))

In [17]: df
Out[17]: 
   A  B
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9

In [18]: df.dtypes
Out[18]: 
A    int64
B    int64
dtype: object

Convert a series

In [19]: df['A'].apply(str)
Out[19]: 
0    0
1    2
2    4
3    6
4    8
Name: A, dtype: object

In [20]: df['A'].apply(str)[0]
Out[20]: '0'

Don’t forget to assign the result back:

df['A'] = df['A'].apply(str)

Convert the whole frame

In [21]: df.applymap(str)
Out[21]: 
   A  B
0  0  1
1  2  3
2  4  5
3  6  7
4  8  9

In [22]: df.applymap(str).iloc[0,0]
Out[22]: '0'

df = df.applymap(str)

回答 1

更改DataFrame列的数据类型:

要诠释:

df.column_name = df.column_name.astype(np.int64)

要str:

df.column_name = df.column_name.astype(str)

Change data type of DataFrame column:

To int:

df.column_name = df.column_name.astype(np.int64)

To str:

df.column_name = df.column_name.astype(str)


回答 2

警告:给定的两个解决方案 astype()和apply()都不以nan或None形式保留NULL值。

import pandas as pd
import numpy as np

df = pd.DataFrame([None,'string',np.nan,42], index=[0,1,2,3], columns=['A'])

df1 = df['A'].astype(str)
df2 =  df['A'].apply(str)

print df.isnull()
print df1.isnull()
print df2.isnull()

我相信这是由to_string()的实现解决的

Warning: Both solutions given ( astype() and apply() ) do not preserve NULL values in either the nan or the None form.

import pandas as pd
import numpy as np

df = pd.DataFrame([None,'string',np.nan,42], index=[0,1,2,3], columns=['A'])

df1 = df['A'].astype(str)
df2 =  df['A'].apply(str)

print df.isnull()
print df1.isnull()
print df2.isnull()

I believe this is fixed by the implementation of to_string()


回答 3

使用以下代码:

df.column_name = df.column_name.astype('str')

Use the following code:

df.column_name = df.column_name.astype('str')

回答 4

仅供参考。

以上所有答案均适用于数据帧的情况。但是,如果您在创建/修改列时使用lambda,则此方法将不起作用,因为在那里将其视为int属性而不是pandas系列。您必须使用str(target_attribute)使其成为字符串。请参考以下示例。

def add_zero_in_prefix(df):
    if(df['Hour']<10):
        return '0' + str(df['Hour'])

data['str_hr'] = data.apply(add_zero_in_prefix, axis=1)

Just for an additional reference.

All of the above answers will work in case of a data frame. But if you are using lambda while creating / modify a column this won’t work, Because there it is considered as a int attribute instead of pandas series. You have to use str( target_attribute ) to make it as a string. Please refer the below example.

def add_zero_in_prefix(df):
    if(df['Hour']<10):
        return '0' + str(df['Hour'])

data['str_hr'] = data.apply(add_zero_in_prefix, axis=1)

NumPy或Pandas:具有NaN值时,将数组类型保持为整数

问题:NumPy或Pandas:具有NaN值时,将数组类型保持为整数

有没有一种首选的方法来将numpy数组的数据类型固定为intint64或其他),同时仍将元素内部列出为numpy.NaN

特别是,我正在将内部数据结构转换为Pandas DataFrame。在我们的结构中,我们有仍然具有NaN的整数类型的列(但该列的dtype是int)。如果我们将其设为DataFrame,似乎将所有内容重播为浮点数,但我们真的很希望成为int

有什么想法吗?

尝试过的事情:

我尝试from_records()在pandas.DataFrame下使用该功能coerce_float=False,但这并没有帮助。我还尝试使用带有NaN fill_value的NumPy蒙版数组,该数组也无法正常工作。所有这些导致列数据类型变为浮点型。

Is there a preferred way to keep the data type of a numpy array fixed as int (or int64 or whatever), while still having an element inside listed as numpy.NaN?

In particular, I am converting an in-house data structure to a Pandas DataFrame. In our structure, we have integer-type columns that still have NaN’s (but the dtype of the column is int). It seems to recast everything as a float if we make this a DataFrame, but we’d really like to be int.

Thoughts?

Things tried:

I tried using the from_records() function under pandas.DataFrame, with coerce_float=False and this did not help. I also tried using NumPy masked arrays, with NaN fill_value, which also did not work. All of these caused the column data type to become a float.


回答 0

此功能已添加到熊猫(从0.24版开始):https : //pandas.pydata.org/pandas-docs/version/0.24/whatsnew/v0.24.0.html#optional-integer-na-support

此时,它需要使用扩展名dtype Int64(大写),而不是默认的dtype int64(小写)。

This capability has been added to pandas (beginning with version 0.24): https://pandas.pydata.org/pandas-docs/version/0.24/whatsnew/v0.24.0.html#optional-integer-na-support

At this point, it requires the use of extension dtype Int64 (capitalized), rather than the default dtype int64 (lowercase).


回答 1

NaN不能存储在整数数组中。目前,这是熊猫的已知限制;我一直在等待NumPy中的NA值(与R中的NA相似)取得进展,但是至少要等6个月到一年的时间,NumPy才能获得这些功能,这似乎是:

http://pandas.pydata.org/pandas-docs/stable/gotchas.html#support-for-integer-na

(此功能是从熊猫0.24版开始添加的,但请注意,它需要使用扩展名dtype Int64(大写),而不是默认的dtype int64(小写):https : //pandas.pydata.org/pandas- docs / version / 0.24 / whatsnew / v0.24.0.html#optional-integer-na-support

NaN can’t be stored in an integer array. This is a known limitation of pandas at the moment; I have been waiting for progress to be made with NA values in NumPy (similar to NAs in R), but it will be at least 6 months to a year before NumPy gets these features, it seems:

http://pandas.pydata.org/pandas-docs/stable/gotchas.html#support-for-integer-na

(This feature has been added beginning with version 0.24 of pandas, but note it requires the use of extension dtype Int64 (capitalized), rather than the default dtype int64 (lower case): https://pandas.pydata.org/pandas-docs/version/0.24/whatsnew/v0.24.0.html#optional-integer-na-support )


回答 2

如果性能不是主要问题,则可以存储字符串。

df.col = df.col.dropna().apply(lambda x: str(int(x)) )

然后,您可以NaN根据需要随意混合。如果您确实希望拥有整数,则可以根据您的应用程序使用-1,或0,或1234567890或一些其他专用值来表示NaN

您也可以临时复制这些列:一列,有浮点数;另一个是实验型,带有整数或字符串。然后将其插入asserts每个合理的位置,以检查两者是否同步。经过足够的测试后,您可以放开浮子。

If performance is not the main issue, you can store strings instead.

df.col = df.col.dropna().apply(lambda x: str(int(x)) )

Then you can mix then with NaN as much as you want. If you really want to have integers, depending on your application, you can use -1, or 0, or 1234567890, or some other dedicated value to represent NaN.

You can also temporarily duplicate the columns: one as you have, with floats; the other one experimental, with ints or strings. Then inserts asserts in every reasonable place checking that the two are in sync. After enough testing you can let go of the floats.


回答 3

这并不是对所有情况都适用的解决方案,但我使用的是(基因座标)(NaO)

a3['MapInfo'] = a3['MapInfo'].fillna(0).astype(int)

这至少允许使用正确的“本机”列类型,如减法,比较等操作均按预期工作

This is not a solution for all cases, but mine (genomic coordinates) I’ve resorted to using 0 as NaN

a3['MapInfo'] = a3['MapInfo'].fillna(0).astype(int)

This at least allows for the proper ‘native’ column type to be used, operations like subtraction, comparison etc work as expected


回答 4

熊猫v0.24 +

支持功能 NaNv0.24或更高版本将提供整数系列。有这些信息在v0.24部分,并在更多的细节“新什么是” 空整数数据类型

Pandas v0.23及更早版本

通常,最好float在可能的情况下使用系列,即使该系列是从intfloat由于包含的NaN值。这将启用基于矢量的基于NumPy的计算,否则将处理Python级别的循环。

文档确实建议:“一种可能性是使用dtype=object数组。” 例如:

s = pd.Series([1, 2, 3, np.nan])

print(s.astype(object))

0      1
1      2
2      3
3    NaN
dtype: object

出于美观原因,例如输出到文件,此 可能是更可取的。

熊猫v0.23及更早版本:背景

NaN被认为是float当前文档(自v0.23起)指定了将整数序列向上转换为的原因float

在没有从根本上将高性能NA支持内置到NumPy中的情况下,主要的受害者是能够以整数数组表示NA。

这种权衡主要是出于内存和性能方面的考虑,并且也使得最终的Series仍然是“数字”。

该文档还提供NaN包含以下内容的上传规则

Typeclass   Promotion dtype for storing NAs
floating    no change
object      no change
integer     cast to float64
boolean     cast to object

Pandas v0.24+

Functionality to support NaN in integer series will be available in v0.24 upwards. There’s information on this in the v0.24 “What’s New” section, and more details under Nullable Integer Data Type.

Pandas v0.23 and earlier

In general, it’s best to work with float series where possible, even when the series is upcast from int to float due to inclusion of NaN values. This enables vectorised NumPy-based calculations where, otherwise, Python-level loops would be processed.

The docs do suggest : “One possibility is to use dtype=object arrays instead.” For example:

s = pd.Series([1, 2, 3, np.nan])

print(s.astype(object))

0      1
1      2
2      3
3    NaN
dtype: object

For cosmetic reasons, e.g. output to a file, this may be preferable.

Pandas v0.23 and earlier: background

NaN is considered a float. The docs currently (as of v0.23) specify the reason why integer series are upcasted to float:

In the absence of high performance NA support being built into NumPy from the ground up, the primary casualty is the ability to represent NAs in integer arrays.

This trade-off is made largely for memory and performance reasons, and also so that the resulting Series continues to be “numeric”.

The docs also provide rules for upcasting due to NaN inclusion:

Typeclass   Promotion dtype for storing NAs
floating    no change
object      no change
integer     cast to float64
boolean     cast to object

回答 5

现在这是可能的,因为pandas v 0.24.0

pandas 0.24.x发行说明 Quote:“ Pandas已具备保存具有缺失值的整数dtypes的能力。

This is now possible, since pandas v 0.24.0

pandas 0.24.x release notes Quote: “Pandas has gained the ability to hold integer dtypes with missing values.


回答 6

只是想补充一下,以防您尝试将浮点数(1.143)向量转换为整数(1),并且将NA转换为新的’Int64’dtype会导致错误。为了解决这个问题,您必须四舍五入数字,然后执行“ .astype(’Int64’)”

s1 = pd.Series([1.434, 2.343, np.nan])
#without round() the next line returns an error 
s1.astype('Int64')
#cannot safely cast non-equivalent float64 to int64
##with round() it works
s1.round().astype('Int64')
0      1
1      2
2    NaN
dtype: Int64

我的用例是我有一个浮点数系列,我想四舍五入为整数,但是当您执行.round()时,数字末尾仍为’* .0’,因此您可以从末尾减去0转换为int。

Just wanted to add that in case you are trying to convert a float (1.143) vector to integer (1) that has NA converting to the new ‘Int64’ dtype will give you an error. In order to solve this you have to round the numbers and then do “.astype(‘Int64’)”

s1 = pd.Series([1.434, 2.343, np.nan])
#without round() the next line returns an error 
s1.astype('Int64')
#cannot safely cast non-equivalent float64 to int64
##with round() it works
s1.round().astype('Int64')
0      1
1      2
2    NaN
dtype: Int64

My use case is that I have a float series that I want to round to int, but when you do .round() a ‘*.0’ at the end of the number remains, so you can drop that 0 from the end by converting to int.


回答 7

如果文本数据中有空格,则通常为整数的列将转换为float64 dtype,因为int64 dtype无法处理null。如果您要加载多个文件,其中一些带有空白(最终将以float64的形式加载,而另一些将最终以int64的形式加载),则可能导致架构不一致

该代码将尝试将任何数字类型的列转换为Int64(而不是int64),因为Int64可以处理空值

import pandas as pd
import numpy as np

#show datatypes before transformation
mydf.dtypes

for c in mydf.select_dtypes(np.number).columns:
    try:
        mydf[c] = mydf[c].astype('Int64')
        print('casted {} as Int64'.format(c))
    except:
        print('could not cast {} to Int64'.format(c))

#show datatypes after transformation
mydf.dtypes

If there are blanks in the text data, columns that would normally be integers will be cast to floats as float64 dtype because int64 dtype cannot handle nulls. This can cause inconsistent schema if you are loading multiple files some with blanks (which will end up as float64 and others without which will end up as int64

This code will attempt to convert any number type columns to Int64 (as opposed to int64) since Int64 can handle nulls

import pandas as pd
import numpy as np

#show datatypes before transformation
mydf.dtypes

for c in mydf.select_dtypes(np.number).columns:
    try:
        mydf[c] = mydf[c].astype('Int64')
        print('casted {} as Int64'.format(c))
    except:
        print('could not cast {} to Int64'.format(c))

#show datatypes after transformation
mydf.dtypes

如何将int转换为十六进制字符串?

问题:如何将int转换为十六进制字符串?

我想将一个整数(将为<= 255)用于十六进制字符串表示形式

例如:我想通过65并离开'\x41',或255获得'\xff'

我曾尝试使用struct.pack('c',65来执行此操作),但9由于它想采用单个字符串,因此上述内容均会阻塞。

I want to take an integer (that will be <= 255), to a hex string representation

e.g.: I want to pass in 65 and get out '\x41', or 255 and get '\xff'.

I’ve tried doing this with the struct.pack('c',65), but that chokes on anything above 9 since it wants to take in a single character string.


回答 0

您正在寻找chr功能。

您似乎正在混合使用整数的十进制表示形式和整数的十六进制表示形式,因此尚不清楚您需要什么。根据您的描述,我认为这些片段之一可以显示您想要的内容。

>>> chr(0x65) == '\x65'
True


>>> hex(65)
'0x41'
>>> chr(65) == '\x41'
True

请注意,这与包含整数(十六进制)的字符串完全不同。如果那是您想要的,请使用hex内置的。

You are looking for the chr function.

You seem to be mixing decimal representations of integers and hex representations of integers, so it’s not entirely clear what you need. Based on the description you gave, I think one of these snippets shows what you want.

>>> chr(0x65) == '\x65'
True


>>> hex(65)
'0x41'
>>> chr(65) == '\x41'
True

Note that this is quite different from a string containing an integer as hex. If that is what you want, use the hex builtin.


回答 1

这会将整数转换为带有0x前缀的2位十六进制字符串:

strHex = "0x%0.2X" % 255

This will convert an integer to a 2 digit hex string with the 0x prefix:

strHex = "0x%0.2X" % 255

回答 2

hex()

hex(255)  # 0xff

如果您真的想\在前台就可以:

print '\\' + hex(255)[1:]

What about hex()?

hex(255)  # 0xff

If you really want to have \ in front you can do:

print '\\' + hex(255)[1:]

回答 3

尝试:

"0x%x" % 255 # => 0xff

要么

"0x%X" % 255 # => 0xFF

Python文档说:“把它放在枕头底下:http : //docs.python.org/library/index.html

Try:

"0x%x" % 255 # => 0xff

or

"0x%X" % 255 # => 0xFF

Python Documentation says: “keep this under Your pillow: http://docs.python.org/library/index.html


回答 4

让我添加这一点,因为有时您只想用一位数字表示:

'{:x}'.format(15)
> f

现在,使用新的f''格式字符串,您可以执行以下操作:

f'{15:x}'
> f

注意:最初的’f’ f'{15:x}'是表示格式字符串

Let me add this one, because sometimes you just want the single digit representation

( x can be lower, ‘x’, or uppercase, ‘X’, the choice determines if the output letters are upper or lower.):

'{:x}'.format(15)
> f

And now with the new f'' format strings you can do:

f'{15:x}'
> f

To add 0 padding you can use 0>n:

f'{2034:0>4X}'
> 07F2

NOTE: the initial ‘f’ in f'{15:x}' is to signify a format string


回答 5

如果要打包一个值小于255的结构(一个无符号字节,uint8_t)并以一个字符的字符串结尾,则可能要寻找格式B而不是c。C将字符转换为字符串(本身不太有用),而B将整数转换。

struct.pack('B', 65)

(是的,65是\ x41,而不是\ x65。)

struct类还将方便地处理通讯或其他用途的字节序。

If you want to pack a struct with a value <255 (one byte unsigned, uint8_t) and end up with a string of one character, you’re probably looking for the format B instead of c. C converts a character to a string (not too useful by itself) while B converts an integer.

struct.pack('B', 65)

(And yes, 65 is \x41, not \x65.)

The struct class will also conveniently handle endianness for communication or other uses.


回答 6

请注意,对于较大的值,hex()仍然可以使用(某些其他答案无效):

x = hex(349593196107334030177678842158399357)
print(x)

Python 2:0x4354467b746f6f5f736d616c6c3f7dL
Python 3:0x4354467b746f6f5f736d616c6c3f7d

对于解密的RSA消息,可以执行以下操作:

import binascii

hexadecimals = hex(349593196107334030177678842158399357)

print(binascii.unhexlify(hexadecimals[2:-1])) # python 2
print(binascii.unhexlify(hexadecimals[2:])) # python 3

Note that for large values, hex() still works (some other answers don’t):

x = hex(349593196107334030177678842158399357)
print(x)

Python 2: 0x4354467b746f6f5f736d616c6c3f7dL
Python 3: 0x4354467b746f6f5f736d616c6c3f7d

For a decrypted RSA message, one could do the following:

import binascii

hexadecimals = hex(349593196107334030177678842158399357)

print(binascii.unhexlify(hexadecimals[2:-1])) # python 2
print(binascii.unhexlify(hexadecimals[2:])) # python 3

回答 7

这对我来说最好

"0x%02X" % 5  # => 0x05
"0x%02X" % 17 # => 0x11

如果您想要一个更大的宽度(2是2个十六进制打印字符),请更改(2),这样3将为您提供以下内容

"0x%03X" % 5  # => 0x005
"0x%03X" % 17 # => 0x011

This worked best for me

"0x%02X" % 5  # => 0x05
"0x%02X" % 17 # => 0x11

Change the (2) if you want a number with a bigger width (2 is for 2 hex printned chars) so 3 will give you the following

"0x%03X" % 5  # => 0x005
"0x%03X" % 17 # => 0x011

回答 8

我希望将一个随机整数转换为以#开头的六位十六进制字符串。为了得到这个我用了

"#%6x" % random.randint(0xFFFFFF)

I wanted a random integer converted into a six-digit hex string with a # at the beginning. To get this I used

"#%6x" % random.randint(0xFFFFFF)

回答 9

随着format(),按照格式的例子,我们可以这样做:

>>> # format also supports binary numbers
>>> "int: {0:d};  hex: {0:x};  oct: {0:o};  bin: {0:b}".format(42)
'int: 42;  hex: 2a;  oct: 52;  bin: 101010'
>>> # with 0x, 0o, or 0b as prefix:
>>> "int: {0:d};  hex: {0:#x};  oct: {0:#o};  bin: {0:#b}".format(42)
'int: 42;  hex: 0x2a;  oct: 0o52;  bin: 0b101010'

With format(), as per format-examples, we can do:

>>> # format also supports binary numbers
>>> "int: {0:d};  hex: {0:x};  oct: {0:o};  bin: {0:b}".format(42)
'int: 42;  hex: 2a;  oct: 52;  bin: 101010'
>>> # with 0x, 0o, or 0b as prefix:
>>> "int: {0:d};  hex: {0:#x};  oct: {0:#o};  bin: {0:#b}".format(42)
'int: 42;  hex: 0x2a;  oct: 0o52;  bin: 0b101010'

回答 10

(int_variable).to_bytes(bytes_length, byteorder='big'|'little').hex()

例如:

>>> (434).to_bytes(4, byteorder='big').hex()
'000001b2'
>>> (434).to_bytes(4, byteorder='little').hex()
'b2010000'
(int_variable).to_bytes(bytes_length, byteorder='big'|'little').hex()

For example:

>>> (434).to_bytes(4, byteorder='big').hex()
'000001b2'
>>> (434).to_bytes(4, byteorder='little').hex()
'b2010000'

回答 11

您也可以将任何基数的任何数字转换为十六进制。在这里使用这一行代码很容易使用:

hex(int(n,x)).replace("0x","")

您有一个字符串n,该字符串是您的数字以及x 该数字的基数。首先,将其更改为整数,然后更改为十六进制,但是十六进制首先更改为十六进制0x,因此replace我们将其删除。

Also you can convert any number in any base to hex. Use this one line code here it’s easy and simple to use:

hex(int(n,x)).replace("0x","")

You have a string n that is your number and x the base of that number. First, change it to integer and then to hex but hex has 0x at the first of it so with replace we remove it.


回答 12

作为替代表示,您可以使用

[in] '%s' % hex(15)
[out]'0xf'

As an alternative representation you could use

[in] '%s' % hex(15)
[out]'0xf'

如何将输入读取为数字?

问题:如何将输入读取为数字?

为什么在下面的代码中使用xy字符串而不是整数?

(注意:在Python 2.x中使用raw_input()。在Python 3.x中使用input()。在Python 3.x中raw_input()被重命名为input()

play = True

while play:

    x = input("Enter a number: ")
    y = input("Enter a number: ")

    print(x + y)
    print(x - y)
    print(x * y)
    print(x / y)
    print(x % y)

    if input("Play again? ") == "no":
        play = False

Why are x and y strings instead of ints in the below code?

(Note: in Python 2.x use raw_input(). In Python 3.x use input(). raw_input() was renamed to input() in Python 3.x)

play = True

while play:

    x = input("Enter a number: ")
    y = input("Enter a number: ")

    print(x + y)
    print(x - y)
    print(x * y)
    print(x / y)
    print(x % y)

    if input("Play again? ") == "no":
        play = False

回答 0

TLDR

  • Python 3不会评估input函数接收到的数据,但是Python 2的input函数会评估(阅读下一节以了解含义)。
  • inputraw_input函数相当于Python 2与Python 3 。

Python 2.x

有两个函数用于获取用户输入,分别称为inputraw_input。它们之间的区别是,raw_input不评估数据并以字符串形式按原样返回。但是,input将对您输入的内容进行评估,评估结果将返回。例如,

>>> import sys
>>> sys.version
'2.7.6 (default, Mar 22 2014, 22:59:56) \n[GCC 4.8.2]'
>>> data = input("Enter a number: ")
Enter a number: 5 + 17
>>> data, type(data)
(22, <type 'int'>)

5 + 17评估数据,结果为22。当它对表达式求值时5 + 17,它将检测到您要添加两个数字,因此结果也将是同一int类型。因此,类型转换是免费完成的,并22作为的结果返回input并存储在data变量中。您可以将其input视为raw_input带有eval调用的组合。

>>> data = eval(raw_input("Enter a number: "))
Enter a number: 5 + 17
>>> data, type(data)
(22, <type 'int'>)

注意:input在Python 2.x 中使用时应小心。我在这个答案中解释了为什么在使用它时要小心。

但是,raw_input不评估输入并以字符串形式原样返回。

>>> import sys
>>> sys.version
'2.7.6 (default, Mar 22 2014, 22:59:56) \n[GCC 4.8.2]'
>>> data = raw_input("Enter a number: ")
Enter a number: 5 + 17
>>> data, type(data)
('5 + 17', <type 'str'>)

Python 3.x

Python 3.x input和Python 2.x raw_input类似,raw_input在Python 3.x中不可用。

>>> import sys
>>> sys.version
'3.4.0 (default, Apr 11 2014, 13:05:11) \n[GCC 4.8.2]'
>>> data = input("Enter a number: ")
Enter a number: 5 + 17
>>> data, type(data)
('5 + 17', <class 'str'>)

要回答您的问题,由于Python 3.x不会评估和转换数据类型,因此必须使用显式转换为ints int,如下所示

x = int(input("Enter a number: "))
y = int(input("Enter a number: "))

您可以接受任意基数的数字,并使用int函数将其直接转换为10基数

>>> data = int(input("Enter a number: "), 8)
Enter a number: 777
>>> data
511
>>> data = int(input("Enter a number: "), 16)
Enter a number: FFFF
>>> data
65535
>>> data = int(input("Enter a number: "), 2)
Enter a number: 10101010101
>>> data
1365

第二个参数告诉输入的数字的基础是什么,然后在内部对其进行理解和转换。如果输入的数据有误,将抛出ValueError

>>> data = int(input("Enter a number: "), 2)
Enter a number: 1234
Traceback (most recent call last):
  File "<input>", line 1, in <module>
ValueError: invalid literal for int() with base 2: '1234'

对于可以包含小数部分的值,类型应为float而不是int

x = float(input("Enter a number:"))

除此之外,您的程序可以像这样进行一些更改

while True:
    ...
    ...
    if input("Play again? ") == "no":
        break

您可以play使用break和摆脱变量while True

TLDR

  • Python 3 doesn’t evaluate the data received with input function, but Python 2’s input function does (read the next section to understand the implication).
  • Python 2’s equivalent of Python 3’s input is the raw_input function.

Python 2.x

There were two functions to get user input, called input and raw_input. The difference between them is, raw_input doesn’t evaluate the data and returns as it is, in string form. But, input will evaluate whatever you entered and the result of evaluation will be returned. For example,

>>> import sys
>>> sys.version
'2.7.6 (default, Mar 22 2014, 22:59:56) \n[GCC 4.8.2]'
>>> data = input("Enter a number: ")
Enter a number: 5 + 17
>>> data, type(data)
(22, <type 'int'>)

The data 5 + 17 is evaluated and the result is 22. When it evaluates the expression 5 + 17, it detects that you are adding two numbers and so the result will also be of the same int type. So, the type conversion is done for free and 22 is returned as the result of input and stored in data variable. You can think of input as the raw_input composed with an eval call.

>>> data = eval(raw_input("Enter a number: "))
Enter a number: 5 + 17
>>> data, type(data)
(22, <type 'int'>)

Note: you should be careful when you are using input in Python 2.x. I explained why one should be careful when using it, in this answer.

But, raw_input doesn’t evaluate the input and returns as it is, as a string.

>>> import sys
>>> sys.version
'2.7.6 (default, Mar 22 2014, 22:59:56) \n[GCC 4.8.2]'
>>> data = raw_input("Enter a number: ")
Enter a number: 5 + 17
>>> data, type(data)
('5 + 17', <type 'str'>)

Python 3.x

Python 3.x’s input and Python 2.x’s raw_input are similar and raw_input is not available in Python 3.x.

>>> import sys
>>> sys.version
'3.4.0 (default, Apr 11 2014, 13:05:11) \n[GCC 4.8.2]'
>>> data = input("Enter a number: ")
Enter a number: 5 + 17
>>> data, type(data)
('5 + 17', <class 'str'>)

Solution

To answer your question, since Python 3.x doesn’t evaluate and convert the data type, you have to explicitly convert to ints, with int, like this

x = int(input("Enter a number: "))
y = int(input("Enter a number: "))

You can accept numbers of any base and convert them directly to base-10 with the int function, like this

>>> data = int(input("Enter a number: "), 8)
Enter a number: 777
>>> data
511
>>> data = int(input("Enter a number: "), 16)
Enter a number: FFFF
>>> data
65535
>>> data = int(input("Enter a number: "), 2)
Enter a number: 10101010101
>>> data
1365

The second parameter tells what is the base of the numbers entered and then internally it understands and converts it. If the entered data is wrong it will throw a ValueError.

>>> data = int(input("Enter a number: "), 2)
Enter a number: 1234
Traceback (most recent call last):
  File "<input>", line 1, in <module>
ValueError: invalid literal for int() with base 2: '1234'

For values that can have a fractional component, the type would be float rather than int:

x = float(input("Enter a number:"))

Apart from that, your program can be changed a little bit, like this

while True:
    ...
    ...
    if input("Play again? ") == "no":
        break

You can get rid of the play variable by using break and while True.


回答 1

在Python 3.x中,raw_input已重命名为,inputinput删除了Python2.x 。

这意味着,就像Python 3.x中的一样raw_inputinput总是返回一个字符串对象。

要解决此问题,您需要通过将它们输入以下内容来将这些输入明确地变成整数int

x = int(input("Enter a number: "))
y = int(input("Enter a number: "))

In Python 3.x, raw_input was renamed to input and the Python 2.x input was removed.

This means that, just like raw_input, input in Python 3.x always returns a string object.

To fix the problem, you need to explicitly make those inputs into integers by putting them in int:

x = int(input("Enter a number: "))
y = int(input("Enter a number: "))

回答 2

对于单行中的多个整数,map可能会更好。

arr = map(int, raw_input().split())

如果数字已知(例如2个整数),则可以使用

num1, num2 = map(int, raw_input().split())

For multiple integer in a single line, map might be better.

arr = map(int, raw_input().split())

If the number is already known, (like 2 integers), you can use

num1, num2 = map(int, raw_input().split())

回答 3

input()(Python 3)和raw_input()(Python 2)始终返回字符串。使用显式将结果转换为整数int()

x = int(input("Enter a number: "))
y = int(input("Enter a number: "))

input() (Python 3) and raw_input() (Python 2) always return strings. Convert the result to integer explicitly with int().

x = int(input("Enter a number: "))
y = int(input("Enter a number: "))

回答 4

多个问题需要在单行上输入多个整数。最好的方法是一行输入整个数字字符串,然后将它们拆分为整数。这是Python 3版本:

a = []
p = input()
p = p.split()      
for i in p:
    a.append(int(i))

也可以使用列表理解

p = input().split("whatever the seperator is")

并将所有输入从字符串转换为整数,我们执行以下操作

x = [int(i) for i in p]
print(x, end=' ')

应以直线打印列表元素。

Multiple questions require input for several integers on single line. The best way is to input the whole string of numbers one one line and then split them to integers. Here is a Python 3 version:

a = []
p = input()
p = p.split()      
for i in p:
    a.append(int(i))

Also a list comprehension can be used

p = input().split("whatever the seperator is")

And to convert all the inputs from string to int we do the following

x = [int(i) for i in p]
print(x, end=' ')

shall print the list elements in a straight line.


回答 5

转换为整数:

my_number = int(input("enter the number"))

对于浮点数类似:

my_decimalnumber = float(input("enter the number"))

Convert to integers:

my_number = int(input("enter the number"))

Similarly for floating point numbers:

my_decimalnumber = float(input("enter the number"))

回答 6

n=int(input())
for i in range(n):
    n=input()
    n=int(n)
    arr1=list(map(int,input().split()))

for循环应运行’n’次。第二个“ n”是数组的长度。最后一条语句将整数映射到列表,并以空格分隔的形式接受输入。您还可以在for循环的末尾返回数组。

n=int(input())
for i in range(n):
    n=input()
    n=int(n)
    arr1=list(map(int,input().split()))

the for loop shall run ‘n’ number of times . the second ‘n’ is the length of the array. the last statement maps the integers to a list and takes input in space separated form . you can also return the array at the end of for loop.


回答 7

我在解决CodeChef上的问题时遇到了输入整数的问题,该问题应从一行读取两个以空格分隔的整数。

虽然int(input())对于单个整数就足够了,但是我没有找到直接输入两个整数的方法。我尝试了这个:

num = input()
num1 = 0
num2 = 0

for i in range(len(num)):
    if num[i] == ' ':
        break

num1 = int(num[:i])
num2 = int(num[i+1:])

现在,我将num1和num2用作整数。希望这可以帮助。

I encountered a problem of taking integer input while solving a problem on CodeChef, where two integers – separated by space – should be read from one line.

While int(input()) is sufficient for a single integer, I did not find a direct way to input two integers. I tried this:

num = input()
num1 = 0
num2 = 0

for i in range(len(num)):
    if num[i] == ' ':
        break

num1 = int(num[:i])
num2 = int(num[i+1:])

Now I use num1 and num2 as integers. Hope this helps.


回答 8

def dbz():
    try:
        r = raw_input("Enter number:")
        if r.isdigit():
            i = int(raw_input("Enter divident:"))
            d = int(r)/i
            print "O/p is -:",d
        else:
            print "Not a number"
    except Exception ,e:
        print "Program halted incorrect data entered",type(e)
dbz()

Or 

num = input("Enter Number:")#"input" will accept only numbers
def dbz():
    try:
        r = raw_input("Enter number:")
        if r.isdigit():
            i = int(raw_input("Enter divident:"))
            d = int(r)/i
            print "O/p is -:",d
        else:
            print "Not a number"
    except Exception ,e:
        print "Program halted incorrect data entered",type(e)
dbz()

Or 

num = input("Enter Number:")#"input" will accept only numbers

回答 9

尽管在你的榜样,int(input(...))做的伎俩在任何情况下,python-futurebuiltins.input是值得考虑的,因为这可以确保你的代码同时适用于Python 2和3 ,并禁用Python2的违约行为,input努力成为‘聪明的’关于输入数据类型(builtins.input基本上只是的行为类似于raw_input)。

While in your example, int(input(...)) does the trick in any case, python-future‘s builtins.input is worth consideration since that makes sure your code works for both Python 2 and 3 and disables Python2’s default behaviour of input trying to be “clever” about the input data type (builtins.input basically just behaves like raw_input).


将列表中的所有字符串转换为int

问题:将列表中的所有字符串转换为int

在Python中,我想将列表中的所有字符串转换为整数。

所以,如果我有:

results = ['1', '2', '3']

我该如何做:

results = [1, 2, 3]

In Python, I want to convert all strings in a list to integers.

So if I have:

results = ['1', '2', '3']

How do I make it:

results = [1, 2, 3]

回答 0

使用map功能(在Python 2.x中):

results = map(int, results)

在Python 3中,您需要将结果从map转换为列表:

results = list(map(int, results))

Use the map function (in Python 2.x):

results = map(int, results)

In Python 3, you will need to convert the result from map to a list:

results = list(map(int, results))

回答 1

使用列表理解

results = [int(i) for i in results]

例如

>>> results = ["1", "2", "3"]
>>> results = [int(i) for i in results]
>>> results
[1, 2, 3]

Use a list comprehension:

results = [int(i) for i in results]

e.g.

>>> results = ["1", "2", "3"]
>>> results = [int(i) for i in results]
>>> results
[1, 2, 3]

回答 2

比列表理解要扩展一点,但同样有用:

def str_list_to_int_list(str_list):
    n = 0
    while n < len(str_list):
        str_list[n] = int(str_list[n])
        n += 1
    return(str_list)

例如

>>> results = ["1", "2", "3"]
>>> str_list_to_int_list(results)
[1, 2, 3]

也:

def str_list_to_int_list(str_list):
    int_list = [int(n) for n in str_list]
    return int_list

A little bit more expanded than list comprehension but likewise useful:

def str_list_to_int_list(str_list):
    n = 0
    while n < len(str_list):
        str_list[n] = int(str_list[n])
        n += 1
    return(str_list)

e.g.

>>> results = ["1", "2", "3"]
>>> str_list_to_int_list(results)
[1, 2, 3]

Also:

def str_list_to_int_list(str_list):
    int_list = [int(n) for n in str_list]
    return int_list

“ is”运算符对整数的行为异常

问题:“ is”运算符对整数的行为异常

为什么以下内容在Python中表现异常?

>>> a = 256
>>> b = 256
>>> a is b
True           # This is an expected result
>>> a = 257
>>> b = 257
>>> a is b
False          # What happened here? Why is this False?
>>> 257 is 257
True           # Yet the literal numbers compare properly

我正在使用Python 2.5.2。尝试使用某些不同版本的Python,Python 2.3.3似乎在99到100之间显示了上述行为。

基于以上所述,我可以假设Python是内部实现的,因此“小”整数的存储方式与大整数的存储方式不同,并且is运算符可以分辨出这种差异。为什么要泄漏抽象?当我事先不知道它们是否为数字时,比较两个任意对象以查看它们是否相同的更好方法是什么?

Why does the following behave unexpectedly in Python?

>>> a = 256
>>> b = 256
>>> a is b
True           # This is an expected result
>>> a = 257
>>> b = 257
>>> a is b
False          # What happened here? Why is this False?
>>> 257 is 257
True           # Yet the literal numbers compare properly

I am using Python 2.5.2. Trying some different versions of Python, it appears that Python 2.3.3 shows the above behaviour between 99 and 100.

Based on the above, I can hypothesize that Python is internally implemented such that “small” integers are stored in a different way than larger integers and the is operator can tell the difference. Why the leaky abstraction? What is a better way of comparing two arbitrary objects to see whether they are the same when I don’t know in advance whether they are numbers or not?


回答 0

看看这个:

>>> a = 256
>>> b = 256
>>> id(a)
9987148
>>> id(b)
9987148
>>> a = 257
>>> b = 257
>>> id(a)
11662816
>>> id(b)
11662828

这是我在Python 2文档“普通整数对象”中发现的内容(对于Python 3也是一样):

当前的实现为-5到256之间的所有整数保留一个整数对象数组,当您在该范围内创建int时,实际上实际上是返回对现有对象的引用。因此应该可以更改1的值。我怀疑在这种情况下Python的行为是不确定的。:-)

Take a look at this:

>>> a = 256
>>> b = 256
>>> id(a)
9987148
>>> id(b)
9987148
>>> a = 257
>>> b = 257
>>> id(a)
11662816
>>> id(b)
11662828

Here’s what I found in the Python 2 documentation, “Plain Integer Objects” (It’s the same for Python 3):

The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object. So it should be possible to change the value of 1. I suspect the behaviour of Python in this case is undefined. :-)


回答 1

Python的“ is”运算符在使用整数时表现异常吗?

总结-让我强调一下:不要is用于比较整数。

这不是您应该有任何期望的行为。

相反,分别使用==!=比较相等和不平等。例如:

>>> a = 1000
>>> a == 1000       # Test integers like this,
True
>>> a != 5000       # or this!
True
>>> a is 1000       # Don't do this! - Don't use `is` to test integers!!
False

说明

要知道这一点,您需要了解以下内容。

首先,该怎么is办?它是一个比较运算符。从文档中

运算符isis not测试对象标识:x is y当且仅当x和y是同一对象时才为true。x is not y产生反真值。

因此,以下内容是等效的。

>>> a is b
>>> id(a) == id(b)

文档中

id 返回对象的“身份”。这是一个整数(或长整数),在该对象的生存期内,此整数保证是唯一且恒定的。具有不重叠生存期的两个对象可能具有相同的id()值。

请注意,CPython(Python的参考实现)中对象的ID是内存中的位置这一事实是实现细节。Python的其他实现(例如Jython或IronPython)可以轻松地使用的不同实现id

那么用例是is什么呢? PEP8描述

与单例之类的比较None应始终使用isis not,而不应使用相等运算符。

问题

您询问并陈述以下问题(带有代码):

为什么以下内容在Python中表现异常?

>>> a = 256
>>> b = 256
>>> a is b
True           # This is an expected result

不是预期的结果。为什么会这样?这仅意味着256两者a和引用的整数值是整数b的相同实例。整数在Python中是不可变的,因此它们不能更改。这对任何代码都没有影响。不应期望。这仅仅是一个实现细节。

但是也许我们应该为每次声明一个等于256的值而在内存中没有新的单独实例感到高兴。

>>> a = 257
>>> b = 257
>>> a is b
False          # What happened here? Why is this False?

看起来我们现在有两个单独的整数实例,它们的值257在内存中。由于整数是不可变的,因此会浪费内存。希望我们不要浪费很多。我们可能不是。但是不能保证这种行为。

>>> 257 is 257
True           # Yet the literal numbers compare properly

好吧,这看起来好像您的Python特定实现正在尝试变得聪明,除非必须这样做,否则不会在内存中创建冗余值的整数。您似乎表明您正在使用Python的引用实现,即CPython。对CPython有好处。

如果CPython可以在全球范围内做到这一点甚至更好,如果它可以便宜地做到这一点(因为查找会花费一定的成本),也许还有另一种实现方式。

但是对于对代码的影响,您不必在乎整数是否是整数的特定实例。您只需要关心该实例的值是什么,就可以使用普通的比较运算符,即==

是什么is

is检查id两个对象的相同。在CPython中,id是内存中的位置,但是在另一个实现中,它可能是其他一些唯一标识的数字。要用代码重新声明:

>>> a is b

是相同的

>>> id(a) == id(b)

那我们为什么要使用is呢?

相对于说,这是一个非常快速的检查,检查两个很长的字符串的值是否相等。但是由于它适用于对象的唯一性,因此我们的用例有限。实际上,我们主要是想用它来检查None,这是一个单例(内存中一个地方存在的唯一实例)。如果有可能将其他单例合并is,我们可以创建其他单例,我们可能会与进行检查,但这相对较少。这是一个示例(将在Python 2和3中运行),例如

SENTINEL_SINGLETON = object() # this will only be created one time.

def foo(keyword_argument=None):
    if keyword_argument is None:
        print('no argument given to foo')
    bar()
    bar(keyword_argument)
    bar('baz')

def bar(keyword_argument=SENTINEL_SINGLETON):
    # SENTINEL_SINGLETON tells us if we were not passed anything
    # as None is a legitimate potential argument we could get.
    if keyword_argument is SENTINEL_SINGLETON:
        print('no argument given to bar')
    else:
        print('argument to bar: {0}'.format(keyword_argument))

foo()

哪些打印:

no argument given to foo
no argument given to bar
argument to bar: None
argument to bar: baz

因此,我们看到,使用is和和哨兵,我们可以区分何时bar不带参数调用和何时带调用None。这些是主要的用例的is-不要没有用它来测试整数,字符串,元组,或者其他喜欢这些东西的平等。

Python’s “is” operator behaves unexpectedly with integers?

In summary – let me emphasize: Do not use is to compare integers.

This isn’t behavior you should have any expectations about.

Instead, use == and != to compare for equality and inequality, respectively. For example:

>>> a = 1000
>>> a == 1000       # Test integers like this,
True
>>> a != 5000       # or this!
True
>>> a is 1000       # Don't do this! - Don't use `is` to test integers!!
False

Explanation

To know this, you need to know the following.

First, what does is do? It is a comparison operator. From the documentation:

The operators is and is not test for object identity: x is y is true if and only if x and y are the same object. x is not y yields the inverse truth value.

And so the following are equivalent.

>>> a is b
>>> id(a) == id(b)

From the documentation:

id Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value.

Note that the fact that the id of an object in CPython (the reference implementation of Python) is the location in memory is an implementation detail. Other implementations of Python (such as Jython or IronPython) could easily have a different implementation for id.

So what is the use-case for is? PEP8 describes:

Comparisons to singletons like None should always be done with is or is not, never the equality operators.

The Question

You ask, and state, the following question (with code):

Why does the following behave unexpectedly in Python?

>>> a = 256
>>> b = 256
>>> a is b
True           # This is an expected result

It is not an expected result. Why is it expected? It only means that the integers valued at 256 referenced by both a and b are the same instance of integer. Integers are immutable in Python, thus they cannot change. This should have no impact on any code. It should not be expected. It is merely an implementation detail.

But perhaps we should be glad that there is not a new separate instance in memory every time we state a value equals 256.

>>> a = 257
>>> b = 257
>>> a is b
False          # What happened here? Why is this False?

Looks like we now have two separate instances of integers with the value of 257 in memory. Since integers are immutable, this wastes memory. Let’s hope we’re not wasting a lot of it. We’re probably not. But this behavior is not guaranteed.

>>> 257 is 257
True           # Yet the literal numbers compare properly

Well, this looks like your particular implementation of Python is trying to be smart and not creating redundantly valued integers in memory unless it has to. You seem to indicate you are using the referent implementation of Python, which is CPython. Good for CPython.

It might be even better if CPython could do this globally, if it could do so cheaply (as there would a cost in the lookup), perhaps another implementation might.

But as for impact on code, you should not care if an integer is a particular instance of an integer. You should only care what the value of that instance is, and you would use the normal comparison operators for that, i.e. ==.

What is does

is checks that the id of two objects are the same. In CPython, the id is the location in memory, but it could be some other uniquely identifying number in another implementation. To restate this with code:

>>> a is b

is the same as

>>> id(a) == id(b)

Why would we want to use is then?

This can be a very fast check relative to say, checking if two very long strings are equal in value. But since it applies to the uniqueness of the object, we thus have limited use-cases for it. In fact, we mostly want to use it to check for None, which is a singleton (a sole instance existing in one place in memory). We might create other singletons if there is potential to conflate them, which we might check with is, but these are relatively rare. Here’s an example (will work in Python 2 and 3) e.g.

SENTINEL_SINGLETON = object() # this will only be created one time.

def foo(keyword_argument=None):
    if keyword_argument is None:
        print('no argument given to foo')
    bar()
    bar(keyword_argument)
    bar('baz')

def bar(keyword_argument=SENTINEL_SINGLETON):
    # SENTINEL_SINGLETON tells us if we were not passed anything
    # as None is a legitimate potential argument we could get.
    if keyword_argument is SENTINEL_SINGLETON:
        print('no argument given to bar')
    else:
        print('argument to bar: {0}'.format(keyword_argument))

foo()

Which prints:

no argument given to foo
no argument given to bar
argument to bar: None
argument to bar: baz

And so we see, with is and a sentinel, we are able to differentiate between when bar is called with no arguments and when it is called with None. These are the primary use-cases for is – do not use it to test for equality of integers, strings, tuples, or other things like these.


回答 2

这取决于您是否要看两个事物是否相等或相同的对象。

is检查它们是否是相同的对象,而不仅仅是相等。小整数可能指向相同的内存位置以提高空间效率

In [29]: a = 3
In [30]: b = 3
In [31]: id(a)
Out[31]: 500729144
In [32]: id(b)
Out[32]: 500729144

您应该==用来比较任意对象的相等性。您可以使用__eq____ne__属性指定行为。

It depends on whether you’re looking to see if 2 things are equal, or the same object.

is checks to see if they are the same object, not just equal. The small ints are probably pointing to the same memory location for space efficiency

In [29]: a = 3
In [30]: b = 3
In [31]: id(a)
Out[31]: 500729144
In [32]: id(b)
Out[32]: 500729144

You should use == to compare equality of arbitrary objects. You can specify the behavior with the __eq__, and __ne__ attributes.


回答 3

我来晚了,但是,您想从中获得答案吗?我将尝试以介绍性的方式对此进行说明,以便更多的人可以跟进。


关于CPython的一件好事是您实际上可以看到其来源。我将使用3.5版本的链接,但是找到相应的2.x链接是微不足道的。

在CPython中,用于创建新对象的C-API函数intPyLong_FromLong(long v)。此功能的说明是:

当前的实现为-5到256之间的所有整数保留一个整数对象数组,当您在该范围内创建int时,实际上实际上是返回对现有对象的引用。因此应该可以更改1的值。我怀疑在这种情况下Python的行为是不确定的。:-)

(我的斜体)

不了解您,但我看到了并想:让我们找到那个数组!

如果您还不熟悉实现CPython的C代码,则应 ; 一切都井井有条,可读性强。对于我们而言,我们需要在看Objects子目录中的主源代码目录树

PyLong_FromLong处理long对象,因此不难推断我们需要窥视内部longobject.c。看完内部,您可能会觉得事情很混乱。它们是,但不要担心,我们正在寻找的功能在第230行令人不寒而栗,等待我们检查出来。这是一个很小的函数,因此主体(不包括声明)可以轻松粘贴到此处:

PyObject *
PyLong_FromLong(long ival)
{
    // omitting declarations

    CHECK_SMALL_INT(ival);

    if (ival < 0) {
        /* negate: cant write this as abs_ival = -ival since that
           invokes undefined behaviour when ival is LONG_MIN */
        abs_ival = 0U-(unsigned long)ival;
        sign = -1;
    }
    else {
        abs_ival = (unsigned long)ival;
    }

    /* Fast path for single-digit ints */
    if (!(abs_ival >> PyLong_SHIFT)) {
        v = _PyLong_New(1);
        if (v) {
            Py_SIZE(v) = sign;
            v->ob_digit[0] = Py_SAFE_DOWNCAST(
                abs_ival, unsigned long, digit);
        }
        return (PyObject*)v; 
}

现在,我们不是C 主代码-haxxorz,但我们也不傻,我们可以看到这CHECK_SMALL_INT(ival);一切诱人地窥视着我们。我们可以理解,这与此有关。让我们来看看:

#define CHECK_SMALL_INT(ival) \
    do if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS) { \
        return get_small_int((sdigit)ival); \
    } while(0)

因此,get_small_int如果值ival满足条件,则它是一个调用函数的宏:

if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS)

那么什么是NSMALLNEGINTSNSMALLPOSINTS?宏!他们在这里

#ifndef NSMALLPOSINTS
#define NSMALLPOSINTS           257
#endif
#ifndef NSMALLNEGINTS
#define NSMALLNEGINTS           5
#endif

所以我们的条件是if (-5 <= ival && ival < 257)通话get_small_int

接下来,让我们看一下get_small_int它的所有荣耀(好吧,我们只看它的身体,因为那是有趣的地方):

PyObject *v;
assert(-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS);
v = (PyObject *)&small_ints[ival + NSMALLNEGINTS];
Py_INCREF(v);

好的,声明一个PyObject,断言先前的条件成立并执行赋值:

v = (PyObject *)&small_ints[ival + NSMALLNEGINTS];

small_ints看起来很像我们一直在寻找的那个数组,它是!我们只要阅读该死的文档,我们就永远知道!

/* Small integers are preallocated in this array so that they
   can be shared.
   The integers that are preallocated are those in the range
   -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).
*/
static PyLongObject small_ints[NSMALLNEGINTS + NSMALLPOSINTS];

是的,这是我们的家伙。当您要int在该范围内创建一个新[NSMALLNEGINTS, NSMALLPOSINTS)对象时,您只需返回对已预先分配的现有对象的引用。

由于引用引用的是同一对象,因此id()直接发布或检查其上的身份is将返回完全相同的内容。

但是,什么时候分配它们?

_PyLong_Init Python 初始化期间,将很乐意进入for循环为您执行此操作:

for (ival = -NSMALLNEGINTS; ival <  NSMALLPOSINTS; ival++, v++) {

查看源代码以阅读循环体!

希望我的解释使您现在对C的认识清楚(很明显是故意的)。


但是,257 is 257?这是怎么回事?

这实际上更容易解释,我已经尝试过这样做;这是由于Python将这个交互式语句作为一个单独的块执行:

>>> 257 is 257

在编译此语句期间,CPython将看到您有两个匹配的文字,并将使用相同的PyLongObject表示形式257。如果您自己进行编译并检查其内容,则可以看到以下内容:

>>> codeObj = compile("257 is 257", "blah!", "exec")
>>> codeObj.co_consts
(257, None)

当CPython进行操作时,现在将要加载完全相同的对象:

>>> import dis
>>> dis.dis(codeObj)
  1           0 LOAD_CONST               0 (257)   # dis
              3 LOAD_CONST               0 (257)   # dis again
              6 COMPARE_OP               8 (is)

所以is会回来的True

I’m late but, you want some source with your answer? I’ll try and word this in an introductory manner so more folks can follow along.


A good thing about CPython is that you can actually see the source for this. I’m going to use links for the 3.5 release, but finding the corresponding 2.x ones is trivial.

In CPython, the C-API function that handles creating a new int object is PyLong_FromLong(long v). The description for this function is:

The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object. So it should be possible to change the value of 1. I suspect the behaviour of Python in this case is undefined. :-)

(My italics)

Don’t know about you but I see this and think: Let’s find that array!

If you haven’t fiddled with the C code implementing CPython you should; everything is pretty organized and readable. For our case, we need to look in the Objects subdirectory of the main source code directory tree.

PyLong_FromLong deals with long objects so it shouldn’t be hard to deduce that we need to peek inside longobject.c. After looking inside you might think things are chaotic; they are, but fear not, the function we’re looking for is chilling at line 230 waiting for us to check it out. It’s a smallish function so the main body (excluding declarations) is easily pasted here:

PyObject *
PyLong_FromLong(long ival)
{
    // omitting declarations

    CHECK_SMALL_INT(ival);

    if (ival < 0) {
        /* negate: cant write this as abs_ival = -ival since that
           invokes undefined behaviour when ival is LONG_MIN */
        abs_ival = 0U-(unsigned long)ival;
        sign = -1;
    }
    else {
        abs_ival = (unsigned long)ival;
    }

    /* Fast path for single-digit ints */
    if (!(abs_ival >> PyLong_SHIFT)) {
        v = _PyLong_New(1);
        if (v) {
            Py_SIZE(v) = sign;
            v->ob_digit[0] = Py_SAFE_DOWNCAST(
                abs_ival, unsigned long, digit);
        }
        return (PyObject*)v; 
}

Now, we’re no C master-code-haxxorz but we’re also not dumb, we can see that CHECK_SMALL_INT(ival); peeking at us all seductively; we can understand it has something to do with this. Let’s check it out:

#define CHECK_SMALL_INT(ival) \
    do if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS) { \
        return get_small_int((sdigit)ival); \
    } while(0)

So it’s a macro that calls function get_small_int if the value ival satisfies the condition:

if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS)

So what are NSMALLNEGINTS and NSMALLPOSINTS? Macros! Here they are:

#ifndef NSMALLPOSINTS
#define NSMALLPOSINTS           257
#endif
#ifndef NSMALLNEGINTS
#define NSMALLNEGINTS           5
#endif

So our condition is if (-5 <= ival && ival < 257) call get_small_int.

Next let’s look at get_small_int in all its glory (well, we’ll just look at its body because that’s where the interesting things are):

PyObject *v;
assert(-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS);
v = (PyObject *)&small_ints[ival + NSMALLNEGINTS];
Py_INCREF(v);

Okay, declare a PyObject, assert that the previous condition holds and execute the assignment:

v = (PyObject *)&small_ints[ival + NSMALLNEGINTS];

small_ints looks a lot like that array we’ve been searching for, and it is! We could’ve just read the damn documentation and we would’ve know all along!:

/* Small integers are preallocated in this array so that they
   can be shared.
   The integers that are preallocated are those in the range
   -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).
*/
static PyLongObject small_ints[NSMALLNEGINTS + NSMALLPOSINTS];

So yup, this is our guy. When you want to create a new int in the range [NSMALLNEGINTS, NSMALLPOSINTS) you’ll just get back a reference to an already existing object that has been preallocated.

Since the reference refers to the same object, issuing id() directly or checking for identity with is on it will return exactly the same thing.

But, when are they allocated??

During initialization in _PyLong_Init Python will gladly enter in a for loop do do this for you:

for (ival = -NSMALLNEGINTS; ival <  NSMALLPOSINTS; ival++, v++) {

Check out the source to read the loop body!

I hope my explanation has made you C things clearly now (pun obviously intented).


But, 257 is 257? What’s up?

This is actually easier to explain, and I have attempted to do so already; it’s due to the fact that Python will execute this interactive statement as a single block:

>>> 257 is 257

During complilation of this statement, CPython will see that you have two matching literals and will use the same PyLongObject representing 257. You can see this if you do the compilation yourself and examine its contents:

>>> codeObj = compile("257 is 257", "blah!", "exec")
>>> codeObj.co_consts
(257, None)

When CPython does the operation, it’s now just going to load the exact same object:

>>> import dis
>>> dis.dis(codeObj)
  1           0 LOAD_CONST               0 (257)   # dis
              3 LOAD_CONST               0 (257)   # dis again
              6 COMPARE_OP               8 (is)

So is will return True.


回答 4

您可以检入源文件intobject.c,Python会缓存小整数以提高效率。每次创建对小整数的引用时,都是在引用缓存的小整数,而不是新对象。257不是一个小整数,因此它被计算为另一个对象。

最好==用于此目的。

As you can check in source file intobject.c, Python caches small integers for efficiency. Every time you create a reference to a small integer, you are referring the cached small integer, not a new object. 257 is not an small integer, so it is calculated as a different object.

It is better to use == for that purpose.


回答 5

我认为您的假设是正确的。实验id(对象的身份):

In [1]: id(255)
Out[1]: 146349024

In [2]: id(255)
Out[2]: 146349024

In [3]: id(257)
Out[3]: 146802752

In [4]: id(257)
Out[4]: 148993740

In [5]: a=255

In [6]: b=255

In [7]: c=257

In [8]: d=257

In [9]: id(a), id(b), id(c), id(d)
Out[9]: (146349024, 146349024, 146783024, 146804020)

看来数字<= 255被当作​​文字,而上面的任何东西都被不同地对待!

I think your hypotheses is correct. Experiment with id (identity of object):

In [1]: id(255)
Out[1]: 146349024

In [2]: id(255)
Out[2]: 146349024

In [3]: id(257)
Out[3]: 146802752

In [4]: id(257)
Out[4]: 148993740

In [5]: a=255

In [6]: b=255

In [7]: c=257

In [8]: d=257

In [9]: id(a), id(b), id(c), id(d)
Out[9]: (146349024, 146349024, 146783024, 146804020)

It appears that numbers <= 255 are treated as literals and anything above is treated differently!


回答 6

对于整数,字符串或日期时间之类的不可变值对象,对象标识并不是特别有用。最好考虑平等。身份本质上是值对象的实现细节-由于它们是不可变的,因此对同一个对象或多个对象具有多个引用之间没有有效的区别。

For immutable value objects, like ints, strings or datetimes, object identity is not especially useful. It’s better to think about equality. Identity is essentially an implementation detail for value objects – since they’re immutable, there’s no effective difference between having multiple refs to the same object or multiple objects.


回答 7

现有答案中都没有指出另一个问题。允许Python合并任何两个不可变的值,并且预先创建的小int值不是发生这种情况的唯一方法。永远不能保证 Python实现会做到这一点,但他们所做的不仅仅只是小整数。


一方面,还有一些其他预先创建的值,例如empty tuplestrbytes和一些短字符串(在CPython 3.6中,这是256个单字符Latin-1字符串)。例如:

>>> a = ()
>>> b = ()
>>> a is b
True

而且,即使是非预先创建的值也可以相同。考虑以下示例:

>>> c = 257
>>> d = 257
>>> c is d
False
>>> e, f = 258, 258
>>> e is f
True

这不限于int值:

>>> g, h = 42.23e100, 42.23e100
>>> g is h
True

显然,CPython没有为预先创建float42.23e100。那么,这是怎么回事?

CPython的编译器将合并一些已知不变类型等的恒定值intfloatstrbytes,在相同的编译单元。对于一个模块,整个模块是一个编译单元,但是在交互式解释器中,每个语句都是一个单独的编译单元。由于cd是在单独的语句中定义的,因此不会合并它们的值。由于ef是在同一条语句中定义的,因此将合并它们的值。


您可以通过分解字节码来查看发生了什么。尝试定义一个执行该操作的函数,e, f = 128, 128然后对其进行调用dis.dis,您将看到只有一个常数值(128, 128)

>>> def f(): i, j = 258, 258
>>> dis.dis(f)
  1           0 LOAD_CONST               2 ((128, 128))
              2 UNPACK_SEQUENCE          2
              4 STORE_FAST               0 (i)
              6 STORE_FAST               1 (j)
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
>>> f.__code__.co_consts
(None, 128, (128, 128))
>>> id(f.__code__.co_consts[1], f.__code__.co_consts[2][0], f.__code__.co_consts[2][1])
4305296480, 4305296480, 4305296480

您可能会注意到,128即使字节码实际上并未使用编译器,编译器也已将其存储为常量,这使您了解了CPython编译器所做的优化很少。这意味着(非空)元组实际上不会最终合并:

>>> k, l = (1, 2), (1, 2)
>>> k is l
False

把在一个函数,dis它,看看co_consts-there是一个12两个(1, 2)共享相同的元组12,但不相同,并且((1, 2), (1, 2))具有两个不同的元组相等的元组。


CPython还有另外一个优化:字符串实习。与编译器常量折叠不同,这不限于源代码文字:

>>> m = 'abc'
>>> n = 'abc'
>>> m is n
True

另一方面,它仅限于内部存储类型“ ascii compact”,“ compact”或“ legacy ready”str类型和字符串,并且在许多情况下,只有“ ascii compact”会被嵌入。


无论如何,不​​同实现之间,同一实现的版本之间,甚至同一实现的同一副本上运行相同代码的时间之间,关于值必须是,可能是或不能不同的规则有所不同。 。

有趣的是值得学习一个特定Python的规则。但是在代码中不值得依赖它们。唯一安全的规则是:

  • 不要编写假定两个相等但分别创建的不可变值相同的代码(不要使用x is y,请使用x == y
  • 不要编写假定两个相等但分别创建的不可变值不同的代码(不要使用x is not y,请使用x != y

或者,换句话说,仅用于is测试已记录的单例(如None)或仅在代码中的一个位置创建的单例(如_sentinel = object()成语)。

There’s another issue that isn’t pointed out in any of the existing answers. Python is allowed to merge any two immutable values, and pre-created small int values are not the only way this can happen. A Python implementation is never guaranteed to do this, but they all do it for more than just small ints.


For one thing, there are some other pre-created values, such as the empty tuple, str, and bytes, and some short strings (in CPython 3.6, it’s the 256 single-character Latin-1 strings). For example:

>>> a = ()
>>> b = ()
>>> a is b
True

But also, even non-pre-created values can be identical. Consider these examples:

>>> c = 257
>>> d = 257
>>> c is d
False
>>> e, f = 258, 258
>>> e is f
True

And this isn’t limited to int values:

>>> g, h = 42.23e100, 42.23e100
>>> g is h
True

Obviously, CPython doesn’t come with a pre-created float value for 42.23e100. So, what’s going on here?

The CPython compiler will merge constant values of some known-immutable types like int, float, str, bytes, in the same compilation unit. For a module, the whole module is a compilation unit, but at the interactive interpreter, each statement is a separate compilation unit. Since c and d are defined in separate statements, their values aren’t merged. Since e and f are defined in the same statement, their values are merged.


You can see what’s going on by disassembling the bytecode. Try defining a function that does e, f = 128, 128 and then calling dis.dis on it, and you’ll see that there’s a single constant value (128, 128)

>>> def f(): i, j = 258, 258
>>> dis.dis(f)
  1           0 LOAD_CONST               2 ((128, 128))
              2 UNPACK_SEQUENCE          2
              4 STORE_FAST               0 (i)
              6 STORE_FAST               1 (j)
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
>>> f.__code__.co_consts
(None, 128, (128, 128))
>>> id(f.__code__.co_consts[1], f.__code__.co_consts[2][0], f.__code__.co_consts[2][1])
4305296480, 4305296480, 4305296480

You may notice that the compiler has stored 128 as a constant even though it’s not actually used by the bytecode, which gives you an idea of how little optimization CPython’s compiler does. Which means that (non-empty) tuples actually don’t end up merged:

>>> k, l = (1, 2), (1, 2)
>>> k is l
False

Put that in a function, dis it, and look at the co_consts—there’s a 1 and a 2, two (1, 2) tuples that share the same 1 and 2 but are not identical, and a ((1, 2), (1, 2)) tuple that has the two distinct equal tuples.


There’s one more optimization that CPython does: string interning. Unlike compiler constant folding, this isn’t restricted to source code literals:

>>> m = 'abc'
>>> n = 'abc'
>>> m is n
True

On the other hand, it is limited to the str type, and to strings of internal storage kind “ascii compact”, “compact”, or “legacy ready”, and in many cases only “ascii compact” will get interned.


At any rate, the rules for what values must be, might be, or cannot be distinct vary from implementation to implementation, and between versions of the same implementation, and maybe even between runs of the same code on the same copy of the same implementation.

It can be worth learning the rules for one specific Python for the fun of it. But it’s not worth relying on them in your code. The only safe rule is:

  • Do not write code that assumes two equal but separately-created immutable values are identical (don’t use x is y, use x == y)
  • Do not write code that assumes two equal but separately-created immutable values are distinct (don’t use x is not y, use x != y)

Or, in other words, only use is to test for the documented singletons (like None) or that are only created in one place in the code (like the _sentinel = object() idiom).


回答 8

is 身份相等运算符(功能类似于id(a) == id(b));只是两个相等的数字不一定是同一对象。出于性能原因,一些小整数正好会被记住,因此它们往往是相同的(因为它们是不可变的,因此可以这样做)。

===另一方面,PHP的运算符被描述为检查相等性和类型:x == y and type(x) == type(y)根据Paulo Freitas的评论。这足以满足通用数,但不同于以荒谬方式is定义的类__eq__

class Unequal:
    def __eq__(self, other):
        return False

对于“内置”类,PHP显然允许相同的东西(我指的是在C级实现,而不是在PHP中实现)。计时器对象可能有点荒谬,它每次用作数字时,其值都不同。相当为什么要模拟Visual Basic,Now而不是显示它是带有time.time()我不知道。

Greg Hewgill(OP)发表了一条澄清的评论:“我的目标是比较对象标识,而不是价值相等。除了数字,我希望对象标识与价值相等相同。”

这将有另一个答案,因为我们必须将事物归类为数字,以选择是否与==或进行比较isCPython定义数字协议,包括PyNumber_Check,但这不能从Python本身访问。

我们可以尝试使用isinstance所有已知的数字类型,但这不可避免地是不完整的。类型模块包含一个StringTypes列表,但没有NumberTypes。从Python 2.6开始,内置数字类具有基类numbers.Number,但存在相同的问题:

import numpy, numbers
assert not issubclass(numpy.int16,numbers.Number)
assert issubclass(int,numbers.Number)

顺便说一句,NumPy将产生低数字的单独实例。

我实际上不知道这个问题的答案。我想从理论上讲可以使用ctypes进行调用PyNumber_Check,但是即使该函数也已经受到参数,并且肯定不是可移植的。我们只需要对我们目前要测试的内容有所保留。

最后,此问题源于Python最初没有类型树,其谓词如Scheme number?Haskell的 类型类 Numis检查对象身份,而不是值相等。PHP的历史也很悠久,===显然is在PHP5中的对象上起作用,而在PHP4中没有。这就是跨语言(包括一种语言的版本)之间转移的越来越大的痛苦。

is is the identity equality operator (functioning like id(a) == id(b)); it’s just that two equal numbers aren’t necessarily the same object. For performance reasons some small integers happen to be memoized so they will tend to be the same (this can be done since they are immutable).

PHP’s === operator, on the other hand, is described as checking equality and type: x == y and type(x) == type(y) as per Paulo Freitas’ comment. This will suffice for common numbers, but differ from is for classes that define __eq__ in an absurd manner:

class Unequal:
    def __eq__(self, other):
        return False

PHP apparently allows the same thing for “built-in” classes (which I take to mean implemented at C level, not in PHP). A slightly less absurd use might be a timer object, which has a different value every time it’s used as a number. Quite why you’d want to emulate Visual Basic’s Now instead of showing that it is an evaluation with time.time() I don’t know.

Greg Hewgill (OP) made one clarifying comment “My goal is to compare object identity, rather than equality of value. Except for numbers, where I want to treat object identity the same as equality of value.”

This would have yet another answer, as we have to categorize things as numbers or not, to select whether we compare with == or is. CPython defines the number protocol, including PyNumber_Check, but this is not accessible from Python itself.

We could try to use isinstance with all the number types we know of, but this would inevitably be incomplete. The types module contains a StringTypes list but no NumberTypes. Since Python 2.6, the built in number classes have a base class numbers.Number, but it has the same problem:

import numpy, numbers
assert not issubclass(numpy.int16,numbers.Number)
assert issubclass(int,numbers.Number)

By the way, NumPy will produce separate instances of low numbers.

I don’t actually know an answer to this variant of the question. I suppose one could theoretically use ctypes to call PyNumber_Check, but even that function has been debated, and it’s certainly not portable. We’ll just have to be less particular about what we test for now.

In the end, this issue stems from Python not originally having a type tree with predicates like Scheme’s number?, or Haskell’s type class Num. is checks object identity, not value equality. PHP has a colorful history as well, where === apparently behaves as is only on objects in PHP5, but not PHP4. Such are the growing pains of moving across languages (including versions of one).


回答 9

字符串也会发生这种情况:

>>> s = b = 'somestr'
>>> s == b, s is b, id(s), id(b)
(True, True, 4555519392, 4555519392)

现在一切似乎都很好。

>>> s = 'somestr'
>>> b = 'somestr'
>>> s == b, s is b, id(s), id(b)
(True, True, 4555519392, 4555519392)

这也是预期的。

>>> s1 = b1 = 'somestrdaasd ad ad asd as dasddsg,dlfg ,;dflg, dfg a'
>>> s1 == b1, s1 is b1, id(s1), id(b1)
(True, True, 4555308080, 4555308080)

>>> s1 = 'somestrdaasd ad ad asd as dasddsg,dlfg ,;dflg, dfg a'
>>> b1 = 'somestrdaasd ad ad asd as dasddsg,dlfg ,;dflg, dfg a'
>>> s1 == b1, s1 is b1, id(s1), id(b1)
(True, False, 4555308176, 4555308272)

现在那是出乎意料的。

It also happens with strings:

>>> s = b = 'somestr'
>>> s == b, s is b, id(s), id(b)
(True, True, 4555519392, 4555519392)

Now everything seems fine.

>>> s = 'somestr'
>>> b = 'somestr'
>>> s == b, s is b, id(s), id(b)
(True, True, 4555519392, 4555519392)

That’s expected too.

>>> s1 = b1 = 'somestrdaasd ad ad asd as dasddsg,dlfg ,;dflg, dfg a'
>>> s1 == b1, s1 is b1, id(s1), id(b1)
(True, True, 4555308080, 4555308080)

>>> s1 = 'somestrdaasd ad ad asd as dasddsg,dlfg ,;dflg, dfg a'
>>> b1 = 'somestrdaasd ad ad asd as dasddsg,dlfg ,;dflg, dfg a'
>>> s1 == b1, s1 is b1, id(s1), id(b1)
(True, False, 4555308176, 4555308272)

Now that’s unexpected.


回答 10

Python 3.8的新增功能:Python行为的变化

现在,当将身份检查(和 )与某些类型的文字(例如字符串,整数)一起使用时,编译器会生成SyntaxWarning。这些通常在CPython中偶然地起作用,但是语言规范无法保证。该警告建议用户改用相等性测试( 和)。isis not==!=

What’s New In Python 3.8: Changes in Python behavior:

The compiler now produces a SyntaxWarning when identity checks (is and is not) are used with certain types of literals (e.g. strings, ints). These can often work by accident in CPython, but are not guaranteed by the language spec. The warning advises users to use equality tests (== and !=) instead.