问题:为什么使用’==’或’is’比较字符串有时会产生不同的结果?
我有一个Python程序,其中将两个变量设置为value 'public'
。在条件表达式我有比较var1 is var2
其失败,但如果我把它改为var1 == var2
返回True
。
现在,如果我打开Python解释器并进行相同的“是”比较,则成功。
>>> s1 = 'public'
>>> s2 = 'public'
>>> s2 is s1
True
我在这里想念什么?
I’ve got a Python program where two variables are set to the value 'public'
. In a conditional expression I have the comparison var1 is var2
which fails, but if I change it to var1 == var2
it returns True
.
Now if I open my Python interpreter and do the same “is” comparison, it succeeds.
>>> s1 = 'public'
>>> s2 = 'public'
>>> s2 is s1
True
What am I missing here?
回答 0
is
是身份测试,==
是平等测试。您的代码中发生的情况将在解释器中进行模拟,如下所示:
>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False
所以,难怪他们不一样吧?
换句话说:is
是id(a) == id(b)
is
is identity testing, ==
is equality testing. what happens in your code would be emulated in the interpreter like this:
>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False
so, no wonder they’re not the same, right?
In other words: is
is the id(a) == id(b)
回答 1
这里的其他答案是正确的:is
用于身份比较,而==
用于相等比较。由于您关心的是相等性(两个字符串应包含相同的字符),因此在这种情况下,is
运算符完全是错误的,您应该==
改用。
is
交互工作的原因是(大多数)字符串文字默认情况下是interned。从维基百科:
插入的字符串可加快字符串比较的速度,这有时是严重依赖带有字符串键的哈希表的应用程序(例如编译器和动态编程语言运行时)的性能瓶颈。在不进行实习的情况下,检查两个不同的字符串是否相等涉及检查两个字符串的每个字符。这很慢,原因有几个:字符串的长度固有地为O(n);它通常需要从多个内存区域进行读取,这需要时间。并且读取将填满处理器缓存,这意味着可用于其他需求的缓存较少。对于插入的字符串,在原始的内部操作之后,一个简单的对象身份测试就足够了;这通常被实现为指针相等性测试,
因此,当程序中有两个具有相同值的字符串文字(在程序源代码中逐字键入的单词,并用引号引起来)时,Python编译器将自动内插字符串,使它们都存储在相同的位置内存位置。(请注意,这并不总是会发生,并且发生这种情况的规则非常复杂,因此请不要在生产代码中依赖此行为!)
由于在您的交互式会话中,两个字符串实际上都存储在相同的存储位置中,因此它们具有相同的标识,因此is
操作符将按预期工作。但是,如果您通过其他方法构造一个字符串(即使该字符串包含完全相同的字符),则该字符串可能相等,但它不是同一字符串 -也就是说,它具有不同的标识,因为它是存储在内存中的其他位置。
Other answers here are correct: is
is used for identity comparison, while ==
is used for equality comparison. Since what you care about is equality (the two strings should contain the same characters), in this case the is
operator is simply wrong and you should be using ==
instead.
The reason is
works interactively is that (most) string literals are interned by default. From Wikipedia:
Interned strings speed up string
comparisons, which are sometimes a
performance bottleneck in applications
(such as compilers and dynamic
programming language runtimes) that
rely heavily on hash tables with
string keys. Without interning,
checking that two different strings
are equal involves examining every
character of both strings. This is
slow for several reasons: it is
inherently O(n) in the length of the
strings; it typically requires reads
from several regions of memory, which
take time; and the reads fills up the
processor cache, meaning there is less
cache available for other needs. With
interned strings, a simple object
identity test suffices after the
original intern operation; this is
typically implemented as a pointer
equality test, normally just a single
machine instruction with no memory
reference at all.
So, when you have two string literals (words that are literally typed into your program source code, surrounded by quotation marks) in your program that have the same value, the Python compiler will automatically intern the strings, making them both stored at the same memory location. (Note that this doesn’t always happen, and the rules for when this happens are quite convoluted, so please don’t rely on this behavior in production code!)
Since in your interactive session both strings are actually stored in the same memory location, they have the same identity, so the is
operator works as expected. But if you construct a string by some other method (even if that string contains exactly the same characters), then the string may be equal, but it is not the same string — that is, it has a different identity, because it is stored in a different place in memory.
回答 2
该is
关键字是对象标识一个测试而==
是一个值比较。
如果使用is
,则当且仅当对象是同一对象时,结果才为true。但是,==
只要对象的值相同,就为真。
The is
keyword is a test for object identity while ==
is a value comparison.
If you use is
, the result will be true if and only if the object is the same object. However, ==
will be true any time the values of the object are the same.
回答 3
最后要注意的一点是,您可以使用该sys.intern
函数来确保获得对相同字符串的引用:
>>> from sys import intern
>>> a = intern('a')
>>> a2 = intern('a')
>>> a is a2
True
如上所述,您不应该is
用来确定字符串的相等性。但这可能有助于了解您是否有某种奇怪的要求要使用is
。
请注意,该intern
函数以前是Python 2的内置函数,但已移至sys
Python 3 的模块中。
One last thing to note, you may use the sys.intern
function to ensure that you’re getting a reference to the same string:
>>> from sys import intern
>>> a = intern('a')
>>> a2 = intern('a')
>>> a is a2
True
As pointed out above, you should not be using is
to determine equality of strings. But this may be helpful to know if you have some kind of weird requirement to use is
.
Note that the intern
function used to be a builtin on Python 2 but was moved to the sys
module in Python 3.
回答 4
is
是身份测试,==
是平等测试。这意味着is
检查两种事物是相同的还是等同的。
假设您有一个简单的person
对象。如果它的名字叫“ Jack”并且是“ 23”岁,则相当于另一个23岁的Jack,但不是同一个人。
class Person(object):
def __init__(self, name, age):
self.name = name
self.age = age
def __eq__(self, other):
return self.name == other.name and self.age == other.age
jack1 = Person('Jack', 23)
jack2 = Person('Jack', 23)
jack1 == jack2 #True
jack1 is jack2 #False
他们是同一年龄,但他们不是同一个人。一个字符串可能等效于另一个,但它不是同一对象。
is
is identity testing, ==
is equality testing. What this means is that is
is a way to check whether two things are the same things, or just equivalent.
Say you’ve got a simple person
object. If it is named ‘Jack’ and is ’23’ years old, it’s equivalent to another 23yr old Jack, but its not the same person.
class Person(object):
def __init__(self, name, age):
self.name = name
self.age = age
def __eq__(self, other):
return self.name == other.name and self.age == other.age
jack1 = Person('Jack', 23)
jack2 = Person('Jack', 23)
jack1 == jack2 #True
jack1 is jack2 #False
They’re the same age, but they’re not the same instance of person. A string might be equivalent to another, but it’s not the same object.
回答 5
回答 6
如果不确定自己在做什么,请使用’==’。如果您对此有更多了解,可以对已知对象(例如“无”)使用“ is”。
否则,您将最终想知道为什么事情不起作用以及为什么会发生这种情况:
>>> a = 1
>>> b = 1
>>> b is a
True
>>> a = 6000
>>> b = 6000
>>> b is a
False
我什至不确定在不同的python版本/实现之间是否可以保证某些事情保持不变。
If you’re not sure what you’re doing, use the ‘==’.
If you have a little more knowledge about it you can use ‘is’ for known objects like ‘None’.
Otherwise you’ll end up wondering why things doesn’t work and why this happens:
>>> a = 1
>>> b = 1
>>> b is a
True
>>> a = 6000
>>> b = 6000
>>> b is a
False
I’m not even sure if some things are guaranteed to stay the same between different python versions/implementations.
回答 7
根据我在python中的有限经验,is
用于比较两个对象以查看它们是否是同一对象,而不是两个具有相同值的不同对象。 ==
用于确定值是否相同。
这是一个很好的例子:
>>> s1 = u'public'
>>> s2 = 'public'
>>> s1 is s2
False
>>> s1 == s2
True
s1
是unicode字符串,并且s2
是普通字符串。它们不是同一类型,但是具有相同的值。
From my limited experience with python, is
is used to compare two objects to see if they are the same object as opposed to two different objects with the same value. ==
is used to determine if the values are identical.
Here is a good example:
>>> s1 = u'public'
>>> s2 = 'public'
>>> s1 is s2
False
>>> s1 == s2
True
s1
is a unicode string, and s2
is a normal string. They are not the same type, but are the same value.
回答 8
我认为这与以下事实有关:当“ is”比较结果为false时,将使用两个不同的对象。如果评估结果为true,则表示内部使用的是完全相同的对象,而不是创建一个新对象,这可能是因为您在不到2秒的时间内创建了它们,并且在优化和使用相同的对象。
这就是为什么您应该使用相等运算符==
而不是is
来比较字符串对象的值的原因。
>>> s = 'one'
>>> s2 = 'two'
>>> s is s2
False
>>> s2 = s2.replace('two', 'one')
>>> s2
'one'
>>> s2 is s
False
>>>
在此示例中,我创建了s2,它是一个以前等于’one’的不同字符串对象,但它与并不相同s
,因为解释器没有使用相同的对象,因为我最初并未将其分配给’one’,如果我有的话,会让他们成为同一个对象。
I think it has to do with the fact that, when the ‘is’ comparison evaluates to false, two distinct objects are used. If it evaluates to true, that means internally it’s using the same exact object and not creating a new one, possibly because you created them within a fraction of 2 or so seconds and because there isn’t a large time gap in between it’s optimized and uses the same object.
This is why you should be using the equality operator ==
, not is
, to compare the value of a string object.
>>> s = 'one'
>>> s2 = 'two'
>>> s is s2
False
>>> s2 = s2.replace('two', 'one')
>>> s2
'one'
>>> s2 is s
False
>>>
In this example, I made s2, which was a different string object previously equal to ‘one’ but it is not the same object as s
, because the interpreter did not use the same object as I did not initially assign it to ‘one’, if I had it would have made them the same object.
回答 9
我相信这被称为“ interned”字符串。在优化模式下,Python会这样做,Java也会这样做,C和C ++也会这样做。
如果您使用两个相同的字符串,而不是通过创建两个字符串对象来浪费内存,则具有相同内容的所有已嵌入字符串都指向相同的内存。
这导致Python“ is”运算符返回True,因为两个内容相同的字符串指向同一个字符串对象。这也将在Java和C语言中发生。
但是,这仅对节省内存有用。您不能依靠它来测试字符串是否相等,因为各种解释器和编译器以及JIT引擎不能总是这样做。
I believe that this is known as “interned” strings. Python does this, so does Java, and so do C and C++ when compiling in optimized modes.
If you use two identical strings, instead of wasting memory by creating two string objects, all interned strings with the same contents point to the same memory.
This results in the Python “is” operator returning True because two strings with the same contents are pointing at the same string object. This will also happen in Java and in C.
This is only useful for memory savings though. You cannot rely on it to test for string equality, because the various interpreters and compilers and JIT engines cannot always do it.
回答 10
我回答了这个问题,尽管这个问题已经很老了,因为上面没有答案引用了语言参考
实际上,is运算符检查身份,而==运算符检查是否相等,
从语言参考:
类型影响对象行为的几乎所有方面。甚至对象身份的重要性在某种意义上也受到影响:对于不可变类型,计算新值的操作实际上可能返回对具有相同类型和值的任何现有对象的引用,而对于可变对象,则不允许这样做。例如,在a = 1之后;b = 1,取决于实现,a和b可以或可以不使用值1引用同一对象,但是在c = []之后;d = [],保证c和d引用两个不同的,唯一的,新创建的空列表。(请注意,c = d = []将相同的对象分配给c和d。)
因此,根据上述陈述,我们可以推断出,使用“ is”检查时,不可变类型的字符串可能会失败,而使用“ is”检查时,则可能会检查成功
同样适用于int,tuple也是不可变的类型
I am answering the question even though the question is to old because no answers above quotes the language reference
Actually the is operator checks for identity and == operator checks for equality,
From Language Reference:
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)
so from above statement we can infer that the strings which is an immutable type may fail when checked with “is” and may checked succeed when checked with “is”
The same applies for int,tuple which are also immutable types
回答 11
该==
运营商测试值等价。该is
运营商的测试对象的身份,Python的测试是否两者实际上是同一个对象(即住在内存中的地址相同)。
>>> a = 'banana'
>>> b = 'banana'
>>> a is b
True
在此例如,Python只创建了一个字符串对象,都a
和b
参照它。原因是Python在内部缓存和重用了一些字符串作为优化,实际上在内存中只有一个字符串“ banana”,由a和b共享;要触发正常行为,您需要使用更长的字符串:
>>> a = 'a longer banana'
>>> b = 'a longer banana'
>>> a == b, a is b
(True, False)
创建两个列表时,将获得两个对象:
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a is b
False
在这种情况下,我们可以说这两个列表是等效的,因为它们具有相同的元素,但是不相同,因为它们不是相同的对象。如果两个对象相同,则它们也是等效的,但是如果它们相等,则它们不一定相同。
如果a
引用对象,则分配b = a
,然后,则两个变量都引用同一个对象:
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
The ==
operator test value equivalence. The is
operator tests object identity, Python tests whether the two are really the same object(i.e., live at the same address in memory).
>>> a = 'banana'
>>> b = 'banana'
>>> a is b
True
In this example, Python only created one string object, and both a
and b
refers to it. The reason is that Python internally caches and reuses some strings as an optimization, there really is just a string ‘banana’ in memory, shared by a and b; To trigger the normal behavior, you need to use longer strings:
>>> a = 'a longer banana'
>>> b = 'a longer banana'
>>> a == b, a is b
(True, False)
When you create two lists, you get two objects:
>>> a = [1, 2, 3]
>>> b = [1, 2, 3]
>>> a is b
False
In this case we would say that the two lists are equivalent, because they have the same elements, but not identical, because they are not the same object. If two objects are identical, they are also equivalent, but if they are equivalent, they are not necessarily identical.
If a
refers to an object and you assign b = a
, then both variables refer to the same object:
>>> a = [1, 2, 3]
>>> b = a
>>> b is a
True
回答 12
is
将比较内存位置。它用于对象级比较。
==
将比较程序中的变量。用于在值级别进行检查。
is
检查地址级别是否相等
==
检查价值水平是否相等
is
will compare the memory location. It is used for object-level comparison.
==
will compare the variables in the program. It is used for checking at a value level.
is
checks for address level equivalence
==
checks for value level equivalence
回答 13
is
是身份测试,==
是相等性测试(请参阅Python文档)。
在大多数情况下,如果a is b
,则a == b
。但是也有exceptions,例如:
>>> nan = float('nan')
>>> nan is nan
True
>>> nan == nan
False
因此,您只能is
用于身份测试,而不能用于相等性测试。
is
is identity testing, ==
is equality testing (see Python Documentation).
In most cases, if a is b
, then a == b
. But there are exceptions, for example:
>>> nan = float('nan')
>>> nan is nan
True
>>> nan == nan
False
So, you can only use is
for identity tests, never equality tests.