Python是否具有字符串“包含”子字符串方法?

问题:Python是否具有字符串“包含”子字符串方法?

我在寻找Python中的string.containsor string.indexof方法。

我想要做:

if not somestring.contains("blah"):
   continue

I’m looking for a string.contains or string.indexof method in Python.

I want to do:

if not somestring.contains("blah"):
   continue

回答 0

您可以使用in运算符

if "blah" not in somestring: 
    continue

You can use the in operator:

if "blah" not in somestring: 
    continue

回答 1

如果只是子字符串搜索,则可以使用string.find("substring")

你必须与小心一点findindexin虽然,因为它们是字符串搜索。换句话说,这是:

s = "This be a string"
if s.find("is") == -1:
    print("No 'is' here!")
else:
    print("Found 'is' in the string.")

它将打印Found 'is' in the string.类似,if "is" in s:结果为True。这可能是您想要的,也可能不是。

If it’s just a substring search you can use string.find("substring").

You do have to be a little careful with find, index, and in though, as they are substring searches. In other words, this:

s = "This be a string"
if s.find("is") == -1:
    print("No 'is' here!")
else:
    print("Found 'is' in the string.")

It would print Found 'is' in the string. Similarly, if "is" in s: would evaluate to True. This may or may not be what you want.


回答 2

Python是否有包含子字符串方法的字符串?

是的,但是Python有一个比较运算符,您应该改用它,因为该语言打算使用它,而其他程序员则希望您使用它。该关键字是in,用作比较运算符:

>>> 'foo' in '**foo**'
True

原始问题要求的相反的(补码)是not in

>>> 'foo' not in '**foo**' # returns False
False

这在语义上not 'foo' in '**foo**'与之相同,但是它在语言中更具可读性,并作为可读性的改进而明确提供。

避免使用__contains__findindex

如所承诺的,这是contains方法:

str.__contains__('**foo**', 'foo')

返回True。您也可以从超字符串的实例调用此函数:

'**foo**'.__contains__('foo')

但是不要。以下划线开头的方法在语义上被视为私有。使用此功能的唯一原因是在扩展inand not in功能(例如,子类化str)时:

class NoisyString(str):
    def __contains__(self, other):
        print('testing if "{0}" in "{1}"'.format(other, self))
        return super(NoisyString, self).__contains__(other)

ns = NoisyString('a string with a substring inside')

现在:

>>> 'substring' in ns
testing if "substring" in "a string with a substring inside"
True

另外,请避免使用以下字符串方法:

>>> '**foo**'.index('foo')
2
>>> '**foo**'.find('foo')
2

>>> '**oo**'.find('foo')
-1
>>> '**oo**'.index('foo')

Traceback (most recent call last):
  File "<pyshell#40>", line 1, in <module>
    '**oo**'.index('foo')
ValueError: substring not found

其他语言可能没有直接测试子字符串的方法,因此您必须使用这些类型的方法,但是对于Python,使用in比较运算符会更加有效。

性能比较

我们可以比较实现同一目标的各种方式。

import timeit

def in_(s, other):
    return other in s

def contains(s, other):
    return s.__contains__(other)

def find(s, other):
    return s.find(other) != -1

def index(s, other):
    try:
        s.index(other)
    except ValueError:
        return False
    else:
        return True



perf_dict = {
'in:True': min(timeit.repeat(lambda: in_('superstring', 'str'))),
'in:False': min(timeit.repeat(lambda: in_('superstring', 'not'))),
'__contains__:True': min(timeit.repeat(lambda: contains('superstring', 'str'))),
'__contains__:False': min(timeit.repeat(lambda: contains('superstring', 'not'))),
'find:True': min(timeit.repeat(lambda: find('superstring', 'str'))),
'find:False': min(timeit.repeat(lambda: find('superstring', 'not'))),
'index:True': min(timeit.repeat(lambda: index('superstring', 'str'))),
'index:False': min(timeit.repeat(lambda: index('superstring', 'not'))),
}

现在我们看到使用in比其他方法快得多。进行等效操作的时间越少越好:

>>> perf_dict
{'in:True': 0.16450627865128808,
 'in:False': 0.1609668098178645,
 '__contains__:True': 0.24355481654697542,
 '__contains__:False': 0.24382793854783813,
 'find:True': 0.3067379407923454,
 'find:False': 0.29860888058124146,
 'index:True': 0.29647137792585454,
 'index:False': 0.5502287584545229}

Does Python have a string contains substring method?

Yes, but Python has a comparison operator that you should use instead, because the language intends its usage, and other programmers will expect you to use it. That keyword is in, which is used as a comparison operator:

>>> 'foo' in '**foo**'
True

The opposite (complement), which the original question asks for, is not in:

>>> 'foo' not in '**foo**' # returns False
False

This is semantically the same as not 'foo' in '**foo**' but it’s much more readable and explicitly provided for in the language as a readability improvement.

Avoid using __contains__, find, and index

As promised, here’s the contains method:

str.__contains__('**foo**', 'foo')

returns True. You could also call this function from the instance of the superstring:

'**foo**'.__contains__('foo')

But don’t. Methods that start with underscores are considered semantically private. The only reason to use this is when extending the in and not in functionality (e.g. if subclassing str):

class NoisyString(str):
    def __contains__(self, other):
        print('testing if "{0}" in "{1}"'.format(other, self))
        return super(NoisyString, self).__contains__(other)

ns = NoisyString('a string with a substring inside')

and now:

>>> 'substring' in ns
testing if "substring" in "a string with a substring inside"
True

Also, avoid the following string methods:

>>> '**foo**'.index('foo')
2
>>> '**foo**'.find('foo')
2

>>> '**oo**'.find('foo')
-1
>>> '**oo**'.index('foo')

Traceback (most recent call last):
  File "<pyshell#40>", line 1, in <module>
    '**oo**'.index('foo')
ValueError: substring not found

Other languages may have no methods to directly test for substrings, and so you would have to use these types of methods, but with Python, it is much more efficient to use the in comparison operator.

Performance comparisons

We can compare various ways of accomplishing the same goal.

import timeit

def in_(s, other):
    return other in s

def contains(s, other):
    return s.__contains__(other)

def find(s, other):
    return s.find(other) != -1

def index(s, other):
    try:
        s.index(other)
    except ValueError:
        return False
    else:
        return True



perf_dict = {
'in:True': min(timeit.repeat(lambda: in_('superstring', 'str'))),
'in:False': min(timeit.repeat(lambda: in_('superstring', 'not'))),
'__contains__:True': min(timeit.repeat(lambda: contains('superstring', 'str'))),
'__contains__:False': min(timeit.repeat(lambda: contains('superstring', 'not'))),
'find:True': min(timeit.repeat(lambda: find('superstring', 'str'))),
'find:False': min(timeit.repeat(lambda: find('superstring', 'not'))),
'index:True': min(timeit.repeat(lambda: index('superstring', 'str'))),
'index:False': min(timeit.repeat(lambda: index('superstring', 'not'))),
}

And now we see that using in is much faster than the others. Less time to do an equivalent operation is better:

>>> perf_dict
{'in:True': 0.16450627865128808,
 'in:False': 0.1609668098178645,
 '__contains__:True': 0.24355481654697542,
 '__contains__:False': 0.24382793854783813,
 'find:True': 0.3067379407923454,
 'find:False': 0.29860888058124146,
 'index:True': 0.29647137792585454,
 'index:False': 0.5502287584545229}

回答 3

if needle in haystack:正如@Michael所说,这是正常的用法-它依赖于in运算符,比方法调用更具可读性和速度。

如果您确实需要一个方法而不是一个运算符(例如,key=对一个非常特殊的类做一些奇怪的事情??),那就是'haystack'.__contains__。但是由于您的示例是用于的if,我想您并不是真的在说什么;-)。直接使用特殊方法不是很好的形式(既不可读也不高效),而是要通过委托给它们的运算符和内建函数使用它们。

if needle in haystack: is the normal use, as @Michael says — it relies on the in operator, more readable and faster than a method call.

If you truly need a method instead of an operator (e.g. to do some weird key= for a very peculiar sort…?), that would be 'haystack'.__contains__. But since your example is for use in an if, I guess you don’t really mean what you say;-). It’s not good form (nor readable, nor efficient) to use special methods directly — they’re meant to be used, instead, through the operators and builtins that delegate to them.


回答 4

in Python字符串和列表

下面是一些有用的示例,它们说明了该in方法:

"foo" in "foobar"
True

"foo" in "Foobar"
False

"foo" in "Foobar".lower()
True

"foo".capitalize() in "Foobar"
True

"foo" in ["bar", "foo", "foobar"]
True

"foo" in ["fo", "o", "foobar"]
False

["foo" in a for a in ["fo", "o", "foobar"]]
[False, False, True]

警告。列表是可迭代的,并且该in方法作用于可迭代的对象,而不仅仅是字符串。

in Python strings and lists

Here are a few useful examples that speak for themselves concerning the in method:

"foo" in "foobar"
True

"foo" in "Foobar"
False

"foo" in "Foobar".lower()
True

"foo".capitalize() in "Foobar"
True

"foo" in ["bar", "foo", "foobar"]
True

"foo" in ["fo", "o", "foobar"]
False

["foo" in a for a in ["fo", "o", "foobar"]]
[False, False, True]

Caveat. Lists are iterables, and the in method acts on iterables, not just strings.


回答 5

如果您满意"blah" in somestring但希望将其用作函数/方法调用,则可以执行此操作

import operator

if not operator.contains(somestring, "blah"):
    continue

在Python 操作符模块中,或多或少可以找到Python中的所有操作符,包括in

If you are happy with "blah" in somestring but want it to be a function/method call, you can probably do this

import operator

if not operator.contains(somestring, "blah"):
    continue

All operators in Python can be more or less found in the operator module including in.


回答 6

因此,显然,矢量方向比较没有类似之处。一个明显的Python方式是:

names = ['bob', 'john', 'mike']
any(st in 'bob and john' for st in names) 
>> True

any(st in 'mary and jane' for st in names) 
>> False

So apparently there is nothing similar for vector-wise comparison. An obvious Python way to do so would be:

names = ['bob', 'john', 'mike']
any(st in 'bob and john' for st in names) 
>> True

any(st in 'mary and jane' for st in names) 
>> False

回答 7

您可以使用y.count()

它将返回子字符串出现在字符串中的次数的整数值。

例如:

string.count("bah") >> 0
string.count("Hello") >> 1

You can use y.count().

It will return the integer value of the number of times a sub string appears in a string.

For example:

string.count("bah") >> 0
string.count("Hello") >> 1

回答 8

这是您的答案:

if "insert_char_or_string_here" in "insert_string_to_search_here":
    #DOSTUFF

检查是否为假:

if not "insert_char_or_string_here" in "insert_string_to_search_here":
    #DOSTUFF

要么:

if "insert_char_or_string_here" not in "insert_string_to_search_here":
    #DOSTUFF

Here is your answer:

if "insert_char_or_string_here" in "insert_string_to_search_here":
    #DOSTUFF

For checking if it is false:

if not "insert_char_or_string_here" in "insert_string_to_search_here":
    #DOSTUFF

OR:

if "insert_char_or_string_here" not in "insert_string_to_search_here":
    #DOSTUFF

回答 9

您可以使用正则表达式获取出现次数:

>>> import re
>>> print(re.findall(r'( |t)', to_search_in)) # searches for t or space
['t', ' ', 't', ' ', ' ']

You can use regular expressions to get the occurrences:

>>> import re
>>> print(re.findall(r'( |t)', to_search_in)) # searches for t or space
['t', ' ', 't', ' ', ' ']