检查另一个字符串中是否存在多个字符串

问题:检查另一个字符串中是否存在多个字符串

如何检查数组中的任何字符串是否在另一个字符串中?

喜欢:

a = ['a', 'b', 'c']
str = "a123"
if a in str:
  print "some of the strings found in str"
else:
  print "no strings found in str"

该代码行不通,只是为了展示我想要实现的目标。

How can I check if any of the strings in an array exists in another string?

Like:

a = ['a', 'b', 'c']
str = "a123"
if a in str:
  print "some of the strings found in str"
else:
  print "no strings found in str"

That code doesn’t work, it’s just to show what I want to achieve.


回答 0

您可以使用any

a_string = "A string is more than its parts!"
matches = ["more", "wholesome", "milk"]

if any(x in a_string for x in matches):

同样检查 找到了列表中的所有字符串,请使用all代替any

You can use any:

a_string = "A string is more than its parts!"
matches = ["more", "wholesome", "milk"]

if any(x in a_string for x in matches):

Similarly to check if all the strings from the list are found, use all instead of any.


回答 1

any()到目前为止,如果您想要的只是TrueFalse,那么这是最好的方法,但是如果您想具体了解哪些字符串匹配,则可以使用一些方法。

如果要进行第一个匹配(False默认为):

match = next((x for x in a if x in str), False)

如果要获得所有匹配项(包括重复项):

matches = [x for x in a if x in str]

如果要获取所有非重复的匹配项(不考虑顺序):

matches = {x for x in a if x in str}

如果要以正确的顺序获取所有非重复的匹配项:

matches = []
for x in a:
    if x in str and x not in matches:
        matches.append(x)

any() is by far the best approach if all you want is True or False, but if you want to know specifically which string/strings match, you can use a couple things.

If you want the first match (with False as a default):

match = next((x for x in a if x in str), False)

If you want to get all matches (including duplicates):

matches = [x for x in a if x in str]

If you want to get all non-duplicate matches (disregarding order):

matches = {x for x in a if x in str}

If you want to get all non-duplicate matches in the right order:

matches = []
for x in a:
    if x in str and x not in matches:
        matches.append(x)

回答 2

如果输入的字符串变长astr变长,则应小心。简单的解采用O(S *(A ^ 2)),其中S是的长度,str而A是中的所有字符串的长度之和a。为获得更快的解决方案,请查看用于字符串匹配的Aho-Corasick算法,该算法以线性时间O(S + A)运行。

You should be careful if the strings in a or str gets longer. The straightforward solutions take O(S*(A^2)), where S is the length of str and A is the sum of the lenghts of all strings in a. For a faster solution, look at Aho-Corasick algorithm for string matching, which runs in linear time O(S+A).


回答 3

只是为了增加一些多样性regex

import re

if any(re.findall(r'a|b|c', str, re.IGNORECASE)):
    print 'possible matches thanks to regex'
else:
    print 'no matches'

或者如果您的清单太长- any(re.findall(r'|'.join(a), str, re.IGNORECASE))

Just to add some diversity with regex:

import re

if any(re.findall(r'a|b|c', str, re.IGNORECASE)):
    print 'possible matches thanks to regex'
else:
    print 'no matches'

or if your list is too long – any(re.findall(r'|'.join(a), str, re.IGNORECASE))


回答 4

您需要迭代a的元素。

a = ['a', 'b', 'c']
str = "a123"
found_a_string = False
for item in a:    
    if item in str:
        found_a_string = True

if found_a_string:
    print "found a match"
else:
    print "no match found"

You need to iterate on the elements of a.

a = ['a', 'b', 'c']
str = "a123"
found_a_string = False
for item in a:    
    if item in str:
        found_a_string = True

if found_a_string:
    print "found a match"
else:
    print "no match found"

回答 5

jbernadas已经提到Aho-Corasick-Algorithm,以降低复杂性。

这是在Python中使用它的一种方法:

  1. 这里下载aho_corasick.py

  2. 将其与主Python文件放在同一目录中并命名 aho_corasick.py

  3. 使用以下代码尝试算法:

    from aho_corasick import aho_corasick #(string, keywords)
    
    print(aho_corasick(string, ["keyword1", "keyword2"]))

请注意,搜索区分大小写

jbernadas already mentioned the Aho-Corasick-Algorithm in order to reduce complexity.

Here is one way to use it in Python:

  1. Download aho_corasick.py from here

  2. Put it in the same directory as your main Python file and name it aho_corasick.py

  3. Try the alrorithm with the following code:

    from aho_corasick import aho_corasick #(string, keywords)
    
    print(aho_corasick(string, ["keyword1", "keyword2"]))
    

Note that the search is case-sensitive


回答 6

a = ['a', 'b', 'c']
str =  "a123"

a_match = [True for match in a if match in str]

if True in a_match:
  print "some of the strings found in str"
else:
  print "no strings found in str"
a = ['a', 'b', 'c']
str =  "a123"

a_match = [True for match in a if match in str]

if True in a_match:
  print "some of the strings found in str"
else:
  print "no strings found in str"

回答 7

这取决于上下文猜,如果你要检查,如单文字(任何一个字,E,W,..等)足够

original_word ="hackerearcth"
for 'h' in original_word:
      print("YES")

如果要检查original_word中的任何字符:请使用

if any(your_required in yourinput for your_required in original_word ):

如果要在那个original_word中输入所有想要的输入,请使用所有简单的输入

original_word = ['h', 'a', 'c', 'k', 'e', 'r', 'e', 'a', 'r', 't', 'h']
yourinput = str(input()).lower()
if all(requested_word in yourinput for requested_word in original_word):
    print("yes")

It depends on the context suppose if you want to check single literal like(any single word a,e,w,..etc) in is enough

original_word ="hackerearcth"
for 'h' in original_word:
      print("YES")

if you want to check any of the character among the original_word: make use of

if any(your_required in yourinput for your_required in original_word ):

if you want all the input you want in that original_word,make use of all simple

original_word = ['h', 'a', 'c', 'k', 'e', 'r', 'e', 'a', 'r', 't', 'h']
yourinput = str(input()).lower()
if all(requested_word in yourinput for requested_word in original_word):
    print("yes")

回答 8

关于如何获取String中所有列表元素的更多信息

a = ['a', 'b', 'c']
str = "a123" 
list(filter(lambda x:  x in str, a))

Just some more info on how to get all list elements availlable in String

a = ['a', 'b', 'c']
str = "a123" 
list(filter(lambda x:  x in str, a))

回答 9

一种出奇的快速方法是使用set

a = ['a', 'b', 'c']
str = "a123"
if set(a) & set(str):
    print("some of the strings found in str")
else:
    print("no strings found in str")

如果a不包含任何多个字符的值(在这种情况下,请使用上面any列出的值),则此方法有效。如果是这样,这是简单的指定为字符串:。aa = 'abc'

A surprisingly fast approach is to use set:

a = ['a', 'b', 'c']
str = "a123"
if set(a) & set(str):
    print("some of the strings found in str")
else:
    print("no strings found in str")

This works if a does not contain any multiple-character values (in which case use any as listed above). If so, it’s simpler to specify a as a string: a = 'abc'.


回答 10

flog = open('test.txt', 'r')
flogLines = flog.readlines()
strlist = ['SUCCESS', 'Done','SUCCESSFUL']
res = False
for line in flogLines:
     for fstr in strlist:
         if line.find(fstr) != -1:
            print('found') 
            res = True


if res:
    print('res true')
else: 
    print('res false')

flog = open('test.txt', 'r')
flogLines = flog.readlines()
strlist = ['SUCCESS', 'Done','SUCCESSFUL']
res = False
for line in flogLines:
     for fstr in strlist:
         if line.find(fstr) != -1:
            print('found') 
            res = True


if res:
    print('res true')
else: 
    print('res false')


回答 11

我会使用这种功能来提高速度:

def check_string(string, substring_list):
    for substring in substring_list:
        if substring in string:
            return True
    return False

I would use this kind of function for speed:

def check_string(string, substring_list):
    for substring in substring_list:
        if substring in string:
            return True
    return False

回答 12

data = "firstName and favoriteFood"
mandatory_fields = ['firstName', 'lastName', 'age']


# for each
for field in mandatory_fields:
    if field not in data:
        print("Error, missing req field {0}".format(field));

# still fine, multiple if statements
if ('firstName' not in data or 
    'lastName' not in data or
    'age' not in data):
    print("Error, missing a req field");

# not very readable, list comprehension
missing_fields = [x for x in mandatory_fields if x not in data]
if (len(missing_fields)>0):
    print("Error, missing fields {0}".format(", ".join(missing_fields)));
data = "firstName and favoriteFood"
mandatory_fields = ['firstName', 'lastName', 'age']


# for each
for field in mandatory_fields:
    if field not in data:
        print("Error, missing req field {0}".format(field));

# still fine, multiple if statements
if ('firstName' not in data or 
    'lastName' not in data or
    'age' not in data):
    print("Error, missing a req field");

# not very readable, list comprehension
missing_fields = [x for x in mandatory_fields if x not in data]
if (len(missing_fields)>0):
    print("Error, missing fields {0}".format(", ".join(missing_fields)));