标签归档:case-sensitive

不区分大小写的正则表达式,无需重新编译?

问题:不区分大小写的正则表达式,无需重新编译?

在Python中,我可以使用re.compile以下命令将正则表达式编译为不区分大小写:

>>> s = 'TeSt'
>>> casesensitive = re.compile('test')
>>> ignorecase = re.compile('test', re.IGNORECASE)
>>> 
>>> print casesensitive.match(s)
None
>>> print ignorecase.match(s)
<_sre.SRE_Match object at 0x02F0B608>

有没有办法做同样的事情,但是不用re.compile。在文档中找不到Perl的i后缀(例如m/test/i)。

In Python, I can compile a regular expression to be case-insensitive using re.compile:

>>> s = 'TeSt'
>>> casesensitive = re.compile('test')
>>> ignorecase = re.compile('test', re.IGNORECASE)
>>> 
>>> print casesensitive.match(s)
None
>>> print ignorecase.match(s)
<_sre.SRE_Match object at 0x02F0B608>

Is there a way to do the same, but without using re.compile. I can’t find anything like Perl’s i suffix (e.g. m/test/i) in the documentation.


回答 0

传递re.IGNORECASEflags的PARAM searchmatchsub

re.search('test', 'TeSt', re.IGNORECASE)
re.match('test', 'TeSt', re.IGNORECASE)
re.sub('test', 'xxxx', 'Testing', flags=re.IGNORECASE)

Pass re.IGNORECASE to the flags param of search, match, or sub:

re.search('test', 'TeSt', re.IGNORECASE)
re.match('test', 'TeSt', re.IGNORECASE)
re.sub('test', 'xxxx', 'Testing', flags=re.IGNORECASE)

回答 1

您还可以使用不带IGNORECASE标志(已在Python 2.7.3中进行测试)的搜索/匹配来执行不区分大小写的搜索:

re.search(r'(?i)test', 'TeSt').group()    ## returns 'TeSt'
re.match(r'(?i)test', 'TeSt').group()     ## returns 'TeSt'

You can also perform case insensitive searches using search/match without the IGNORECASE flag (tested in Python 2.7.3):

re.search(r'(?i)test', 'TeSt').group()    ## returns 'TeSt'
re.match(r'(?i)test', 'TeSt').group()     ## returns 'TeSt'

回答 2

不区分大小写的标记(?i)可以直接合并到regex模式中:

>>> import re
>>> s = 'This is one Test, another TEST, and another test.'
>>> re.findall('(?i)test', s)
['Test', 'TEST', 'test']

The case-insensitive marker, (?i) can be incorporated directly into the regex pattern:

>>> import re
>>> s = 'This is one Test, another TEST, and another test.'
>>> re.findall('(?i)test', s)
['Test', 'TEST', 'test']

回答 3

您还可以在模式编译期间定义不区分大小写的代码:

pattern = re.compile('FIle:/+(.*)', re.IGNORECASE)

You can also define case insensitive during the pattern compile:

pattern = re.compile('FIle:/+(.*)', re.IGNORECASE)

回答 4

进口中

import re

在运行时处理中:

RE_TEST = r'test'
if re.match(RE_TEST, 'TeSt', re.IGNORECASE):

应当指出,不使用re.compile是浪费。每次调用上述match方法时,都会编译正则表达式。这在其他编程语言中也是错误的做法。下面是更好的做法。

在应用程序初始化中:

self.RE_TEST = re.compile('test', re.IGNORECASE)

在运行时处理中:

if self.RE_TEST.match('TeSt'):

In imports

import re

In run time processing:

RE_TEST = r'test'
if re.match(RE_TEST, 'TeSt', re.IGNORECASE):

It should be mentioned that not using re.compile is wasteful. Every time the above match method is called, the regular expression will be compiled. This is also faulty practice in other programming languages. The below is the better practice.

In app initialization:

self.RE_TEST = re.compile('test', re.IGNORECASE)

In run time processing:

if self.RE_TEST.match('TeSt'):

回答 5

#'re.IGNORECASE' for case insensitive results short form re.I
#'re.match' returns the first match located from the start of the string. 
#'re.search' returns location of the where the match is found 
#'re.compile' creates a regex object that can be used for multiple matches

 >>> s = r'TeSt'   
 >>> print (re.match(s, r'test123', re.I))
 <_sre.SRE_Match object; span=(0, 4), match='test'>
 # OR
 >>> pattern = re.compile(s, re.I)
 >>> print(pattern.match(r'test123'))
 <_sre.SRE_Match object; span=(0, 4), match='test'>
#'re.IGNORECASE' for case insensitive results short form re.I
#'re.match' returns the first match located from the start of the string. 
#'re.search' returns location of the where the match is found 
#'re.compile' creates a regex object that can be used for multiple matches

 >>> s = r'TeSt'   
 >>> print (re.match(s, r'test123', re.I))
 <_sre.SRE_Match object; span=(0, 4), match='test'>
 # OR
 >>> pattern = re.compile(s, re.I)
 >>> print(pattern.match(r'test123'))
 <_sre.SRE_Match object; span=(0, 4), match='test'>

回答 6

要执行不区分大小写的操作,请提供re.IGNORECASE

>>> import re
>>> test = 'UPPER TEXT, lower text, Mixed Text'
>>> re.findall('text', test, flags=re.IGNORECASE)
['TEXT', 'text', 'Text']

如果我们要替换与大小写匹配的文本…

>>> def matchcase(word):
        def replace(m):
            text = m.group()
            if text.isupper():
                return word.upper()
            elif text.islower():
                return word.lower()
            elif text[0].isupper():
                return word.capitalize()
            else:
                return word
        return replace

>>> re.sub('text', matchcase('word'), test, flags=re.IGNORECASE)
'UPPER WORD, lower word, Mixed Word'

To perform case-insensitive operations, supply re.IGNORECASE

>>> import re
>>> test = 'UPPER TEXT, lower text, Mixed Text'
>>> re.findall('text', test, flags=re.IGNORECASE)
['TEXT', 'text', 'Text']

and if we want to replace text matching the case…

>>> def matchcase(word):
        def replace(m):
            text = m.group()
            if text.isupper():
                return word.upper()
            elif text.islower():
                return word.lower()
            elif text[0].isupper():
                return word.capitalize()
            else:
                return word
        return replace

>>> re.sub('text', matchcase('word'), test, flags=re.IGNORECASE)
'UPPER WORD, lower word, Mixed Word'

回答 7

如果您想替换但仍保留以前str的样式。有可能的。

例如:高亮显示字符串“ test asdasd TEST asd tEst asdasd”。

sentence = "test asdasd TEST asd tEst asdasd"
result = re.sub(
  '(test)', 
  r'<b>\1</b>',  # \1 here indicates first matching group.
  sentence, 
  flags=re.IGNORECASE)

测试 asdasd TEST ASD 测试 asdasd

If you would like to replace but still keeping the style of previous str. It is possible.

For example: highlight the string “test asdasd TEST asd tEst asdasd”.

sentence = "test asdasd TEST asd tEst asdasd"
result = re.sub(
  '(test)', 
  r'<b>\1</b>',  # \1 here indicates first matching group.
  sentence, 
  flags=re.IGNORECASE)

test asdasd TEST asd tEst asdasd


回答 8

对于不区分大小写的正则表达式(Regex):通过两种方式添加代码:

  1. flags=re.IGNORECASE

    Regx3GList = re.search("(WCDMA:)((\d*)(,?))*", txt, **re.IGNORECASE**)
  2. 不区分大小写的标记 (?i)

    Regx3GList = re.search("**(?i)**(WCDMA:)((\d*)(,?))*", txt)

For Case insensitive regular expression(Regex): There are two ways by adding in your code:

  1. flags=re.IGNORECASE

    Regx3GList = re.search("(WCDMA:)((\d*)(,?))*", txt, **re.IGNORECASE**)
    
  2. The case-insensitive marker (?i)

    Regx3GList = re.search("**(?i)**(WCDMA:)((\d*)(,?))*", txt)