Python正则表达式-如何获取匹配项的位置和值-Python 实用宝典

问题：Python正则表达式-如何获取匹配项的位置和值

如何使用该re模块获取所有比赛的开始和结束位置？例如给定的模式r'[a-z]'和字符串，'a1b2c3d4'我想获得它找到每个字母的位置。理想情况下，我也想找回比赛的文字。

How can I get the start and end positions of all matches using the re module? For example given the pattern r'[a-z]' and the string 'a1b2c3d4' I’d want to get the positions where it finds each letter. Ideally, I’d like to get the text of the match back too.

回答 0

import re
p = re.compile("[a-z]")
for m in p.finditer('a1b2c3d4'):
    print(m.start(), m.group())

import re
p = re.compile("[a-z]")
for m in p.finditer('a1b2c3d4'):
    print(m.start(), m.group())

回答 1

取自

正则表达式操作方法

span（）在单个元组中返回开始索引和结束索引。由于match方法仅检查RE是否在字符串开头匹配，因此start（）始终为零。但是，RegexObject实例的搜索方法将扫描字符串，因此在这种情况下，匹配可能不会从零开始。

>>> p = re.compile('[a-z]+')
>>> print p.match('::: message')
None
>>> m = p.search('::: message') ; print m
<re.MatchObject instance at 80c9650>
>>> m.group()
'message'
>>> m.span()
(4, 11)

结合使用：

在Python 2.2中，finditer（）方法也可用，它返回一个MatchObject实例序列作为迭代器。

>>> p = re.compile( ... )
>>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...')
>>> iterator
<callable-iterator object at 0x401833ac>
>>> for match in iterator:
...     print match.span()
...
(0, 2)
(22, 24)
(29, 31)

您应该能够按以下顺序进行操作

for match in re.finditer(r'[a-z]', 'a1b2c3d4'):
   print match.span()

Taken from

Regular Expression HOWTO

span() returns both start and end indexes in a single tuple. Since the match method only checks if the RE matches at the start of a string, start() will always be zero. However, the search method of RegexObject instances scans through the string, so the match may not start at zero in that case.

>>> p = re.compile('[a-z]+')
>>> print p.match('::: message')
None
>>> m = p.search('::: message') ; print m
<re.MatchObject instance at 80c9650>
>>> m.group()
'message'
>>> m.span()
(4, 11)

Combine that with:

In Python 2.2, the finditer() method is also available, returning a sequence of MatchObject instances as an iterator.

>>> p = re.compile( ... )
>>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...')
>>> iterator
<callable-iterator object at 0x401833ac>
>>> for match in iterator:
...     print match.span()
...
(0, 2)
(22, 24)
(29, 31)

you should be able to do something on the order of

for match in re.finditer(r'[a-z]', 'a1b2c3d4'):
   print match.span()

回答 2

对于Python 3.x

from re import finditer
for match in finditer("pattern", "string"):
    print(match.span(), match.group())

\n对于字符串中的每个匹配，您将获得独立的元组（分别包含匹配的第一个和最后一个索引）和匹配本身。

For Python 3.x

from re import finditer
for match in finditer("pattern", "string"):
    print(match.span(), match.group())

You shall get \n separated tuples (comprising first and last indices of the match, respectively) and the match itself, for each hit in the string.

回答 3

请注意，跨度和组在正则表达式中被索引为多个捕获组

regex_with_3_groups=r"([a-z])([0-9]+)([A-Z])"
for match in re.finditer(regex_with_3_groups, string):
    for idx in range(0, 4):
        print(match.span(idx), match.group(idx))

note that the span & group are indexed for multi capture groups in a regex

regex_with_3_groups=r"([a-z])([0-9]+)([A-Z])"
for match in re.finditer(regex_with_3_groups, string):
    for idx in range(0, 4):
        print(match.span(idx), match.group(idx))

声明：本站所有文章，如无特殊说明或标注，均为本站原创发布。任何个人或组织，在未征得本站同意时，禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益，可联系我们进行处理。

Python正则表达式-如何获取匹配项的位置和值

问题：Python正则表达式-如何获取匹配项的位置和值

回答 0

回答 1

回答 2

回答 3

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

Python 流程图 — 一键转化代码为流程图

7行代码 Python热力图可视化分析缺失数据处理

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

sklearn错误ValueError：输入包含NaN，无穷大或对于dtype（’float64’）而言太大的值

PEP8的E128：连续行缩进不足以实现视觉缩进是什么？

当在apply中也计算出先前值时，Pandas中有没有一种方法可以使用dataframe.apply中的先前行值？

python pip：强制安装忽略依赖项

如何从外部访问本地Django Web服务器

如何在AWS EC2实例上安装Python 3？

Python正则表达式-如何获取匹配项的位置和值

问题：Python正则表达式-如何获取匹配项的位置和值

回答 0

回答 1

回答 2

回答 3

相关文章

排行榜展示

文章展示