python re.sub组:\ number之后的数字

问题:python re.sub组:\ number之后的数字

如何替换foobarfoo123bar

这不起作用:

>>> re.sub(r'(foo)', r'\1123', 'foobar')
'J3bar'

这有效:

>>> re.sub(r'(foo)', r'\1hi', 'foobar')
'foohibar'

我认为,遇到时,这是一个普遍的问题\number。谁能给我一个关于如何处理的提示?

How can I replace foobar with foo123bar?

This doesn’t work:

>>> re.sub(r'(foo)', r'\1123', 'foobar')
'J3bar'

This works:

>>> re.sub(r'(foo)', r'\1hi', 'foobar')
'foohibar'

I think it’s a common issue when having something like \number. Can anyone give me a hint on how to handle this?


回答 0

答案是:

re.sub(r'(foo)', r'\g<1>123', 'foobar')

相关摘录:

除了如上所述的字符转义和反向引用之外,\ g将使用由(?P …)语法定义的名为name的组匹配的子字符串。\ g使用​​相应的组号;因此,\ g <2>等效于\ 2,但在诸如\ g <2> 0之类的替换中并没有歧义。\ 20将被解释为对组20的引用,而不是对组2的引用,后跟文字字符“ 0”。反向引用\ g <0>替换RE匹配的整个子字符串。

The answer is:

re.sub(r'(foo)', r'\g<1>123', 'foobar')

Relevant excerpt from the docs:

In addition to character escapes and backreferences as described above, \g will use the substring matched by the group named name, as defined by the (?P…) syntax. \g uses the corresponding group number; \g<2> is therefore equivalent to \2, but isn’t ambiguous in a replacement such as \g<2>0. \20 would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character ‘0’. The backreference \g<0> substitutes in the entire substring matched by the RE.