Python非贪婪正则表达式

Question 1

How do I make a python regex like "(.*)" such that, given "a (b) c (d) e" python matches "b" instead of "b) c (d"?

I know that I can use "[^)]" instead of ".", but I’m looking for a more general solution that keeps my regex a little cleaner. Is there any way to tell python “hey, match this as soon as possible”?

Question 2

You seek the all-powerful *?

From the docs, Greedy versus Non-Greedy

the non-greedy qualifiers *?, +?, ??, or {m,n}? […] match as little text as possible.

Question 3

>>> x = "a (b) c (d) e"
>>> re.search(r"\(.*\)", x).group()
'(b) c (d)'
>>> re.search(r"\(.*?\)", x).group()
'(b)'

According to the docs:

The ‘*‘, ‘+‘, and ‘?‘ qualifiers are all greedy; they match as much text as possible. Sometimes this behavior isn’t desired; if the RE <.*> is matched against ‘<H1>title</H1>‘, it will match the entire string, and not just ‘<H1>‘. Adding ‘?‘ after the qualifier makes it perform the match in non-greedy or minimal fashion; as few characters as possible will be matched. Using .*? in the previous expression will match only ‘<H1>‘.

Question 4

Would not \\(.*?\\) work? That is the non-greedy syntax.

Question 5

As the others have said using the ? modifier on the * quantifier will solve your immediate problem, but be careful, you are starting to stray into areas where regexes stop working and you need a parser instead. For instance, the string “(foo (bar)) baz” will cause you problems.

Question 6

Using an ungreedy match is a good start, but I’d also suggest that you reconsider any use of .* — what about this?

groups = re.search(r"\([^)]*\)", x)

Question 7

Do you want it to match “(b)”? Do as Zitrax and Paolo have suggested. Do you want it to match “b”? Do

>>> x = "a (b) c (d) e"
>>> re.search(r"\((.*?)\)", x).group(1)
'b'

Question 8

To start with, I do not suggest using “*” in regexes. Yes, I know, it is the most used multi-character delimiter, but it is nevertheless a bad idea. This is because, while it does match any amount of repetition for that character, “any” includes 0, which is usually something you want to throw a syntax error for, not accept. Instead, I suggest using the + sign, which matches any repetition of length > 1. What’s more, from what I can see, you are dealing with fixed-length parenthesized expressions. As a result, you can probably use the {x, y} syntax to specifically specify the desired length.

However, if you really do need non-greedy repetition, I suggest consulting the all-powerful ?. This, when placed after at the end of any regex repetition specifier, will force that part of the regex to find the least amount of text possible.

That being said, I would be very careful with the ? as it, like the Sonic Screwdriver in Dr. Who, has a tendency to do, how should I put it, “slightly” undesired things if not carefully calibrated. For example, to use your example input, it would identify ((1) (note the lack of a second rparen) as a match.

Python非贪婪正则表达式

问题：Python非贪婪正则表达式

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

7行代码 Python热力图可视化分析缺失数据处理

Python 流程图 — 一键转化代码为流程图

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

有没有一种简单的方法来删除字符串中的多个空格？

Python：在init中引发异常是否不好？

Delorean 优秀的Python时间格式转换工具

Tqsdk-python-天勤量化开发包，期货量化，实时行情/历史数据/实盘交易

了解get和set以及Python描述符

如何修复PyDev“导入时未定义变量”错误？

Python非贪婪正则表达式

问题：Python非贪婪正则表达式

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

相关文章

排行榜展示

文章展示