根据正则表达式分割字符串-Python 实用宝典

问题：根据正则表达式分割字符串

我有表格形式的命令输出。我正在从结果文件中解析此输出，并将其存储在字符串中。一行中的每个元素都由一个或多个空格字符分隔，因此我正在使用正则表达式来匹配1个或多个空格并将其拆分。但是，每个元素之间都会插入一个空格：

>>> str1="a    b     c      d" # spaces are irregular
>>> str1
'a    b     c      d'
>>> str2=re.split("( )+", str1)
>>> str2
['a', ' ', 'b', ' ', 'c', ' ', 'd'] # 1 space element between!!!

有一个更好的方法吗？

每次拆分后都会str2添加到列表中。

I have the output of a command in tabular form. I’m parsing this output from a result file and storing it in a string. Each element in one row is separated by one or more whitespace characters, thus I’m using regular expressions to match 1 or more spaces and split it. However, a space is being inserted between every element:

>>> str1="a    b     c      d" # spaces are irregular
>>> str1
'a    b     c      d'
>>> str2=re.split("( )+", str1)
>>> str2
['a', ' ', 'b', ' ', 'c', ' ', 'd'] # 1 space element between!!!

Is there a better way to do this?

After each split str2 is appended to a list.

回答 0

通过使用(，)您将捕获该组，如果仅删除它们，则不会出现此问题。

>>> str1 = "a    b     c      d"
>>> re.split(" +", str1)
['a', 'b', 'c', 'd']

但是，不需要正则表达式，str.split没有指定任何定界符将为您将其分隔为空白。在这种情况下，这将是最好的方法。

>>> str1.split()
['a', 'b', 'c', 'd']

如果您真的想要正则表达式，则可以使用它（'\s'代表空格，并且更清晰）：

>>> re.split("\s+", str1)
['a', 'b', 'c', 'd']

或者您可以找到所有非空格字符

>>> re.findall(r'\S+',str1)
['a', 'b', 'c', 'd']

By using (,), you are capturing the group, if you simply remove them you will not have this problem.

>>> str1 = "a    b     c      d"
>>> re.split(" +", str1)
['a', 'b', 'c', 'd']

However there is no need for regex, str.split without any delimiter specified will split this by whitespace for you. This would be the best way in this case.

>>> str1.split()
['a', 'b', 'c', 'd']

If you really wanted regex you can use this ('\s' represents whitespace and it’s clearer):

>>> re.split("\s+", str1)
['a', 'b', 'c', 'd']

or you can find all non-whitespace characters

>>> re.findall(r'\S+',str1)
['a', 'b', 'c', 'd']

回答 1

该str.split方法将自动删除项目之间的所有空白：

>>> str1 = "a    b     c      d"
>>> str1.split()
['a', 'b', 'c', 'd']

文件在这里：http : //docs.python.org/library/stdtypes.html#str.split

The str.split method will automatically remove all white space between items:

>>> str1 = "a    b     c      d"
>>> str1.split()
['a', 'b', 'c', 'd']

Docs are here: http://docs.python.org/library/stdtypes.html#str.split

回答 2

当您使用re.split并且拆分模式包含捕获组时，这些组将保留在输出中。如果您不想这样做，请改用非捕获组。

When you use re.split and the split pattern contains capturing groups, the groups are retained in the output. If you don’t want this, use a non-capturing group instead.

回答 3

实际上，它非常简单。试试这个：

str1="a    b     c      d"
splitStr1 = str1.split()
print splitStr1

Its very simple actually. Try this:

str1="a    b     c      d"
splitStr1 = str1.split()
print splitStr1

声明：本站所有文章，如无特殊说明或标注，均为本站原创发布。任何个人或组织，在未征得本站同意时，禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益，可联系我们进行处理。

根据正则表达式分割字符串

问题：根据正则表达式分割字符串

回答 0

回答 1

回答 2

回答 3

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

Python 流程图 — 一键转化代码为流程图

7行代码 Python热力图可视化分析缺失数据处理

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

Python 教你用 Rows 快速操作csv文件

NumPy或Pandas：具有NaN值时，将数组类型保持为整数

无法通过套接字’/tmp/mysql.sock连接到本地MySQL服务器

Python将项目添加到元组

Dash Python、R、Julia和Jupyter的分析型Web应用程序

如何在Python中进行相对导入？

根据正则表达式分割字符串

问题：根据正则表达式分割字符串

回答 0

回答 1

回答 2

回答 3

相关文章

排行榜展示

文章展示