替换字符串中的所有非字母数字字符

问题:替换字符串中的所有非字母数字字符

我有一个字符串,我想用一个星号替换任何不是标准字符或数字的字符,例如(az或0-9)。例如,“ h ^&ell`。,| ow] {+ orld”被替换为“ h * ell * o * w * orld”。请注意,多个字符(例如“ ^&”)将替换为一个星号。我将如何去做呢?

I have a string with which i want to replace any character that isn’t a standard character or number such as (a-z or 0-9) with an asterisk. For example, “h^&ell`.,|o w]{+orld” is replaced with “h*ell*o*w*orld”. Note that multiple characters such as “^&” get replaced with one asterisk. How would I go about doing this?


回答 0

正则表达式可以解救!

import re

s = re.sub('[^0-9a-zA-Z]+', '*', s)

例:

>>> re.sub('[^0-9a-zA-Z]+', '*', 'h^&ell`.,|o w]{+orld')
'h*ell*o*w*orld'

Regex to the rescue!

import re

s = re.sub('[^0-9a-zA-Z]+', '*', s)

Example:

>>> re.sub('[^0-9a-zA-Z]+', '*', 'h^&ell`.,|o w]{+orld')
'h*ell*o*w*orld'

回答 1

pythonic方式。

print "".join([ c if c.isalnum() else "*" for c in s ])

但是,这不涉及对多个连续的不匹配字符进行分组,即

"h^&i => "h**i不像"h*i"正则表达式解决方案那样。

The pythonic way.

print "".join([ c if c.isalnum() else "*" for c in s ])

This doesn’t deal with grouping multiple consecutive non-matching characters though, i.e.

"h^&i => "h**i not "h*i" as in the regex solutions.


回答 2

尝试:

s = filter(str.isalnum, s)

在Python3中:

s = ''.join(filter(str.isalnum, s))

编辑:意识到OP希望用’*’替换非字符。我的答案不合适

Try:

s = filter(str.isalnum, s)

in Python3:

s = ''.join(filter(str.isalnum, s))

Edit: realized that the OP wants to replace non-chars with ‘*’. My answer does not fit


回答 3

用途\W等同于[^a-zA-Z0-9_]。查看文档https://docs.python.org/2/library/re.html

Import re
s =  'h^&ell`.,|o w]{+orld'
replaced_string = re.sub(r'\W+', '*', s)
output: 'h*ell*o*w*orld'

更新:此解决方案还将排除下划线。如果只希望排除字母和数字,那么使用nneonneo解决方案更为合适。

Use \W which is equivalent to [^a-zA-Z0-9_]. Check the documentation, https://docs.python.org/2/library/re.html

Import re
s =  'h^&ell`.,|o w]{+orld'
replaced_string = re.sub(r'\W+', '*', s)
output: 'h*ell*o*w*orld'

update: This solution will exclude underscore as well. If you want only alphabets and numbers to be excluded, then solution by nneonneo is more appropriate.