问题:用逗号分割并在Python中去除空格
我有一些在逗号处分割的python代码,但没有去除空格:
>>> string = "blah, lots  ,  of ,  spaces, here "
>>> mylist = string.split(',')
>>> print mylist
['blah', ' lots  ', '  of ', '  spaces', ' here ']
我宁愿这样删除空格:
['blah', 'lots', 'of', 'spaces', 'here']
我知道我可以遍历list和strip()每个项目,但是,因为这是Python,所以我猜有一种更快,更轻松和更优雅的方法。
 
        
        
            
            
            
                
                    I have some python code that splits on comma, but doesn’t strip the whitespace:
>>> string = "blah, lots  ,  of ,  spaces, here "
>>> mylist = string.split(',')
>>> print mylist
['blah', ' lots  ', '  of ', '  spaces', ' here ']
I would rather end up with whitespace removed like this:
['blah', 'lots', 'of', 'spaces', 'here']
I am aware that I could loop through the list and strip() each item but, as this is Python, I’m guessing there’s a quicker, easier and more elegant way of doing it.
                 
             
            
         
        
        
回答 0
使用列表理解-更简单,就像for循环一样容易阅读。
my_string = "blah, lots  ,  of ,  spaces, here "
result = [x.strip() for x in my_string.split(',')]
# result is ["blah", "lots", "of", "spaces", "here"]
请参阅: 有关列表理解的Python文档
很好的2秒钟的列表理解说明。
 
        
        
            
            
            
                
                    Use list comprehension — simpler, and just as easy to read as a for loop.
my_string = "blah, lots  ,  of ,  spaces, here "
result = [x.strip() for x in my_string.split(',')]
# result is ["blah", "lots", "of", "spaces", "here"]
See: Python docs on List Comprehension
A good 2 second explanation of list comprehension.
                 
             
            
         
        
        
回答 1
使用正则表达式拆分。注意我用前导空格使情况更一般。列表理解是删除前面和后面的空字符串。
>>> import re
>>> string = "  blah, lots  ,  of ,  spaces, here "
>>> pattern = re.compile("^\s+|\s*,\s*|\s+$")
>>> print([x for x in pattern.split(string) if x])
['blah', 'lots', 'of', 'spaces', 'here']
即使^\s+不匹配也可以:
>>> string = "foo,   bar  "
>>> print([x for x in pattern.split(string) if x])
['foo', 'bar']
>>>
这就是您需要^ \ s +的原因:
>>> pattern = re.compile("\s*,\s*|\s+$")
>>> print([x for x in pattern.split(string) if x])
['  blah', 'lots', 'of', 'spaces', 'here']
看到等等的主要空间吗?
说明:上面使用的是Python 3解释器,但结果与Python 2相同。
 
        
        
            
            
            
                
                    Split using a regular expression. Note I made the case more general with leading spaces. The list comprehension is to remove the null strings at the front and back.
>>> import re
>>> string = "  blah, lots  ,  of ,  spaces, here "
>>> pattern = re.compile("^\s+|\s*,\s*|\s+$")
>>> print([x for x in pattern.split(string) if x])
['blah', 'lots', 'of', 'spaces', 'here']
This works even if ^\s+ doesn’t match:
>>> string = "foo,   bar  "
>>> print([x for x in pattern.split(string) if x])
['foo', 'bar']
>>>
Here’s why you need ^\s+:
>>> pattern = re.compile("\s*,\s*|\s+$")
>>> print([x for x in pattern.split(string) if x])
['  blah', 'lots', 'of', 'spaces', 'here']
See the leading spaces in blah?
Clarification: above uses the Python 3 interpreter, but results are the same in Python 2.
                 
             
            
         
        
        
回答 2
我来补充:
map(str.strip, string.split(','))
但是看到Jason Orendorff在评论中已经提到了它。
在同一个答案中读到格伦·梅纳德(Glenn Maynard)的评论,这暗示着人们对地图的理解,我开始怀疑为什么。我以为他是出于性能方面的考虑,但是当然他可能是出于风格方面的原因,或者其他原因(Glenn?)。
因此,在我的盒子上快速地(可能有缺陷?)应用了以下三种方法的测试:
[word.strip() for word in string.split(',')]
$ time ./list_comprehension.py 
real    0m22.876s
map(lambda s: s.strip(), string.split(','))
$ time ./map_with_lambda.py 
real    0m25.736s
map(str.strip, string.split(','))
$ time ./map_with_str.strip.py 
real    0m19.428s
做map(str.strip, string.split(','))赢家,但它似乎他们都在同一个球场。
当然,出于性能原因,不一定要排除map(有或没有lambda),对我而言,它至少与列表理解一样清晰。
编辑:
Ubuntu 10.04上的Python 2.6.5
 
        
        
            
            
            
                
                    I came to add:
map(str.strip, string.split(','))
but saw it had already been mentioned by Jason Orendorff in a comment.
Reading Glenn Maynard’s comment in the same answer suggesting list comprehensions over map I started to wonder why. I assumed he meant for performance reasons, but of course he might have meant for stylistic reasons, or something else (Glenn?).
So a quick (possibly flawed?) test on my box applying the three methods in a loop revealed:
[word.strip() for word in string.split(',')]
$ time ./list_comprehension.py 
real    0m22.876s
map(lambda s: s.strip(), string.split(','))
$ time ./map_with_lambda.py 
real    0m25.736s
map(str.strip, string.split(','))
$ time ./map_with_str.strip.py 
real    0m19.428s
making map(str.strip, string.split(',')) the winner, although it seems they are all in the same ballpark.
Certainly though map (with or without a lambda) should not necessarily be ruled out for performance reasons, and for me it is at least as clear as a list comprehension.
Edit:
Python 2.6.5 on Ubuntu 10.04
                 
             
            
         
        
        
回答 3
分割字符串之前,只需从字符串中删除空格。 
mylist = my_string.replace(' ','').split(',')
 
        
        
            
            
            
                
                    Just remove the white space from the string before you split it. 
mylist = my_string.replace(' ','').split(',')
                 
             
            
         
        
        
回答 4
我知道已经回答了这个问题,但是如果您结束很多工作,则使用正则表达式可能是更好的选择:
>>> import re
>>> re.sub(r'\s', '', string).split(',')
['blah', 'lots', 'of', 'spaces', 'here']
将\s匹配任何空白字符,我们只是用一个空字符串替换它''。您可以在此处找到更多信息:http : //docs.python.org/library/re.html#re.sub
 
        
        
            
            
            
                
                    I know this has already been answered, but if you end doing this a lot, regular expressions may be a better way to go:
>>> import re
>>> re.sub(r'\s', '', string).split(',')
['blah', 'lots', 'of', 'spaces', 'here']
The \s matches any whitespace character, and we just replace it with an empty string ''. You can find more info here: http://docs.python.org/library/re.html#re.sub
                 
             
            
         
        
        
回答 5
import re
result=[x for x in re.split(',| ',your_string) if x!='']
这对我来说很好。
 
        
        
            
            
            
                
                    import re
result=[x for x in re.split(',| ',your_string) if x!='']
this works fine for me.
                 
             
            
         
        
        
回答 6
re (如正则表达式中一样)允许一次分割多个字符:
$ string = "blah, lots  ,  of ,  spaces, here "
$ re.split(', ',string)
['blah', 'lots  ', ' of ', ' spaces', 'here ']
这对于您的示例字符串而言效果不佳,但对于逗号分隔的列表则效果很好。对于您的示例字符串,您可以结合使用re.split功能来分割正则表达式模式,从而获得“按此分割”效果。
$ re.split('[, ]',string)
['blah',
 '',
 'lots',
 '',
 '',
 '',
 '',
 'of',
 '',
 '',
 '',
 'spaces',
 '',
 'here',
 '']
不幸的是,这很丑陋,但是a filter会成功的:
$ filter(None, re.split('[, ]',string))
['blah', 'lots', 'of', 'spaces', 'here']
瞧!
 
        
        
            
            
            
                
                    re (as in regular expressions) allows splitting on multiple characters at once:
$ string = "blah, lots  ,  of ,  spaces, here "
$ re.split(', ',string)
['blah', 'lots  ', ' of ', ' spaces', 'here ']
This doesn’t work well for your example string, but works nicely for a comma-space separated list. For your example string, you can combine the re.split power to split on regex patterns to get a “split-on-this-or-that” effect.
$ re.split('[, ]',string)
['blah',
 '',
 'lots',
 '',
 '',
 '',
 '',
 'of',
 '',
 '',
 '',
 'spaces',
 '',
 'here',
 '']
Unfortunately, that’s ugly, but a filter will do the trick:
$ filter(None, re.split('[, ]',string))
['blah', 'lots', 'of', 'spaces', 'here']
Voila!
                 
             
            
         
        
        
回答 7
map(lambda s: s.strip(), mylist)比显式循环要好一点。或一次全部:map(lambda s:s.strip(), string.split(','))
 
        
        
            
            
            
                
                    map(lambda s: s.strip(), mylist) would be a little better than explicitly looping. Or for the whole thing at once: map(lambda s:s.strip(), string.split(','))
                 
             
            
         
        
        
回答 8
s = 'bla, buu, jii'
sp = []
sp = s.split(',')
for st in sp:
    print st
 
        
        
            
            
            
                
                    s = 'bla, buu, jii'
sp = []
sp = s.split(',')
for st in sp:
    print st
                 
             
            
         
        
        
回答 9
import re
mylist = [x for x in re.compile('\s*[,|\s+]\s*').split(string)]
简单地说,用逗号或至少一个空白空格,带有/没有在前/在后的空格。
请试试!
 
        
        
            
            
            
                
                    import re
mylist = [x for x in re.compile('\s*[,|\s+]\s*').split(string)]
Simply, comma or at least one white spaces with/without preceding/succeeding white spaces.
Please try!
                 
             
            
         
        
        
回答 10
map(lambda s: s.strip(), mylist)比显式循环要好一点。
或一次全部:
map(lambda s:s.strip(), string.split(','))
这基本上就是您需要的一切。
 
        
        
            
            
            
                
                    map(lambda s: s.strip(), mylist) would be a little better than explicitly looping.
Or for the whole thing at once: 
map(lambda s:s.strip(), string.split(','))
That’s basically everything you need.
                 
             
            
         
        
        
	
	声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。