问题:删除一长串文本中的所有换行符

基本上,我是在要求用户在控制台中输入文本字符串,但是该字符串很长,并且包含许多换行符。我将如何获取用户的字符串并删除所有换行符以使其成为一行文本。我获取字符串的方法非常简单。

string = raw_input("Please enter string: ")

我应该从用户那里获取字符串吗?我在Mac上运行Python 2.7.4。

PS显然,我是一个菜鸟,所以即使解决方案不是最有效的,也应感谢使用最简单语法的解决方案。

Basically, I’m asking the user to input a string of text into the console, but the string is very long and includes many line breaks. How would I take the user’s string and delete all line breaks to make it a single line of text. My method for acquiring the string is very simple.

string = raw_input("Please enter string: ")

Is there a different way I should be grabbing the string from the user? I’m running Python 2.7.4 on a Mac.

P.S. Clearly I’m a noob, so even if a solution isn’t the most efficient, the one that uses the most simple syntax would be appreciated.


回答 0

您如何输入换行符raw_input?但是,一旦您的字符串中包含一些字符,就想摆脱replace它们。

>>> mystr = raw_input('please enter string: ')
please enter string: hello world, how do i enter line breaks?
>>> # pressing enter didn't work...
...
>>> mystr
'hello world, how do i enter line breaks?'
>>> mystr.replace(' ', '')
'helloworld,howdoienterlinebreaks?'
>>>

在上面的示例中,我替换了所有空格。该字符串'\n'表示换行符。并\r代表回车(如果您在Windows上,则可能会得到这些,一秒钟replace就会为您处理!)。

基本上:

# you probably want to use a space ' ' to replace `\n`
mystring = mystring.replace('\n', ' ').replace('\r', '')

还要注意,调用变量是一个坏主意string,因为这会遮盖模块string。我会避免使用但会在某些时候使用的另一个名称:file。为了同样的原因。

How do you enter line breaks with raw_input? But, once you have a string with some characters in it you want to get rid of, just replace them.

>>> mystr = raw_input('please enter string: ')
please enter string: hello world, how do i enter line breaks?
>>> # pressing enter didn't work...
...
>>> mystr
'hello world, how do i enter line breaks?'
>>> mystr.replace(' ', '')
'helloworld,howdoienterlinebreaks?'
>>>

In the example above, I replaced all spaces. The string '\n' represents newlines. And \r represents carriage returns (if you’re on windows, you might be getting these and a second replace will handle them for you!).

basically:

# you probably want to use a space ' ' to replace `\n`
mystring = mystring.replace('\n', ' ').replace('\r', '')

Note also, that it is a bad idea to call your variable string, as this shadows the module string. Another name I’d avoid but would love to use sometimes: file. For the same reason.


回答 1

您可以尝试使用字符串替换:

string = string.replace('\r', '').replace('\n', '')

You can try using string replace:

string = string.replace('\r', '').replace('\n', '')

回答 2

您可以在不使用分隔符arg的情况下拆分字符串,这会将连续的空格视为单个分隔符(包括换行符和制表符)。然后使用空格加入:

In : " ".join("\n\nsome    text \r\n with multiple whitespace".split())
Out: 'some text with multiple whitespace'

https://docs.python.org/2/library/stdtypes.html#str.split

You can split the string with no separator arg, which will treat consecutive whitespace as a single separator (including newlines and tabs). Then join using a space:

In : " ".join("\n\nsome    text \r\n with multiple whitespace".split())
Out: 'some text with multiple whitespace'

https://docs.python.org/2/library/stdtypes.html#str.split


回答 3

根据Xbello评论更新:

string = my_string.rstrip('\r\n')

在这里阅读更多

updated based on Xbello comment:

string = my_string.rstrip('\r\n')

read more here


回答 4

另一个选择是正则表达式:

>>> import re
>>> re.sub("\n|\r", "", "Foo\n\rbar\n\rbaz\n\r")
'Foobarbaz'

Another option is regex:

>>> import re
>>> re.sub("\n|\r", "", "Foo\n\rbar\n\rbaz\n\r")
'Foobarbaz'

回答 5

一种考虑方法

  • 字符串开头/结尾的其他白色字符
  • 每行开头/结尾处的其他白色字符
  • 各种结束符

它需要这样的多行字符串,可能会很杂乱,例如

test_str = '\nhej ho \n aaa\r\n   a\n '

并产生漂亮的单行字符串

>>> ' '.join([line.strip() for line in test_str.strip().splitlines()])
'hej ho aaa a'

更新:要修复产生多余空格的多个换行符:

' '.join([line.strip() for line in test_str.strip().splitlines() if line.strip()])

这也适用于以下情况 test_str = '\nhej ho \n aaa\r\n\n\n\n\n a\n '

A method taking into consideration

  • additional white characters at the beginning/end of string
  • additional white characters at the beginning/end of every line
  • various end-line characters

it takes such a multi-line string which may be messy e.g.

test_str = '\nhej ho \n aaa\r\n   a\n '

and produces nice one-line string

>>> ' '.join([line.strip() for line in test_str.strip().splitlines()])
'hej ho aaa a'

UPDATE: To fix multiple new-line character producing redundant spaces:

' '.join([line.strip() for line in test_str.strip().splitlines() if line.strip()])

This works for the following too test_str = '\nhej ho \n aaa\r\n\n\n\n\n a\n '


回答 6

如果有人决定使用replace,则应r'\n'改用'\n'

mystring = mystring.replace(r'\n', ' ').replace(r'\r', '')

If anybody decides to use replace, you should try r'\n' instead '\n'

mystring = mystring.replace(r'\n', ' ').replace(r'\r', '')

回答 7

rstrip的问题在于它不能在所有情况下都起作用(正如我本人所见的那样)。相反,您可以使用-text = text.replace(“ \ n”,“”),这将删除所有带有空格的新行\ n。

预先感谢你们的支持。

The problem with rstrip is that it does not work in all cases (as I myself have seen few). Instead you can use – text= text.replace(“\n”,” “) this will remove all new line \n with a space.

Thanks in advance guys for your upvotes.


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。