标签归档:trailing

为什么列表中允许尾随逗号?

问题:为什么列表中允许尾随逗号?

我很好奇为什么在Python中列表中的尾部逗号是有效的语法,并且似乎Python只是忽略了它:

>>> ['a','b',]
['a', 'b']

自从('a')('a',)是两个元组时,它是一个有意义的元组,但是在列表中呢?

I am curious why in Python a trailing comma in a list is valid syntax, and it seems that Python simply ignores it:

>>> ['a','b',]
['a', 'b']

It makes sense when its a tuple since ('a') and ('a',) are two different things, but in lists?


回答 0

主要优点是,它使多行列表更易于编辑,并减少了差异。

变更:

s = ['manny',
     'mo',
     'jack',
]

至:

s = ['manny',
     'mo',
     'jack',
     'roger',
]

仅涉及差异的单行更改:

  s = ['manny',
       'mo',
       'jack',
+      'roger',
  ]

当省略尾部逗号时,这击败了更令人困惑的多行差异:

  s = ['manny',
       'mo',
-      'jack'
+      'jack',
+      'roger'
  ]

后者的差异使得很难看到仅添加了一行,而另一行没有更改内容。

它还降低了这样做的风险:

s = ['manny',
     'mo',
     'jack'
     'roger'  # Added this line, but forgot to add a comma on the previous line
]

并触发隐式字符串文字串联,产生s = ['manny', 'mo', 'jackroger']而不是预期的结果。

The main advantages are that it makes multi-line lists easier to edit and that it reduces clutter in diffs.

Changing:

s = ['manny',
     'mo',
     'jack',
]

to:

s = ['manny',
     'mo',
     'jack',
     'roger',
]

involves only a one-line change in the diff:

  s = ['manny',
       'mo',
       'jack',
+      'roger',
  ]

This beats the more confusing multi-line diff when the trailing comma was omitted:

  s = ['manny',
       'mo',
-      'jack'
+      'jack',
+      'roger'
  ]

The latter diff makes it harder to see that only one line was added and that the other line didn’t change content.

It also reduces the risk of doing this:

s = ['manny',
     'mo',
     'jack'
     'roger'  # Added this line, but forgot to add a comma on the previous line
]

and triggering implicit string literal concatenation, producing s = ['manny', 'mo', 'jackroger'] instead of the intended result.


回答 1

这是一种常见的语法约定,允许在数组中尾随逗号,C和Java之类的语言都允许,并且Python似乎已对其列表数据结构采用了该约定。在生成用于填充列表的代码时,它特别有用:只需生成一系列元素和逗号,而无需将最后一个元素和逗号视为特殊情况,并且不应该在末尾加逗号。

It’s a common syntactical convention to allow trailing commas in an array, languages like C and Java allow it, and Python seems to have adopted this convention for its list data structure. It’s particularly useful when generating code for populating a list: just generate a sequence of elements and commas, no need to consider the last one as a special case that shouldn’t have a comma at the end.


回答 2

它有助于消除某种错误。有时在多行上写列表会更清晰。但是在以后的维护中,您可能需要重新排列项目。

l1 = [
        1,
        2,
        3,
        4,
        5
]

# Now you want to rearrange

l1 = [
        1,
        2,
        3,
        5
        4,
]

# Now you have an error

但是,如果允许使用尾随逗号,则可以轻松地重新排列行而不会引起错误。

It helps to eliminate a certain kind of bug. It’s sometimes clearer to write lists on multiple lines. But in, later maintenace you may want to rearrange the items.

l1 = [
        1,
        2,
        3,
        4,
        5
]

# Now you want to rearrange

l1 = [
        1,
        2,
        3,
        5
        4,
]

# Now you have an error

But if you allow trailing commas, and use them, you can easily rearrange the lines without introducing an error.


回答 3

元组的不同之处在于,('a')它使用隐式连续和()s作为优先运算符进行扩展,而('a',)引用长度为1的元组。

你原来的例子是 tuple('a')

A tuple is different because ('a') is expanded using implicit continuation and ()s as a precendence operator, whereas ('a',) refers to a length 1 tuple.

Your original example would have been tuple('a')


回答 4

主要原因是使diff变得不那么复杂。例如,您有一个列表:

list = [
    'a',
    'b',
    'c'
]

并且您想要向其中添加另一个元素。然后,您将最终执行此操作:

list = [
    'a',
    'b',
    'c',
    'd'
]

因此,diff将显示出两行已更改,首先在’c’处添加’,’,在最后一行添加’d’。

因此,python允许在列表的最后一个元素中尾部加上’,’,以防止可能引起混淆的额外差异。

The main reason is to make diff less complicated. For example you have a list :

list = [
    'a',
    'b',
    'c'
]

and you want to add another element to it. Then you will be end up doing this:

list = [
    'a',
    'b',
    'c',
    'd'
]

thus, diff will show that two lines have been changed, first adding ‘,’ in line with ‘c’ and adding ‘d’ at last line.

So, python allows trailing ‘,’ in last element of list, to prevent extra diff which can cause confusion.


如何删除字符串中的前导和尾随零?Python

问题:如何删除字符串中的前导和尾随零?Python

我有几个像这样的字母数字字符串

listOfNum = ['000231512-n','1209123100000-n00000','alphanumeric0000', '000alphanumeric']

除去尾随零的理想输出为:

listOfNum = ['000231512-n','1209123100000-n','alphanumeric', '000alphanumeric']

前导尾随零的期望输出为:

listOfNum = ['231512-n','1209123100000-n00000','alphanumeric0000', 'alphanumeric']

除去前导零和尾随零的期望输出为:

listOfNum = ['231512-n','1209123100000-n', 'alphanumeric', 'alphanumeric']

目前,我已经按照以下方式进行操作,如果有的话,请提出一种更好的方法:

listOfNum = ['000231512-n','1209123100000-n00000','alphanumeric0000', \
'000alphanumeric']
trailingremoved = []
leadingremoved = []
bothremoved = []

# Remove trailing
for i in listOfNum:
  while i[-1] == "0":
    i = i[:-1]
  trailingremoved.append(i)

# Remove leading
for i in listOfNum:
  while i[0] == "0":
    i = i[1:]
  leadingremoved.append(i)

# Remove both
for i in listOfNum:
  while i[0] == "0":
    i = i[1:]
  while i[-1] == "0":
    i = i[:-1]
  bothremoved.append(i)

I have several alphanumeric strings like these

listOfNum = ['000231512-n','1209123100000-n00000','alphanumeric0000', '000alphanumeric']

The desired output for removing trailing zeros would be:

listOfNum = ['000231512-n','1209123100000-n','alphanumeric', '000alphanumeric']

The desired output for leading trailing zeros would be:

listOfNum = ['231512-n','1209123100000-n00000','alphanumeric0000', 'alphanumeric']

The desire output for removing both leading and trailing zeros would be:

listOfNum = ['231512-n','1209123100000-n', 'alphanumeric', 'alphanumeric']

For now i’ve been doing it the following way, please suggest a better way if there is:

listOfNum = ['000231512-n','1209123100000-n00000','alphanumeric0000', \
'000alphanumeric']
trailingremoved = []
leadingremoved = []
bothremoved = []

# Remove trailing
for i in listOfNum:
  while i[-1] == "0":
    i = i[:-1]
  trailingremoved.append(i)

# Remove leading
for i in listOfNum:
  while i[0] == "0":
    i = i[1:]
  leadingremoved.append(i)

# Remove both
for i in listOfNum:
  while i[0] == "0":
    i = i[1:]
  while i[-1] == "0":
    i = i[:-1]
  bothremoved.append(i)

回答 0

那基本的

your_string.strip("0")

删除尾随和前导零?如果您只想删除尾随零,请.rstrip改用(.lstrip仅用于前导零)。

[ 文档中的更多信息。]

您可以使用一些列表推导来获得所需的序列,如下所示:

trailing_removed = [s.rstrip("0") for s in listOfNum]
leading_removed = [s.lstrip("0") for s in listOfNum]
both_removed = [s.strip("0") for s in listOfNum]

What about a basic

your_string.strip("0")

to remove both trailing and leading zeros ? If you’re only interested in removing trailing zeros, use .rstrip instead (and .lstrip for only the leading ones).

[More info in the doc.]

You could use some list comprehension to get the sequences you want like so:

trailing_removed = [s.rstrip("0") for s in listOfNum]
leading_removed = [s.lstrip("0") for s in listOfNum]
both_removed = [s.strip("0") for s in listOfNum]

回答 1

删除前导+尾随的“ 0”:

list = [i.strip('0') for i in listOfNum ]

删除前导“ 0”:

list = [ i.lstrip('0') for i in listOfNum ]

删除尾随的“ 0”:

list = [ i.rstrip('0') for i in listOfNum ]

Remove leading + trailing ‘0’:

list = [i.strip('0') for i in listOfNum ]

Remove leading ‘0’:

list = [ i.lstrip('0') for i in listOfNum ]

Remove trailing ‘0’:

list = [ i.rstrip('0') for i in listOfNum ]

回答 2

您可以简单地通过bool做到这一点:

if int(number) == float(number):

   number = int(number)

else:

   number = float(number)

You can simply do this with a bool:

if int(number) == float(number):

   number = int(number)

else:

   number = float(number)

回答 3

您是否尝试了strip()

listOfNum = ['231512-n','1209123100000-n00000','alphanumeric0000', 'alphanumeric']
print [item.strip('0') for item in listOfNum]

>>> ['231512-n', '1209123100000-n', 'alphanumeric', 'alphanumeric']

Did you try with strip() :

listOfNum = ['231512-n','1209123100000-n00000','alphanumeric0000', 'alphanumeric']
print [item.strip('0') for item in listOfNum]

>>> ['231512-n', '1209123100000-n', 'alphanumeric', 'alphanumeric']

回答 4

str.strip是解决这种情况的最佳方法,但more_itertools.strip还是一种通用解决方案,可从迭代中剥离前导元素和尾随元素:

import more_itertools as mit


iterables = ["231512-n\n","  12091231000-n00000","alphanum0000", "00alphanum"]
pred = lambda x: x in {"0", "\n", " "}
list("".join(mit.strip(i, pred)) for i in iterables)
# ['231512-n', '12091231000-n', 'alphanum', 'alphanum']

细节

注意,这里我们"0"将满足谓词的其他元素中的前导和尾随s 剥离。此工具不仅限于字符串。

另请参阅docs,以获取更多的示例

more_itertools是可通过安装的第三方库> pip install more_itertools

str.strip is the best approach for this situation, but more_itertools.strip is also a general solution that strips both leading and trailing elements from an iterable:

Code

import more_itertools as mit


iterables = ["231512-n\n","  12091231000-n00000","alphanum0000", "00alphanum"]
pred = lambda x: x in {"0", "\n", " "}
list("".join(mit.strip(i, pred)) for i in iterables)
# ['231512-n', '12091231000-n', 'alphanum', 'alphanum']

Details

Notice, here we strip both leading and trailing "0"s among other elements that satisfy a predicate. This tool is not limited to strings.

See also docs for more examples of

more_itertools is a third-party library installable via > pip install more_itertools.


回答 5

假设列表中还有其他数据类型(不仅是字符串),请尝试此操作。这将从字符串中删除尾随和前导零,并使其他数据类型保持不变。这也处理特殊情况s =’0′

例如

a = ['001', '200', 'akdl00', 200, 100, '0']

b = [(lambda x: x.strip('0') if isinstance(x,str) and len(x) != 1 else x)(x) for x in a]

b
>>>['1', '2', 'akdl', 200, 100, '0']

Assuming you have other data types (and not only string) in your list try this. This removes trailing and leading zeros from strings and leaves other data types untouched. This also handles the special case s = ‘0’

e.g

a = ['001', '200', 'akdl00', 200, 100, '0']

b = [(lambda x: x.strip('0') if isinstance(x,str) and len(x) != 1 else x)(x) for x in a]

b
>>>['1', '2', 'akdl', 200, 100, '0']


如何删除尾随换行符?

问题:如何删除尾随换行符?

Python与Perl chomp函数等效吗?如果是换行符,它将删除字符串的最后符?

What is the Python equivalent of Perl’s chomp function, which removes the last character of a string if it is a newline?


回答 0

试用该方法rstrip()(请参阅doc Python 2Python 3

>>> 'test string\n'.rstrip()
'test string'

Python的rstrip()方法去除所有的默认类型的尾随空白的,如Perl并与不只是一个换行符chomp

>>> 'test string \n \r\n\n\r \n\n'.rstrip()
'test string'

要只删除换行符:

>>> 'test string \n \r\n\n\r \n\n'.rstrip('\n')
'test string \n \r\n\n\r '

还有一些方法lstrip()strip()

>>> s = "   \n\r\n  \n  abc   def \n\r\n  \n  "
>>> s.strip()
'abc   def'
>>> s.lstrip()
'abc   def \n\r\n  \n  '
>>> s.rstrip()
'   \n\r\n  \n  abc   def'

Try the method rstrip() (see doc Python 2 and Python 3)

>>> 'test string\n'.rstrip()
'test string'

Python’s rstrip() method strips all kinds of trailing whitespace by default, not just one newline as Perl does with chomp.

>>> 'test string \n \r\n\n\r \n\n'.rstrip()
'test string'

To strip only newlines:

>>> 'test string \n \r\n\n\r \n\n'.rstrip('\n')
'test string \n \r\n\n\r '

There are also the methods lstrip() and strip():

>>> s = "   \n\r\n  \n  abc   def \n\r\n  \n  "
>>> s.strip()
'abc   def'
>>> s.lstrip()
'abc   def \n\r\n  \n  '
>>> s.rstrip()
'   \n\r\n  \n  abc   def'

回答 1

我想说的是,在不尾随换行符的情况下获取行的“ pythonic”方法是splitlines()。

>>> text = "line 1\nline 2\r\nline 3\nline 4"
>>> text.splitlines()
['line 1', 'line 2', 'line 3', 'line 4']

And I would say the “pythonic” way to get lines without trailing newline characters is splitlines().

>>> text = "line 1\nline 2\r\nline 3\nline 4"
>>> text.splitlines()
['line 1', 'line 2', 'line 3', 'line 4']

回答 2

删除行尾(EOL)字符的规范方法是使用字符串rstrip()方法,删除任何尾随的\ r或\ n。以下是Mac,Windows和Unix EOL字符的示例。

>>> 'Mac EOL\r'.rstrip('\r\n')
'Mac EOL'
>>> 'Windows EOL\r\n'.rstrip('\r\n')
'Windows EOL'
>>> 'Unix EOL\n'.rstrip('\r\n')
'Unix EOL'

使用’\ r \ n’作为rstrip的参数意味着它会去除’\ r’或’\ n’的任何尾随组合。这就是为什么它在以上所有三种情况下都有效的原因。

这种细微差别在极少数情况下很重要。例如,我曾经不得不处理一个包含HL7消息的文本文件。HL7标准要求结尾的’\ r’作为其EOL字符。我在其上使用此消息的Windows计算机附加了自己的’\ r \ n’EOL字符。因此,每行的末尾看起来像’\ r \ r \ n’。使用rstrip(’\ r \ n’)会删除整个’\ r \ r \ n’,这不是我想要的。在那种情况下,我只是切掉了最后两个字符。

请注意,与Perl的chomp函数不同,这将在字符串的末尾去除所有指定的字符,而不仅仅是一个:

>>> "Hello\n\n\n".rstrip("\n")
"Hello"

The canonical way to strip end-of-line (EOL) characters is to use the string rstrip() method removing any trailing \r or \n. Here are examples for Mac, Windows, and Unix EOL characters.

>>> 'Mac EOL\r'.rstrip('\r\n')
'Mac EOL'
>>> 'Windows EOL\r\n'.rstrip('\r\n')
'Windows EOL'
>>> 'Unix EOL\n'.rstrip('\r\n')
'Unix EOL'

Using ‘\r\n’ as the parameter to rstrip means that it will strip out any trailing combination of ‘\r’ or ‘\n’. That’s why it works in all three cases above.

This nuance matters in rare cases. For example, I once had to process a text file which contained an HL7 message. The HL7 standard requires a trailing ‘\r’ as its EOL character. The Windows machine on which I was using this message had appended its own ‘\r\n’ EOL character. Therefore, the end of each line looked like ‘\r\r\n’. Using rstrip(‘\r\n’) would have taken off the entire ‘\r\r\n’ which is not what I wanted. In that case, I simply sliced off the last two characters instead.

Note that unlike Perl’s chomp function, this will strip all specified characters at the end of the string, not just one:

>>> "Hello\n\n\n".rstrip("\n")
"Hello"

回答 3

请注意,rstrip的行为与Perl的chomp()并不完全相同,因为它不会修改字符串。也就是说,在Perl中:

$x="a\n";

chomp $x

导致$x存在"a"

但在Python中:

x="a\n"

x.rstrip()

将意味着价值x依旧 "a\n"。甚至x=x.rstrip()并不总是给出相同的结果,因为它从字符串的末尾去除所有空格,最多不只是一个换行符。

Note that rstrip doesn’t act exactly like Perl’s chomp() because it doesn’t modify the string. That is, in Perl:

$x="a\n";

chomp $x

results in $x being "a".

but in Python:

x="a\n"

x.rstrip()

will mean that the value of x is still "a\n". Even x=x.rstrip() doesn’t always give the same result, as it strips all whitespace from the end of the string, not just one newline at most.


回答 4

我可能会使用这样的东西:

import os
s = s.rstrip(os.linesep)

我认为问题rstrip("\n")在于您可能需要确保行分隔符是可移植的。(有传闻说有些过时的系统要使用"\r\n")。另一个难题是,rstrip它将去除重复的空白。希望os.linesep将包含正确的字符。以上对我有用。

I might use something like this:

import os
s = s.rstrip(os.linesep)

I think the problem with rstrip("\n") is that you’ll probably want to make sure the line separator is portable. (some antiquated systems are rumored to use "\r\n"). The other gotcha is that rstrip will strip out repeated whitespace. Hopefully os.linesep will contain the right characters. the above works for me.


回答 5

您可以使用line = line.rstrip('\n')。这将从字符串末尾除去所有换行符,而不仅仅是一条。

You may use line = line.rstrip('\n'). This will strip all newlines from the end of the string, not just one.


回答 6

s = s.rstrip()

将删除字符串末尾的所有换行符s。需要分配是因为rstrip返回一个新字符串而不是修改原始字符串。

s = s.rstrip()

will remove all newlines at the end of the string s. The assignment is needed because rstrip returns a new string instead of modifying the original string.


回答 7

这将为“ \ n”行终止符精确复制perl的champ(数组的负行为):

def chomp(x):
    if x.endswith("\r\n"): return x[:-2]
    if x.endswith("\n") or x.endswith("\r"): return x[:-1]
    return x

(注意:它不会修改字符串“就地”;它不会去除多余的尾随空格;需要考虑\ r \ n)

This would replicate exactly perl’s chomp (minus behavior on arrays) for “\n” line terminator:

def chomp(x):
    if x.endswith("\r\n"): return x[:-2]
    if x.endswith("\n") or x.endswith("\r"): return x[:-1]
    return x

(Note: it does not modify string ‘in place’; it does not strip extra trailing whitespace; takes \r\n in account)


回答 8

"line 1\nline 2\r\n...".replace('\n', '').replace('\r', '')
>>> 'line 1line 2...'

否则您总是可以通过regexp变得更加怪异:)

玩得开心!

"line 1\nline 2\r\n...".replace('\n', '').replace('\r', '')
>>> 'line 1line 2...'

or you could always get geekier with regexps :)

have fun!


回答 9

您可以使用地带:

line = line.strip()

演示:

>>> "\n\n hello world \n\n".strip()
'hello world'

you can use strip:

line = line.strip()

demo:

>>> "\n\n hello world \n\n".strip()
'hello world'

回答 10

rstrip在很多级别上都没有与chomp相同的功能。阅读http://perldoc.perl.org/functions/chomp.html,发现chomp确实非常复杂。

但是,我的主要观点是chomp最多删除1个行尾,而rstrip会删除尽可能多的行。

在这里,您可以看到rstrip删除了所有换行符:

>>> 'foo\n\n'.rstrip(os.linesep)
'foo'

可以使用re.sub来更接近典型的Perl chomp用法,如下所示:

>>> re.sub(os.linesep + r'\Z','','foo\n\n')
'foo\n'

rstrip doesn’t do the same thing as chomp, on so many levels. Read http://perldoc.perl.org/functions/chomp.html and see that chomp is very complex indeed.

However, my main point is that chomp removes at most 1 line ending, whereas rstrip will remove as many as it can.

Here you can see rstrip removing all the newlines:

>>> 'foo\n\n'.rstrip(os.linesep)
'foo'

A much closer approximation of typical Perl chomp usage can be accomplished with re.sub, like this:

>>> re.sub(os.linesep + r'\Z','','foo\n\n')
'foo\n'

回答 11

注意"foo".rstrip(os.linesep):只会砍断正在执行Python的平台的换行符。想象一下,例如,您正在用Linux整理Windows文件的行,例如:

$ python
Python 2.7.1 (r271:86832, Mar 18 2011, 09:09:48) 
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, sys
>>> sys.platform
'linux2'
>>> "foo\r\n".rstrip(os.linesep)
'foo\r'
>>>

"foo".rstrip("\r\n")如Mike所说,请改用。

Careful with "foo".rstrip(os.linesep): That will only chomp the newline characters for the platform where your Python is being executed. Imagine you’re chimping the lines of a Windows file under Linux, for instance:

$ python
Python 2.7.1 (r271:86832, Mar 18 2011, 09:09:48) 
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, sys
>>> sys.platform
'linux2'
>>> "foo\r\n".rstrip(os.linesep)
'foo\r'
>>>

Use "foo".rstrip("\r\n") instead, as Mike says above.


回答 12

Python文档中示例仅使用line.strip()

Perl的chomp函数仅在字符串末尾才删除一个换行序列。

如果process从概念上来说,这是我需要执行的功能,以便对该文件的每一行都有用,这就是我打算在Python 中执行的操作:

import os
sep_pos = -len(os.linesep)
with open("file.txt") as f:
    for line in f:
        if line[sep_pos:] == os.linesep:
            line = line[:sep_pos]
        process(line)

An example in Python’s documentation simply uses line.strip().

Perl’s chomp function removes one linebreak sequence from the end of a string only if it’s actually there.

Here is how I plan to do that in Python, if process is conceptually the function that I need in order to do something useful to each line from this file:

import os
sep_pos = -len(os.linesep)
with open("file.txt") as f:
    for line in f:
        if line[sep_pos:] == os.linesep:
            line = line[:sep_pos]
        process(line)

回答 13

我不使用Python编程,但是在python.org上遇到了一个常见问题解答,主张S.rstrip(“ \ r \ n”)适用于python 2.2或更高版本。

I don’t program in Python, but I came across an FAQ at python.org advocating S.rstrip(“\r\n”) for python 2.2 or later.


回答 14

import re

r_unwanted = re.compile("[\n\t\r]")
r_unwanted.sub("", your_text)
import re

r_unwanted = re.compile("[\n\t\r]")
r_unwanted.sub("", your_text)

回答 15

我发现能够通过迭代器获得短线很方便,这与从文件对象中获得短线的方式相似。您可以使用以下代码进行操作:

def chomped_lines(it):
    return map(operator.methodcaller('rstrip', '\r\n'), it)

用法示例:

with open("file.txt") as infile:
    for line in chomped_lines(infile):
        process(line)

I find it convenient to have be able to get the chomped lines via in iterator, parallel to the way you can get the un-chomped lines from a file object. You can do so with the following code:

def chomped_lines(it):
    return map(operator.methodcaller('rstrip', '\r\n'), it)

Sample usage:

with open("file.txt") as infile:
    for line in chomped_lines(infile):
        process(line)

回答 16

特殊情况的解决方法:

如果换行符是最后符(大多数文件输入都是这种情况),那么对于集合中的任何元素,您都可以按如下所示进行索引:

foobar= foobar[:-1]

切出换行符。

workaround solution for special case:

if the newline character is the last character (as is the case with most file inputs), then for any element in the collection you can index as follows:

foobar= foobar[:-1]

to slice out your newline character.


回答 17

如果您的问题是清理多行str对象(oldstr)中的所有换行符,则可以根据定界符’\ n’将其拆分为一个列表,然后将该列表加入一个新的str(newstr)中。

newstr = "".join(oldstr.split('\n'))

If your question is to clean up all the line breaks in a multiple line str object (oldstr), you can split it into a list according to the delimiter ‘\n’ and then join this list into a new str(newstr).

newstr = "".join(oldstr.split('\n'))


回答 18

它看起来像没有用于Perl的一个完美的模拟格格。尤其是,rstrip无法处理多字符换行符分隔符,例如\r\n。但是,分割线确实如此处指出。按照对另一个问题的回答,您可以结合使用joinsplitlines来删除/替换字符串中的所有换行符s

''.join(s.splitlines())

以下内容仅删除了一条尾随的换行符(我相信像排行一样)。Truekeepends参数作为分割线传递时保留定界符。然后,再次调用splitlines以删除最后一个“行”上的分隔符:

def chomp(s):
    if len(s):
        lines = s.splitlines(True)
        last = lines.pop()
        return ''.join(lines + last.splitlines())
    else:
        return ''

It looks like there is not a perfect analog for perl’s chomp. In particular, rstrip cannot handle multi-character newline delimiters like \r\n. However, splitlines does as pointed out here. Following my answer on a different question, you can combine join and splitlines to remove/replace all newlines from a string s:

''.join(s.splitlines())

The following removes exactly one trailing newline (as chomp would, I believe). Passing True as the keepends argument to splitlines retain the delimiters. Then, splitlines is called again to remove the delimiters on just the last “line”:

def chomp(s):
    if len(s):
        lines = s.splitlines(True)
        last = lines.pop()
        return ''.join(lines + last.splitlines())
    else:
        return ''

回答 19

我正在从先前在其他答案的评论中发布的答案中冒充基于正则表达式的答案。我认为使用re可以解决此问题str.rstrip

>>> import re

如果要删除一个或多个尾随换行符,请执行以下操作:

>>> re.sub(r'[\n\r]+$', '', '\nx\r\n')
'\nx'

如果要在各处删除换行符(不只是尾随):

>>> re.sub(r'[\n\r]+', '', '\nx\r\n')
'x'

如果你想删除只有1-2尾随换行字符(即\r\n\r\n\n\r\r\r\n\n

>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r\n')
'\nx\r'
>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r')
'\nx\r'
>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n')
'\nx'

我有一种感觉,大多数人真的想在这里,是消除只是一个发生尾随换行符的,无论是\r\n\n仅此而已。

>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n\n', count=1)
'\nx\n'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n\r\n', count=1)
'\nx\r\n'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n', count=1)
'\nx'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n', count=1)
'\nx'

?:创建一个非捕获组。)

(顺便说一句,这不是做什么'...'.rstrip('\n', '').rstrip('\r', ''),其他人可能不会在这个线程上绊脚石。 str.rstrip剥离掉尽可能多的尾随字符,因此,像这样的字符串foo\n\n\n会导致的误报,foo而您可能想保留除去尾随单个后的其他换行符。)

I’m bubbling up my regular expression based answer from one I posted earlier in the comments of another answer. I think using re is a clearer more explicit solution to this problem than str.rstrip.

>>> import re

If you want to remove one or more trailing newline chars:

>>> re.sub(r'[\n\r]+$', '', '\nx\r\n')
'\nx'

If you want to remove newline chars everywhere (not just trailing):

>>> re.sub(r'[\n\r]+', '', '\nx\r\n')
'x'

If you want to remove only 1-2 trailing newline chars (i.e., \r, \n, \r\n, \n\r, \r\r, \n\n)

>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r\n')
'\nx\r'
>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r')
'\nx\r'
>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n')
'\nx'

I have a feeling what most people really want here, is to remove just one occurrence of a trailing newline character, either \r\n or \n and nothing more.

>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n\n', count=1)
'\nx\n'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n\r\n', count=1)
'\nx\r\n'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n', count=1)
'\nx'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n', count=1)
'\nx'

(The ?: is to create a non-capturing group.)

(By the way this is not what '...'.rstrip('\n', '').rstrip('\r', '') does which may not be clear to others stumbling upon this thread. str.rstrip strips as many of the trailing characters as possible, so a string like foo\n\n\n would result in a false positive of foo whereas you may have wanted to preserve the other newlines after stripping a single trailing one.)


回答 20

>>> '   spacious   '.rstrip()
'   spacious'
>>> "AABAA".rstrip("A")
  'AAB'
>>> "ABBA".rstrip("AB") # both AB and BA are stripped
   ''
>>> "ABCABBA".rstrip("AB")
   'ABC'
>>> '   spacious   '.rstrip()
'   spacious'
>>> "AABAA".rstrip("A")
  'AAB'
>>> "ABBA".rstrip("AB") # both AB and BA are stripped
   ''
>>> "ABCABBA".rstrip("AB")
   'ABC'

回答 21

只需使用:

line = line.rstrip("\n")

要么

line = line.strip("\n")

您不需要这些复杂的东西

Just use :

line = line.rstrip("\n")

or

line = line.strip("\n")

You don’t need any of this complicated stuff


回答 22

s = '''Hello  World \t\n\r\tHi There'''
# import the module string   
import string
# use the method translate to convert 
s.translate({ord(c): None for c in string.whitespace}
>>'HelloWorldHiThere'

与正则表达式

s = '''  Hello  World 
\t\n\r\tHi '''
print(re.sub(r"\s+", "", s), sep='')  # \s matches all white spaces
>HelloWorldHi

替换\ n,\ t,\ r

s.replace('\n', '').replace('\t','').replace('\r','')
>'  Hello  World Hi '

与正则表达式

s = '''Hello  World \t\n\r\tHi There'''
regex = re.compile(r'[\n\r\t]')
regex.sub("", s)
>'Hello  World Hi There'

与加入

s = '''Hello  World \t\n\r\tHi There'''
' '.join(s.split())
>'Hello  World Hi There'
s = '''Hello  World \t\n\r\tHi There'''
# import the module string   
import string
# use the method translate to convert 
s.translate({ord(c): None for c in string.whitespace}
>>'HelloWorldHiThere'

With regex

s = '''  Hello  World 
\t\n\r\tHi '''
print(re.sub(r"\s+", "", s), sep='')  # \s matches all white spaces
>HelloWorldHi

Replace \n,\t,\r

s.replace('\n', '').replace('\t','').replace('\r','')
>'  Hello  World Hi '

With regex

s = '''Hello  World \t\n\r\tHi There'''
regex = re.compile(r'[\n\r\t]')
regex.sub("", s)
>'Hello  World Hi There'

with Join

s = '''Hello  World \t\n\r\tHi There'''
' '.join(s.split())
>'Hello  World Hi There'

回答 23

有三种类型的行结尾的,我们常遇到的问题:\n\r\r\n。中的一个相当简单的正则表达式re.sub,即r"\r?\n?$",能够将它们全部捕获。

(而且我们要抓住一切,对吗?)

import re

re.sub(r"\r?\n?$", "", the_text, 1)

对于最后一个参数,我们将替换的出现次数限制为一次,从而在某种程度上模仿了chomp。例:

import re

text_1 = "hellothere\n\n\n"
text_2 = "hellothere\n\n\r"
text_3 = "hellothere\n\n\r\n"

a = re.sub(r"\r?\n?$", "", text_1, 1)
b = re.sub(r"\r?\n?$", "", text_2, 1)
c = re.sub(r"\r?\n?$", "", text_3, 1)

…这里a == b == cTrue

There are three types of line endings that we normally encounter: \n, \r and \r\n. A rather simple regular expression in re.sub, namely r"\r?\n?$", is able to catch them all.

(And we gotta catch ’em all, am I right?)

import re

re.sub(r"\r?\n?$", "", the_text, 1)

With the last argument, we limit the number of occurences replaced to one, mimicking chomp to some extent. Example:

import re

text_1 = "hellothere\n\n\n"
text_2 = "hellothere\n\n\r"
text_3 = "hellothere\n\n\r\n"

a = re.sub(r"\r?\n?$", "", text_1, 1)
b = re.sub(r"\r?\n?$", "", text_2, 1)
c = re.sub(r"\r?\n?$", "", text_3, 1)

… where a == b == c is True.


回答 24

如果您担心速度(例如,您有很长的字符串列表)并且知道换行符char的性质,则字符串切片实际上比rstrip快。进行一点测试以说明这一点:

import time

loops = 50000000

def method1(loops=loops):
    test_string = 'num\n'
    t0 = time.time()
    for num in xrange(loops):
        out_sting = test_string[:-1]
    t1 = time.time()
    print('Method 1: ' + str(t1 - t0))

def method2(loops=loops):
    test_string = 'num\n'
    t0 = time.time()
    for num in xrange(loops):
        out_sting = test_string.rstrip()
    t1 = time.time()
    print('Method 2: ' + str(t1 - t0))

method1()
method2()

输出:

Method 1: 3.92700004578
Method 2: 6.73000001907

If you are concerned about speed (say you have a looong list of strings) and you know the nature of the newline char, string slicing is actually faster than rstrip. A little test to illustrate this:

import time

loops = 50000000

def method1(loops=loops):
    test_string = 'num\n'
    t0 = time.time()
    for num in xrange(loops):
        out_sting = test_string[:-1]
    t1 = time.time()
    print('Method 1: ' + str(t1 - t0))

def method2(loops=loops):
    test_string = 'num\n'
    t0 = time.time()
    for num in xrange(loops):
        out_sting = test_string.rstrip()
    t1 = time.time()
    print('Method 2: ' + str(t1 - t0))

method1()
method2()

Output:

Method 1: 3.92700004578
Method 2: 6.73000001907

回答 25


这将同时适用于Windows和Linux(如果您只寻求re解决方案,那么re sub会有点贵)

import re 
if re.search("(\\r|)\\n$", line):
    line = re.sub("(\\r|)\\n$", "", line)


This will work both for windows and linux (bit expensive with re sub if you are looking for only re solution)

import re 
if re.search("(\\r|)\\n$", line):
    line = re.sub("(\\r|)\\n$", "", line)


回答 26

首先分割线,然后通过您喜欢的任何分隔符将它们连接起来:

x = ' '.join(x.splitlines())

应该像魅力一样工作。

First split lines then join them by any separator you like:

x = ' '.join(x.splitlines())

should work like a charm.


回答 27

一网打尽:

line = line.rstrip('\r|\n')

A catch all:

line = line.rstrip('\r|\n')