标签归档:leading-zero

为什么Python 3允许“ 00”作为0的文字,却不允许“ 01”作为1的文字?

问题:为什么Python 3允许“ 00”作为0的文字,却不允许“ 01”作为1的文字?

为什么Python 3允许“ 00”作为原义的0,却不允许“ 01”作为原义的1?有充分的理由吗?这种矛盾使我感到困惑。(我们正在谈论的是Python 3,它故意打破了向后兼容性以实现诸如一致性之类的目标。)

例如:

>>> from datetime import time
>>> time(16, 00)
datetime.time(16, 0)
>>> time(16, 01)
  File "<stdin>", line 1
    time(16, 01)
              ^
SyntaxError: invalid token
>>>

Why does Python 3 allow “00” as a literal for 0 but not allow “01” as a literal for 1? Is there a good reason? This inconsistency baffles me. (And we’re talking about Python 3, which purposely broke backward compatibility in order to achieve goals like consistency.)

For example:

>>> from datetime import time
>>> time(16, 00)
datetime.time(16, 0)
>>> time(16, 01)
  File "<stdin>", line 1
    time(16, 01)
              ^
SyntaxError: invalid token
>>>

回答 0

根据https://docs.python.org/3/reference/lexical_analysis.html#integer-literals

整数文字由以下词汇定义描述:

integer        ::=  decimalinteger | octinteger | hexinteger | bininteger
decimalinteger ::=  nonzerodigit digit* | "0"+
nonzerodigit   ::=  "1"..."9"
digit          ::=  "0"..."9"
octinteger     ::=  "0" ("o" | "O") octdigit+
hexinteger     ::=  "0" ("x" | "X") hexdigit+
bininteger     ::=  "0" ("b" | "B") bindigit+
octdigit       ::=  "0"..."7"
hexdigit       ::=  digit | "a"..."f" | "A"..."F"
bindigit       ::=  "0" | "1"

除了可以存储在可用内存中的整数之外,整数文字的长度没有限制。

请注意,不允许使用非零十进制数字开头的零。这是为了消除C样式八进制文字的歧义,Python在3.0版之前使用了这些样式。

如此处所述,不允许使用非零十进制数字开头的零"0"+作为一个非常特殊的情况是合法的,这在Python 2中是不存在的

integer        ::=  decimalinteger | octinteger | hexinteger | bininteger
decimalinteger ::=  nonzerodigit digit* | "0"
octinteger     ::=  "0" ("o" | "O") octdigit+ | "0" octdigit+

SVN commit r55866在令牌生成器中实现了PEP 3127,它禁止使用旧0<octal>数字。但是,奇怪的是,它也添加了以下注释:

/* in any case, allow '0' as a literal */

带有nonzeroSyntaxError在以下数字序列包含非零数字时抛出的特殊标志。

这很奇怪,因为PEP 3127不允许这种情况:

该PEP建议,将使用Python 3.0(和2.6的Python 3.0预览模式)从语言中删除使用前导零指定八进制数的功能,并且每当前导“ 0”为紧跟着另一个数字

(强调我的)

因此,允许多个零的事实在技术上违反了PEP,并且基本上由Georg Brandl实施为特殊情况。他进行了相应的文档更改,以注意这"0"+是的有效案例decimalinteger(以前已在中进行了介绍octinteger)。

我们可能永远不会确切知道为什么Georg选择使之"0"+有效-在Python中它可能永远是一个奇怪的情况。


更新 [2015年7月28日]:这个问题引发了关于python-ideas 的热烈讨论Georg在其中进行了讨论

史蒂文·达普拉诺(Steven D’Aprano)写道:

为什么这样定义?[…]为什么我们写0000以得到零?

我可以告诉你,但后来我不得不杀了你。

格奥尔格

后来,该线程生成了此错误报告,旨在摆脱这种特殊情况。乔治在这里

我不记得有意进行更改的原因(从文档更改中可以看出)。

我现在无法提出更改的充分理由[…]

因此,我们有了它:这种不一致背后的确切原因已不复存在。

最后,请注意,该错误报告已被拒绝:对于Python 3.x的其余部分,前导零将仅在零整数上继续被接受。

Per https://docs.python.org/3/reference/lexical_analysis.html#integer-literals:

Integer literals are described by the following lexical definitions:

integer        ::=  decimalinteger | octinteger | hexinteger | bininteger
decimalinteger ::=  nonzerodigit digit* | "0"+
nonzerodigit   ::=  "1"..."9"
digit          ::=  "0"..."9"
octinteger     ::=  "0" ("o" | "O") octdigit+
hexinteger     ::=  "0" ("x" | "X") hexdigit+
bininteger     ::=  "0" ("b" | "B") bindigit+
octdigit       ::=  "0"..."7"
hexdigit       ::=  digit | "a"..."f" | "A"..."F"
bindigit       ::=  "0" | "1"

There is no limit for the length of integer literals apart from what can be stored in available memory.

Note that leading zeros in a non-zero decimal number are not allowed. This is for disambiguation with C-style octal literals, which Python used before version 3.0.

As noted here, leading zeros in a non-zero decimal number are not allowed. "0"+ is legal as a very special case, which wasn’t present in Python 2:

integer        ::=  decimalinteger | octinteger | hexinteger | bininteger
decimalinteger ::=  nonzerodigit digit* | "0"
octinteger     ::=  "0" ("o" | "O") octdigit+ | "0" octdigit+

SVN commit r55866 implemented PEP 3127 in the tokenizer, which forbids the old 0<octal> numbers. However, curiously, it also adds this note:

/* in any case, allow '0' as a literal */

with a special nonzero flag that only throws a SyntaxError if the following sequence of digits contains a nonzero digit.

This is odd because PEP 3127 does not allow this case:

This PEP proposes that the ability to specify an octal number by using a leading zero will be removed from the language in Python 3.0 (and the Python 3.0 preview mode of 2.6), and that a SyntaxError will be raised whenever a leading “0” is immediately followed by another digit.

(emphasis mine)

So, the fact that multiple zeros are allowed is technically violating the PEP, and was basically implemented as a special case by Georg Brandl. He made the corresponding documentation change to note that "0"+ was a valid case for decimalinteger (previously that had been covered under octinteger).

We’ll probably never know exactly why Georg chose to make "0"+ valid – it may forever remain an odd corner case in Python.


UPDATE [28 Jul 2015]: This question led to a lively discussion thread on python-ideas in which Georg chimed in:

Steven D’Aprano wrote:

Why was it defined that way? […] Why would we write 0000 to get zero?

I could tell you, but then I’d have to kill you.

Georg

Later on, the thread spawned this bug report aiming to get rid of this special case. Here, Georg says:

I don’t recall the reason for this deliberate change (as seen from the docs change).

I’m unable to come up with a good reason for this change now […]

and thus we have it: the precise reason behind this inconsistency is lost to time.

Finally, note that the bug report was rejected: leading zeros will continue to be accepted only on zero integers for the rest of Python 3.x.


回答 1

这是特例("0"+

2.4.4。整数文字

整数文字由以下词汇定义描述:

整数:: =十进制整数| 八进制| hexinteger | 二进制整数
十进制整数:: =非零数字* “ 0” +
非零数字:: =“ 1” ...“ 9”
数字:: =“ 0” ...“ 9”
八位整数:: =“ 0”(“ o” |“ O”)八位数字+
hexinteger :: =“ 0”(“ x” |“ X”)十六进制+
bininteger :: =“ 0”(“ b” |“ B”)bindigit +
八位数字:: =“ 0” ...“ 7”
十六进制::: digit | “ a” ...“ f” | “ A” ...“ F”
bindigit :: =“ 0” | “ 1”

如果您查看语法,则很容易看到0需要特殊情况。我不确定为什么在+那里需要’ ‘。是时候浏览一下开发邮件列表了…


有趣的是,在Python2中,有多个0解析为octinteger(最终结果仍然0是)

十进制整数:: =非零数字* “ 0”
八位整数:: =“ 0”(“ o” |“ O”)八位数字+ | “ 0”八位数字+

It’s a special case ("0"+)

2.4.4. Integer literals

Integer literals are described by the following lexical definitions:

integer        ::=  decimalinteger | octinteger | hexinteger | bininteger
decimalinteger ::=  nonzerodigit digit* | "0"+
nonzerodigit   ::=  "1"..."9"
digit          ::=  "0"..."9"
octinteger     ::=  "0" ("o" | "O") octdigit+
hexinteger     ::=  "0" ("x" | "X") hexdigit+
bininteger     ::=  "0" ("b" | "B") bindigit+
octdigit       ::=  "0"..."7"
hexdigit       ::=  digit | "a"..."f" | "A"..."F"
bindigit       ::=  "0" | "1"

If you look at the grammar, it’s easy to see that 0 need a special case. I’m not sure why the ‘+‘ is considered necessary there though. Time to dig through the dev mailing list…


Interesting to note that in Python2, more than one 0 was parsed as an octinteger (the end result is still 0 though)

decimalinteger ::=  nonzerodigit digit* | "0"
octinteger     ::=  "0" ("o" | "O") octdigit+ | "0" octdigit+

回答 2

Python2使用前导零指定八进制数:

>>> 010
8

为了避免这种情况(?误导性)行为,Python3需要明确的前缀0b0o0x

>>> 0o10
8

Python2 used the leading zero to specify octal numbers:

>>> 010
8

To avoid this (misleading?) behaviour, Python3 requires explicit prefixes 0b, 0o, 0x:

>>> 0o10
8

如何删除字符串中的前导和尾随零?Python

问题:如何删除字符串中的前导和尾随零?Python

我有几个像这样的字母数字字符串

listOfNum = ['000231512-n','1209123100000-n00000','alphanumeric0000', '000alphanumeric']

除去尾随零的理想输出为:

listOfNum = ['000231512-n','1209123100000-n','alphanumeric', '000alphanumeric']

前导尾随零的期望输出为:

listOfNum = ['231512-n','1209123100000-n00000','alphanumeric0000', 'alphanumeric']

除去前导零和尾随零的期望输出为:

listOfNum = ['231512-n','1209123100000-n', 'alphanumeric', 'alphanumeric']

目前,我已经按照以下方式进行操作,如果有的话,请提出一种更好的方法:

listOfNum = ['000231512-n','1209123100000-n00000','alphanumeric0000', \
'000alphanumeric']
trailingremoved = []
leadingremoved = []
bothremoved = []

# Remove trailing
for i in listOfNum:
  while i[-1] == "0":
    i = i[:-1]
  trailingremoved.append(i)

# Remove leading
for i in listOfNum:
  while i[0] == "0":
    i = i[1:]
  leadingremoved.append(i)

# Remove both
for i in listOfNum:
  while i[0] == "0":
    i = i[1:]
  while i[-1] == "0":
    i = i[:-1]
  bothremoved.append(i)

I have several alphanumeric strings like these

listOfNum = ['000231512-n','1209123100000-n00000','alphanumeric0000', '000alphanumeric']

The desired output for removing trailing zeros would be:

listOfNum = ['000231512-n','1209123100000-n','alphanumeric', '000alphanumeric']

The desired output for leading trailing zeros would be:

listOfNum = ['231512-n','1209123100000-n00000','alphanumeric0000', 'alphanumeric']

The desire output for removing both leading and trailing zeros would be:

listOfNum = ['231512-n','1209123100000-n', 'alphanumeric', 'alphanumeric']

For now i’ve been doing it the following way, please suggest a better way if there is:

listOfNum = ['000231512-n','1209123100000-n00000','alphanumeric0000', \
'000alphanumeric']
trailingremoved = []
leadingremoved = []
bothremoved = []

# Remove trailing
for i in listOfNum:
  while i[-1] == "0":
    i = i[:-1]
  trailingremoved.append(i)

# Remove leading
for i in listOfNum:
  while i[0] == "0":
    i = i[1:]
  leadingremoved.append(i)

# Remove both
for i in listOfNum:
  while i[0] == "0":
    i = i[1:]
  while i[-1] == "0":
    i = i[:-1]
  bothremoved.append(i)

回答 0

那基本的

your_string.strip("0")

删除尾随和前导零?如果您只想删除尾随零,请.rstrip改用(.lstrip仅用于前导零)。

[ 文档中的更多信息。]

您可以使用一些列表推导来获得所需的序列,如下所示:

trailing_removed = [s.rstrip("0") for s in listOfNum]
leading_removed = [s.lstrip("0") for s in listOfNum]
both_removed = [s.strip("0") for s in listOfNum]

What about a basic

your_string.strip("0")

to remove both trailing and leading zeros ? If you’re only interested in removing trailing zeros, use .rstrip instead (and .lstrip for only the leading ones).

[More info in the doc.]

You could use some list comprehension to get the sequences you want like so:

trailing_removed = [s.rstrip("0") for s in listOfNum]
leading_removed = [s.lstrip("0") for s in listOfNum]
both_removed = [s.strip("0") for s in listOfNum]

回答 1

删除前导+尾随的“ 0”:

list = [i.strip('0') for i in listOfNum ]

删除前导“ 0”:

list = [ i.lstrip('0') for i in listOfNum ]

删除尾随的“ 0”:

list = [ i.rstrip('0') for i in listOfNum ]

Remove leading + trailing ‘0’:

list = [i.strip('0') for i in listOfNum ]

Remove leading ‘0’:

list = [ i.lstrip('0') for i in listOfNum ]

Remove trailing ‘0’:

list = [ i.rstrip('0') for i in listOfNum ]

回答 2

您可以简单地通过bool做到这一点:

if int(number) == float(number):

   number = int(number)

else:

   number = float(number)

You can simply do this with a bool:

if int(number) == float(number):

   number = int(number)

else:

   number = float(number)

回答 3

您是否尝试了strip()

listOfNum = ['231512-n','1209123100000-n00000','alphanumeric0000', 'alphanumeric']
print [item.strip('0') for item in listOfNum]

>>> ['231512-n', '1209123100000-n', 'alphanumeric', 'alphanumeric']

Did you try with strip() :

listOfNum = ['231512-n','1209123100000-n00000','alphanumeric0000', 'alphanumeric']
print [item.strip('0') for item in listOfNum]

>>> ['231512-n', '1209123100000-n', 'alphanumeric', 'alphanumeric']

回答 4

str.strip是解决这种情况的最佳方法,但more_itertools.strip还是一种通用解决方案,可从迭代中剥离前导元素和尾随元素:

import more_itertools as mit


iterables = ["231512-n\n","  12091231000-n00000","alphanum0000", "00alphanum"]
pred = lambda x: x in {"0", "\n", " "}
list("".join(mit.strip(i, pred)) for i in iterables)
# ['231512-n', '12091231000-n', 'alphanum', 'alphanum']

细节

注意,这里我们"0"将满足谓词的其他元素中的前导和尾随s 剥离。此工具不仅限于字符串。

另请参阅docs,以获取更多的示例

more_itertools是可通过安装的第三方库> pip install more_itertools

str.strip is the best approach for this situation, but more_itertools.strip is also a general solution that strips both leading and trailing elements from an iterable:

Code

import more_itertools as mit


iterables = ["231512-n\n","  12091231000-n00000","alphanum0000", "00alphanum"]
pred = lambda x: x in {"0", "\n", " "}
list("".join(mit.strip(i, pred)) for i in iterables)
# ['231512-n', '12091231000-n', 'alphanum', 'alphanum']

Details

Notice, here we strip both leading and trailing "0"s among other elements that satisfy a predicate. This tool is not limited to strings.

See also docs for more examples of

more_itertools is a third-party library installable via > pip install more_itertools.


回答 5

假设列表中还有其他数据类型(不仅是字符串),请尝试此操作。这将从字符串中删除尾随和前导零,并使其他数据类型保持不变。这也处理特殊情况s =’0′

例如

a = ['001', '200', 'akdl00', 200, 100, '0']

b = [(lambda x: x.strip('0') if isinstance(x,str) and len(x) != 1 else x)(x) for x in a]

b
>>>['1', '2', 'akdl', 200, 100, '0']

Assuming you have other data types (and not only string) in your list try this. This removes trailing and leading zeros from strings and leaves other data types untouched. This also handles the special case s = ‘0’

e.g

a = ['001', '200', 'akdl00', 200, 100, '0']

b = [(lambda x: x.strip('0') if isinstance(x,str) and len(x) != 1 else x)(x) for x in a]

b
>>>['1', '2', 'akdl', 200, 100, '0']