标签归档:pep

更好地“尝试”某些东西并捕获异常或测试是否有可能首先避免异常?

问题:更好地“尝试”某些东西并捕获异常或测试是否有可能首先避免异常?

我应该测试if某种东西是有效的还是只是try为了做它并捕获异常?

  • 有没有可靠的文档说首选方法?
  • 还有一种方法更pythonic吗?

例如,我应该:

if len(my_list) >= 4:
    x = my_list[3]
else:
    x = 'NO_ABC'

要么:

try:
    x = my_list[3]
except IndexError:
    x = 'NO_ABC'

一些想法…
PEP 20说:

错误绝不能默默传递。
除非明确地保持沉默。

应该使用a try而不是an if解释为无声传递的错误吗?如果是这样,您是否通过以这种方式使用它来明确使其静音,从而使其正常运行?


不是指只能以一种方式做事的情况;例如:

try:
    import foo
except ImportError:
    import baz

Should I test if something is valid or just try to do it and catch the exception?

  • Is there any solid documentation saying that one way is preferred?
  • Is one way more pythonic?

For example, should I:

if len(my_list) >= 4:
    x = my_list[3]
else:
    x = 'NO_ABC'

Or:

try:
    x = my_list[3]
except IndexError:
    x = 'NO_ABC'

Some thoughts…
PEP 20 says:

Errors should never pass silently.
Unless explicitly silenced.

Should using a try instead of an if be interpreted as an error passing silently? And if so, are you explicitly silencing it by using it in this way, therefore making it OK?


I’m not referring to situations where you can only do things 1 way; for example:

try:
    import foo
except ImportError:
    import baz

回答 0

你应该更喜欢try/exceptif/else如果结果

  • 加快速度(例如,通过防止额外的查询)
  • 更清晰的代码(行数更少/更易于阅读)

通常,它们并存。


加速

如果尝试通过以下方式在长列表中查找元素:

try:
    x = my_list[index]
except IndexError:
    x = 'NO_ABC'

index可能在列表中并且通常不引发IndexError 时,尝试除外是最好的选择。这样,您就可以避免进行额外的查询if index < len(my_list)

Python鼓励使用异常,可以使用Dive Into Python中的短语来处理异常。您的示例不仅(优美地)处理异常,而不是让其静默通过,而且仅在未找到索引的特殊情况下才发生异常(因此,单词异常!)。


清洁代码

Python的官方文档中提到了EAFP比获得许可更容易获得宽恕Rob Knight指出捕获错误而不是避免错误可以使代码简洁,更易于阅读。他的示例如下所示:

更差(LBYL“跳前先看”)

#check whether int conversion will raise an error
if not isinstance(s, str) or not s.isdigit():
    return None
elif len(s) > 10:    #too many digits for int conversion
    return None
else:
    return int(s)

更好(EAFP:寻求宽恕比获得许可更容易)

try:
    return int(s)
except (TypeError, ValueError, OverflowError): #int conversion failed
    return None

You should prefer try/except over if/else if that results in

  • speed-ups (for example by preventing extra lookups)
  • cleaner code (fewer lines/easier to read)

Often, these go hand-in-hand.


speed-ups

In the case of trying to find an element in a long list by:

try:
    x = my_list[index]
except IndexError:
    x = 'NO_ABC'

the try, except is the best option when the index is probably in the list and the IndexError is usually not raised. This way you avoid the need for an extra lookup by if index < len(my_list).

Python encourages the use of exceptions, which you handle is a phrase from Dive Into Python. Your example not only handles the exception (gracefully), rather than letting it silently pass, also the exception occurs only in the exceptional case of index not being found (hence the word exception!).


cleaner code

The official Python Documentation mentions EAFP: Easier to ask for forgiveness than permission and Rob Knight notes that catching errors rather than avoiding them, can result in cleaner, easier to read code. His example says it like this:

Worse (LBYL ‘look before you leap’):

#check whether int conversion will raise an error
if not isinstance(s, str) or not s.isdigit():
    return None
elif len(s) > 10:    #too many digits for int conversion
    return None
else:
    return int(s)

Better (EAFP: Easier to ask for forgiveness than permission):

try:
    return int(s)
except (TypeError, ValueError, OverflowError): #int conversion failed
    return None

回答 1

在这种情况下,您应该完全使用其他方法:

x = myDict.get("ABC", "NO_ABC")

不过,通常来说:如果您希望测试经常失败,请使用if。如果测试相对于尝试操作并失败则捕获异常而言代价高昂,请使用try。如果以上条件均不适用,则更容易阅读。

In this particular case, you should use something else entirely:

x = myDict.get("ABC", "NO_ABC")

In general, though: If you expect the test to fail frequently, use if. If the test is expensive relative to just trying the operation and catching the exception if it fails, use try. If neither one of these conditions applies, go with whatever reads easier.


回答 2

使用tryexcept直接,而不是内侧if后卫应该始终是否有竞争条件的可能性来完成。例如,如果要确保目录存在,请不要执行以下操作:

import os, sys
if not os.path.isdir('foo'):
  try:
    os.mkdir('foo')
  except OSError, e
    print e
    sys.exit(1)

如果另一个线程或进程在isdir和之间创建目录,则将mkdir退出。相反,请执行以下操作:

import os, sys, errno
try:
  os.mkdir('foo')
except OSError, e
  if e.errno != errno.EEXIST:
    print e
    sys.exit(1)

仅当无法创建’foo’目录时,该命令才会退出。

Using try and except directly rather than inside an if guard should always be done if there is any possibility of a race condition. For example, if you want to ensure that a directory exists, do not do this:

import os, sys
if not os.path.isdir('foo'):
  try:
    os.mkdir('foo')
  except OSError, e
    print e
    sys.exit(1)

If another thread or process creates the directory between isdir and mkdir, you’ll exit. Instead, do this:

import os, sys, errno
try:
  os.mkdir('foo')
except OSError, e
  if e.errno != errno.EEXIST:
    print e
    sys.exit(1)

That will only exit if the ‘foo’ directory can’t be created.


回答 3

如果在进行某些操作之前先检查一下是否会失败,那么您可能应该赞成这样做。毕竟,构造异常(包括相关的回溯)需要花费时间。

异常应用于:

  1. 出乎意料的事情,或者…
  2. 您需要跳到不只一个逻辑层次的事情(例如a break不能使您走得太远),或者…
  3. 您不确切知道该如何提前处理异常的事情,或者…
  4. 提前检查故障的成本很高(相对于尝试操作而言)

请注意,通常,真正的答案是“都不是”-例如,在第一个示例中,您真正应该做的只是.get()提供默认值:

x = myDict.get('ABC', 'NO_ABC')

If it’s trivial to check whether something will fail before you do it, you should probably favor that. After all, constructing exceptions (including their associated tracebacks) takes time.

Exceptions should be used for:

  1. things that are unexpected, or…
  2. things where you need to jump more than one level of logic (e.g. where a break doesn’t get you far enough), or…
  3. things where you don’t know exactly what is going to be handling the exception ahead of time, or…
  4. things where checking ahead of time for failure is expensive (relative to just attempting the operation)

Note that oftentimes, the real answer is “neither” – for instance, in your first example, what you really should do is just use .get() to provide a default:

x = myDict.get('ABC', 'NO_ABC')

回答 4

正如其他职位所提到的,这取决于情况。使用try / except代替预先检查数据的有效性存在一些危险,尤其是在较大的项目中使用时。

  • 在try块中的代码可能有机会在捕获异常之前进行各种破坏-如果您事先使用if语句主动进行检查,则可以避免这种情况。
  • 如果在try块中调用的代码引发了一个常见的异常类型(如TypeError或ValueError),则您实际上可能没有捕获到您期望捕获的相同异常-可能是其他原因导致甚至在进入之前或之后引发相同的异常类可能引发异常的行。

例如,假设您有:

try:
    x = my_list[index_list[3]]
except IndexError:
    x = 'NO_ABC'

IndexError没有任何关于尝试获取index_list或my_list元素时是否发生的信息。

As the other posts mention, it depends on the situation. There are a few dangers with using try/except in place of checking the validity of your data in advance, especially when using it on bigger projects.

  • The code in the try block may have a chance to wreak all sorts of havoc before the exception is caught – if you proactively check beforehand with an if statement you can avoid this.
  • If the code called in your try block raises a common exception type, like TypeError or ValueError, you may not actually catch the same exception you were expecting to catch – it may be something else that raise the same exception class before or after even getting to the line where your exception may be raised.

e.g., suppose you had:

try:
    x = my_list[index_list[3]]
except IndexError:
    x = 'NO_ABC'

The IndexError says nothing about whether it occurred when trying to get an element of index_list or my_list.


回答 5

应该使用try而不是if来解释为无声传递的错误吗?如果是这样,您是否通过以这种方式使用它来明确使其静音,从而使其正常运行?

使用try表示可能会通过错误,这与使其静默通过相反。使用except导致它根本不通过。

try: except:if: else:逻辑更为复杂的情况下,首选使用。简单胜于复杂。复杂胜于复杂;要求宽恕比允许容易。

警告:“错误永远都不能静默传递”,是代码可能引发您所知道的异常,并且您的设计承认存在这种可能性的情况,但您并未以处理异常的方式进行设计。在我看来,明确地消除错误将像passexcept块中那样进行,仅应在了解“不做任何事情”确实是特定情况下的正确错误处理的情况下进行操作。(这是我真正需要使用编写良好的代码进行注释的少数几次。)

但是,在您的特定示例中,都不适合:

x = myDict.get('ABC', 'NO_ABC')

每个人都指出这一点的原因-即使您承认您希望总体上理解并且无法提出更好的例子-是在很多情况下实际上存在等效的避让,而寻找它们是解决问题的第一步。

Should using a try instead of an if be interpreted as an error passing silently? And if so, are you explicitly silencing it by using it in this way, therefore making it OK?

Using try is acknowledging that an error may pass, which is the opposite of having it pass silently. Using except is causing it not to pass at all.

Using try: except: is preferred in cases where if: else: logic is more complicated. Simple is better than complex; complex is better than complicated; and it’s easier to ask for forgiveness than permission.

What “errors should never pass silently” is warning about, is the case where code could raise an exception that you know about, and where your design admits the possibility, but you haven’t designed in a way to deal with the exception. Explicitly silencing an error, in my view, would be doing something like pass in an except block, which should only be done with an understanding that “doing nothing” really is the correct error handling in the particular situation. (This is one of the few times where I feel like a comment in well-written code is probably really needed.)

However, in your particular example, neither is appropriate:

x = myDict.get('ABC', 'NO_ABC')

The reason everyone is pointing this out – even though you acknowledge your desire to understand in general, and inability to come up with a better example – is that equivalent side-steps actually exist in quite a lot of cases, and looking for them is the first step in solving the problem.


回答 6

每当try/except用于控制流时,请问自己:

  1. 是否容易看到该try块何时成功以及何时失败?
  2. 您是否知道该区块内的所有副作用try
  3. 您是否知道该块引发异常的所有情况try
  4. 如果该try块的实现发生更改,您的控制流是否仍将按预期运行?

如果对这些问题中的一个或多个的回答为“否”,则可能会有很多宽容的要求。最有可能来自您未来的自我。


一个例子。我最近在一个更大的项目中看到了如下代码:

try:
    y = foo(x)
except ProgrammingError:
    y = bar(x)

与程序员交谈后,发现预期的控制流程为:

如果x是整数,则y = foo(x)。

如果x是整数列表,则y = bar(x)。

之所以foo可行,是因为进行了数据库查询,如果x为整数,则查询将成功,如果为列表,ProgrammingError则将抛出if x

try/except在这里使用是一个不好的选择:

  1. 异常的名称ProgrammingError不会给出实际的问题(x不是整数),这使得很难看到发生了什么。
  2. ProgrammingError数据库调用,浪费时间内上升。如果事实证明是foo在引发异常之前将某些内容写入数据库或更改了其他系统的状态,那么事情将变得非常可怕。
  3. 尚不清楚是否ProgrammingError仅在x整数列表时才引发。例如,假设foo的数据库查询中有错字。这可能还会引发一个ProgrammingError。结果是,bar(x)x是整数时,现在也称为。这可能会引发神秘异常或产生不可预见的结果。
  4. try/except块为的所有未来实现增加了要求foo。每当我们进行更改时foo,我们现在都必须考虑它如何处理列表,并确保它引发一个错误,ProgrammingError而不是一个AttributeError或根本不引发一个错误。

Whenever you use try/except for control flow, ask yourself:

  1. Is it easy to see when the try block succeeds and when it fails?
  2. Are you aware of all side effects inside the try block?
  3. Are you aware of all cases in which the try block throws the exception?
  4. If the implementation of the try block changes, will your control flow still behave as expected?

If the answer to one or more of these questions is ‘no’, there might be a lot of forgiveness to ask for; most likely from your future self.


An example. I recently saw code in a larger project that looked like this:

try:
    y = foo(x)
except ProgrammingError:
    y = bar(x)

Talking to the programmer it turned that the intended control flow was:

If x is an integer, do y = foo(x).

If x is a list of integers, do y = bar(x).

This worked because foo made a database query and the query would be successful if x was an integer and throw a ProgrammingError if x was a list.

Using try/except is a bad choice here:

  1. The name of the exception, ProgrammingError, does not give away the actual problem (that x is not an integer), which makes it difficult to see what is going on.
  2. The ProgrammingError is raised during a database call, which wastes time. Things would get truly horrible if it turned out that foo writes something to the database before it throws an exception, or alters the state of some other system.
  3. It is unclear if ProgrammingError is only raised when x is a list of integers. Suppose for instance that there is a typo in foo‘s database query. This might also raise a ProgrammingError. The consequence is that bar(x) is now also called when x is an integer. This might raise cryptic exceptions or produce unforeseeable results.
  4. The try/except block adds a requirement to all future implementations of foo. Whenever we change foo, we must now think about how it handles lists and make sure that it throws a ProgrammingError and not, say, an AttributeError or no error at all.

回答 7

对于一般含义,您可以考虑阅读Python中的成语和反成语:异常

在您的特定情况下,如其他人所述,您应该使用dict.get()

get(key [,默认])

如果key在字典中,则返回key的值,否则返回默认值。如果未提供default,则默认为None,因此此方法永远不会引发KeyError。

For a general meaning, you may consider reading Idioms and Anti-Idioms in Python: Exceptions.

In your particular case, as others stated, you should use dict.get():

get(key[, default])

Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.


sum()的功能是什么,但要乘法呢?产品()?

问题:sum()的功能是什么,但要乘法呢?产品()?

Python的sum()函数返回一个可迭代的数字之和。

sum([3,4,5]) == 3 + 4 + 5 == 12

我正在寻找返回产品的函数。

somelib.somefunc([3,4,5]) == 3 * 4 * 5 == 60

我很确定存在这样的功能,但是找不到。

Python’s sum() function returns the sum of numbers in an iterable.

sum([3,4,5]) == 3 + 4 + 5 == 12

I’m looking for the function that returns the product instead.

somelib.somefunc([3,4,5]) == 3 * 4 * 5 == 60

I’m pretty sure such a function exists, but I can’t find it.


回答 0

更新:

在Python 3.8中,prod函数已添加到math模块。请参阅:math.prod()

较早的信息:Python 3.7及更低版本

您要查找的函数称为prod()product(),但Python没有该函数。因此,您需要编写自己的代码(很简单)。

在prod()上的发音

是的,这是对的。Guido 拒绝了内置prod()函数的想法,因为他认为很少需要它。

用reduce()替代

正如您建议的那样,使用reduce()operator.mul()制作自己的东西并不难:

from functools import reduce  # Required in Python 3
def prod(iterable):
    return reduce(operator.mul, iterable, 1)

>>> prod(range(1, 5))
24

请注意,在Python 3中,reduce()函数已移至functools模块

具体情况:阶乘

附带说明一下,prod()的主要动机用例是计算阶乘。我们已经在math模块中对此提供了支持:

>>> import math

>>> math.factorial(10)
3628800

对数的替代

如果您的数据由浮点数组成,则可以使用带有指数和对数的sum()来计算乘积:

>>> from math import log, exp

>>> data = [1.2, 1.5, 2.5, 0.9, 14.2, 3.8]
>>> exp(sum(map(log, data)))
218.53799999999993

>>> 1.2 * 1.5 * 2.5 * 0.9 * 14.2 * 3.8
218.53799999999998

注意,使用log()要求所有输入均为正。

Update:

In Python 3.8, the prod function was added to the math module. See: math.prod().

Older info: Python 3.7 and prior

The function you’re looking for would be called prod() or product() but Python doesn’t have that function. So, you need to write your own (which is easy).

Pronouncement on prod()

Yes, that’s right. Guido rejected the idea for a built-in prod() function because he thought it was rarely needed.

Alternative with reduce()

As you suggested, it is not hard to make your own using reduce() and operator.mul():

from functools import reduce  # Required in Python 3
import operator
def prod(iterable):
    return reduce(operator.mul, iterable, 1)

>>> prod(range(1, 5))
24

Note, in Python 3, the reduce() function was moved to the functools module.

Specific case: Factorials

As a side note, the primary motivating use case for prod() is to compute factorials. We already have support for that in the math module:

>>> import math

>>> math.factorial(10)
3628800

Alternative with logarithms

If your data consists of floats, you can compute a product using sum() with exponents and logarithms:

>>> from math import log, exp

>>> data = [1.2, 1.5, 2.5, 0.9, 14.2, 3.8]
>>> exp(sum(map(log, data)))
218.53799999999993

>>> 1.2 * 1.5 * 2.5 * 0.9 * 14.2 * 3.8
218.53799999999998

Note, the use of log() requires that all the inputs are positive.


回答 1

实际上,Guido否决了这个想法:http : //bugs.python.org/issue1093

但是,正如该期杂志所述,您可以轻松制作一个:

from functools import reduce # Valid in Python 2.6+, required in Python 3
import operator

reduce(operator.mul, (3, 4, 5), 1)

Actually, Guido vetoed the idea: http://bugs.python.org/issue1093

But, as noted in that issue, you can make one pretty easily:

from functools import reduce # Valid in Python 2.6+, required in Python 3
import operator

reduce(operator.mul, (3, 4, 5), 1)

回答 2

没有一个内置的,但其实很简单推出自己的,这表现在这里

import operator
def prod(factors):
    return reduce(operator.mul, factors, 1)

查看此问题的答案:

哪个Python模块适合列表中的数据操作?

There isn’t one built in, but it’s simple to roll your own, as demonstrated here:

import operator
def prod(factors):
    return reduce(operator.mul, factors, 1)

See answers to this question:

Which Python module is suitable for data manipulation in a list?


回答 3

有一个prod()在numpy的,做你问什么。

There’s a prod() in numpy that does what you’re asking for.


回答 4

Numeric.product 

( 要么

reduce(lambda x,y:x*y,[3,4,5])

Numeric.product 

( or

reduce(lambda x,y:x*y,[3,4,5])

)


回答 5

用这个

def prod(iterable):
    p = 1
    for n in iterable:
        p *= n
    return p

由于没有内置prod功能。

Use this

def prod(iterable):
    p = 1
    for n in iterable:
        p *= n
    return p

Since there’s no built-in prod function.


回答 6

我更喜欢使用functools.reduce()和上面的答案ab使用numpy.prod()答案,但这是使用itertools.accumulate()的另一种解决方案:

import itertools
import operator
prod = list(itertools.accumulate((3, 4, 5), operator.mul))[-1]

I prefer the answers a and b above using functools.reduce() and the answer using numpy.prod(), but here is yet another solution using itertools.accumulate():

import itertools
import operator
prod = list(itertools.accumulate((3, 4, 5), operator.mul))[-1]

回答 7

也许不是“内置”,但我认为它是内置的。无论如何,请使用numpy

import numpy 
prod_sum = numpy.prod(some_list)

Perhaps not a “builtin”, but I consider it builtin. anyways just use numpy

import numpy 
prod_sum = numpy.prod(some_list)

E731不分配lambda表达式,使用def

问题:E731不分配lambda表达式,使用def

每当我使用lambda表达式时,都会收到此pep8警告。不建议使用lambda表达式吗?如果不是,为什么?

I get this pep8 warning whenever I use lambda expressions. Are lambda expressions not recommended? If not why?


回答 0

您遇到的PEP-8中的建议是:

始终使用def语句代替将lambda表达式直接绑定到名称的赋值语句。

是:

def f(x): return 2*x 

没有:

f = lambda x: 2*x 

第一种形式表示结果函数对象的名称专门为’f’而不是通用的'<lambda>’。通常,这对于回溯和字符串表示形式更为有用。使用赋值语句消除了lambda表达式可以提供的优于显式def语句的唯一好处(即,它可以嵌入到较大的表达式中)

为名称分配lambda基本上只是复制了def– 的功能-通常,最好以一种单一的方式进行操作以避免混淆并提高清晰度。

lambda的合法用例是您要在不分配功能的情况下使用该功能,例如:

sorted(players, key=lambda player: player.rank)

通常,反对这样做的主要理由是def语句将导致更多的代码行。我对此的主要回应是:是的,这很好。除非您是打高尔夫球的人,否则不应该减少行数:一目了然。

The recommendation in PEP-8 you are running into is:

Always use a def statement instead of an assignment statement that binds a lambda expression directly to a name.

Yes:

def f(x): return 2*x 

No:

f = lambda x: 2*x 

The first form means that the name of the resulting function object is specifically ‘f’ instead of the generic ‘<lambda>’. This is more useful for tracebacks and string representations in general. The use of the assignment statement eliminates the sole benefit a lambda expression can offer over an explicit def statement (i.e. that it can be embedded inside a larger expression)

Assigning lambdas to names basically just duplicates the functionality of def – and in general, it’s best to do something a single way to avoid confusion and increase clarity.

The legitimate use case for lambda is where you want to use a function without assigning it, e.g:

sorted(players, key=lambda player: player.rank)

In general, the main argument against doing this is that def statements will result in more lines of code. My main response to that would be: yes, and that is fine. Unless you are code golfing, minimising the number of lines isn’t something you should be doing: go for clear over short.


回答 1

这是一个故事,我有一个简单的lambda函数,我使用了两次。

a = map(lambda x : x + offset, simple_list)
b = map(lambda x : x + offset, another_simple_list)

这只是为了表示,我已经遇到了几个不同的版本。

现在,为了保持干燥状态,我开始重用此通用lambda。

f = lambda x : x + offset
a = map(f, simple_list)
b = map(f, another_simple_list)

在这一点上,我的代码质量检查器抱怨lambda是一个命名函数,因此我将其转换为一个函数。

def f(x):
    return x + offset
a = map(f, simple_list)
b = map(f, another_simple_list)

现在,检查者抱怨函数必须在前后插入一个空白行。

def f(x):
    return x + offset

a = map(f, simple_list)
b = map(f, another_simple_list)

在这里,我们现在有6行代码,而不是原始的2行,但没有增加可读性,也没有增加pythonic的代码。在这一点上,代码检查器抱怨该函数没有文档字符串。

在我看来,最好在合理的情况下避免并破坏该规则,请运用您的判断。

Here is the story, I had a simple lambda function which I was using twice.

a = map(lambda x : x + offset, simple_list)
b = map(lambda x : x + offset, another_simple_list)

This is just for the representation, I have faced couple of different versions of this.

Now, to keep things DRY, I start to reuse this common lambda.

f = lambda x : x + offset
a = map(f, simple_list)
b = map(f, another_simple_list)

At this point my code quality checker complains about lambda being a named function so I convert it into a function.

def f(x):
    return x + offset
a = map(f, simple_list)
b = map(f, another_simple_list)

Now the checker complains that a function has to be bounded by one blank line before and after.

def f(x):
    return x + offset

a = map(f, simple_list)
b = map(f, another_simple_list)

Here we have now 6 lines of code instead of original 2 lines with no increase in readability and no increase in being pythonic. At this point the code checker complains about the function not having docstrings.

In my opinion this rule better be avoided and broken when it makes sense, use your judgement.


回答 2

Lattyware是完全正确的:基本上,PEP-8希望您避免诸如此类的事情

f = lambda x: 2 * x

而是使用

def f(x):
    return 2 * x

但是,正如最近 bug报告(2014年8月),语句,如下面现在是否符合:

a.f = lambda x: 2 * x
a["f"] = lambda x: 2 * x

由于我的PEP-8检查器尚未正确实现此功能,因此我暂时关闭了E731。

Lattyware is absolutely right: Basically PEP-8 wants you to avoid things like

f = lambda x: 2 * x

and instead use

def f(x):
    return 2 * x

However, as addressed in a recent bugreport (Aug 2014), statements such as the following are now compliant:

a.f = lambda x: 2 * x
a["f"] = lambda x: 2 * x

Since my PEP-8 checker doesn’t implement this correctly yet, I turned off E731 for the time being.


回答 3

我还遇到了甚至无法使用def(ined)函数的情况。

class SomeClass(object):
  # pep-8 does not allow this
  f = lambda x: x + 1  # NOQA

  def not_reachable(self, x):
    return x + 1

  @staticmethod
  def also_not_reachable(x):
    return x + 1

  @classmethod
  def also_not_reachable(cls, x):
    return x + 1

  some_mapping = {
      'object1': {'name': "Object 1", 'func': f},
      'object2': {'name': "Object 2", 'func': some_other_func},
  }

在这种情况下,我真的很想做一个属于该类的映射。映射中的某些对象需要相同的功能。将命名函数放在类之外是不合逻辑的。我还没有找到从类主体内部引用方法(静态方法,类方法或普通方法)的方法。运行代码时,尚不存在SomeClass。因此,也不可能从类中引用它。

I also encountered a situation in which it was even impossible to use a def(ined) function.

class SomeClass(object):
  # pep-8 does not allow this
  f = lambda x: x + 1  # NOQA

  def not_reachable(self, x):
    return x + 1

  @staticmethod
  def also_not_reachable(x):
    return x + 1

  @classmethod
  def also_not_reachable(cls, x):
    return x + 1

  some_mapping = {
      'object1': {'name': "Object 1", 'func': f},
      'object2': {'name': "Object 2", 'func': some_other_func},
  }

In this case, I really wanted to make a mapping which belonged to the class. Some objects in the mapping needed the same function. It would be illogical to put the a named function outside of the class. I have not found a way to refer to a method (staticmethod, classmethod or normal) from inside the class body. SomeClass does not exist yet when the code is run. So referring to it from the class isn’t possible either.


为什么Python3中没有xrange函数?

问题:为什么Python3中没有xrange函数?

最近,我开始使用Python3,它缺少xrange的好处。

简单的例子:

1) Python2:

from time import time as t
def count():
  st = t()
  [x for x in xrange(10000000) if x%4 == 0]
  et = t()
  print et-st
count()

2) Python3:

from time import time as t

def xrange(x):

    return iter(range(x))

def count():
    st = t()
    [x for x in xrange(10000000) if x%4 == 0]
    et = t()
    print (et-st)
count()

结果分别是:

1) 1.53888392448 2) 3.215819835662842

这是为什么?我的意思是,为什么xrange被删除了?这是学习的好工具。对于初学者来说,就像我自己一样,就像我们都处在某个时刻。为什么要删除它?有人可以指出我正确的PEP,我找不到它。

干杯。

Recently I started using Python3 and it’s lack of xrange hurts.

Simple example:

1) Python2:

from time import time as t
def count():
  st = t()
  [x for x in xrange(10000000) if x%4 == 0]
  et = t()
  print et-st
count()

2) Python3:

from time import time as t

def xrange(x):

    return iter(range(x))

def count():
    st = t()
    [x for x in xrange(10000000) if x%4 == 0]
    et = t()
    print (et-st)
count()

The results are, respectively:

1) 1.53888392448 2) 3.215819835662842

Why is that? I mean, why xrange’s been removed? It’s such a great tool to learn. For the beginners, just like myself, like we all were at some point. Why remove it? Can somebody point me to the proper PEP, I can’t find it.

Cheers.


回答 0

进行一些性能评估,timeit而不是尝试使用手动进行time

首先,Apple 2.7.2 64位:

In [37]: %timeit collections.deque((x for x in xrange(10000000) if x%4 == 0), maxlen=0)
1 loops, best of 3: 1.05 s per loop

现在,python.org 3.3.0 64位:

In [83]: %timeit collections.deque((x for x in range(10000000) if x%4 == 0), maxlen=0)
1 loops, best of 3: 1.32 s per loop

In [84]: %timeit collections.deque((x for x in xrange(10000000) if x%4 == 0), maxlen=0)
1 loops, best of 3: 1.31 s per loop

In [85]: %timeit collections.deque((x for x in iter(range(10000000)) if x%4 == 0), maxlen=0) 
1 loops, best of 3: 1.33 s per loop

显然,3.x range确实比2.x慢一些xrange。OP的xrange功能与此无关。(不足为奇,因为__iter__在循环中发生的任何事情的1000 万次调用中,对插槽的一次性调用不太可能可见,但有人提出来了。)

但这仅慢了30%。OP如何使速度慢2倍?好吧,如果我使用32位Python重复相同的测试,则得到1.58和3.12。因此,我的猜测是,这是3.x针对64位性能进行了优化(以损害32位的方式)的又一案例。

但这真的重要吗?再次使用3.3.0 64位进行检查:

In [86]: %timeit [x for x in range(10000000) if x%4 == 0]
1 loops, best of 3: 3.65 s per loop

因此,构建所需的list时间是整个迭代的两倍以上。

至于“比Python 2.6+消耗更多的资源”,在我的测试中,看起来3.x range的大小与2.x的大小完全相同xrange,即使它的大小是10x的大小,也可以构建不必要的列表问题仍然比范围迭代可能做的任何事情多出10000000x。

那么显式for循环而不是内部的C循环又deque如何呢?

In [87]: def consume(x):
   ....:     for i in x:
   ....:         pass
In [88]: %timeit consume(x for x in range(10000000) if x%4 == 0)
1 loops, best of 3: 1.85 s per loop

因此,在for语句中浪费的时间几乎与迭代的实际工作中所浪费的时间一样多range

如果您担心优化范围对象的迭代,则可能在错误的位置。


同时,xrange无论人们告诉您相同的内容多少次,您都会不断询问为什么要删除它,但是我会再次重复:它没有删除:它被重命名为range,而2.x range是被删除的东西。

这是3.3 range对象是2.x xrange对象(而不是2.x range函数)的直接后代的证明:3.3range2.7xrange的源。您甚至可以查看更改历史记录(我相信,该更改已链接到替换文件中任何位置的字符串“ xrange”的最后一个实例的更改)。

那么,为什么它变慢?

好吧,其中之一是,他们添加了许多新功能。另外,他们已经在整个地方(尤其是在迭代过程中)进行了各种具有较小副作用的更改。尽管有时有时会稍微低估不太重要的案例,但仍有大量工作可以极大地优化各种重要案例。将所有这些加起来,令迭代range速度现在变慢会令我感到惊讶。这是最重要的案例之一,没有人会足够关注。没有人会遇到现实中的用例,这种性能差异是他们代码中的热点。

Some performance measurements, using timeit instead of trying to do it manually with time.

First, Apple 2.7.2 64-bit:

In [37]: %timeit collections.deque((x for x in xrange(10000000) if x%4 == 0), maxlen=0)
1 loops, best of 3: 1.05 s per loop

Now, python.org 3.3.0 64-bit:

In [83]: %timeit collections.deque((x for x in range(10000000) if x%4 == 0), maxlen=0)
1 loops, best of 3: 1.32 s per loop

In [84]: %timeit collections.deque((x for x in xrange(10000000) if x%4 == 0), maxlen=0)
1 loops, best of 3: 1.31 s per loop

In [85]: %timeit collections.deque((x for x in iter(range(10000000)) if x%4 == 0), maxlen=0) 
1 loops, best of 3: 1.33 s per loop

Apparently, 3.x range really is a bit slower than 2.x xrange. And the OP’s xrange function has nothing to do with it. (Not surprising, as a one-time call to the __iter__ slot isn’t likely to be visible among 10000000 calls to whatever happens in the loop, but someone brought it up as a possibility.)

But it’s only 30% slower. How did the OP get 2x as slow? Well, if I repeat the same tests with 32-bit Python, I get 1.58 vs. 3.12. So my guess is that this is yet another of those cases where 3.x has been optimized for 64-bit performance in ways that hurt 32-bit.

But does it really matter? Check this out, with 3.3.0 64-bit again:

In [86]: %timeit [x for x in range(10000000) if x%4 == 0]
1 loops, best of 3: 3.65 s per loop

So, building the list takes more than twice as long than the entire iteration.

And as for “consumes much more resources than Python 2.6+”, from my tests, it looks like a 3.x range is exactly the same size as a 2.x xrange—and, even if it were 10x as big, building the unnecessary list is still about 10000000x more of a problem than anything the range iteration could possibly do.

And what about an explicit for loop instead of the C loop inside deque?

In [87]: def consume(x):
   ....:     for i in x:
   ....:         pass
In [88]: %timeit consume(x for x in range(10000000) if x%4 == 0)
1 loops, best of 3: 1.85 s per loop

So, almost as much time wasted in the for statement as in the actual work of iterating the range.

If you’re worried about optimizing the iteration of a range object, you’re probably looking in the wrong place.


Meanwhile, you keep asking why xrange was removed, no matter how many times people tell you the same thing, but I’ll repeat it again: It was not removed: it was renamed to range, and the 2.x range is what was removed.

Here’s some proof that the 3.3 range object is a direct descendant of the 2.x xrange object (and not of the 2.x range function): the source to 3.3 range and 2.7 xrange. You can even see the change history (linked to, I believe, the change that replaced the last instance of the string “xrange” anywhere in the file).

So, why is it slower?

Well, for one, they’ve added a lot of new features. For another, they’ve done all kinds of changes all over the place (especially inside iteration) that have minor side effects. And there’d been a lot of work to dramatically optimize various important cases, even if it sometimes slightly pessimizes less important cases. Add this all up, and I’m not surprised that iterating a range as fast as possible is now a bit slower. It’s one of those less-important cases that nobody would ever care enough to focus on. No one is likely to ever have a real-life use case where this performance difference is the hotspot in their code.


回答 1

Python3的范围 Python2的xrange。无需在其周围包装迭代器。要获得Python3中的实际列表,您需要使用list(range(...))

如果您想要适用于Python2和Python3的产品,请尝试以下操作

try:
    xrange
except NameError:
    xrange = range

Python3’s range is Python2’s xrange. There’s no need to wrap an iter around it. To get an actual list in Python3, you need to use list(range(...))

If you want something that works with Python2 and Python3, try this

try:
    xrange
except NameError:
    xrange = range

回答 2

Python 3的range类型与Python 2的类型一样xrange。我不确定为什么会看到速度变慢,因为函数直接返回的迭代器xrange正是您返回的迭代器range

我无法在系统上重现速度下降的情况。这是我的测试方式:

Python 2,具有xrange

Python 2.7.3 (default, Apr 10 2012, 23:24:47) [MSC v.1500 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import timeit
>>> timeit.timeit("[x for x in xrange(1000000) if x%4]",number=100)
18.631936646865853

Python 3,range速度稍微快一点:

Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import timeit
>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)
17.31399508687869

我最近了解到Python 3的range类型还具有其他一些简洁的功能,例如对切片的支持:range(10,100,2)[5:25:5]is range(20, 60, 10)

Python 3’s range type works just like Python 2’s xrange. I’m not sure why you’re seeing a slowdown, since the iterator returned by your xrange function is exactly what you’d get if you iterated over range directly.

I’m not able to reproduce the slowdown on my system. Here’s how I tested:

Python 2, with xrange:

Python 2.7.3 (default, Apr 10 2012, 23:24:47) [MSC v.1500 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import timeit
>>> timeit.timeit("[x for x in xrange(1000000) if x%4]",number=100)
18.631936646865853

Python 3, with range is a tiny bit faster:

Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import timeit
>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)
17.31399508687869

I recently learned that Python 3’s range type has some other neat features, such as support for slicing: range(10,100,2)[5:25:5] is range(20, 60, 10)!


回答 3

修复python2代码的一种方法是:

import sys

if sys.version_info >= (3, 0):
    def xrange(*args, **kwargs):
        return iter(range(*args, **kwargs))

One way to fix up your python2 code is:

import sys

if sys.version_info >= (3, 0):
    def xrange(*args, **kwargs):
        return iter(range(*args, **kwargs))

回答 4

Python 2中的xrange是一个生成器,它实现了迭代器,而range只是一个函数。在Python3中,我不知道为什么将其从xrange中删除。

xrange from Python 2 is a generator and implements iterator while range is just a function. In Python3 I don’t know why was dropped off the xrange.


回答 5

comp:〜$ python Python 2.7.6(默认,2015年6月22日,17:58:13)[gcc 4.8.2]在linux2上

>>> import timeit
>>> timeit.timeit("[x for x in xrange(1000000) if x%4]",number=100)

5.656799077987671

>>> timeit.timeit("[x for x in xrange(1000000) if x%4]",number=100)

5.579368829727173

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)

21.54827117919922

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)

22.014557123184204

当timeit number = 1参数时:

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=1)

0.2245171070098877

>>> timeit.timeit("[x for x in xrange(1000000) if x%4]",number=1)

0.10750913619995117

comp:〜$ python3 Python 3.4.3(默认,2015年10月14日,20:28:29)[GCC 4.8.4]在Linux上

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)

9.113872020003328

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)

9.07014398300089

timeit number = 1,2,3,4参数可以快速且线性地工作:

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=1)

0.09329321900440846

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=2)

0.18501482300052885

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=3)

0.2703447980020428

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=4)

0.36209142999723554

因此,看来如果我们测量1个运行循环周期,例如timeit.timeit(“ [x表示x在range(1000000)中,如果x%4]”,number = 1)(正如我们在实际代码中实际使用的那样),python3的运行速度足够快,但在重复循环中,python 2 xrange()在速度上胜过python 3中的range()。

comp:~$ python Python 2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2] on linux2

>>> import timeit
>>> timeit.timeit("[x for x in xrange(1000000) if x%4]",number=100)

5.656799077987671

>>> timeit.timeit("[x for x in xrange(1000000) if x%4]",number=100)

5.579368829727173

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)

21.54827117919922

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)

22.014557123184204

With timeit number=1 param:

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=1)

0.2245171070098877

>>> timeit.timeit("[x for x in xrange(1000000) if x%4]",number=1)

0.10750913619995117

comp:~$ python3 Python 3.4.3 (default, Oct 14 2015, 20:28:29) [GCC 4.8.4] on linux

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)

9.113872020003328

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=100)

9.07014398300089

With timeit number=1,2,3,4 param works quick and in linear way:

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=1)

0.09329321900440846

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=2)

0.18501482300052885

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=3)

0.2703447980020428

>>> timeit.timeit("[x for x in range(1000000) if x%4]",number=4)

0.36209142999723554

So it seems if we measure 1 running loop cycle like timeit.timeit(“[x for x in range(1000000) if x%4]”,number=1) (as we actually use in real code) python3 works quick enough, but in repeated loops python 2 xrange() wins in speed against range() from python 3.