我可以为request.request设置max_retries吗?

问题:我可以为request.request设置max_retries吗?

Python的请求模块既简单又优雅,但有一件事困扰着我。有可能得到一个 requests.exception.ConnectionError有这样的消息:

Max retries exceeded with url: ...

这意味着请求可以尝试多次访问数据。但是在文档的任何地方都没有提及这种可能性。在源代码中,我没有找到可以更改默认值(大概为0)的地方。

那么是否有可能以某种方式设置请求的最大重试次数?

The Python requests module is simple and elegant but one thing bugs me. It is possible to get a requests.exception.ConnectionError with a message like:

Max retries exceeded with url: ...

This implies that requests can attempt to access the data several times. But there is not a single mention of this possibility anywhere in the docs. Looking at the source code I didn’t find any place where I could alter the default (presumably 0) value.

So is it possible to somehow set the maximum number of retries for requests?


回答 0

urllib3重试是底层库。要设置其他最大重试计数,请使用备用传输适配器

from requests.adapters import HTTPAdapter

s = requests.Session()
s.mount('http://stackoverflow.com', HTTPAdapter(max_retries=5))

max_retries参数接受一个整数或一个Retry()对象 ; 后者使您可以对重试哪种类型的故障进行细粒度的控制(将整数值转换为Retry()仅处理连接故障的实例;默认情况下,不处理连接后的错误,因为这些错误可能会导致副作用) 。


旧答案,早于请求1.2.1的发布

requests库实际上并没有使它可配置,也没有打算(请参阅此拉取请求))。当前(请求1.1),重试次数设置为0。如果您确实想将其设置为更高的值,则必须全局设置此值:

import requests

requests.adapters.DEFAULT_RETRIES = 5

此常量未记录;使用它的后果自负,因为将来的发行版可能会更改其处理方式。

更新:这确实改变了;在1.2.1版中,添加了在设置max_retries参数的选项,因此现在您必须使用替代的传输适配器,请参见上文。除非您也修补默认值(不建议这样做),否则Monkey修补方法将不再起作用。HTTPAdapter()HTTPAdapter.__init__()

It is the underlying urllib3 library that does the retrying. To set a different maximum retry count, use alternative transport adapters:

from requests.adapters import HTTPAdapter

s = requests.Session()
s.mount('http://stackoverflow.com', HTTPAdapter(max_retries=5))

The max_retries argument takes an integer or a Retry() object; the latter gives you fine-grained control over what kinds of failures are retried (an integer value is turned into a Retry() instance which only handles connection failures; errors after a connection is made are by default not handled as these could lead to side-effects).


Old answer, predating the release of requests 1.2.1:

The requests library doesn’t really make this configurable, nor does it intend to (see this pull request). Currently (requests 1.1), the retries count is set to 0. If you really want to set it to a higher value, you’ll have to set this globally:

import requests

requests.adapters.DEFAULT_RETRIES = 5

This constant is not documented; use it at your own peril as future releases could change how this is handled.

Update: and this did change; in version 1.2.1 the option to set the max_retries parameter on the HTTPAdapter() class was added, so that now you have to use alternative transport adapters, see above. The monkey-patch approach no longer works, unless you also patch the HTTPAdapter.__init__() defaults (very much not recommended).


回答 1

这不仅会更改max_retries,还会启用退避策略,该策略会使对所有http://地址的请求在重试之前休眠一段时间(总计5次):

import requests
from urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter

s = requests.Session()

retries = Retry(total=5,
                backoff_factor=0.1,
                status_forcelist=[ 500, 502, 503, 504 ])

s.mount('http://', HTTPAdapter(max_retries=retries))

s.get('http://httpstat.us/500')

根据文档说明Retry:如果backoff_factor为0.1,则sleep()将在重试之间睡眠[0.1s,0.2s,0.4s,…]。这也将迫使重试,如果返回的状态代码是500502503504

各种其他选择 Retry可以进行更精细的控制:

  • total –允许的重试总数。
  • 连接 –重试多少个与连接有关的错误。
  • 读取 -重试几次读取错误。
  • 重定向 -要执行多少重定向。
  • method_whitelist –我们应重试的大写HTTP方法动词集。
  • status_forcelist –我们应强制重试的一组HTTP状态代码。
  • backoff_factor –在两次尝试之间应用的退避因子。
  • raise_on_redirect –如果重定向次数已用尽,则引发MaxRetryError,还是返回响应代码在3xx范围内的响应。
  • raise_on_status -类似含义raise_on_redirect:我们是否应该抛出一个异常,或返回响应,如果状态落在status_forcelist范围和重试次数已经用尽。

注意raise_on_status相对较新,尚未将其发布到urllib3或请求中。 raise_on_status在python 3.6版中关键字自变量似乎最多已进入标准库。

要使请求重试特定的HTTP状态代码,请使用status_forcelist。例如,status_forcelist = [503]将重试状态码503(服务不可用)。

默认情况下,重试仅针对以下情况触发:

  • 无法从池获得连接。
  • TimeoutError
  • HTTPException(从Python 3中的http.client或其他httplib)引发。这似乎是低级HTTP异常,例如URL或协议格式不正确。
  • SocketError
  • ProtocolError

请注意,所有这些都是阻止接收常规HTTP响应的异常。如果生成任何常规响应,则不会重试。不使用status_forcelist,即使状态为500的响应也不会重试。

以使其以这样的方式,其是用于与远程API或web服务器工作的更直观的行为,我会用上面的代码段,其在状态力的重试500502503504,所有这些都并不少见上网络和(可能)在足够大的退避期后可以恢复。

编辑Retry直接从urllib3导入类。

This will not only change the max_retries but also enable a backoff strategy which makes requests to all http:// addresses sleep for a period of time before retrying (to a total of 5 times):

import requests
from urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter

s = requests.Session()

retries = Retry(total=5,
                backoff_factor=0.1,
                status_forcelist=[ 500, 502, 503, 504 ])

s.mount('http://', HTTPAdapter(max_retries=retries))

s.get('http://httpstat.us/500')

As per documentation for Retry: if the backoff_factor is 0.1, then sleep() will sleep for [0.1s, 0.2s, 0.4s, …] between retries. It will also force a retry if the status code returned is 500, 502, 503 or 504.

Various other options to Retry allow for more granular control:

  • total – Total number of retries to allow.
  • connect – How many connection-related errors to retry on.
  • read – How many times to retry on read errors.
  • redirect – How many redirects to perform.
  • method_whitelist – Set of uppercased HTTP method verbs that we should retry on.
  • status_forcelist – A set of HTTP status codes that we should force a retry on.
  • backoff_factor – A backoff factor to apply between attempts.
  • raise_on_redirect – Whether, if the number of redirects is exhausted, to raise a MaxRetryError, or to return a response with a response code in the 3xx range.
  • raise_on_status – Similar meaning to raise_on_redirect: whether we should raise an exception, or return a response, if status falls in status_forcelist range and retries have been exhausted.

NB: raise_on_status is relatively new, and has not made it into a release of urllib3 or requests yet. The raise_on_status keyword argument appears to have made it into the standard library at most in python version 3.6.

To make requests retry on specific HTTP status codes, use status_forcelist. For example, status_forcelist=[503] will retry on status code 503 (service unavailable).

By default, the retry only fires for these conditions:

  • Could not get a connection from the pool.
  • TimeoutError
  • HTTPException raised (from http.client in Python 3 else httplib). This seems to be low-level HTTP exceptions, like URL or protocol not formed correctly.
  • SocketError
  • ProtocolError

Notice that these are all exceptions that prevent a regular HTTP response from being received. If any regular response is generated, no retry is done. Without using the status_forcelist, even a response with status 500 will not be retried.

To make it behave in a manner which is more intuitive for working with a remote API or web server, I would use the above code snippet, which forces retries on statuses 500, 502, 503 and 504, all of which are not uncommon on the web and (possibly) recoverable given a big enough backoff period.

EDITED: Import Retry class directly from urllib3.


回答 2

请注意,Martijn Pieters的答案不适用于1.2.1+版本。如果不修补库,则无法全局设置。

您可以改为:

import requests
from requests.adapters import HTTPAdapter

s = requests.Session()
s.mount('http://www.github.com', HTTPAdapter(max_retries=5))
s.mount('https://www.github.com', HTTPAdapter(max_retries=5))

Be careful, Martijn Pieters’s answer isn’t suitable for version 1.2.1+. You can’t set it globally without patching the library.

You can do this instead:

import requests
from requests.adapters import HTTPAdapter

s = requests.Session()
s.mount('http://www.github.com', HTTPAdapter(max_retries=5))
s.mount('https://www.github.com', HTTPAdapter(max_retries=5))

回答 3

在为这里的一些答案苦苦挣扎之后,我找到了一个名为backoff的库,该库对我的情况更好。一个基本的例子:

import backoff

@backoff.on_exception(
    backoff.expo,
    requests.exceptions.RequestException,
    max_tries=5,
    giveup=lambda e: e.response is not None and e.response.status_code < 500
)
def publish(self, data):
    r = requests.post(url, timeout=10, json=data)
    r.raise_for_status()

我仍然建议您尝试一下该库的本机功能,但是如果遇到任何问题或需要更广泛的控制,可以选择退避。

After struggling a bit with some of the answers here, I found a library called backoff that worked better for my situation. A basic example:

import backoff

@backoff.on_exception(
    backoff.expo,
    requests.exceptions.RequestException,
    max_tries=5,
    giveup=lambda e: e.response is not None and e.response.status_code < 500
)
def publish(self, data):
    r = requests.post(url, timeout=10, json=data)
    r.raise_for_status()

I’d still recommend giving the library’s native functionality a shot, but if you run into any problems or need broader control, backoff is an option.


回答 4

获得更高控制权的一种更干净的方法可能是将重试内容打包到一个函数中,并使用装饰器将该函数重试,并将异常列入白名单。

我在这里创建了相同的文件:http : //www.praddy.in/retry-decorator-whitelisted-exceptions/

复制该链接中的代码:

def retry(exceptions, delay=0, times=2):
"""
A decorator for retrying a function call with a specified delay in case of a set of exceptions

Parameter List
-------------
:param exceptions:  A tuple of all exceptions that need to be caught for retry
                                    e.g. retry(exception_list = (Timeout, Readtimeout))
:param delay: Amount of delay (seconds) needed between successive retries.
:param times: no of times the function should be retried


"""
def outer_wrapper(function):
    @functools.wraps(function)
    def inner_wrapper(*args, **kwargs):
        final_excep = None  
        for counter in xrange(times):
            if counter > 0:
                time.sleep(delay)
            final_excep = None
            try:
                value = function(*args, **kwargs)
                return value
            except (exceptions) as e:
                final_excep = e
                pass #or log it

        if final_excep is not None:
            raise final_excep
    return inner_wrapper

return outer_wrapper

@retry(exceptions=(TimeoutError, ConnectTimeoutError), delay=0, times=3)
def call_api():

A cleaner way to gain higher control might be to package the retry stuff into a function and make that function retriable using a decorator and whitelist the exceptions.

I have created the same here: http://www.praddy.in/retry-decorator-whitelisted-exceptions/

Reproducing the code in that link :

def retry(exceptions, delay=0, times=2):
"""
A decorator for retrying a function call with a specified delay in case of a set of exceptions

Parameter List
-------------
:param exceptions:  A tuple of all exceptions that need to be caught for retry
                                    e.g. retry(exception_list = (Timeout, Readtimeout))
:param delay: Amount of delay (seconds) needed between successive retries.
:param times: no of times the function should be retried


"""
def outer_wrapper(function):
    @functools.wraps(function)
    def inner_wrapper(*args, **kwargs):
        final_excep = None  
        for counter in xrange(times):
            if counter > 0:
                time.sleep(delay)
            final_excep = None
            try:
                value = function(*args, **kwargs)
                return value
            except (exceptions) as e:
                final_excep = e
                pass #or log it

        if final_excep is not None:
            raise final_excep
    return inner_wrapper

return outer_wrapper

@retry(exceptions=(TimeoutError, ConnectTimeoutError), delay=0, times=3)
def call_api():