python请求超时。获取整个响应

问题:python请求超时。获取整个响应

我正在收集网站列表上的统计信息,为了简化起见,我使用了请求。这是我的代码:

data=[]
websites=['http://google.com', 'http://bbc.co.uk']
for w in websites:
    r= requests.get(w, verify=False)
    data.append( (r.url, len(r.content), r.elapsed.total_seconds(), str([(l.status_code, l.url) for l in r.history]), str(r.headers.items()), str(r.cookies.items())) )

现在,我想requests.get在10秒后超时,以免循环陷入困境。

这个问题以前也很有趣但是没有一个答案是正确的。我将为此悬赏,以得到一个不错的答案。

我听说也许不使用请求是个好主意,但是我应该如何获得请求所提供的好处。(元组中的)

I’m gathering statistics on a list of websites and I’m using requests for it for simplicity. Here is my code:

data=[]
websites=['http://google.com', 'http://bbc.co.uk']
for w in websites:
    r= requests.get(w, verify=False)
    data.append( (r.url, len(r.content), r.elapsed.total_seconds(), str([(l.status_code, l.url) for l in r.history]), str(r.headers.items()), str(r.cookies.items())) )

Now, I want requests.get to timeout after 10 seconds so the loop doesn’t get stuck.

This question has been of interest before too but none of the answers are clean. I will be putting some bounty on this to get a nice answer.

I hear that maybe not using requests is a good idea but then how should I get the nice things requests offer. (the ones in the tuple)


回答 0

怎样使用eventlet?如果您想在10秒后使请求超时,即使正在接收数据,此代码段也将为您服务:

import requests
import eventlet
eventlet.monkey_patch()

with eventlet.Timeout(10):
    requests.get("http://ipv4.download.thinkbroadband.com/1GB.zip", verify=False)

What about using eventlet? If you want to timeout the request after 10 seconds, even if data is being received, this snippet will work for you:

import requests
import eventlet
eventlet.monkey_patch()

with eventlet.Timeout(10):
    requests.get("http://ipv4.download.thinkbroadband.com/1GB.zip", verify=False)

回答 1

设置超时参数

r = requests.get(w, verify=False, timeout=10) # 10 seconds

只要您不stream=True对该请求进行设置,requests.get()如果连接花费的时间超过十秒钟,或者服务器发送的数据超过十秒钟,这将导致呼叫超时。

Set the timeout parameter:

r = requests.get(w, verify=False, timeout=10) # 10 seconds

As long as you don’t set stream=True on that request, this will cause the call to requests.get() to timeout if the connection takes more than ten seconds, or if the server doesn’t send data for more than ten seconds.


回答 2

更新:https//requests.readthedocs.io/en/master/user/advanced/#timeouts

在新版本中requests

如果为超时指定单个值,则如下所示:

r = requests.get('https://github.com', timeout=5)

超时值将同时应用于connectread超时。如果要单独设置值,请指定一个元组:

r = requests.get('https://github.com', timeout=(3.05, 27))

如果远程服务器非常慢,则可以通过将None传递为超时值,然后获取一杯咖啡,从而使Requests永远等待响应。

r = requests.get('https://github.com', timeout=None)

我的旧的(可能是过时的)答案(很久以前发布了):

还有其他方法可以解决此问题:

1.使用TimeoutSauce内部类

来自:https : //github.com/kennethreitz/requests/issues/1928#issuecomment-35811896

import requests from requests.adapters import TimeoutSauce

class MyTimeout(TimeoutSauce):
    def __init__(self, *args, **kwargs):
        connect = kwargs.get('connect', 5)
        read = kwargs.get('read', connect)
        super(MyTimeout, self).__init__(connect=connect, read=read)

requests.adapters.TimeoutSauce = MyTimeout

此代码应使我们将读取超时设置为等于连接超时,这是您在Session.get()调用中传递的超时值。(请注意,我实际上尚未测试此代码,因此可能需要进行一些快速调试,我只是将其直接写到GitHub窗口中。)

2.使用来自kevinburke的请求分支: https : //github.com/kevinburke/requests/tree/connect-timeout

从其文档中:https : //github.com/kevinburke/requests/blob/connect-timeout/docs/user/advanced.rst

如果为超时指定单个值,则如下所示:

r = requests.get('https://github.com', timeout=5)

超时值将同时应用于连接和读取超时。如果要单独设置值,请指定一个元组:

r = requests.get('https://github.com', timeout=(3.05, 27))

kevinburke已请求将其合并到主要请求项目中,但尚未被接受。

UPDATE: https://requests.readthedocs.io/en/master/user/advanced/#timeouts

In new version of requests:

If you specify a single value for the timeout, like this:

r = requests.get('https://github.com', timeout=5)

The timeout value will be applied to both the connect and the read timeouts. Specify a tuple if you would like to set the values separately:

r = requests.get('https://github.com', timeout=(3.05, 27))

If the remote server is very slow, you can tell Requests to wait forever for a response, by passing None as a timeout value and then retrieving a cup of coffee.

r = requests.get('https://github.com', timeout=None)

My old (probably outdated) answer (which was posted long time ago):

There are other ways to overcome this problem:

1. Use the TimeoutSauce internal class

From: https://github.com/kennethreitz/requests/issues/1928#issuecomment-35811896

import requests from requests.adapters import TimeoutSauce

class MyTimeout(TimeoutSauce):
    def __init__(self, *args, **kwargs):
        connect = kwargs.get('connect', 5)
        read = kwargs.get('read', connect)
        super(MyTimeout, self).__init__(connect=connect, read=read)

requests.adapters.TimeoutSauce = MyTimeout

This code should cause us to set the read timeout as equal to the connect timeout, which is the timeout value you pass on your Session.get() call. (Note that I haven’t actually tested this code, so it may need some quick debugging, I just wrote it straight into the GitHub window.)

2. Use a fork of requests from kevinburke: https://github.com/kevinburke/requests/tree/connect-timeout

From its documentation: https://github.com/kevinburke/requests/blob/connect-timeout/docs/user/advanced.rst

If you specify a single value for the timeout, like this:

r = requests.get('https://github.com', timeout=5)

The timeout value will be applied to both the connect and the read timeouts. Specify a tuple if you would like to set the values separately:

r = requests.get('https://github.com', timeout=(3.05, 27))

kevinburke has requested it to be merged into the main requests project, but it hasn’t been accepted yet.


回答 3

timeout = int(seconds)

由于 requests >= 2.4.0,您可以使用timeout参数,即:

requests.get('https://duckduckgo.com/', timeout=10)

注意:

timeout不是整个响应下载的时间限制;相反,exception如果服务器在超时秒内未发出响应(更确切地说,在超时秒内未在基础套接字上接收到任何字节),则引发。如果未明确指定超时,则请求不会超时。

timeout = int(seconds)

Since requests >= 2.4.0, you can use the timeout argument, i.e:

requests.get('https://duckduckgo.com/', timeout=10)

Note:

timeout is not a time limit on the entire response download; rather, an exception is raised if the server has not issued a response for timeout seconds ( more precisely, if no bytes have been received on the underlying socket for timeout seconds). If no timeout is specified explicitly, requests do not time out.


回答 4

要创建超时,您可以使用信号

解决此问题的最佳方法可能是

  1. 设置异常作为警报信号的处理程序
  2. 延迟十秒拨打警报信号
  3. 在一个try-except-finally块内调用该函数。
  4. 如果该功能超时,则将到达except块。
  5. 在finally块中,您将中止警报,因此以后不会对其进行信号处理。

这是一些示例代码:

import signal
from time import sleep

class TimeoutException(Exception):
    """ Simple Exception to be called on timeouts. """
    pass

def _timeout(signum, frame):
    """ Raise an TimeoutException.

    This is intended for use as a signal handler.
    The signum and frame arguments passed to this are ignored.

    """
    # Raise TimeoutException with system default timeout message
    raise TimeoutException()

# Set the handler for the SIGALRM signal:
signal.signal(signal.SIGALRM, _timeout)
# Send the SIGALRM signal in 10 seconds:
signal.alarm(10)

try:    
    # Do our code:
    print('This will take 11 seconds...')
    sleep(11)
    print('done!')
except TimeoutException:
    print('It timed out!')
finally:
    # Abort the sending of the SIGALRM signal:
    signal.alarm(0)

有一些注意事项:

  1. 它不是线程安全的,信号总是传递到主线程,因此您不能将其放在任何其他线程中。
  2. 在信号调度和实际代码执行之后会有一点延迟。这意味着即使仅睡眠十秒钟,该示例也会超时。

但是,所有这些都在标准python库中!除了睡眠功能导入外,它只是一个导入。如果要在许多地方使用超时,则可以轻松地将TimeoutException,_timeout和单数放在函数中,然后调用它。或者,您可以制作一个装饰器并将其放置在函数上,请参见下面链接的答案。

您还可以将其设置为“上下文管理器”,以便将其与以下with语句一起使用:

import signal
class Timeout():
    """ Timeout for use with the `with` statement. """

    class TimeoutException(Exception):
        """ Simple Exception to be called on timeouts. """
        pass

    def _timeout(signum, frame):
        """ Raise an TimeoutException.

        This is intended for use as a signal handler.
        The signum and frame arguments passed to this are ignored.

        """
        raise Timeout.TimeoutException()

    def __init__(self, timeout=10):
        self.timeout = timeout
        signal.signal(signal.SIGALRM, Timeout._timeout)

    def __enter__(self):
        signal.alarm(self.timeout)

    def __exit__(self, exc_type, exc_value, traceback):
        signal.alarm(0)
        return exc_type is Timeout.TimeoutException

# Demonstration:
from time import sleep

print('This is going to take maximum 10 seconds...')
with Timeout(10):
    sleep(15)
    print('No timeout?')
print('Done')

这种上下文管理器方法的一个缺点是您无法知道代码是否实际超时。

资料来源和推荐读物:

To create a timeout you can use signals.

The best way to solve this case is probably to

  1. Set an exception as the handler for the alarm signal
  2. Call the alarm signal with a ten second delay
  3. Call the function inside a try-except-finally block.
  4. The except block is reached if the function timed out.
  5. In the finally block you abort the alarm, so it’s not singnaled later.

Here is some example code:

import signal
from time import sleep

class TimeoutException(Exception):
    """ Simple Exception to be called on timeouts. """
    pass

def _timeout(signum, frame):
    """ Raise an TimeoutException.

    This is intended for use as a signal handler.
    The signum and frame arguments passed to this are ignored.

    """
    # Raise TimeoutException with system default timeout message
    raise TimeoutException()

# Set the handler for the SIGALRM signal:
signal.signal(signal.SIGALRM, _timeout)
# Send the SIGALRM signal in 10 seconds:
signal.alarm(10)

try:    
    # Do our code:
    print('This will take 11 seconds...')
    sleep(11)
    print('done!')
except TimeoutException:
    print('It timed out!')
finally:
    # Abort the sending of the SIGALRM signal:
    signal.alarm(0)

There are some caveats to this:

  1. It is not threadsafe, signals are always delivered to the main thread, so you can’t put this in any other thread.
  2. There is a slight delay after the scheduling of the signal and the execution of the actual code. This means that the example would time out even if it only slept for ten seconds.

But, it’s all in the standard python library! Except for the sleep function import it’s only one import. If you are going to use timeouts many places You can easily put the TimeoutException, _timeout and the singaling in a function and just call that. Or you can make a decorator and put it on functions, see the answer linked below.

You can also set this up as a “context manager” so you can use it with the with statement:

import signal
class Timeout():
    """ Timeout for use with the `with` statement. """

    class TimeoutException(Exception):
        """ Simple Exception to be called on timeouts. """
        pass

    def _timeout(signum, frame):
        """ Raise an TimeoutException.

        This is intended for use as a signal handler.
        The signum and frame arguments passed to this are ignored.

        """
        raise Timeout.TimeoutException()

    def __init__(self, timeout=10):
        self.timeout = timeout
        signal.signal(signal.SIGALRM, Timeout._timeout)

    def __enter__(self):
        signal.alarm(self.timeout)

    def __exit__(self, exc_type, exc_value, traceback):
        signal.alarm(0)
        return exc_type is Timeout.TimeoutException

# Demonstration:
from time import sleep

print('This is going to take maximum 10 seconds...')
with Timeout(10):
    sleep(15)
    print('No timeout?')
print('Done')

One possible down side with this context manager approach is that you can’t know if the code actually timed out or not.

Sources and recommended reading:


回答 5

尝试使用超时和错误处理此请求:

import requests
try: 
    url = "http://google.com"
    r = requests.get(url, timeout=10)
except requests.exceptions.Timeout as e: 
    print e

Try this request with timeout & error handling:

import requests
try: 
    url = "http://google.com"
    r = requests.get(url, timeout=10)
except requests.exceptions.Timeout as e: 
    print e

回答 6

设置stream=True和使用r.iter_content(1024)。是的,eventlet.Timeout只是对我不起作用。

try:
    start = time()
    timeout = 5
    with get(config['source']['online'], stream=True, timeout=timeout) as r:
        r.raise_for_status()
        content = bytes()
        content_gen = r.iter_content(1024)
        while True:
            if time()-start > timeout:
                raise TimeoutError('Time out! ({} seconds)'.format(timeout))
            try:
                content += next(content_gen)
            except StopIteration:
                break
        data = content.decode().split('\n')
        if len(data) in [0, 1]:
            raise ValueError('Bad requests data')
except (exceptions.RequestException, ValueError, IndexError, KeyboardInterrupt,
        TimeoutError) as e:
    print(e)
    with open(config['source']['local']) as f:
        data = [line.strip() for line in f.readlines()]

讨论在这里https://redd.it/80kp1h

Set stream=True and use r.iter_content(1024). Yes, eventlet.Timeout just somehow doesn’t work for me.

try:
    start = time()
    timeout = 5
    with get(config['source']['online'], stream=True, timeout=timeout) as r:
        r.raise_for_status()
        content = bytes()
        content_gen = r.iter_content(1024)
        while True:
            if time()-start > timeout:
                raise TimeoutError('Time out! ({} seconds)'.format(timeout))
            try:
                content += next(content_gen)
            except StopIteration:
                break
        data = content.decode().split('\n')
        if len(data) in [0, 1]:
            raise ValueError('Bad requests data')
except (exceptions.RequestException, ValueError, IndexError, KeyboardInterrupt,
        TimeoutError) as e:
    print(e)
    with open(config['source']['local']) as f:
        data = [line.strip() for line in f.readlines()]

The discussion is here https://redd.it/80kp1h


回答 7

这可能有点过分,但是Celery分布式任务队列对超时有很好的支持。

特别是,您可以定义一个软时间限制,它仅会在您的过程中引发异常(以便您可以清理)和/或一个硬时间限制,当超过该时间限制时,该硬时间限制将终止任务。

在幕后,它使用与“之前”帖子中引用的信号方法相同,但以更易用和易管理的方式。而且,如果您要监视的网站列表很长,您可能会受益于其主要功能-各种方式来管理大量任务的执行。

This may be overkill, but the Celery distributed task queue has good support for timeouts.

In particular, you can define a soft time limit that just raises an exception in your process (so you can clean up) and/or a hard time limit that terminates the task when the time limit has been exceeded.

Under the covers, this uses the same signals approach as referenced in your “before” post, but in a more usable and manageable way. And if the list of web sites you are monitoring is long, you might benefit from its primary feature — all kinds of ways to manage the execution of a large number of tasks.


回答 8

我相信您可以使用multiprocessing而不依赖第三方套餐:

import multiprocessing
import requests

def call_with_timeout(func, args, kwargs, timeout):
    manager = multiprocessing.Manager()
    return_dict = manager.dict()

    # define a wrapper of `return_dict` to store the result.
    def function(return_dict):
        return_dict['value'] = func(*args, **kwargs)

    p = multiprocessing.Process(target=function, args=(return_dict,))
    p.start()

    # Force a max. `timeout` or wait for the process to finish
    p.join(timeout)

    # If thread is still active, it didn't finish: raise TimeoutError
    if p.is_alive():
        p.terminate()
        p.join()
        raise TimeoutError
    else:
        return return_dict['value']

call_with_timeout(requests.get, args=(url,), kwargs={'timeout': 10}, timeout=60)

传递给kwargs的超时是从服务器获取任何响应timeout的超时,自变量是获取完整响应的超时。

I believe you can use multiprocessing and not depend on a 3rd party package:

import multiprocessing
import requests

def call_with_timeout(func, args, kwargs, timeout):
    manager = multiprocessing.Manager()
    return_dict = manager.dict()

    # define a wrapper of `return_dict` to store the result.
    def function(return_dict):
        return_dict['value'] = func(*args, **kwargs)

    p = multiprocessing.Process(target=function, args=(return_dict,))
    p.start()

    # Force a max. `timeout` or wait for the process to finish
    p.join(timeout)

    # If thread is still active, it didn't finish: raise TimeoutError
    if p.is_alive():
        p.terminate()
        p.join()
        raise TimeoutError
    else:
        return return_dict['value']

call_with_timeout(requests.get, args=(url,), kwargs={'timeout': 10}, timeout=60)

The timeout passed to kwargs is the timeout to get any response from the server, the argument timeout is the timeout to get the complete response.


回答 9

超时=(连接超时,数据读取超时)或给出一个参数(超时= 1)

import requests

try:
    req = requests.request('GET', 'https://www.google.com',timeout=(1,1))
    print(req)
except requests.ReadTimeout:
    print("READ TIME OUT")

timeout = (connection timeout, data read timeout) or give a single argument(timeout=1)

import requests

try:
    req = requests.request('GET', 'https://www.google.com',timeout=(1,1))
    print(req)
except requests.ReadTimeout:
    print("READ TIME OUT")

回答 10

此代码适用于socketError 11004和10060 …

# -*- encoding:UTF-8 -*-
__author__ = 'ACE'
import requests
from PyQt4.QtCore import *
from PyQt4.QtGui import *


class TimeOutModel(QThread):
    Existed = pyqtSignal(bool)
    TimeOut = pyqtSignal()

    def __init__(self, fun, timeout=500, parent=None):
        """
        @param fun: function or lambda
        @param timeout: ms
        """
        super(TimeOutModel, self).__init__(parent)
        self.fun = fun

        self.timeer = QTimer(self)
        self.timeer.setInterval(timeout)
        self.timeer.timeout.connect(self.time_timeout)
        self.Existed.connect(self.timeer.stop)
        self.timeer.start()

        self.setTerminationEnabled(True)

    def time_timeout(self):
        self.timeer.stop()
        self.TimeOut.emit()
        self.quit()
        self.terminate()

    def run(self):
        self.fun()


bb = lambda: requests.get("http://ipv4.download.thinkbroadband.com/1GB.zip")

a = QApplication([])

z = TimeOutModel(bb, 500)
print 'timeout'

a.exec_()

this code working for socketError 11004 and 10060……

# -*- encoding:UTF-8 -*-
__author__ = 'ACE'
import requests
from PyQt4.QtCore import *
from PyQt4.QtGui import *


class TimeOutModel(QThread):
    Existed = pyqtSignal(bool)
    TimeOut = pyqtSignal()

    def __init__(self, fun, timeout=500, parent=None):
        """
        @param fun: function or lambda
        @param timeout: ms
        """
        super(TimeOutModel, self).__init__(parent)
        self.fun = fun

        self.timeer = QTimer(self)
        self.timeer.setInterval(timeout)
        self.timeer.timeout.connect(self.time_timeout)
        self.Existed.connect(self.timeer.stop)
        self.timeer.start()

        self.setTerminationEnabled(True)

    def time_timeout(self):
        self.timeer.stop()
        self.TimeOut.emit()
        self.quit()
        self.terminate()

    def run(self):
        self.fun()


bb = lambda: requests.get("http://ipv4.download.thinkbroadband.com/1GB.zip")

a = QApplication([])

z = TimeOutModel(bb, 500)
print 'timeout'

a.exec_()

回答 11

尽管存在与请求有关的问题,但我发现使用pycurl CURLOPT_TIMEOUT或CURLOPT_TIMEOUT_MS 非常容易。

无需线程或信令:

import pycurl
import StringIO

url = 'http://www.example.com/example.zip'
timeout_ms = 1000
raw = StringIO.StringIO()
c = pycurl.Curl()
c.setopt(pycurl.TIMEOUT_MS, timeout_ms)  # total timeout in milliseconds
c.setopt(pycurl.WRITEFUNCTION, raw.write)
c.setopt(pycurl.NOSIGNAL, 1)
c.setopt(pycurl.URL, url)
c.setopt(pycurl.HTTPGET, 1)
try:
    c.perform()
except pycurl.error:
    traceback.print_exc() # error generated on timeout
    pass # or just pass if you don't want to print the error

Despite the question being about requests, I find this very easy to do with pycurl CURLOPT_TIMEOUT or CURLOPT_TIMEOUT_MS.

No threading or signaling required:

import pycurl
import StringIO

url = 'http://www.example.com/example.zip'
timeout_ms = 1000
raw = StringIO.StringIO()
c = pycurl.Curl()
c.setopt(pycurl.TIMEOUT_MS, timeout_ms)  # total timeout in milliseconds
c.setopt(pycurl.WRITEFUNCTION, raw.write)
c.setopt(pycurl.NOSIGNAL, 1)
c.setopt(pycurl.URL, url)
c.setopt(pycurl.HTTPGET, 1)
try:
    c.perform()
except pycurl.error:
    traceback.print_exc() # error generated on timeout
    pass # or just pass if you don't want to print the error

回答 12

如果您使用该选项stream=True,则可以执行以下操作:

r = requests.get(
    'http://url_to_large_file',
    timeout=1,  # relevant only for underlying socket
    stream=True)

with open('/tmp/out_file.txt'), 'wb') as f:
    start_time = time.time()
    for chunk in r.iter_content(chunk_size=1024):
        if chunk:  # filter out keep-alive new chunks
            f.write(chunk)
        if time.time() - start_time > 8:
            raise Exception('Request took longer than 8s')

该解决方案不需要信号或多处理。

In case you’re using the option stream=True you can do this:

r = requests.get(
    'http://url_to_large_file',
    timeout=1,  # relevant only for underlying socket
    stream=True)

with open('/tmp/out_file.txt'), 'wb') as f:
    start_time = time.time()
    for chunk in r.iter_content(chunk_size=1024):
        if chunk:  # filter out keep-alive new chunks
            f.write(chunk)
        if time.time() - start_time > 8:
            raise Exception('Request took longer than 8s')

The solution does not need signals or multiprocessing.


回答 13

只是另一个解决方案(可从http://docs.python-requests.org/en/master/user/advanced/#streaming-uploads获得

上传之前,您可以确定内容大小:

TOO_LONG = 10*1024*1024  # 10 Mb
big_url = "http://ipv4.download.thinkbroadband.com/1GB.zip"
r = requests.get(big_url, stream=True)
print (r.headers['content-length'])
# 1073741824  

if int(r.headers['content-length']) < TOO_LONG:
    # upload content:
    content = r.content

但是请注意,发件人可能在“内容长度”响应字段中设置了错误的值。

Just another one solution (got it from http://docs.python-requests.org/en/master/user/advanced/#streaming-uploads)

Before upload you can find out the content size:

TOO_LONG = 10*1024*1024  # 10 Mb
big_url = "http://ipv4.download.thinkbroadband.com/1GB.zip"
r = requests.get(big_url, stream=True)
print (r.headers['content-length'])
# 1073741824  

if int(r.headers['content-length']) < TOO_LONG:
    # upload content:
    content = r.content

But be careful, a sender can set up incorrect value in the ‘content-length’ response field.


回答 14

如果是这样,请创建一个监视程序线程,该线程在10秒后会弄乱请求的内部状态,例如:

  • 关闭底层套接字,理想情况下
  • 如果请求重试该操作,则会触发异常

请注意,根据系统库,您可能无法设置DNS解析的截止日期。

If it comes to that, create a watchdog thread that messes up requests’ internal state after 10 seconds, e.g.:

  • closes the underlying socket, and ideally
  • triggers an exception if requests retries the operation

Note that depending on the system libraries you may be unable to set deadline on DNS resolution.


回答 15

好吧,我在此页面上尝试了许多解决方案,但仍然面临不稳定,随机挂起,连接性能差的问题。

我现在正在使用Curl,即使实现如此差劲,我也很高兴它具有“最大时间”功能和全局性能:

content=commands.getoutput('curl -m6 -Ss "http://mywebsite.xyz"')

在这里,我定义了一个6秒的最大时间参数,包括连接和传输时间。

我确定Curl有一个不错的python绑定,如果您更喜欢使用pythonic语法:)

Well, I tried many solutions on this page and still faced instabilities, random hangs, poor connections performance.

I’m now using Curl and i’m really happy about it’s “max time” functionnality and about the global performances, even with such a poor implementation :

content=commands.getoutput('curl -m6 -Ss "http://mywebsite.xyz"')

Here, I defined a 6 seconds max time parameter, englobing both connection and transfer time.

I’m sure Curl has a nice python binding, if you prefer to stick to the pythonic syntax :)


回答 16

有一个名为timeout-decorator的程序包,您可以使用它来使任何python函数超时。

@timeout_decorator.timeout(5)
def mytest():
    print("Start")
    for i in range(1,10):
        time.sleep(1)
        print("{} seconds have passed".format(i))

它使用一些此处建议的信号方法。另外,您可以告诉它使用多处理而不是信号(例如,如果您处于多线程环境中)。

There is a package called timeout-decorator that you can use to time out any python function.

@timeout_decorator.timeout(5)
def mytest():
    print("Start")
    for i in range(1,10):
        time.sleep(1)
        print("{} seconds have passed".format(i))

It uses the signals approach that some answers here suggest. Alternatively, you can tell it to use multiprocessing instead of signals (e.g. if you are in a multi-thread environment).


回答 17

我正在使用请求2.2.1,eventlet不适用于我。相反,我可以使用gevent超时,因为在我的服务中将gevent用于gunicorn。

import gevent
import gevent.monkey
gevent.monkey.patch_all(subprocess=True)
try:
    with gevent.Timeout(5):
        ret = requests.get(url)
        print ret.status_code, ret.content
except gevent.timeout.Timeout as e:
    print "timeout: {}".format(e.message)

请注意,一般的异常处理不会捕获gevent.timeout.Timeout。因此,无论是显式捕获gevent.timeout.Timeout 还是传递要像这样使用的其他异常:with gevent.Timeout(5, requests.exceptions.Timeout):尽管引发此异常时未传递任何消息。

I’m using requests 2.2.1 and eventlet didn’t work for me. Instead I was able use gevent timeout instead since gevent is used in my service for gunicorn.

import gevent
import gevent.monkey
gevent.monkey.patch_all(subprocess=True)
try:
    with gevent.Timeout(5):
        ret = requests.get(url)
        print ret.status_code, ret.content
except gevent.timeout.Timeout as e:
    print "timeout: {}".format(e.message)

Please note that gevent.timeout.Timeout is not caught by general Exception handling. So either explicitly catch gevent.timeout.Timeout or pass in a different exception to be used like so: with gevent.Timeout(5, requests.exceptions.Timeout): although no message is passed when this exception is raised.


回答 18

我想出了一个更直接的解决方案,该解决方案虽然丑陋,但可以解决实际问题。它有点像这样:

resp = requests.get(some_url, stream=True)
resp.raw._fp.fp._sock.settimeout(read_timeout)
# This will load the entire response even though stream is set
content = resp.content

您可以在此处阅读完整的说明

I came up with a more direct solution that is admittedly ugly but fixes the real problem. It goes a bit like this:

resp = requests.get(some_url, stream=True)
resp.raw._fp.fp._sock.settimeout(read_timeout)
# This will load the entire response even though stream is set
content = resp.content

You can read the full explanation here