Python请求-打印整个HTTP请求(原始)?

问题:Python请求-打印整个HTTP请求(原始)?

在使用requests模块时,有什么方法可以打印原始HTTP请求?

我不仅想要标题,还想要请求行,标题和内容打印输出。是否有可能看到HTTP请求最终构造了什么?

While using the requests module, is there any way to print the raw HTTP request?

I don’t want just the headers, I want the request line, headers, and content printout. Is it possible to see what ultimately is constructed from HTTP request?


回答 0

从v1.2.3开始,请求添加了PreparedRequest对象。按照文档“它包含将发送到服务器的确切字节”。

可以使用它来漂亮地打印请求,如下所示:

import requests

req = requests.Request('POST','http://stackoverflow.com',headers={'X-Custom':'Test'},data='a=1&b=2')
prepared = req.prepare()

def pretty_print_POST(req):
    """
    At this point it is completely built and ready
    to be fired; it is "prepared".

    However pay attention at the formatting used in 
    this function because it is programmed to be pretty 
    printed and may differ from the actual request.
    """
    print('{}\n{}\r\n{}\r\n\r\n{}'.format(
        '-----------START-----------',
        req.method + ' ' + req.url,
        '\r\n'.join('{}: {}'.format(k, v) for k, v in req.headers.items()),
        req.body,
    ))

pretty_print_POST(prepared)

生成:

-----------START-----------
POST http://stackoverflow.com/
Content-Length: 7
X-Custom: Test

a=1&b=2

然后,您可以使用以下命令发送实际请求:

s = requests.Session()
s.send(prepared)

这些链接是可用的最新文档,因此它们的内容可能会发生变化: 高级-准备的请求API-较低级别的类

Since v1.2.3 Requests added the PreparedRequest object. As per the documentation “it contains the exact bytes that will be sent to the server”.

One can use this to pretty print a request, like so:

import requests

req = requests.Request('POST','http://stackoverflow.com',headers={'X-Custom':'Test'},data='a=1&b=2')
prepared = req.prepare()

def pretty_print_POST(req):
    """
    At this point it is completely built and ready
    to be fired; it is "prepared".

    However pay attention at the formatting used in 
    this function because it is programmed to be pretty 
    printed and may differ from the actual request.
    """
    print('{}\n{}\r\n{}\r\n\r\n{}'.format(
        '-----------START-----------',
        req.method + ' ' + req.url,
        '\r\n'.join('{}: {}'.format(k, v) for k, v in req.headers.items()),
        req.body,
    ))

pretty_print_POST(prepared)

which produces:

-----------START-----------
POST http://stackoverflow.com/
Content-Length: 7
X-Custom: Test

a=1&b=2

Then you can send the actual request with this:

s = requests.Session()
s.send(prepared)

These links are to the latest documentation available, so they might change in content: Advanced – Prepared requests and API – Lower level classes


回答 1

import requests
response = requests.post('http://httpbin.org/post', data={'key1':'value1'})
print(response.request.body)
print(response.request.headers)

我正在使用请求版本2.18.4和Python 3

import requests
response = requests.post('http://httpbin.org/post', data={'key1':'value1'})
print(response.request.body)
print(response.request.headers)

I am using requests version 2.18.4 and Python 3


回答 2

注意:此答案已过时。requests 作为AntonioHerraizS的答复文件,较新的支持版本直接获取请求内容

无法获得请求的真实原始内容requests,因为它仅处理更高级别的对象,例如标头方法类型requests利用urllib3发送请求,但urllib3 不能与原始数据处理-它使用httplib。这是请求的代表性堆栈跟踪:

-> r= requests.get("http://google.com")
  /usr/local/lib/python2.7/dist-packages/requests/api.py(55)get()
-> return request('get', url, **kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/api.py(44)request()
-> return session.request(method=method, url=url, **kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/sessions.py(382)request()
-> resp = self.send(prep, **send_kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/sessions.py(485)send()
-> r = adapter.send(request, **kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/adapters.py(324)send()
-> timeout=timeout
  /usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(478)urlopen()
-> body=body, headers=headers)
  /usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(285)_make_request()
-> conn.request(method, url, **httplib_request_kw)
  /usr/lib/python2.7/httplib.py(958)request()
-> self._send_request(method, url, body, headers)

httplib机器内部,我们可以看到HTTPConnection._send_request间接使用HTTPConnection._send_output,它最终创建了原始请求主体(如果存在),并使用HTTPConnection.send了分别发送它们。send终于到达插座。

由于没有钩子可以做您想做的事情,因此,万不得已时,您可以Monkey补丁httplib来获取内容。这是一个脆弱的解决方案,如果httplib进行了更改,您可能需要对其进行调整。如果您打算使用此解决方案分发软件,则可能要考虑打包httplib而不是使用系统包,这很容易,因为它是纯python模块。

las,不费吹灰之力,解决方案:

import requests
import httplib

def patch_send():
    old_send= httplib.HTTPConnection.send
    def new_send( self, data ):
        print data
        return old_send(self, data) #return is not necessary, but never hurts, in case the library is changed
    httplib.HTTPConnection.send= new_send

patch_send()
requests.get("http://www.python.org")

产生输出:

GET / HTTP/1.1
Host: www.python.org
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/2.1.0 CPython/2.7.3 Linux/3.2.0-23-generic-pae

Note: this answer is outdated. Newer versions of requests support getting the request content directly, as AntonioHerraizS’s answer documents.

It’s not possible to get the true raw content of the request out of requests, since it only deals with higher level objects, such as headers and method type. requests uses urllib3 to send requests, but urllib3 also doesn’t deal with raw data – it uses httplib. Here’s a representative stack trace of a request:

-> r= requests.get("http://google.com")
  /usr/local/lib/python2.7/dist-packages/requests/api.py(55)get()
-> return request('get', url, **kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/api.py(44)request()
-> return session.request(method=method, url=url, **kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/sessions.py(382)request()
-> resp = self.send(prep, **send_kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/sessions.py(485)send()
-> r = adapter.send(request, **kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/adapters.py(324)send()
-> timeout=timeout
  /usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(478)urlopen()
-> body=body, headers=headers)
  /usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(285)_make_request()
-> conn.request(method, url, **httplib_request_kw)
  /usr/lib/python2.7/httplib.py(958)request()
-> self._send_request(method, url, body, headers)

Inside the httplib machinery, we can see HTTPConnection._send_request indirectly uses HTTPConnection._send_output, which finally creates the raw request and body (if it exists), and uses HTTPConnection.send to send them separately. send finally reaches the socket.

Since there’s no hooks for doing what you want, as a last resort you can monkey patch httplib to get the content. It’s a fragile solution, and you may need to adapt it if httplib is changed. If you intend to distribute software using this solution, you may want to consider packaging httplib instead of using the system’s, which is easy, since it’s a pure python module.

Alas, without further ado, the solution:

import requests
import httplib

def patch_send():
    old_send= httplib.HTTPConnection.send
    def new_send( self, data ):
        print data
        return old_send(self, data) #return is not necessary, but never hurts, in case the library is changed
    httplib.HTTPConnection.send= new_send

patch_send()
requests.get("http://www.python.org")

which yields the output:

GET / HTTP/1.1
Host: www.python.org
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/2.1.0 CPython/2.7.3 Linux/3.2.0-23-generic-pae

回答 3

更好的主意是使用requests_toolbelt库,该库可以将请求和响应都作为字符串转储出去,以供您打印到控制台。它使用上述解决方案无法很好处理的文件和编码来处理所有棘手的情况。

就这么简单:

import requests
from requests_toolbelt.utils import dump

resp = requests.get('https://httpbin.org/redirect/5')
data = dump.dump_all(resp)
print(data.decode('utf-8'))

资料来源:https : //toolbelt.readthedocs.org/en/latest/dumputils.html

您只需输入以下内容即可安装:

pip install requests_toolbelt

An even better idea is to use the requests_toolbelt library, which can dump out both requests and responses as strings for you to print to the console. It handles all the tricky cases with files and encodings which the above solution does not handle well.

It’s as easy as this:

import requests
from requests_toolbelt.utils import dump

resp = requests.get('https://httpbin.org/redirect/5')
data = dump.dump_all(resp)
print(data.decode('utf-8'))

Source: https://toolbelt.readthedocs.org/en/latest/dumputils.html

You can simply install it by typing:

pip install requests_toolbelt

回答 4

这是一个代码,与上面相同,但是带有响应头:

import socket
def patch_requests():
    old_readline = socket._fileobject.readline
    if not hasattr(old_readline, 'patched'):
        def new_readline(self, size=-1):
            res = old_readline(self, size)
            print res,
            return res
        new_readline.patched = True
        socket._fileobject.readline = new_readline
patch_requests()

我花了很多时间寻找这个,所以如果有人需要,我就把它留在这里。

Here is a code, which makes the same, but with response headers:

import socket
def patch_requests():
    old_readline = socket._fileobject.readline
    if not hasattr(old_readline, 'patched'):
        def new_readline(self, size=-1):
            res = old_readline(self, size)
            print res,
            return res
        new_readline.patched = True
        socket._fileobject.readline = new_readline
patch_requests()

I spent a lot of time searching for this, so I’m leaving it here, if someone needs.


回答 5

我使用以下函数来格式化请求。就像@AntonioHerraizS一样,除了它还会在主体中漂亮地打印JSON对象,并标记请求的所有部分。

format_json = functools.partial(json.dumps, indent=2, sort_keys=True)
indent = functools.partial(textwrap.indent, prefix='  ')

def format_prepared_request(req):
    """Pretty-format 'requests.PreparedRequest'

    Example:
        res = requests.post(...)
        print(format_prepared_request(res.request))

        req = requests.Request(...)
        req = req.prepare()
        print(format_prepared_request(res.request))
    """
    headers = '\n'.join(f'{k}: {v}' for k, v in req.headers.items())
    content_type = req.headers.get('Content-Type', '')
    if 'application/json' in content_type:
        try:
            body = format_json(json.loads(req.body))
        except json.JSONDecodeError:
            body = req.body
    else:
        body = req.body
    s = textwrap.dedent("""
    REQUEST
    =======
    endpoint: {method} {url}
    headers:
    {headers}
    body:
    {body}
    =======
    """).strip()
    s = s.format(
        method=req.method,
        url=req.url,
        headers=indent(headers),
        body=indent(body),
    )
    return s

我有一个类似的函数来格式化响应:

def format_response(resp):
    """Pretty-format 'requests.Response'"""
    headers = '\n'.join(f'{k}: {v}' for k, v in resp.headers.items())
    content_type = resp.headers.get('Content-Type', '')
    if 'application/json' in content_type:
        try:
            body = format_json(resp.json())
        except json.JSONDecodeError:
            body = resp.text
    else:
        body = resp.text
    s = textwrap.dedent("""
    RESPONSE
    ========
    status_code: {status_code}
    headers:
    {headers}
    body:
    {body}
    ========
    """).strip()

    s = s.format(
        status_code=resp.status_code,
        headers=indent(headers),
        body=indent(body),
    )
    return s

I use the following function to format requests. It’s like @AntonioHerraizS except it will pretty-print JSON objects in the body as well, and it labels all parts of the request.

format_json = functools.partial(json.dumps, indent=2, sort_keys=True)
indent = functools.partial(textwrap.indent, prefix='  ')

def format_prepared_request(req):
    """Pretty-format 'requests.PreparedRequest'

    Example:
        res = requests.post(...)
        print(format_prepared_request(res.request))

        req = requests.Request(...)
        req = req.prepare()
        print(format_prepared_request(res.request))
    """
    headers = '\n'.join(f'{k}: {v}' for k, v in req.headers.items())
    content_type = req.headers.get('Content-Type', '')
    if 'application/json' in content_type:
        try:
            body = format_json(json.loads(req.body))
        except json.JSONDecodeError:
            body = req.body
    else:
        body = req.body
    s = textwrap.dedent("""
    REQUEST
    =======
    endpoint: {method} {url}
    headers:
    {headers}
    body:
    {body}
    =======
    """).strip()
    s = s.format(
        method=req.method,
        url=req.url,
        headers=indent(headers),
        body=indent(body),
    )
    return s

And I have a similar function to format the response:

def format_response(resp):
    """Pretty-format 'requests.Response'"""
    headers = '\n'.join(f'{k}: {v}' for k, v in resp.headers.items())
    content_type = resp.headers.get('Content-Type', '')
    if 'application/json' in content_type:
        try:
            body = format_json(resp.json())
        except json.JSONDecodeError:
            body = resp.text
    else:
        body = resp.text
    s = textwrap.dedent("""
    RESPONSE
    ========
    status_code: {status_code}
    headers:
    {headers}
    body:
    {body}
    ========
    """).strip()

    s = s.format(
        status_code=resp.status_code,
        headers=indent(headers),
        body=indent(body),
    )
    return s

回答 6

requests支持所谓的事件挂钩(从2.23开始,实际上只有response挂钩)。该钩子可用于请求以打印完整的请求-响应对数据,包括有效的URL,标头和正文,例如:

import textwrap
import requests

def print_roundtrip(response, *args, **kwargs):
    format_headers = lambda d: '\n'.join(f'{k}: {v}' for k, v in d.items())
    print(textwrap.dedent('''
        ---------------- request ----------------
        {req.method} {req.url}
        {reqhdrs}

        {req.body}
        ---------------- response ----------------
        {res.status_code} {res.reason} {res.url}
        {reshdrs}

        {res.text}
    ''').format(
        req=response.request, 
        res=response, 
        reqhdrs=format_headers(response.request.headers), 
        reshdrs=format_headers(response.headers), 
    ))

requests.get('https://httpbin.org/', hooks={'response': print_roundtrip})

运行它会打印:

---------------- request ----------------
GET https://httpbin.org/
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive

None
---------------- response ----------------
200 OK https://httpbin.org/
Date: Thu, 14 May 2020 17:16:13 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 9593
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true

<!DOCTYPE html>
<html lang="en">
...
</html>

如果响应为二进制res.textres.content则可能需要更改为。

requests supports so called event hooks (as of 2.23 there’s actually only response hook). The hook can be used on a request to print full request-response pair’s data, including effective URL, headers and bodies, like:

import textwrap
import requests

def print_roundtrip(response, *args, **kwargs):
    format_headers = lambda d: '\n'.join(f'{k}: {v}' for k, v in d.items())
    print(textwrap.dedent('''
        ---------------- request ----------------
        {req.method} {req.url}
        {reqhdrs}

        {req.body}
        ---------------- response ----------------
        {res.status_code} {res.reason} {res.url}
        {reshdrs}

        {res.text}
    ''').format(
        req=response.request, 
        res=response, 
        reqhdrs=format_headers(response.request.headers), 
        reshdrs=format_headers(response.headers), 
    ))

requests.get('https://httpbin.org/', hooks={'response': print_roundtrip})

Running it prints:

---------------- request ----------------
GET https://httpbin.org/
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive

None
---------------- response ----------------
200 OK https://httpbin.org/
Date: Thu, 14 May 2020 17:16:13 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 9593
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true

<!DOCTYPE html>
<html lang="en">
...
</html>

You may want to change res.text to res.content if the response is binary.