Python 实用宝典

Question 1

While using the requests module, is there any way to print the raw HTTP request?

I don’t want just the headers, I want the request line, headers, and content printout. Is it possible to see what ultimately is constructed from HTTP request?

Question 2

Since v1.2.3 Requests added the PreparedRequest object. As per the documentation “it contains the exact bytes that will be sent to the server”.

One can use this to pretty print a request, like so:

import requests

req = requests.Request('POST','http://stackoverflow.com',headers={'X-Custom':'Test'},data='a=1&b=2')
prepared = req.prepare()

def pretty_print_POST(req):
    """
    At this point it is completely built and ready
    to be fired; it is "prepared".

    However pay attention at the formatting used in 
    this function because it is programmed to be pretty 
    printed and may differ from the actual request.
    """
    print('{}\n{}\r\n{}\r\n\r\n{}'.format(
        '-----------START-----------',
        req.method + ' ' + req.url,
        '\r\n'.join('{}: {}'.format(k, v) for k, v in req.headers.items()),
        req.body,
    ))

pretty_print_POST(prepared)

which produces:

-----------START-----------
POST http://stackoverflow.com/
Content-Length: 7
X-Custom: Test

a=1&b=2

Then you can send the actual request with this:

s = requests.Session()
s.send(prepared)

These links are to the latest documentation available, so they might change in content: Advanced – Prepared requests and API – Lower level classes

Question 3

import requests
response = requests.post('http://httpbin.org/post', data={'key1':'value1'})
print(response.request.body)
print(response.request.headers)

I am using requests version 2.18.4 and Python 3

Question 4

Note: this answer is outdated. Newer versions of requests support getting the request content directly, as AntonioHerraizS’s answer documents.

It’s not possible to get the true raw content of the request out of requests, since it only deals with higher level objects, such as headers and method type. requests uses urllib3 to send requests, but urllib3 also doesn’t deal with raw data – it uses httplib. Here’s a representative stack trace of a request:

-> r= requests.get("http://google.com")
  /usr/local/lib/python2.7/dist-packages/requests/api.py(55)get()
-> return request('get', url, **kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/api.py(44)request()
-> return session.request(method=method, url=url, **kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/sessions.py(382)request()
-> resp = self.send(prep, **send_kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/sessions.py(485)send()
-> r = adapter.send(request, **kwargs)
  /usr/local/lib/python2.7/dist-packages/requests/adapters.py(324)send()
-> timeout=timeout
  /usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(478)urlopen()
-> body=body, headers=headers)
  /usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/connectionpool.py(285)_make_request()
-> conn.request(method, url, **httplib_request_kw)
  /usr/lib/python2.7/httplib.py(958)request()
-> self._send_request(method, url, body, headers)

Inside the httplib machinery, we can see HTTPConnection._send_request indirectly uses HTTPConnection._send_output, which finally creates the raw request and body (if it exists), and uses HTTPConnection.send to send them separately. send finally reaches the socket.

Since there’s no hooks for doing what you want, as a last resort you can monkey patch httplib to get the content. It’s a fragile solution, and you may need to adapt it if httplib is changed. If you intend to distribute software using this solution, you may want to consider packaging httplib instead of using the system’s, which is easy, since it’s a pure python module.

Alas, without further ado, the solution:

import requests
import httplib

def patch_send():
    old_send= httplib.HTTPConnection.send
    def new_send( self, data ):
        print data
        return old_send(self, data) #return is not necessary, but never hurts, in case the library is changed
    httplib.HTTPConnection.send= new_send

patch_send()
requests.get("http://www.python.org")

which yields the output:

GET / HTTP/1.1
Host: www.python.org
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/2.1.0 CPython/2.7.3 Linux/3.2.0-23-generic-pae

Question 5

An even better idea is to use the requests_toolbelt library, which can dump out both requests and responses as strings for you to print to the console. It handles all the tricky cases with files and encodings which the above solution does not handle well.

It’s as easy as this:

import requests
from requests_toolbelt.utils import dump

resp = requests.get('https://httpbin.org/redirect/5')
data = dump.dump_all(resp)
print(data.decode('utf-8'))

Source: https://toolbelt.readthedocs.org/en/latest/dumputils.html

You can simply install it by typing:

pip install requests_toolbelt

Question 6

Here is a code, which makes the same, but with response headers:

import socket
def patch_requests():
    old_readline = socket._fileobject.readline
    if not hasattr(old_readline, 'patched'):
        def new_readline(self, size=-1):
            res = old_readline(self, size)
            print res,
            return res
        new_readline.patched = True
        socket._fileobject.readline = new_readline
patch_requests()

I spent a lot of time searching for this, so I’m leaving it here, if someone needs.

Question 7

I use the following function to format requests. It’s like @AntonioHerraizS except it will pretty-print JSON objects in the body as well, and it labels all parts of the request.

format_json = functools.partial(json.dumps, indent=2, sort_keys=True)
indent = functools.partial(textwrap.indent, prefix='  ')

def format_prepared_request(req):
    """Pretty-format 'requests.PreparedRequest'

    Example:
        res = requests.post(...)
        print(format_prepared_request(res.request))

        req = requests.Request(...)
        req = req.prepare()
        print(format_prepared_request(res.request))
    """
    headers = '\n'.join(f'{k}: {v}' for k, v in req.headers.items())
    content_type = req.headers.get('Content-Type', '')
    if 'application/json' in content_type:
        try:
            body = format_json(json.loads(req.body))
        except json.JSONDecodeError:
            body = req.body
    else:
        body = req.body
    s = textwrap.dedent("""
    REQUEST
    =======
    endpoint: {method} {url}
    headers:
    {headers}
    body:
    {body}
    =======
    """).strip()
    s = s.format(
        method=req.method,
        url=req.url,
        headers=indent(headers),
        body=indent(body),
    )
    return s

And I have a similar function to format the response:

def format_response(resp):
    """Pretty-format 'requests.Response'"""
    headers = '\n'.join(f'{k}: {v}' for k, v in resp.headers.items())
    content_type = resp.headers.get('Content-Type', '')
    if 'application/json' in content_type:
        try:
            body = format_json(resp.json())
        except json.JSONDecodeError:
            body = resp.text
    else:
        body = resp.text
    s = textwrap.dedent("""
    RESPONSE
    ========
    status_code: {status_code}
    headers:
    {headers}
    body:
    {body}
    ========
    """).strip()

    s = s.format(
        status_code=resp.status_code,
        headers=indent(headers),
        body=indent(body),
    )
    return s

Question 8

requests supports so called event hooks (as of 2.23 there’s actually only response hook). The hook can be used on a request to print full request-response pair’s data, including effective URL, headers and bodies, like:

import textwrap
import requests

def print_roundtrip(response, *args, **kwargs):
    format_headers = lambda d: '\n'.join(f'{k}: {v}' for k, v in d.items())
    print(textwrap.dedent('''
        ---------------- request ----------------
        {req.method} {req.url}
        {reqhdrs}

        {req.body}
        ---------------- response ----------------
        {res.status_code} {res.reason} {res.url}
        {reshdrs}

        {res.text}
    ''').format(
        req=response.request, 
        res=response, 
        reqhdrs=format_headers(response.request.headers), 
        reshdrs=format_headers(response.headers), 
    ))

requests.get('https://httpbin.org/', hooks={'response': print_roundtrip})

Running it prints:

---------------- request ----------------
GET https://httpbin.org/
User-Agent: python-requests/2.23.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive

None
---------------- response ----------------
200 OK https://httpbin.org/
Date: Thu, 14 May 2020 17:16:13 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 9593
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true

<!DOCTYPE html>
<html lang="en">
...
</html>

You may want to change res.text to res.content if the response is binary.

Python 实用宝典

Python请求-打印整个HTTP请求（原始）？

问题：Python请求-打印整个HTTP请求（原始）？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

有趣好用的Python教程