无论内容类型标头如何,都可以在Python Flask中获取原始POST正文

问题:无论内容类型标头如何,都可以在Python Flask中获取原始POST正文

以前,我问过如何获取Flask请求中的数据,因为它request.data是空的。答案解释request.data为原始帖子正文,但如果分析表单数据将为空。我如何无条件获得原始职位?

@app.route('/', methods=['POST'])
def parse_request():
    data = request.data  # empty in some cases
    # always need raw data here, not parsed form data

Previously, I asked How to get data received in Flask request because request.data was empty. The answer explained that request.data is the raw post body, but will be empty if form data is parsed. How can I get the raw post body unconditionally?

@app.route('/', methods=['POST'])
def parse_request():
    data = request.data  # empty in some cases
    # always need raw data here, not parsed form data

回答 0

使用request.get_data()获得的原始数据,而不管内容类型。该数据被缓存,您可以随后访问request.datarequest.jsonrequest.form随意。

如果您request.data首先访问,它将首先调用get_data一个参数以解析表单数据。如果请求具有形式的内容类型(multipart/form-dataapplication/x-www-form-urlencoded,或application/x-url-encoded),则原始数据将被消耗。request.data并且request.json在这种情况下将显示为空。

Use request.get_data() to get the raw data, regardless of content type. The data is cached and you can subsequently access request.data, request.json, request.form at will.

If you access request.data first, it will call get_data with an argument to parse form data first. If the request has a form content type (multipart/form-data, application/x-www-form-urlencoded, or application/x-url-encoded) then the raw data will be consumed. request.data and request.json will appear empty in this case.


回答 1

request.stream是WSGI服务器传递给应用程序的原始数据流。读取时不进行任何解析,尽管通常会这样做request.get_data()

data = request.stream.read()

如果该流先前被request.data或其他属性读取,则将为空。

request.stream is the stream of raw data passed to the application by the WSGI server. No parsing is done when reading it, although you usually want request.get_data() instead.

data = request.stream.read()

The stream will be empty if it was previously read by request.data or another attribute.


回答 2

我创建了一个WSGI中间件,用于存储environ['wsgi.input']流中的原始内容。我将值保存在WSGI环境中,因此可以从request.environ['body_copy']我的应用程序中访问它。

在Werkzeug或Flask中这不是必需的,因为request.get_data()无论内容类型如何,都将获取原始数据,但是可以更好地处理HTTP和WSGI行为。

这会将整个主体读入内存,如果发布了一个大文件,这将是一个问题。如果Content-Length缺少标题,它将不会读取任何内容,因此它将无法处理流式请求。

from io import BytesIO

class WSGICopyBody(object):
    def __init__(self, application):
        self.application = application

    def __call__(self, environ, start_response):
        length = int(environ.get('CONTENT_LENGTH') or 0)
        body = environ['wsgi.input'].read(length)
        environ['body_copy'] = body
        # replace the stream since it was exhausted by read()
        environ['wsgi.input'] = BytesIO(body)
        return self.application(environ, start_response)

app.wsgi_app = WSGICopyBody(app.wsgi_app)
request.environ['body_copy']

I created a WSGI middleware that stores the raw body from the environ['wsgi.input'] stream. I saved the value in the WSGI environ so I could access it from request.environ['body_copy'] within my app.

This isn’t necessary in Werkzeug or Flask, as request.get_data() will get the raw data regardless of content type, but with better handling of HTTP and WSGI behavior.

This reads the entire body into memory, which will be an issue if for example a large file is posted. This won’t read anything if the Content-Length header is missing, so it won’t handle streaming requests.

from io import BytesIO

class WSGICopyBody(object):
    def __init__(self, application):
        self.application = application

    def __call__(self, environ, start_response):
        length = int(environ.get('CONTENT_LENGTH') or 0)
        body = environ['wsgi.input'].read(length)
        environ['body_copy'] = body
        # replace the stream since it was exhausted by read()
        environ['wsgi.input'] = BytesIO(body)
        return self.application(environ, start_response)

app.wsgi_app = WSGICopyBody(app.wsgi_app)
request.environ['body_copy']

回答 3

request.data如果request.headers["Content-Type"]被识别为表单数据,则将为空,并将其解析为request.form。要获取原始数据而不管内容类型如何,请使用request.get_data()

request.data调用request.get_data(parse_form_data=True),这导致表单数据的行为不同。

request.data will be empty if request.headers["Content-Type"] is recognized as form data, which will be parsed into request.form. To get the raw data regardless of content type, use request.get_data().

request.data calls request.get_data(parse_form_data=True), which results in the different behavior for form data.