一个Flask进程接收多少个并发请求?

问题:一个Flask进程接收多少个并发请求?

我正在用Flask构建一个应用程序,但是我对WSGI并不太了解,它是基于HTTP的Werkzeug。当我开始使用gunicorn和4个工作进程处理Flask应用程序时,这是否意味着我可以处理4个并发请求?

我的意思是并发请求,而不是每秒的请求或其他任何请求。

I’m building an app with Flask, but I don’t know much about WSGI and it’s HTTP base, Werkzeug. When I start serving a Flask application with gunicorn and 4 worker processes, does this mean that I can handle 4 concurrent requests?

I do mean concurrent requests, and not requests per second or anything else.


回答 0

在运行开发服务器时(这是通过运行获得的)app.run(),您将获得一个同步过程,这意味着一次最多处理一个请求。

通过将Gunicorn保留在其默认配置中并简单地增加Gunicorn的数量--workers,您所获得的实际上是一些流程(由Gunicorn管理),每个流程的行为都类似于app.run()开发服务器。4个工作人员== 4个并发请求。这是因为Gunicorn sync默认使用其包含的工作程序类型。

重要的是要注意,Gunicorn还包含异步工作程序,即eventletgevent(以及tornado,但似乎最好在Tornado框架中使用)。通过使用--worker-class标志指定这些异步工作程序之一,您将获得Gunicorn管理多个异步进程的信息,每个进程管理自己的并发性。这些进程不使用线程,而是协程。基本上,在每个进程中,一次只能发生1件事(1个线程),但是当对象等待外部进程完成(例如数据库查询或等待网络I / O)时,它们可以被“暂停”。

这意味着,如果您使用的是Gunicorn的异步工作程序之一,则每个工作程序一次最多可以处理多个请求。多少工人才是最好的,取决于您的应用程序的性质,其环境,运行的硬件等。有关更多详细信息,请参见Gunicorn的设计页面,并在其介绍页面上介绍gevent的工作方式

When running the development server – which is what you get by running app.run(), you get a single synchronous process, which means at most 1 request is being processed at a time.

By sticking Gunicorn in front of it in its default configuration and simply increasing the number of --workers, what you get is essentially a number of processes (managed by Gunicorn) that each behave like the app.run() development server. 4 workers == 4 concurrent requests. This is because Gunicorn uses its included sync worker type by default.

It is important to note that Gunicorn also includes asynchronous workers, namely eventlet and gevent (and also tornado, but that’s best used with the Tornado framework, it seems). By specifying one of these async workers with the --worker-class flag, what you get is Gunicorn managing a number of async processes, each of which managing its own concurrency. These processes don’t use threads, but instead coroutines. Basically, within each process, still only 1 thing can be happening at a time (1 thread), but objects can be ‘paused’ when they are waiting on external processes to finish (think database queries or waiting on network I/O).

This means, if you’re using one of Gunicorn’s async workers, each worker can handle many more than a single request at a time. Just how many workers is best depends on the nature of your app, its environment, the hardware it runs on, etc. More details can be found on Gunicorn’s design page and notes on how gevent works on its intro page.


回答 1

当前,存在比已提供的解决方案简单得多的解决方案。运行应用程序时,只需将threaded=True参数传递给app.run()调用,例如:

app.run(host="your.host", port=4321, threaded=True)

根据在werkzeug文档中可以看到的另一种选择是使用processes参数,该参数接收的数字> 1表示要处理的最大并发进程数:

  • 线程化–进程应在单独的线程中处理每个请求吗?
  • 进程–如果大于1,则将处理新进程中的每个请求,直到最大并发进程数。

就像是:

app.run(host="your.host", port=4321, processes=3) #up to 3 processes

关于更多信息run()方法在这里,和博客文章,导致我找到解决方案和API引用。


注意:在Flask文档中,关于run()方法的方法表明不鼓励在生产环境中使用它,因为(quote):“虽然Flask轻巧易用,但其内置服务器不适合生产,因为它的扩展性不好”。

但是,他们确实指向其“ 部署选项”页面,以了解在投入生产时执行此操作的推荐方法。

Currently there is a far simpler solution than the ones already provided. When running your application you just have to pass along the threaded=True parameter to the app.run() call, like:

app.run(host="your.host", port=4321, threaded=True)

Another option as per what we can see in the werkzeug docs, is to use the processes parameter, which receives a number > 1 indicating the maximum number of concurrent processes to handle:

  • threaded – should the process handle each request in a separate thread?
  • processes – if greater than 1 then handle each request in a new process up to this maximum number of concurrent processes.

Something like:

app.run(host="your.host", port=4321, processes=3) #up to 3 processes

More info on the run() method here, and the blog post that led me to find the solution and api references.


Note: on the Flask docs on the run() methods it’s indicated that using it in a Production Environment is discouraged because (quote): “While lightweight and easy to use, Flask’s built-in server is not suitable for production as it doesn’t scale well.”

However, they do point to their Deployment Options page for the recommended ways to do this when going for production.


回答 2

Flask将同时为每个线程处理一个请求。如果您有2个进程,每个进程有4个线程,则是8个并发请求。

Flask不会产生或管理线程或进程。这就是WSGI网关(例如gunicorn)的责任。

Flask will process one request per thread at the same time. If you have 2 processes with 4 threads each, that’s 8 concurrent requests.

Flask doesn’t spawn or manage threads or processes. That’s the responsability of the WSGI gateway (eg. gunicorn).


回答 3

不,您绝对可以处理更多。

重要的是要记住,假设您正在运行一台单核计算机,那么CPU实际上一次只能运行一条指令*。

也就是说,CPU只能执行非常有限的一组指令,并且每个时钟周期不能执行多个指令(许多指令甚至需要1个周期以上)。

因此,我们在计算机科学中谈论的大多数并发是软件并发。换句话说,有一些软件实现层从我们这里提取底层CPU,并使我们认为我们正在同时运行代码。

这些“事物”可以是进程,它们是代码单元,它们在每个进程都认为其在自己的世界中使用自己的非共享内存在运行时可以并发运行。

另一个示例是线程,线程是进程内部的代码单元,也允许并发。

您的4个辅助进程能够处理4个以上请求的原因是,它们将触发线程以处理越来越多的请求。

实际的请求限制取决于所选择的HTTP服务器,I / O,操作系统,硬件,网络连接等。

祝好运!

*说明是CPU可以运行的最基本的命令。示例-加两个数字,从一条指令跳转到另一条指令

No- you can definitely handle more than that.

Its important to remember that deep deep down, assuming you are running a single core machine, the CPU really only runs one instruction* at a time.

Namely, the CPU can only execute a very limited set of instructions, and it can’t execute more than one instruction per clock tick (many instructions even take more than 1 tick).

Therefore, most concurrency we talk about in computer science is software concurrency. In other words, there are layers of software implementation that abstract the bottom level CPU from us and make us think we are running code concurrently.

These “things” can be processes, which are units of code that get run concurrently in the sense that each process thinks its running in its own world with its own, non-shared memory.

Another example is threads, which are units of code inside processes that allow concurrency as well.

The reason your 4 worker processes will be able to handle more than 4 requests is that they will fire off threads to handle more and more requests.

The actual request limit depends on HTTP server chosen, I/O, OS, hardware, network connection etc.

Good luck!

*instructions are the very basic commands the CPU can run. examples – add two numbers, jump from one instruction to another