标签归档:process

确保只运行一个程序实例

问题:确保只运行一个程序实例

有没有一种Python方式可以只运行一个程序实例?

我想出的唯一合理的解决方案是尝试将其作为服务器在某个端口上运行,然后试图绑定到同一端口的第二个程序失败。但这不是一个好主意,也许有比这更轻巧的东西了吗?

(考虑到程序有时可能会失败,例如segfault-因此“锁定文件”之类的东西将无法工作)

Is there a Pythonic way to have only one instance of a program running?

The only reasonable solution I’ve come up with is trying to run it as a server on some port, then second program trying to bind to same port – fails. But it’s not really a great idea, maybe there’s something more lightweight than this?

(Take into consideration that program is expected to fail sometimes, i.e. segfault – so things like “lock file” won’t work)


回答 0

以下代码可以完成此工作,它是跨平台的,并且可以在Python 2.4-3.2上运行。我在Windows,OS X和Linux上进行了测试。

from tendo import singleton
me = singleton.SingleInstance() # will sys.exit(-1) if other instance is running

最新的代码版本位于singleton.py中。请在这里提交错误

您可以使用以下方法之一安装tend:

The following code should do the job, it is cross-platform and runs on Python 2.4-3.2. I tested it on Windows, OS X and Linux.

from tendo import singleton
me = singleton.SingleInstance() # will sys.exit(-1) if other instance is running

The latest code version is available singleton.py. Please file bugs here.

You can install tend using one of the following methods:


回答 1

简单,跨平台的解决方案,在发现了另一个问题Zgoda酒店

import fcntl
import os
import sys

def instance_already_running(label="default"):
    """
    Detect if an an instance with the label is already running, globally
    at the operating system level.

    Using `os.open` ensures that the file pointer won't be closed
    by Python's garbage collector after the function's scope is exited.

    The lock will be released when the program exits, or could be
    released if the file pointer were closed.
    """

    lock_file_pointer = os.open(f"/tmp/instance_{label}.lock", os.O_WRONLY)

    try:
        fcntl.lockf(lock_file_pointer, fcntl.LOCK_EX | fcntl.LOCK_NB)
        already_running = False
    except IOError:
        already_running = True

    return already_running

很像S.Lott的建议,但是带有代码。

Simple, cross-platform solution, found in another question by zgoda:

import fcntl
import os
import sys

def instance_already_running(label="default"):
    """
    Detect if an an instance with the label is already running, globally
    at the operating system level.

    Using `os.open` ensures that the file pointer won't be closed
    by Python's garbage collector after the function's scope is exited.

    The lock will be released when the program exits, or could be
    released if the file pointer were closed.
    """

    lock_file_pointer = os.open(f"/tmp/instance_{label}.lock", os.O_WRONLY)

    try:
        fcntl.lockf(lock_file_pointer, fcntl.LOCK_EX | fcntl.LOCK_NB)
        already_running = False
    except IOError:
        already_running = True

    return already_running

A lot like S.Lott’s suggestion, but with the code.


回答 2

此代码特定于Linux。它使用“抽象” UNIX域套接字,但是它很简单并且不会留下过时的锁定文件。与上面的解决方案相比,我更喜欢它,因为它不需要专门保留的TCP端口。

try:
    import socket
    s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
    ## Create an abstract socket, by prefixing it with null. 
    s.bind( '\0postconnect_gateway_notify_lock') 
except socket.error as e:
    error_code = e.args[0]
    error_string = e.args[1]
    print "Process already running (%d:%s ). Exiting" % ( error_code, error_string) 
    sys.exit (0) 

postconnect_gateway_notify_lock可以更改唯一字符串,以允许需要强制执行单个实例的多个程序。

This code is Linux specific. It uses ‘abstract’ UNIX domain sockets, but it is simple and won’t leave stale lock files around. I prefer it to the solution above because it doesn’t require a specially reserved TCP port.

try:
    import socket
    s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
    ## Create an abstract socket, by prefixing it with null. 
    s.bind( '\0postconnect_gateway_notify_lock') 
except socket.error as e:
    error_code = e.args[0]
    error_string = e.args[1]
    print "Process already running (%d:%s ). Exiting" % ( error_code, error_string) 
    sys.exit (0) 

The unique string postconnect_gateway_notify_lock can be changed to allow multiple programs that need a single instance enforced.


回答 3

我不知道它是否足够的pythonic,但是在Java世界中,在定义的端口上进行侦听是一种使用广泛的解决方案,因为它可以在所有主要平台上使用,并且在崩溃的程序上没有任何问题。

侦听端口的另一个优点是可以将命令发送到正在运行的实例。例如,当用户第二次启动该程序时,您可以向运行中的实例发送命令以告诉它打开另一个窗口(例如Firefox就是这样做的。我不知道他们是否使用TCP端口或命名管道,或者这样的东西,虽然)。

I don’t know if it’s pythonic enough, but in the Java world listening on a defined port is a pretty widely used solution, as it works on all major platforms and doesn’t have any problems with crashing programs.

Another advantage of listening to a port is that you could send a command to the running instance. For example when the users starts the program a second time, you could send the running instance a command to tell it to open another window (that’s what Firefox does, for example. I don’t know if they use TCP ports or named pipes or something like that, ‘though).


回答 4

以前从未编写过python,但这是我刚刚在mycheckpoint中实现的功能,以防止crond将其启动两次或更多次:

import os
import sys
import fcntl
fh=0
def run_once():
    global fh
    fh=open(os.path.realpath(__file__),'r')
    try:
        fcntl.flock(fh,fcntl.LOCK_EX|fcntl.LOCK_NB)
    except:
        os._exit(0)

run_once()

在另一期(http://stackoverflow.com/questions/2959474)发布后,找到了Slava-N的建议。此功能称为函数,它锁定正在执行的脚本文件(不是pid文件)并保持锁定状态,直到脚本结束(正常或错误)。

Never written python before, but this is what I’ve just implemented in mycheckpoint, to prevent it being started twice or more by crond:

import os
import sys
import fcntl
fh=0
def run_once():
    global fh
    fh=open(os.path.realpath(__file__),'r')
    try:
        fcntl.flock(fh,fcntl.LOCK_EX|fcntl.LOCK_NB)
    except:
        os._exit(0)

run_once()

Found Slava-N’s suggestion after posting this in another issue (http://stackoverflow.com/questions/2959474). This one is called as a function, locks the executing scripts file (not a pid file) and maintains the lock until the script ends (normal or error).


回答 5

使用一个pid文件。您有一些已知的位置,“ / path / to / pidfile”,并且在启动时会执行以下操作(部分伪代码,因为我是咖啡前人士,所以不想那么努力):

import os, os.path
pidfilePath = """/path/to/pidfile"""
if os.path.exists(pidfilePath):
   pidfile = open(pidfilePath,"r")
   pidString = pidfile.read()
   if <pidString is equal to os.getpid()>:
      # something is real weird
      Sys.exit(BADCODE)
   else:
      <use ps or pidof to see if the process with pid pidString is still running>
      if  <process with pid == 'pidString' is still running>:
          Sys.exit(ALREADAYRUNNING)
      else:
          # the previous server must have crashed
          <log server had crashed>
          <reopen pidfilePath for writing>
          pidfile.write(os.getpid())
else:
    <open pidfilePath for writing>
    pidfile.write(os.getpid())

因此,换句话说,您正在检查是否存在pidfile。如果不是,请将您的pid写入该文件。如果pidfile存在,则检查pid是否为正在运行的进程的pid;如果是这样,则您有另一个正在运行的实时进程,因此只需关闭。如果不是,则先前的进程崩溃了,因此将其记录下来,然后将您自己的pid写入旧文件中。然后继续。

Use a pid file. You have some known location, “/path/to/pidfile” and at startup you do something like this (partially pseudocode because I’m pre-coffee and don’t want to work all that hard):

import os, os.path
pidfilePath = """/path/to/pidfile"""
if os.path.exists(pidfilePath):
   pidfile = open(pidfilePath,"r")
   pidString = pidfile.read()
   if <pidString is equal to os.getpid()>:
      # something is real weird
      Sys.exit(BADCODE)
   else:
      <use ps or pidof to see if the process with pid pidString is still running>
      if  <process with pid == 'pidString' is still running>:
          Sys.exit(ALREADAYRUNNING)
      else:
          # the previous server must have crashed
          <log server had crashed>
          <reopen pidfilePath for writing>
          pidfile.write(os.getpid())
else:
    <open pidfilePath for writing>
    pidfile.write(os.getpid())

So, in other words, you’re checking if a pidfile exists; if not, write your pid to that file. If the pidfile does exist, then check to see if the pid is the pid of a running process; if so, then you’ve got another live process running, so just shut down. If not, then the previous process crashed, so log it, and then write your own pid to the file in place of the old one. Then continue.


回答 6

您已经在另一个线程中找到了对类似问题的答复,因此为了完整起见,请参见如何在Windows上实现名为Mutex的相同目的。

http://code.activestate.com/recipes/474070/

You already found reply to similar question in another thread, so for completeness sake see how to achieve the same on Windows uning named mutex.

http://code.activestate.com/recipes/474070/


回答 7

这可能有效。

  1. 尝试将PID文件创建到已知位置。如果失败,则有人将文件锁定了,您就完成了。

  2. 正常完成后,请关闭并删除PID文件,以便其他人可以覆盖它。

您可以将程序包装在Shell脚本中,即使程序崩溃,该脚本也会删除PID文件。

如果程序挂起,也可以使用PID文件将其杀死。

This may work.

  1. Attempt create a PID file to a known location. If you fail, someone has the file locked, you’re done.

  2. When you finish normally, close and remove the PID file, so someone else can overwrite it.

You can wrap your program in a shell script that removes the PID file even if your program crashes.

You can, also, use the PID file to kill the program if it hangs.


回答 8

在UNIX上,使用锁定文件是一种非常普遍的方法。如果崩溃,则必须手动清理。您可以将PID存储在文件中,并在启动时检查是否有使用此PID的进程,否则将覆盖锁定文件。(但是,您还需要在read-file-check-pid-rewrite-file周围加锁)。您将在os -package中找到获取和检查pid所需的内容。检查是否存在具有给定pid的进程的常见方法是向其发送非致命信号。

其他替代方法可以将其与羊群或posix信号量结合使用。

如saua所建议的那样,打开网络插座可能是最简单,最方便的方法。

Using a lock-file is a quite common approach on unix. If it crashes, you have to clean up manually. You could stor the PID in the file, and on startup check if there is a process with this PID, overriding the lock-file if not. (However, you also need a lock around the read-file-check-pid-rewrite-file). You will find what you need for getting and checking pid in the os-package. The common way of checking if there exists a process with a given pid, is to send it a non-fatal signal.

Other alternatives could be combining this with flock or posix semaphores.

Opening a network socket, as saua proposed, would probably be the easiest and most portable.


回答 9

对于将wxPython用于其应用程序的任何人,您都可以使用此处记录的功能 wx.SingleInstanceChecker

我个人使用一个子类,wx.App该子类利用wx.SingleInstanceChecker和返回存在的应用程序现有实例,并False从中返回OnInit()

import wx

class SingleApp(wx.App):
    """
    class that extends wx.App and only permits a single running instance.
    """

    def OnInit(self):
        """
        wx.App init function that returns False if the app is already running.
        """
        self.name = "SingleApp-%s".format(wx.GetUserId())
        self.instance = wx.SingleInstanceChecker(self.name)
        if self.instance.IsAnotherRunning():
            wx.MessageBox(
                "An instance of the application is already running", 
                "Error", 
                 wx.OK | wx.ICON_WARNING
            )
            return False
        return True

这是一个简单的直接替换wx.App,禁止多个实例。要使用它,只需在代码中将替换wx.AppSingleApp,如下所示:

app = SingleApp(redirect=False)
frame = wx.Frame(None, wx.ID_ANY, "Hello World")
frame.Show(True)
app.MainLoop()

For anybody using wxPython for their application, you can use the function wx.SingleInstanceChecker documented here.

I personally use a subclass of wx.App which makes use of wx.SingleInstanceChecker and returns False from OnInit() if there is an existing instance of the app already executing like so:

import wx

class SingleApp(wx.App):
    """
    class that extends wx.App and only permits a single running instance.
    """

    def OnInit(self):
        """
        wx.App init function that returns False if the app is already running.
        """
        self.name = "SingleApp-%s".format(wx.GetUserId())
        self.instance = wx.SingleInstanceChecker(self.name)
        if self.instance.IsAnotherRunning():
            wx.MessageBox(
                "An instance of the application is already running", 
                "Error", 
                 wx.OK | wx.ICON_WARNING
            )
            return False
        return True

This is a simple drop-in replacement for wx.App that prohibits multiple instances. To use it simply replace wx.App with SingleApp in your code like so:

app = SingleApp(redirect=False)
frame = wx.Frame(None, wx.ID_ANY, "Hello World")
frame.Show(True)
app.MainLoop()

回答 10

这是我最终只能使用Windows的解决方案。将以下内容放入一个模块,可能称为“ onlyone.py”或其他任何模块。将该模块直接包含在__ main __ python脚本文件中。

import win32event, win32api, winerror, time, sys, os
main_path = os.path.abspath(sys.modules['__main__'].__file__).replace("\\", "/")

first = True
while True:
        mutex = win32event.CreateMutex(None, False, main_path + "_{<paste YOUR GUID HERE>}")
        if win32api.GetLastError() == 0:
            break
        win32api.CloseHandle(mutex)
        if first:
            print "Another instance of %s running, please wait for completion" % main_path
            first = False
        time.sleep(1)

说明

该代码尝试创建一个互斥锁,其名称来自脚本的完整路径。我们使用正斜杠来避免与实际文件系统的潜在混淆。

优点

  • 不需要配置或“魔术”标识符,请根据需要在许多不同的脚本中使用它。
  • 周围没有陈旧的文件,互斥锁将与您一起消亡。
  • 等待时打印有用的消息

Here is my eventual Windows-only solution. Put the following into a module, perhaps called ‘onlyone.py’, or whatever. Include that module directly into your __ main __ python script file.

import win32event, win32api, winerror, time, sys, os
main_path = os.path.abspath(sys.modules['__main__'].__file__).replace("\\", "/")

first = True
while True:
        mutex = win32event.CreateMutex(None, False, main_path + "_{<paste YOUR GUID HERE>}")
        if win32api.GetLastError() == 0:
            break
        win32api.CloseHandle(mutex)
        if first:
            print "Another instance of %s running, please wait for completion" % main_path
            first = False
        time.sleep(1)

Explanation

The code attempts to create a mutex with name derived from the full path to the script. We use forward-slashes to avoid potential confusion with the real file system.

Advantages

  • No configuration or ‘magic’ identifiers needed, use it in as many different scripts as needed.
  • No stale files left around, the mutex dies with you.
  • Prints a helpful message when waiting

回答 11

Windows上对此的最佳解决方案是使用@zgoda建议的互斥锁。

import win32event
import win32api
from winerror import ERROR_ALREADY_EXISTS

mutex = win32event.CreateMutex(None, False, 'name')
last_error = win32api.GetLastError()

if last_error == ERROR_ALREADY_EXISTS:
   print("App instance already running")

有些答案使用fctnl了Windows上不可用的(也包含在@sorin tento软件包中),并且如果您尝试使用像pyinstaller静态导入这样的软件包来冻结python应用程序,则会引发错误。

此外,使用锁定文件方法还会导致read-only数据库文件出现问题(sqlite3)。

The best solution for this on windows is to use mutexes as suggested by @zgoda.

import win32event
import win32api
from winerror import ERROR_ALREADY_EXISTS

mutex = win32event.CreateMutex(None, False, 'name')
last_error = win32api.GetLastError()

if last_error == ERROR_ALREADY_EXISTS:
   print("App instance already running")

Some answers use fctnl (included also in @sorin tendo package) which is not available on windows and should you try to freeze your python app using a package like pyinstaller which does static imports, it throws an error.

Also, using the lock file method, creates a read-only problem with database files( experienced this with sqlite3).


回答 12

我将其发布为答案,因为我是新用户,并且Stack Overflow尚未允许我投票。

Sorin Sbarnea的解决方案可在OS X,Linux和Windows下为我工作,对此我深表感谢。

但是,tempfile.gettempdir()在OS X和Windows下表现为一种方式,而在其他some / many / all(?)* nixes下表现为另一种方式(忽略OS X也是Unix!)。区别对于此代码很重要。

OS X和Windows具有用户特定的临时目录,因此,一个用户创建的临时文件对另一用户不可见。相比之下,在许多版本的* nix(我测试过Ubuntu 9,RHEL 5,OpenSolaris 2008和FreeBSD 8)下,临时目录对于所有用户都是/ tmp。

这意味着在多用户计算机上创建锁文件时,它是在/ tmp中创建的,只有第一次创建锁文件的用户才能运行该应用程序。

一种可能的解决方案是将当前用户名嵌入到锁定文件的名称中。

值得注意的是,OP抢占端口的解决方案在多用户计算机上也会出现异常。

I’m posting this as an answer because I’m a new user and Stack Overflow won’t let me vote yet.

Sorin Sbarnea’s solution works for me under OS X, Linux and Windows, and I am grateful for it.

However, tempfile.gettempdir() behaves one way under OS X and Windows and another under other some/many/all(?) *nixes (ignoring the fact that OS X is also Unix!). The difference is important to this code.

OS X and Windows have user-specific temp directories, so a tempfile created by one user isn’t visible to another user. By contrast, under many versions of *nix (I tested Ubuntu 9, RHEL 5, OpenSolaris 2008 and FreeBSD 8), the temp dir is /tmp for all users.

That means that when the lockfile is created on a multi-user machine, it’s created in /tmp and only the user who creates the lockfile the first time will be able to run the application.

A possible solution is to embed the current username in the name of the lock file.

It’s worth noting that the OP’s solution of grabbing a port will also misbehave on a multi-user machine.


回答 13

single_process在我的gentoo上使用;

pip install single_process

例如

from single_process import single_process

@single_process
def main():
    print 1

if __name__ == "__main__":
    main()   

参考:https : //pypi.python.org/pypi/single_process/1.0

I use single_process on my gentoo;

pip install single_process

example:

from single_process import single_process

@single_process
def main():
    print 1

if __name__ == "__main__":
    main()   

refer: https://pypi.python.org/pypi/single_process/1.0


回答 14

我一直怀疑使用进程组应该有一个不错的POSIXy解决方案,而不必使用文件系统,但是我不太确定。就像是:

启动时,您的进程会向特定组中的所有进程发送“ kill -0”。如果存在任何此类进程,则退出。然后,它加入该组。没有其他进程使用该组。

但是,这是一个竞争条件-多个进程都可以恰好同时执行此操作,并且最终都加入了该团队并同时运行。到您添加某种互斥锁使其具有水密性时,您不再需要过程组。

如果您的过程仅由cron启动,每分钟或每小时启动一次,这可能是可以接受的,但是这让我有些紧张,因为它恰恰在您不希望的那一天出错。

我猜毕竟这不是一个很好的解决方案,除非有人可以改进?

I keep suspecting there ought to be a good POSIXy solution using process groups, without having to hit the file system, but I can’t quite nail it down. Something like:

On startup, your process sends a ‘kill -0’ to all processes in a particular group. If any such processes exist, it exits. Then it joins the group. No other processes use that group.

However, this has a race condition – multiple processes could all do this at precisely the same time and all end up joining the group and running simultaneously. By the time you’ve added some sort of mutex to make it watertight, you no longer need the process groups.

This might be acceptable if your process only gets started by cron, once every minute or every hour, but it makes me a bit nervous that it would go wrong precisely on the day when you don’t want it to.

I guess this isn’t a very good solution after all, unless someone can improve on it?


回答 15

上周,我遇到了这个确切的问题,尽管我确实找到了一些好的解决方案,但我还是决定制作一个非常简单干净的python软件包并将其上传到PyPI。它与tendo的不同之处在于它可以锁定任何字符串资源名称。尽管您当然可以锁定__file__以达到相同的效果。

安装方式: pip install quicklock

使用它非常简单:

[nate@Nates-MacBook-Pro-3 ~/live] python
Python 2.7.6 (default, Sep  9 2014, 15:04:36)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from quicklock import singleton
>>> # Let's create a lock so that only one instance of a script will run
...
>>> singleton('hello world')
>>>
>>> # Let's try to do that again, this should fail
...
>>> singleton('hello world')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/nate/live/gallery/env/lib/python2.7/site-packages/quicklock/quicklock.py", line 47, in singleton
    raise RuntimeError('Resource <{}> is currently locked by <Process {}: "{}">'.format(resource, other_process.pid, other_process.name()))
RuntimeError: Resource <hello world> is currently locked by <Process 24801: "python">
>>>
>>> # But if we quit this process, we release the lock automatically
...
>>> ^D
[nate@Nates-MacBook-Pro-3 ~/live] python
Python 2.7.6 (default, Sep  9 2014, 15:04:36)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from quicklock import singleton
>>> singleton('hello world')
>>>
>>> # No exception was thrown, we own 'hello world'!

看一下:https : //pypi.python.org/pypi/quicklock

I ran into this exact problem last week, and although I did find some good solutions, I decided to make a very simple and clean python package and uploaded it to PyPI. It differs from tendo in that it can lock any string resource name. Although you could certainly lock __file__ to achieve the same effect.

Install with: pip install quicklock

Using it is extremely simple:

[nate@Nates-MacBook-Pro-3 ~/live] python
Python 2.7.6 (default, Sep  9 2014, 15:04:36)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from quicklock import singleton
>>> # Let's create a lock so that only one instance of a script will run
...
>>> singleton('hello world')
>>>
>>> # Let's try to do that again, this should fail
...
>>> singleton('hello world')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/nate/live/gallery/env/lib/python2.7/site-packages/quicklock/quicklock.py", line 47, in singleton
    raise RuntimeError('Resource <{}> is currently locked by <Process {}: "{}">'.format(resource, other_process.pid, other_process.name()))
RuntimeError: Resource <hello world> is currently locked by <Process 24801: "python">
>>>
>>> # But if we quit this process, we release the lock automatically
...
>>> ^D
[nate@Nates-MacBook-Pro-3 ~/live] python
Python 2.7.6 (default, Sep  9 2014, 15:04:36)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from quicklock import singleton
>>> singleton('hello world')
>>>
>>> # No exception was thrown, we own 'hello world'!

Take a look: https://pypi.python.org/pypi/quicklock


回答 16

在罗伯托·罗萨里奥(Roberto Rosario)的回答的基础上,我提出了以下功能:

SOCKET = None
def run_single_instance(uniq_name):
    try:
        import socket
        global SOCKET
        SOCKET = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
        ## Create an abstract socket, by prefixing it with null.
        # this relies on a feature only in linux, when current process quits, the
        # socket will be deleted.
        SOCKET.bind('\0' + uniq_name)
        return True
    except socket.error as e:
        return False

我们需要定义全局SOCKET变量,因为只有在整个过程退出时才会对其进行垃圾收集。如果我们在函数中声明局部变量,则在函数退出后它将超出范围,因此将删除套接字。

所有的功劳应该归功于罗伯托·罗萨里奥(Roberto Rosario),因为我只是澄清和阐述了他的代码。而且此代码仅在Linux上有效,如https://troydhanson.github.io/network/Unix_domain_sockets.html中以下引用的文字所述:

Linux具有一项特殊功能:如果UNIX域套接字的路径名以空字节\ 0开头,则其名称不会映射到文件系统中。因此,它不会与文件系统中的其他名称冲突。同样,当服务器在抽象命名空间中关闭其UNIX域侦听套接字时,其文件也将被删除。使用常规的UNIX域套接字,该文件在服务器关闭后仍然存在。

Building upon Roberto Rosario’s answer, I come up with the following function:

SOCKET = None
def run_single_instance(uniq_name):
    try:
        import socket
        global SOCKET
        SOCKET = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
        ## Create an abstract socket, by prefixing it with null.
        # this relies on a feature only in linux, when current process quits, the
        # socket will be deleted.
        SOCKET.bind('\0' + uniq_name)
        return True
    except socket.error as e:
        return False

We need to define global SOCKET vaiable since it will only be garbage collected when the whole process quits. If we declare a local variable in the function, it will go out of scope after the function exits, thus the socket be deleted.

All the credit should go to Roberto Rosario, since I only clarify and elaborate upon his code. And this code will work only on Linux, as the following quoted text from https://troydhanson.github.io/network/Unix_domain_sockets.html explains:

Linux has a special feature: if the pathname for a UNIX domain socket begins with a null byte \0, its name is not mapped into the filesystem. Thus it won’t collide with other names in the filesystem. Also, when a server closes its UNIX domain listening socket in the abstract namespace, its file is deleted; with regular UNIX domain sockets, the file persists after the server closes it.


回答 17

linux示例

此方法是基于创建临时文件后关闭应用程序而自动删除的。程序启动后,我们验证文件是否存在;如果文件存在(正在执行中),则程序关闭;否则,它将创建文件并继续执行程序。

from tempfile import *
import time
import os
import sys


f = NamedTemporaryFile( prefix='lock01_', delete=True) if not [f  for f in     os.listdir('/tmp') if f.find('lock01_')!=-1] else sys.exit()

YOUR CODE COMES HERE

linux example

This method is based on the creation of a temporary file automatically deleted after you close the application. the program launch we verify the existence of the file; if the file exists ( there is a pending execution) , the program is closed ; otherwise it creates the file and continues the execution of the program.

from tempfile import *
import time
import os
import sys


f = NamedTemporaryFile( prefix='lock01_', delete=True) if not [f  for f in     os.listdir('/tmp') if f.find('lock01_')!=-1] else sys.exit()

YOUR CODE COMES HERE

回答 18

在Linux系统上,还可以询问 pgrep -a实例数,该脚本位于进程列表中(选项-a显示完整的命令行字符串)。例如

import os
import sys
import subprocess

procOut = subprocess.check_output( "/bin/pgrep -u $UID -a python", shell=True, 
                                   executable="/bin/bash", universal_newlines=True)

if procOut.count( os.path.basename(__file__)) > 1 :        
    sys.exit( ("found another instance of >{}<, quitting."
              ).format( os.path.basename(__file__)))

-u $UID如果限制适用于所有用户,请删除。免责声明:a)假定脚本的(基本)名称是唯一的,b)可能存在竞争条件。

On a Linux system one could also ask pgrep -a for the number of instances, the script is found in the process list (option -a reveals the full command line string). E.g.

import os
import sys
import subprocess

procOut = subprocess.check_output( "/bin/pgrep -u $UID -a python", shell=True, 
                                   executable="/bin/bash", universal_newlines=True)

if procOut.count( os.path.basename(__file__)) > 1 :        
    sys.exit( ("found another instance of >{}<, quitting."
              ).format( os.path.basename(__file__)))

Remove -u $UID if the restriction should apply to all users. Disclaimer: a) it is assumed that the script’s (base)name is unique, b) there might be race conditions.


回答 19

import sys,os

# start program
try:  # (1)
    os.unlink('lock')  # (2)
    fd=os.open("lock", os.O_CREAT|os.O_EXCL) # (3)  
except: 
    try: fd=os.open("lock", os.O_CREAT|os.O_EXCL) # (4) 
    except:  
        print "Another Program running !.."  # (5)
        sys.exit()  

# your program  ...
# ...

# exit program
try: os.close(fd)  # (6)
except: pass
try: os.unlink('lock')  
except: pass
sys.exit()  
import sys,os

# start program
try:  # (1)
    os.unlink('lock')  # (2)
    fd=os.open("lock", os.O_CREAT|os.O_EXCL) # (3)  
except: 
    try: fd=os.open("lock", os.O_CREAT|os.O_EXCL) # (4) 
    except:  
        print "Another Program running !.."  # (5)
        sys.exit()  

# your program  ...
# ...

# exit program
try: os.close(fd)  # (6)
except: pass
try: os.unlink('lock')  
except: pass
sys.exit()  

如何检查Python中是否存在具有给定pid的进程?

问题:如何检查Python中是否存在具有给定pid的进程?

有没有办法检查pid是否对应于有效进程?我从其他来源获取pid os.getpid(),我需要检查计算机上是否不存在具有该pid的进程。

我需要它在Unix和Windows中可用。我也在检查PID是否未使用。

Is there a way to check to see if a pid corresponds to a valid process? I’m getting a pid from a different source other than from os.getpid() and I need to check to see if a process with that pid doesn’t exist on the machine.

I need it to be available in Unix and Windows. I’m also checking to see if the PID is NOT in use.


回答 0

如果pid没有运行,则向pid发送信号0将引发OSError异常,否则不执行任何操作。

import os

def check_pid(pid):        
    """ Check For the existence of a unix pid. """
    try:
        os.kill(pid, 0)
    except OSError:
        return False
    else:
        return True

Sending signal 0 to a pid will raise an OSError exception if the pid is not running, and do nothing otherwise.

import os

def check_pid(pid):        
    """ Check For the existence of a unix pid. """
    try:
        os.kill(pid, 0)
    except OSError:
        return False
    else:
        return True

回答 1

看一下psutil模块:

psutil(Python系统和进程实用程序)是一个跨平台的库,用于检索有关Python中正在运行的进程系统利用率(CPU,内存,磁盘,网络)的信息。[…]它目前支持32位64位体系结构的LinuxWindowsOSXFreeBSDSun Solaris,Python版本从2.6到3.4(Python 2.4和2.5的用户可以使用2.1.3版本) 。PyPy也可以工作。

它具有一个称为的函数pid_exists(),可用于检查具有给定pid的进程是否存在。

这是一个例子:

import psutil
pid = 12345
if psutil.pid_exists(pid):
    print("a process with pid %d exists" % pid)
else:
    print("a process with pid %d does not exist" % pid)

以供参考:

Have a look at the psutil module:

psutil (python system and process utilities) is a cross-platform library for retrieving information on running processes and system utilization (CPU, memory, disks, network) in Python. […] It currently supports Linux, Windows, OSX, FreeBSD and Sun Solaris, both 32-bit and 64-bit architectures, with Python versions from 2.6 to 3.4 (users of Python 2.4 and 2.5 may use 2.1.3 version). PyPy is also known to work.

It has a function called pid_exists() that you can use to check whether a process with the given pid exists.

Here’s an example:

import psutil
pid = 12345
if psutil.pid_exists(pid):
    print("a process with pid %d exists" % pid)
else:
    print("a process with pid %d does not exist" % pid)

For reference:


回答 2

mluebke代码不是100%正确;kill()还会引发EPERM(访问被拒绝),在这种情况下,这显然意味着存在进程。这应该工作:

(根据Jason R. Coombs评论编辑)

import errno
import os

def pid_exists(pid):
    """Check whether pid exists in the current process table.
    UNIX only.
    """
    if pid < 0:
        return False
    if pid == 0:
        # According to "man 2 kill" PID 0 refers to every process
        # in the process group of the calling process.
        # On certain systems 0 is a valid PID but we have no way
        # to know that in a portable fashion.
        raise ValueError('invalid PID 0')
    try:
        os.kill(pid, 0)
    except OSError as err:
        if err.errno == errno.ESRCH:
            # ESRCH == No such process
            return False
        elif err.errno == errno.EPERM:
            # EPERM clearly means there's a process to deny access to
            return True
        else:
            # According to "man 2 kill" possible error values are
            # (EINVAL, EPERM, ESRCH)
            raise
    else:
        return True

除非您使用pywin32,ctypes或C扩展模块,否则无法在Windows上执行此操作。如果可以从外部库中获取依赖,则可以使用psutil

>>> import psutil
>>> psutil.pid_exists(2353)
True

mluebke code is not 100% correct; kill() can also raise EPERM (access denied) in which case that obviously means a process exists. This is supposed to work:

(edited as per Jason R. Coombs comments)

import errno
import os

def pid_exists(pid):
    """Check whether pid exists in the current process table.
    UNIX only.
    """
    if pid < 0:
        return False
    if pid == 0:
        # According to "man 2 kill" PID 0 refers to every process
        # in the process group of the calling process.
        # On certain systems 0 is a valid PID but we have no way
        # to know that in a portable fashion.
        raise ValueError('invalid PID 0')
    try:
        os.kill(pid, 0)
    except OSError as err:
        if err.errno == errno.ESRCH:
            # ESRCH == No such process
            return False
        elif err.errno == errno.EPERM:
            # EPERM clearly means there's a process to deny access to
            return True
        else:
            # According to "man 2 kill" possible error values are
            # (EINVAL, EPERM, ESRCH)
            raise
    else:
        return True

You can’t do this on Windows unless you use pywin32, ctypes or a C extension module. If you’re OK with depending from an external lib you can use psutil:

>>> import psutil
>>> psutil.pid_exists(2353)
True

回答 3

仅当所讨论的进程归运行测试的用户所有时,涉及向进程发送“信号0”的答案才有效。否则,即使pid存在于系统中,您也将OSError由于权限而获得权限

为了绕过此限制,您可以检查是否/proc/<pid>存在:

import os

def is_running(pid):
    if os.path.isdir('/proc/{}'.format(pid)):
        return True
    return False

显然,这仅适用于基于Linux的系统。

The answers involving sending ‘signal 0’ to the process will work only if the process in question is owned by the user running the test. Otherwise you will get an OSError due to permissions, even if the pid exists in the system.

In order to bypass this limitation you can check if /proc/<pid> exists:

import os

def is_running(pid):
    if os.path.isdir('/proc/{}'.format(pid)):
        return True
    return False

This applies to linux based systems only, obviously.


回答 4

在Python 3.3+中,可以使用异常名称代替errno常量。Posix版本

import os

def pid_exists(pid): 
    if pid < 0: return False #NOTE: pid == 0 returns True
    try:
        os.kill(pid, 0) 
    except ProcessLookupError: # errno.ESRCH
        return False # No such process
    except PermissionError: # errno.EPERM
        return True # Operation not permitted (i.e., process exists)
    else:
        return True # no error, we can send a signal to the process

In Python 3.3+, you could use exception names instead of errno constants. Posix version:

import os

def pid_exists(pid): 
    if pid < 0: return False #NOTE: pid == 0 returns True
    try:
        os.kill(pid, 0) 
    except ProcessLookupError: # errno.ESRCH
        return False # No such process
    except PermissionError: # errno.EPERM
        return True # Operation not permitted (i.e., process exists)
    else:
        return True # no error, we can send a signal to the process

回答 5

在此处查找特定于Windows的方法,以获取运行进程及其ID的完整列表。就像

from win32com.client import GetObject
def get_proclist():
    WMI = GetObject('winmgmts:')
    processes = WMI.InstancesOf('Win32_Process')
    return [process.Properties_('ProcessID').Value for process in processes]

然后,您可以根据此列表验证pid。我对性能成本一无所知,因此,如果您要经常进行pid验证,则最好检查一下。

对于* NIx,只需使用mluebke的溶液即可。

Look here for windows-specific way of getting full list of running processes with their IDs. It would be something like

from win32com.client import GetObject
def get_proclist():
    WMI = GetObject('winmgmts:')
    processes = WMI.InstancesOf('Win32_Process')
    return [process.Properties_('ProcessID').Value for process in processes]

You can then verify pid you get against this list. I have no idea about performance cost, so you’d better check this if you’re going to do pid verification often.

For *NIx, just use mluebke’s solution.


回答 6

在ntrrgc的基础上,我增强了Windows版本,因此它检查进程退出代码并检查权限:

def pid_exists(pid):
    """Check whether pid exists in the current process table."""
    if os.name == 'posix':
        import errno
        if pid < 0:
            return False
        try:
            os.kill(pid, 0)
        except OSError as e:
            return e.errno == errno.EPERM
        else:
            return True
    else:
        import ctypes
        kernel32 = ctypes.windll.kernel32
        HANDLE = ctypes.c_void_p
        DWORD = ctypes.c_ulong
        LPDWORD = ctypes.POINTER(DWORD)
        class ExitCodeProcess(ctypes.Structure):
            _fields_ = [ ('hProcess', HANDLE),
                ('lpExitCode', LPDWORD)]

        SYNCHRONIZE = 0x100000
        process = kernel32.OpenProcess(SYNCHRONIZE, 0, pid)
        if not process:
            return False

        ec = ExitCodeProcess()
        out = kernel32.GetExitCodeProcess(process, ctypes.byref(ec))
        if not out:
            err = kernel32.GetLastError()
            if kernel32.GetLastError() == 5:
                # Access is denied.
                logging.warning("Access is denied to get pid info.")
            kernel32.CloseHandle(process)
            return False
        elif bool(ec.lpExitCode):
            # print ec.lpExitCode.contents
            # There is an exist code, it quit
            kernel32.CloseHandle(process)
            return False
        # No exit code, it's running.
        kernel32.CloseHandle(process)
        return True

Building upon ntrrgc’s I’ve beefed up the windows version so it checks the process exit code and checks for permissions:

def pid_exists(pid):
    """Check whether pid exists in the current process table."""
    if os.name == 'posix':
        import errno
        if pid < 0:
            return False
        try:
            os.kill(pid, 0)
        except OSError as e:
            return e.errno == errno.EPERM
        else:
            return True
    else:
        import ctypes
        kernel32 = ctypes.windll.kernel32
        HANDLE = ctypes.c_void_p
        DWORD = ctypes.c_ulong
        LPDWORD = ctypes.POINTER(DWORD)
        class ExitCodeProcess(ctypes.Structure):
            _fields_ = [ ('hProcess', HANDLE),
                ('lpExitCode', LPDWORD)]

        SYNCHRONIZE = 0x100000
        process = kernel32.OpenProcess(SYNCHRONIZE, 0, pid)
        if not process:
            return False

        ec = ExitCodeProcess()
        out = kernel32.GetExitCodeProcess(process, ctypes.byref(ec))
        if not out:
            err = kernel32.GetLastError()
            if kernel32.GetLastError() == 5:
                # Access is denied.
                logging.warning("Access is denied to get pid info.")
            kernel32.CloseHandle(process)
            return False
        elif bool(ec.lpExitCode):
            # print ec.lpExitCode.contents
            # There is an exist code, it quit
            kernel32.CloseHandle(process)
            return False
        # No exit code, it's running.
        kernel32.CloseHandle(process)
        return True

回答 7

结合GiampaoloRodolà对POSIX的回答我对Windows 的回答,我得到了:

import os
if os.name == 'posix':
    def pid_exists(pid):
        """Check whether pid exists in the current process table."""
        import errno
        if pid < 0:
            return False
        try:
            os.kill(pid, 0)
        except OSError as e:
            return e.errno == errno.EPERM
        else:
            return True
else:
    def pid_exists(pid):
        import ctypes
        kernel32 = ctypes.windll.kernel32
        SYNCHRONIZE = 0x100000

        process = kernel32.OpenProcess(SYNCHRONIZE, 0, pid)
        if process != 0:
            kernel32.CloseHandle(process)
            return True
        else:
            return False

Combining Giampaolo Rodolà’s answer for POSIX and mine for Windows I got this:

import os
if os.name == 'posix':
    def pid_exists(pid):
        """Check whether pid exists in the current process table."""
        import errno
        if pid < 0:
            return False
        try:
            os.kill(pid, 0)
        except OSError as e:
            return e.errno == errno.EPERM
        else:
            return True
else:
    def pid_exists(pid):
        import ctypes
        kernel32 = ctypes.windll.kernel32
        SYNCHRONIZE = 0x100000

        process = kernel32.OpenProcess(SYNCHRONIZE, 0, pid)
        if process != 0:
            kernel32.CloseHandle(process)
            return True
        else:
            return False

回答 8

在Windows中,您可以通过以下方式进行操作:

import ctypes
PROCESS_QUERY_INFROMATION = 0x1000
def checkPid(pid):
    processHandle = ctypes.windll.kernel32.OpenProcess(PROCESS_QUERY_INFROMATION, 0,pid)
    if processHandle == 0:
        return False
    else:
        ctypes.windll.kernel32.CloseHandle(processHandle)
    return True

首先,在此代码中,您尝试获取给定pid的进程的句柄。如果句柄有效,则关闭该句柄进行处理并返回True;否则,返回true。否则,您返回False。OpenProcess文档:https : //msdn.microsoft.com/zh-cn/library/windows/desktop/ms684320%28v=vs.85%29.aspx

In Windows, you can do it in this way:

import ctypes
PROCESS_QUERY_INFROMATION = 0x1000
def checkPid(pid):
    processHandle = ctypes.windll.kernel32.OpenProcess(PROCESS_QUERY_INFROMATION, 0,pid)
    if processHandle == 0:
        return False
    else:
        ctypes.windll.kernel32.CloseHandle(processHandle)
    return True

First of all, in this code you try to get a handle for process with pid given. If the handle is valid, then close the handle for process and return True; otherwise, you return False. Documentation for OpenProcess: https://msdn.microsoft.com/en-us/library/windows/desktop/ms684320%28v=vs.85%29.aspx


回答 9

例如,如果您要检查banshee是否正在运行,该功能将在Linux上运行…(banshee是音乐播放器)

import subprocess

def running_process(process):
    "check if process is running. < process > is the name of the process."

    proc = subprocess.Popen(["if pgrep " + process + " >/dev/null 2>&1; then echo 'True'; else echo 'False'; fi"], stdout=subprocess.PIPE, shell=True)

    (Process_Existance, err) = proc.communicate()
    return Process_Existance

# use the function
print running_process("banshee")

This will work for Linux, for example if you want to check if banshee is running… (banshee is a music player)

import subprocess

def running_process(process):
    "check if process is running. < process > is the name of the process."

    proc = subprocess.Popen(["if pgrep " + process + " >/dev/null 2>&1; then echo 'True'; else echo 'False'; fi"], stdout=subprocess.PIPE, shell=True)

    (Process_Existance, err) = proc.communicate()
    return Process_Existance

# use the function
print running_process("banshee")

回答 10

我想将PID用于获得它的任何目的并优雅地处理错误。否则,这将是经典竞赛(当您检查PID有效时,PID可能有效,但稍后消失)

I’d say use the PID for whatever purpose you’re obtaining it and handle the errors gracefully. Otherwise, it’s a classic race (the PID may be valid when you check it’s valid, but go away an instant later)


检查python脚本是否正在运行

问题:检查python脚本是否正在运行

我有一个作为后台程序运行的python守护程序/如何快速查看(使用python)我的守护程序是否正在运行,如果没有运行,请启动它?

我想通过这种方式来修复守护程序的任何崩溃,因此该脚本不必手动运行,它将在调用后立即自动运行,然后保持运行状态。

如何检查(使用python)脚本是否正在运行?

I have a python daemon running as a part of my web app/ How can I quickly check (using python) if my daemon is running and, if not, launch it?

I want to do it that way to fix any crashes of the daemon, and so the script does not have to be run manually, it will automatically run as soon as it is called and then stay running.

How can i check (using python) if my script is running?


回答 0

将pidfile放在某个地方(例如/ tmp)。然后,您可以通过检查文件中的PID是否存在来检查进程是否正在运行。干净关闭文件时,不要忘记删除文件,启动时请检查文件。

#/usr/bin/env python

import os
import sys

pid = str(os.getpid())
pidfile = "/tmp/mydaemon.pid"

if os.path.isfile(pidfile):
    print "%s already exists, exiting" % pidfile
    sys.exit()
file(pidfile, 'w').write(pid)
try:
    # Do some actual work here
finally:
    os.unlink(pidfile)

然后,您可以通过检查/tmp/mydaemon.pid的内容是否为现有进程来查看该进程是否正在运行。Monit(如上所述)可以为您完成此操作,或者您可以编写一个简单的Shell脚本使用ps的返回代码为您检查它。

ps up `cat /tmp/mydaemon.pid ` >/dev/null && echo "Running" || echo "Not running"

为了获得更多的荣誉,您可以使用atexit模块来确保您的程序在任何情况下(被杀死,引发异常等)都清理其pidfile。

Drop a pidfile somewhere (e.g. /tmp). Then you can check to see if the process is running by checking to see if the PID in the file exists. Don’t forget to delete the file when you shut down cleanly, and check for it when you start up.

#/usr/bin/env python

import os
import sys

pid = str(os.getpid())
pidfile = "/tmp/mydaemon.pid"

if os.path.isfile(pidfile):
    print "%s already exists, exiting" % pidfile
    sys.exit()
file(pidfile, 'w').write(pid)
try:
    # Do some actual work here
finally:
    os.unlink(pidfile)

Then you can check to see if the process is running by checking to see if the contents of /tmp/mydaemon.pid are an existing process. Monit (mentioned above) can do this for you, or you can write a simple shell script to check it for you using the return code from ps.

ps up `cat /tmp/mydaemon.pid ` >/dev/null && echo "Running" || echo "Not running"

For extra credit, you can use the atexit module to ensure that your program cleans up its pidfile under any circumstances (when killed, exceptions raised, etc.).


回答 1

在Linux系统上很方便的一种技术是使用域套接字:

import socket
import sys
import time

def get_lock(process_name):
    # Without holding a reference to our socket somewhere it gets garbage
    # collected when the function exits
    get_lock._lock_socket = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)

    try:
        # The null byte (\0) means the the socket is created 
        # in the abstract namespace instead of being created 
        # on the file system itself.
        # Works only in Linux
        get_lock._lock_socket.bind('\0' + process_name)
        print 'I got the lock'
    except socket.error:
        print 'lock exists'
        sys.exit()


get_lock('running_test')
while True:
    time.sleep(3)

它是原子的,避免了如果向您的进程发送SIGKILL时锁定文件到处乱码的问题

您可以在文档中阅读有关socket.close垃圾回收时套接字自动关闭的信息。

A technique that is handy on a Linux system is using domain sockets:

import socket
import sys
import time

def get_lock(process_name):
    # Without holding a reference to our socket somewhere it gets garbage
    # collected when the function exits
    get_lock._lock_socket = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)

    try:
        # The null byte (\0) means the socket is created 
        # in the abstract namespace instead of being created 
        # on the file system itself.
        # Works only in Linux
        get_lock._lock_socket.bind('\0' + process_name)
        print 'I got the lock'
    except socket.error:
        print 'lock exists'
        sys.exit()


get_lock('running_test')
while True:
    time.sleep(3)

It is atomic and avoids the problem of having lock files lying around if your process gets sent a SIGKILL

You can read in the documentation for socket.close that sockets are automatically closed when garbage collected.


回答 2

PID库可以做的正是这一点。

from pid import PidFile

with PidFile():
  do_something()

它还会自动处理pidfile存在但进程未运行的情况。

The pid library can do exactly this.

from pid import PidFile

with PidFile():
  do_something()

It will also automatically handle the case where the pidfile exists but the process is not running.


回答 3

当然,来自Dan的示例将无法正常工作。

实际上,如果脚本崩溃,出现异常或不清除pid文件,则脚本将运行多次。

我建议其他网站提供以下内容:

这是检查是否已经存在锁定文件

\#/usr/bin/env python
import os
import sys
if os.access(os.path.expanduser("~/.lockfile.vestibular.lock"), os.F_OK):
        #if the lockfile is already there then check the PID number
        #in the lock file
        pidfile = open(os.path.expanduser("~/.lockfile.vestibular.lock"), "r")
        pidfile.seek(0)
        old_pid = pidfile.readline()
        # Now we check the PID from lock file matches to the current
        # process PID
        if os.path.exists("/proc/%s" % old_pid):
                print "You already have an instance of the program running"
                print "It is running as process %s," % old_pid
                sys.exit(1)
        else:
                print "File is there but the program is not running"
                print "Removing lock file for the: %s as it can be there because of the program last time it was run" % old_pid
                os.remove(os.path.expanduser("~/.lockfile.vestibular.lock"))

这是代码的一部分,我们在其中将PID文件放置在锁定文件中

pidfile = open(os.path.expanduser("~/.lockfile.vestibular.lock"), "w")
pidfile.write("%s" % os.getpid())
pidfile.close()

此代码将检查pid与现有运行进程相比的值,从而避免重复执行。

希望对您有所帮助。

Of course the example from Dan will not work as it should be.

Indeed, if the script crash, rise an exception, or does not clean pid file, the script will be run multiple times.

I suggest the following based from another website:

This is to check if there is already a lock file existing

\#/usr/bin/env python
import os
import sys
if os.access(os.path.expanduser("~/.lockfile.vestibular.lock"), os.F_OK):
        #if the lockfile is already there then check the PID number
        #in the lock file
        pidfile = open(os.path.expanduser("~/.lockfile.vestibular.lock"), "r")
        pidfile.seek(0)
        old_pid = pidfile.readline()
        # Now we check the PID from lock file matches to the current
        # process PID
        if os.path.exists("/proc/%s" % old_pid):
                print "You already have an instance of the program running"
                print "It is running as process %s," % old_pid
                sys.exit(1)
        else:
                print "File is there but the program is not running"
                print "Removing lock file for the: %s as it can be there because of the program last time it was run" % old_pid
                os.remove(os.path.expanduser("~/.lockfile.vestibular.lock"))

This is part of code where we put a PID file in the lock file

pidfile = open(os.path.expanduser("~/.lockfile.vestibular.lock"), "w")
pidfile.write("%s" % os.getpid())
pidfile.close()

This code will check the value of pid compared to existing running process., avoiding double execution.

I hope it will help.


回答 4

有很好的软件包可用于在UNIX上重新启动进程。monit是一本有关构建和配置它的很棒的教程。通过一些调整,您可以拥有可靠的可靠技术来保持您的守护程序。

There are very good packages for restarting processes on UNIX. One that has a great tutorial about building and configuring it is monit. With some tweaking you can have a rock solid proven technology keeping up your daemon.


回答 5

我的解决方案是检查在Windows和ubuntu linux上测试过的进程和命令行参数

import psutil
import os

def is_running(script):
    for q in psutil.process_iter():
        if q.name().startswith('python'):
            if len(q.cmdline())>1 and script in q.cmdline()[1] and q.pid !=os.getpid():
                print("'{}' Process is already running".format(script))
                return True

    return False


if not is_running("test.py"):
    n = input("What is Your Name? ")
    print ("Hello " + n)

My solution is to check for the process and command line arguments Tested on windows and ubuntu linux

import psutil
import os

def is_running(script):
    for q in psutil.process_iter():
        if q.name().startswith('python'):
            if len(q.cmdline())>1 and script in q.cmdline()[1] and q.pid !=os.getpid():
                print("'{}' Process is already running".format(script))
                return True

    return False


if not is_running("test.py"):
    n = input("What is Your Name? ")
    print ("Hello " + n)

回答 6

有很多选择。一种方法是使用系统调用或为您执行此类调用的python库。另一种方法是产生一个类似以下的过程:

ps ax | grep processName

并解析输出。许多人选择这种方法,在我看来,这不一定是不好的方法。

There are a myriad of options. One method is using system calls or python libraries that perform such calls for you. The other is simply to spawn out a process like:

ps ax | grep processName

and parse the output. Many people choose this approach, it isn’t necessarily a bad approach in my view.


回答 7

遇到了这个老问题,自己寻找解决方案。

使用psutil

import psutil
import sys
from subprocess import Popen

for process in psutil.process_iter():
    if process.cmdline() == ['python', 'your_script.py']:
        sys.exit('Process found: exiting.')

print('Process not found: starting it.')
Popen(['python', 'your_script.py'])

Came across this old question looking for solution myself.

Use psutil:

import psutil
import sys
from subprocess import Popen

for process in psutil.process_iter():
    if process.cmdline() == ['python', 'your_script.py']:
        sys.exit('Process found: exiting.')

print('Process not found: starting it.')
Popen(['python', 'your_script.py'])

回答 8

我非常喜欢管理守护程序的Supervisor。它是用Python编写的,因此有很多示例说明了如何与Python交互或从中扩展。为了您的目的,XML-RPC流程控制API应该可以正常工作。

I’m a big fan of Supervisor for managing daemons. It’s written in Python, so there are plenty of examples of how to interact with or extend it from Python. For your purposes the XML-RPC process control API should work nicely.


回答 9

试试这个其他版本

def checkPidRunning(pid):        
    '''Check For the existence of a unix pid.
    '''
    try:
        os.kill(pid, 0)
    except OSError:
        return False
    else:
        return True

# Entry point
if __name__ == '__main__':
    pid = str(os.getpid())
    pidfile = os.path.join("/", "tmp", __program__+".pid")

    if os.path.isfile(pidfile) and checkPidRunning(int(file(pidfile,'r').readlines()[0])):
            print "%s already exists, exiting" % pidfile
            sys.exit()
    else:
        file(pidfile, 'w').write(pid)

    # Do some actual work here
    main()

    os.unlink(pidfile)

Try this other version

def checkPidRunning(pid):        
    '''Check For the existence of a unix pid.
    '''
    try:
        os.kill(pid, 0)
    except OSError:
        return False
    else:
        return True

# Entry point
if __name__ == '__main__':
    pid = str(os.getpid())
    pidfile = os.path.join("/", "tmp", __program__+".pid")

    if os.path.isfile(pidfile) and checkPidRunning(int(file(pidfile,'r').readlines()[0])):
            print "%s already exists, exiting" % pidfile
            sys.exit()
    else:
        file(pidfile, 'w').write(pid)

    # Do some actual work here
    main()

    os.unlink(pidfile)

回答 10

与其开发自己的PID文件解决方案(它比您想象的要具有更多的细微之处和极端情况),不如看监督 -这是一个过程控制系统,可以轻松地将作业控制和守护程序行为包装在现有的Python周围脚本。

Rather than developing your own PID file solution (which has more subtleties and corner cases than you might think), have a look at supervisord — this is a process control system that makes it easy to wrap job control and daemon behaviors around an existing Python script.


回答 11

其他答案对于诸如cron作业之类的事情非常有用,但是如果您正在运行守护程序,则应使用daemontools之类的工具对其进行监视。

The other answers are great for things like cron jobs, but if you’re running a daemon you should monitor it with something like daemontools.


回答 12

ps ax | grep processName

如果您在pycharm中的调试脚本始终退出

pydevd.py --multiproc --client 127.0.0.1 --port 33882 --file processName
ps ax | grep processName

if yor debug script in pycharm always exit

pydevd.py --multiproc --client 127.0.0.1 --port 33882 --file processName

回答 13

试试这个:

#/usr/bin/env python
import os, sys, atexit

try:
    # Set PID file
    def set_pid_file():
        pid = str(os.getpid())
        f = open('myCode.pid', 'w')
        f.write(pid)
        f.close()

    def goodby():
        pid = str('myCode.pid')
        os.remove(pid)

    atexit.register(goodby)
    set_pid_file()
    # Place your code here

except KeyboardInterrupt:
    sys.exit(0)

try this:

#/usr/bin/env python
import os, sys, atexit

try:
    # Set PID file
    def set_pid_file():
        pid = str(os.getpid())
        f = open('myCode.pid', 'w')
        f.write(pid)
        f.close()

    def goodby():
        pid = str('myCode.pid')
        os.remove(pid)

    atexit.register(goodby)
    set_pid_file()
    # Place your code here

except KeyboardInterrupt:
    sys.exit(0)

回答 14

这是更有用的代码(检查python是否完全执行了脚本):

#! /usr/bin/env python

import os
from sys import exit


def checkPidRunning(pid):
    global script_name
    if pid<1:
        print "Incorrect pid number!"
        exit()
    try:
        os.kill(pid, 0)
    except OSError:
        print "Abnormal termination of previous process."
        return False
    else:
        ps_command = "ps -o command= %s | grep -Eq 'python .*/%s'" % (pid,script_name)
        process_exist = os.system(ps_command)
        if process_exist == 0:
            return True
        else:
            print "Process with pid %s is not a Python process. Continue..." % pid
            return False


if __name__ == '__main__':
    script_name = os.path.basename(__file__)
    pid = str(os.getpid())
    pidfile = os.path.join("/", "tmp/", script_name+".pid")
    if os.path.isfile(pidfile):
        print "Warning! Pid file %s existing. Checking for process..." % pidfile
        r_pid = int(file(pidfile,'r').readlines()[0])
        if checkPidRunning(r_pid):
            print "Python process with pid = %s is already running. Exit!" % r_pid
            exit()
        else:
            file(pidfile, 'w').write(pid)
    else:
        file(pidfile, 'w').write(pid)

# main programm
....
....

os.unlink(pidfile)

这是字符串:

ps_command = "ps -o command= %s | grep -Eq 'python .*/%s'" % (pid,script_name)

如果“ grep”成功,并且当前正在使用脚本名称作为参数运行“ python”进程,则返回0。

Here is more useful code (with checking if exactly python executes the script):

#! /usr/bin/env python

import os
from sys import exit


def checkPidRunning(pid):
    global script_name
    if pid<1:
        print "Incorrect pid number!"
        exit()
    try:
        os.kill(pid, 0)
    except OSError:
        print "Abnormal termination of previous process."
        return False
    else:
        ps_command = "ps -o command= %s | grep -Eq 'python .*/%s'" % (pid,script_name)
        process_exist = os.system(ps_command)
        if process_exist == 0:
            return True
        else:
            print "Process with pid %s is not a Python process. Continue..." % pid
            return False


if __name__ == '__main__':
    script_name = os.path.basename(__file__)
    pid = str(os.getpid())
    pidfile = os.path.join("/", "tmp/", script_name+".pid")
    if os.path.isfile(pidfile):
        print "Warning! Pid file %s existing. Checking for process..." % pidfile
        r_pid = int(file(pidfile,'r').readlines()[0])
        if checkPidRunning(r_pid):
            print "Python process with pid = %s is already running. Exit!" % r_pid
            exit()
        else:
            file(pidfile, 'w').write(pid)
    else:
        file(pidfile, 'w').write(pid)

# main programm
....
....

os.unlink(pidfile)

Here is string:

ps_command = "ps -o command= %s | grep -Eq 'python .*/%s'" % (pid,script_name)

returns 0 if “grep” is successful, and the process “python” is currently running with the name of your script as a parameter .


回答 15

一个简单的示例,如果您仅查找进程名称,则不存在:

import os

def pname_exists(inp):
    os.system('ps -ef > /tmp/psef')
    lines=open('/tmp/psef', 'r').read().split('\n')
    res=[i for i in lines if inp in i]
    return True if res else False

Result:
In [21]: pname_exists('syslog')
Out[21]: True

In [22]: pname_exists('syslog_')
Out[22]: False

A simple example if you only are looking for a process name exist or not:

import os

def pname_exists(inp):
    os.system('ps -ef > /tmp/psef')
    lines=open('/tmp/psef', 'r').read().split('\n')
    res=[i for i in lines if inp in i]
    return True if res else False

Result:
In [21]: pname_exists('syslog')
Out[21]: True

In [22]: pname_exists('syslog_')
Out[22]: False

回答 16

请考虑以下示例来解决您的问题:

#!/usr/bin/python
# -*- coding: latin-1 -*-

import os, sys, time, signal

def termination_handler (signum,frame):
    global running
    global pidfile
    print 'You have requested to terminate the application...'
    sys.stdout.flush()
    running = 0
    os.unlink(pidfile)

running = 1
signal.signal(signal.SIGINT,termination_handler)

pid = str(os.getpid())
pidfile = '/tmp/'+os.path.basename(__file__).split('.')[0]+'.pid'

if os.path.isfile(pidfile):
    print "%s already exists, exiting" % pidfile
    sys.exit()
else:
    file(pidfile, 'w').write(pid)

# Do some actual work here

while running:
  time.sleep(10)

我建议使用此脚本,因为它只能执行一次。

Consider the following example to solve your problem:

#!/usr/bin/python
# -*- coding: latin-1 -*-

import os, sys, time, signal

def termination_handler (signum,frame):
    global running
    global pidfile
    print 'You have requested to terminate the application...'
    sys.stdout.flush()
    running = 0
    os.unlink(pidfile)

running = 1
signal.signal(signal.SIGINT,termination_handler)

pid = str(os.getpid())
pidfile = '/tmp/'+os.path.basename(__file__).split('.')[0]+'.pid'

if os.path.isfile(pidfile):
    print "%s already exists, exiting" % pidfile
    sys.exit()
else:
    file(pidfile, 'w').write(pid)

# Do some actual work here

while running:
  time.sleep(10)

I suggest this script because it can be executed one time only.


回答 17

使用bash查找具有当前脚本名称的进程。没有多余的文件。

import commands
import os
import time
import sys

def stop_if_already_running():
    script_name = os.path.basename(__file__)
    l = commands.getstatusoutput("ps aux | grep -e '%s' | grep -v grep | awk '{print $2}'| awk '{print $2}'" % script_name)
    if l[1]:
        sys.exit(0);

要测试,请添加

stop_if_already_running()
print "running normally"
while True:
    time.sleep(3)

Using bash to look for a process with the current script’s name. No extra file.

import commands
import os
import time
import sys

def stop_if_already_running():
    script_name = os.path.basename(__file__)
    l = commands.getstatusoutput("ps aux | grep -e '%s' | grep -v grep | awk '{print $2}'| awk '{print $2}'" % script_name)
    if l[1]:
        sys.exit(0);

To test, add

stop_if_already_running()
print "running normally"
while True:
    time.sleep(3)

回答 18

这是我在Linux中用来避免启动脚本(如果已运行)的方法:

import os
import sys


script_name = os.path.basename(__file__)
pidfile = os.path.join("/tmp", os.path.splitext(script_name)[0]) + ".pid"


def create_pidfile():
    if os.path.exists(pidfile):
        with open(pidfile, "r") as _file:
            last_pid = int(_file.read())

        # Checking if process is still running
        last_process_cmdline = "/proc/%d/cmdline" % last_pid
        if os.path.exists(last_process_cmdline):
            with open(last_process_cmdline, "r") as _file:
                cmdline = _file.read()
            if script_name in cmdline:
                raise Exception("Script already running...")

    with open(pidfile, "w") as _file:
        pid = str(os.getpid())
        _file.write(pid)


def main():
    """Your application logic goes here"""


if __name__ == "__main__":
    create_pidfile()
    main()

这种方法效果很好,无需依赖任何外部模块。

This is what I use in Linux to avoid starting a script if already running:

import os
import sys


script_name = os.path.basename(__file__)
pidfile = os.path.join("/tmp", os.path.splitext(script_name)[0]) + ".pid"


def create_pidfile():
    if os.path.exists(pidfile):
        with open(pidfile, "r") as _file:
            last_pid = int(_file.read())

        # Checking if process is still running
        last_process_cmdline = "/proc/%d/cmdline" % last_pid
        if os.path.exists(last_process_cmdline):
            with open(last_process_cmdline, "r") as _file:
                cmdline = _file.read()
            if script_name in cmdline:
                raise Exception("Script already running...")

    with open(pidfile, "w") as _file:
        pid = str(os.getpid())
        _file.write(pid)


def main():
    """Your application logic goes here"""


if __name__ == "__main__":
    create_pidfile()
    main()

This approach works good without any dependency on an external module.


线程和多处理模块之间有什么区别?

问题:线程和多处理模块之间有什么区别?

我正在学习如何在Python中使用threadingmultiprocessing模块并行运行某些操作并加快代码速度。

我发现很难理解一个threading.Thread()对象与一个对象之间的区别(也许是因为我没有任何理论背景)multiprocessing.Process()

另外,对我来说,如何实例化一个作业队列并使其只有4个(例如)并行运行,而另一个则等待资源释放后再执行,对我来说也不是很清楚。

我发现文档中的示例很清楚,但并不十分详尽。一旦尝试使事情复杂化,我就会收到很多奇怪的错误(例如无法腌制的方法,等等)。

那么,什么时候应该使用threadingand multiprocessing模块?

您能否将我链接到一些资源,以解释这两个模块的概念以及如何在复杂的任务中正确使用它们?

I am learning how to use the threading and the multiprocessing modules in Python to run certain operations in parallel and speed up my code.

I am finding this hard (maybe because I don’t have any theoretical background about it) to understand what the difference is between a threading.Thread() object and a multiprocessing.Process() one.

Also, it is not entirely clear to me how to instantiate a queue of jobs and having only 4 (for example) of them running in parallel, while the other wait for resources to free before being executed.

I find the examples in the documentation clear, but not very exhaustive; as soon as I try to complicate things a bit, I receive a lot of weird errors (like a method that can’t be pickled, and so on).

So, when should I use the threading and multiprocessing modules?

Can you link me to some resources that explain the concepts behind these two modules and how to use them properly for complex tasks?


回答 0

什么朱利奥·佛朗哥说,对于多线程多处理与真一般

但是,Python *还有一个问题:有一个全局解释器锁,可以防止同一进程中的两个线程同时运行Python代码。这意味着,如果您有8个核心,并且将代码更改为使用8个线程,则它将无法使用800%的CPU并无法以8倍的速度运行;它会使用相同的100%CPU,并以相同的速度运行。(实际上,它的运行速度会稍慢一些,因为即使您没有任何共享数据,线程处理也会带来额外的开销,但是现在暂时忽略它。)

也有exceptions。如果您的代码繁重的计算实际上不是在Python中发生的,而是在某些具有自定义C代码的库中执行的,这些代码可以正确地进行GIL处理,例如numpy应用程序,则您将从线程中获得预期的性能收益。如果繁重的计算是由运行并等待的某些子进程完成的,则情况也是如此。

更重要的是,在某些情况下,这无关紧要。例如,网络服务器花费大部分时间来读取网络中的数据包,而GUI应用花费大部分时间来等待用户事件。在网络服务器或GUI应用程序中使用线程的原因之一是允许您执行长时间运行的“后台任务”,而不会阻止主线程继续为网络数据包或GUI事件提供服务。这在Python线程中工作得很好。(从技术上讲,这意味着Python线程为您提供了并发性,即使它们没有为您提供核心并行性。)

但是,如果您使用纯Python编写受CPU约束的程序,则使用更多线程通常无济于事。

对于GIL,使用单独的进程没有这种问题,因为每个进程都有自己的单独的GIL。当然,线程和进程之间仍然具有与其他任何语言相同的权衡取舍–在进程之间共享数据比在线程之间共享更加困难,而且成本更高,运行大量进程或创建和销毁这些开销可能会很高等等。但是GIL在处理方面的平衡上权衡沉重,这对于C或Java而言并非如此。因此,您会发现自己在Python中比在C或Java中使用多处理的频率更高。


同时,Python的“含电池”理念带来了一些好消息:编写代码很容易,只需进行一次更改即可在线程和进程之间来回切换。

如果您根据独立的“作业”来设计代码,除了输入和输出,这些作业不与其他作业(或主程序)共享任何内容,则可以使用该concurrent.futures库在线程池周围编写代码,如下所示:

with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
    executor.submit(job, argument)
    executor.map(some_function, collection_of_independent_things)
    # ...

您甚至可以获取这些作业的结果,并将其传递给其他作业,按执行顺序或完成顺序等待;等等。阅读有关Future对象的部分以获取详细信息。

现在,如果事实证明您的程序一直在使用100%CPU,并且添加更多线程只会使其速度变慢,那么您就遇到了GIL问题,因此您需要切换到进程。您要做的就是更改第一行:

with concurrent.futures.ProcessPoolExecutor(max_workers=4) as executor:

唯一真正的警告是,作业的自变量和返回值必须可腌制(而不需要花费太多时间或内存来腌制)才能使用跨进程。通常这不是问题,但有时是问题。


但是,如果您的工作不能自给自足怎么办?如果您可以根据将消息从一个传递到另一个的工作来设计代码,那仍然很容易。您可能必须使用threading.Threadmultiprocessing.Process代替依赖池。并且您将必须显式创建queue.Queuemultiprocessing.Queue对象。(还有很多其他选择,例如管道,套接字,带有斑点的文件等等。但是,要点是,如果执行器的自动魔力不足,则必须手动执行某些操作。)

但是,如果您甚至不能依靠消息传递怎么办?如果您需要两项工作来同时改变同一个结构并看到彼此的变化,该怎么办?在这种情况下,您将需要进行手动同步(锁定,信号量,条件等),并且,如果要使用进程,则需要显式的共享内存对象进行引导。这是当多线程(或多处理)变得困难时。如果可以避免,那就太好了;如果不能,那么您将需要阅读的内容超过某人可以提供的答案。


通过评论,您想了解Python中的线程和进程之间的区别。的确,如果您阅读了朱利奥·佛朗哥(Giulio Franco)的答案和我的知识以及我们所有的链接,那应该涵盖了所有内容……但是总结肯定会很有用,所以这里是:

  1. 线程默认共享数据;流程没有。
  2. 作为(1)的结果,在进程之间发送数据通常需要对其进行酸洗和酸洗。**
  3. (1)的另一个结果是,在进程之间直接共享数据通常需要将其放入低级格式,如Value,Array和ctypesTypes。
  4. 流程不受GIL约束。
  5. 在某些平台(主要是Windows)上,创建和销毁进程的成本要高得多。
  6. 对流程有一些额外的限制,其中某些限制在不同平台上有所不同。有关详细信息,请参见编程指南
  7. threading模块不具有该模块的某些功能multiprocessing。(您可以使用multiprocessing.dummy大多数缺少的API放在线程之上,也可以使用更高级别的模块,例如concurrent.futures,不必担心。)

*出现此问题的实际上不是Python语言,而是该语言的“标准”实现CPython。其他一些实现没有JIL,例如Jython。

**如果您正在使用fork start方法进行多处理(在大多数非Windows平台上可以使用),则每个子进程都将获得启动子级时父级拥有的任何资源,这可能是将数据传递给子级的另一种方式。

What Giulio Franco says is true for multithreading vs. multiprocessing in general.

However, Python* has an added issue: There’s a Global Interpreter Lock that prevents two threads in the same process from running Python code at the same time. This means that if you have 8 cores, and change your code to use 8 threads, it won’t be able to use 800% CPU and run 8x faster; it’ll use the same 100% CPU and run at the same speed. (In reality, it’ll run a little slower, because there’s extra overhead from threading, even if you don’t have any shared data, but ignore that for now.)

There are exceptions to this. If your code’s heavy computation doesn’t actually happen in Python, but in some library with custom C code that does proper GIL handling, like a numpy app, you will get the expected performance benefit from threading. The same is true if the heavy computation is done by some subprocess that you run and wait on.

More importantly, there are cases where this doesn’t matter. For example, a network server spends most of its time reading packets off the network, and a GUI app spends most of its time waiting for user events. One reason to use threads in a network server or GUI app is to allow you to do long-running “background tasks” without stopping the main thread from continuing to service network packets or GUI events. And that works just fine with Python threads. (In technical terms, this means Python threads give you concurrency, even though they don’t give you core-parallelism.)

But if you’re writing a CPU-bound program in pure Python, using more threads is generally not helpful.

Using separate processes has no such problems with the GIL, because each process has its own separate GIL. Of course you still have all the same tradeoffs between threads and processes as in any other languages—it’s more difficult and more expensive to share data between processes than between threads, it can be costly to run a huge number of processes or to create and destroy them frequently, etc. But the GIL weighs heavily on the balance toward processes, in a way that isn’t true for, say, C or Java. So, you will find yourself using multiprocessing a lot more often in Python than you would in C or Java.


Meanwhile, Python’s “batteries included” philosophy brings some good news: It’s very easy to write code that can be switched back and forth between threads and processes with a one-liner change.

If you design your code in terms of self-contained “jobs” that don’t share anything with other jobs (or the main program) except input and output, you can use the concurrent.futures library to write your code around a thread pool like this:

with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
    executor.submit(job, argument)
    executor.map(some_function, collection_of_independent_things)
    # ...

You can even get the results of those jobs and pass them on to further jobs, wait for things in order of execution or in order of completion, etc.; read the section on Future objects for details.

Now, if it turns out that your program is constantly using 100% CPU, and adding more threads just makes it slower, then you’re running into the GIL problem, so you need to switch to processes. All you have to do is change that first line:

with concurrent.futures.ProcessPoolExecutor(max_workers=4) as executor:

The only real caveat is that your jobs’ arguments and return values have to be pickleable (and not take too much time or memory to pickle) to be usable cross-process. Usually this isn’t a problem, but sometimes it is.


But what if your jobs can’t be self-contained? If you can design your code in terms of jobs that pass messages from one to another, it’s still pretty easy. You may have to use threading.Thread or multiprocessing.Process instead of relying on pools. And you will have to create queue.Queue or multiprocessing.Queue objects explicitly. (There are plenty of other options—pipes, sockets, files with flocks, … but the point is, you have to do something manually if the automatic magic of an Executor is insufficient.)

But what if you can’t even rely on message passing? What if you need two jobs to both mutate the same structure, and see each others’ changes? In that case, you will need to do manual synchronization (locks, semaphores, conditions, etc.) and, if you want to use processes, explicit shared-memory objects to boot. This is when multithreading (or multiprocessing) gets difficult. If you can avoid it, great; if you can’t, you will need to read more than someone can put into an SO answer.


From a comment, you wanted to know what’s different between threads and processes in Python. Really, if you read Giulio Franco’s answer and mine and all of our links, that should cover everything… but a summary would definitely be useful, so here goes:

  1. Threads share data by default; processes do not.
  2. As a consequence of (1), sending data between processes generally requires pickling and unpickling it.**
  3. As another consequence of (1), directly sharing data between processes generally requires putting it into low-level formats like Value, Array, and ctypes types.
  4. Processes are not subject to the GIL.
  5. On some platforms (mainly Windows), processes are much more expensive to create and destroy.
  6. There are some extra restrictions on processes, some of which are different on different platforms. See Programming guidelines for details.
  7. The threading module doesn’t have some of the features of the multiprocessing module. (You can use multiprocessing.dummy to get most of the missing API on top of threads, or you can use higher-level modules like concurrent.futures and not worry about it.)

* It’s not actually Python, the language, that has this issue, but CPython, the “standard” implementation of that language. Some other implementations don’t have a GIL, like Jython.

** If you’re using the fork start method for multiprocessing—which you can on most non-Windows platforms—each child process gets any resources the parent had when the child was started, which can be another way to pass data to children.


回答 1

一个进程中可以存在多个线程。属于同一进程的线程共享同一内存区域(可以读取和写入相同的变量,并且可以互相干扰)。相反,不同的进程驻留在不同的内存区域中,并且每个进程都有自己的变量。为了进行通信,进程必须使用其他通道(文件,管道或套接字)。

如果要并行化计算,则可能需要多线程处理,因为您可能希望线程在同一内存上进行协作。

说到性能,线程的创建和管理速度比进程要快(因为操作系统不需要分配整个新的虚拟内存区域),并且线程间通信通常比进程间通信快。但是线程很难编程。线程可以互相干扰,并且可以互相写入内存,但是这种情况并不总是很明显(由于多种因素,主要是指令重新排序和内存缓存),因此您将需要同步原语来控制访问您的变量。

Multiple threads can exist in a single process. The threads that belong to the same process share the same memory area (can read from and write to the very same variables, and can interfere with one another). On the contrary, different processes live in different memory areas, and each of them has its own variables. In order to communicate, processes have to use other channels (files, pipes or sockets).

If you want to parallelize a computation, you’re probably going to need multithreading, because you probably want the threads to cooperate on the same memory.

Speaking about performance, threads are faster to create and manage than processes (because the OS doesn’t need to allocate a whole new virtual memory area), and inter-thread communication is usually faster than inter-process communication. But threads are harder to program. Threads can interfere with one another, and can write to each other’s memory, but the way this happens is not always obvious (due to several factors, mainly instruction reordering and memory caching), and so you are going to need synchronization primitives to control access to your variables.


回答 2

我相信此链接可以优雅地回答您的问题。

简而言之,如果您的一个子问题不得不等待另一个子问题完成,那么多线程就可以了(例如,在I / O繁重的操作中);相反,如果您的子问题确实可能同时发生,则建议进行多处理。但是,创建的进程数不会超过核心数。

I believe this link answers your question in an elegant way.

To be short, if one of your sub-problems has to wait while another finishes, multithreading is good (in I/O heavy operations, for example); by contrast, if your sub-problems could really happen at the same time, multiprocessing is suggested. However, you won’t create more processes than your number of cores.


回答 3

Python文档引号

我在以下位置突出了有关Process vs Threads和GIL的重要Python文档报价:CPython中的全局解释器锁(GIL)是什么?

进程与线程实验

我做了一些基准测试,以便更具体地显示差异。

在基准测试中,我为8个超线程上的各种线程的CPU和IO绑定时间计时 CPU。每个线程提供的功总是相同的,因此,更多线程意味着提供更多的总功。

结果是:

绘制数据

结论:

  • 对于CPU限制的工作,多处理总是更快,大概是由于GIL

  • 用于IO绑定工作。两者的速度完全一样

  • 由于我在8个超线程计算机上,因此线程最多只能扩展到大约4倍,而不是预期的8倍。

    与C POSIX绑定的CPU工作达到预期的8倍加速相比,它是什么time(1)输出中的“ real”,“ user”和“ sys”是什么意思?

    TODO:我不知道是什么原因,一定还有其他Python低效率正在发挥作用。

测试代码:

#!/usr/bin/env python3

import multiprocessing
import threading
import time
import sys

def cpu_func(result, niters):
    '''
    A useless CPU bound function.
    '''
    for i in range(niters):
        result = (result * result * i + 2 * result * i * i + 3) % 10000000
    return result

class CpuThread(threading.Thread):
    def __init__(self, niters):
        super().__init__()
        self.niters = niters
        self.result = 1
    def run(self):
        self.result = cpu_func(self.result, self.niters)

class CpuProcess(multiprocessing.Process):
    def __init__(self, niters):
        super().__init__()
        self.niters = niters
        self.result = 1
    def run(self):
        self.result = cpu_func(self.result, self.niters)

class IoThread(threading.Thread):
    def __init__(self, sleep):
        super().__init__()
        self.sleep = sleep
        self.result = self.sleep
    def run(self):
        time.sleep(self.sleep)

class IoProcess(multiprocessing.Process):
    def __init__(self, sleep):
        super().__init__()
        self.sleep = sleep
        self.result = self.sleep
    def run(self):
        time.sleep(self.sleep)

if __name__ == '__main__':
    cpu_n_iters = int(sys.argv[1])
    sleep = 1
    cpu_count = multiprocessing.cpu_count()
    input_params = [
        (CpuThread, cpu_n_iters),
        (CpuProcess, cpu_n_iters),
        (IoThread, sleep),
        (IoProcess, sleep),
    ]
    header = ['nthreads']
    for thread_class, _ in input_params:
        header.append(thread_class.__name__)
    print(' '.join(header))
    for nthreads in range(1, 2 * cpu_count):
        results = [nthreads]
        for thread_class, work_size in input_params:
            start_time = time.time()
            threads = []
            for i in range(nthreads):
                thread = thread_class(work_size)
                threads.append(thread)
                thread.start()
            for i, thread in enumerate(threads):
                thread.join()
            results.append(time.time() - start_time)
        print(' '.join('{:.6e}'.format(result) for result in results))

GitHub上游+在同一目录上绘制代码

在带有CPU的Lenovo ThinkPad P51笔记本电脑上的Ubuntu 18.10,Python 3.6.7上进行了测试:Intel Core i7-7820HQ CPU(4核/ 8线程),RAM:2x三星M471A2K43BB1-CRC(2x 16GiB),SSD:Samsung MZVLB512HAJQ- 000L7(3,000 MB / s)。

可视化在给定时间正在运行的线程

这篇帖子https://rohanvarma.me/GIL/告诉我,只要调度线程的target=参数threading.Thread与相同,就可以运行回调multiprocessing.Process

这使我们可以精确查看每次运行哪个线程。完成此操作后,我们将看到类似的内容(我制作了此特定图形):

            +--------------------------------------+
            + Active threads / processes           +
+-----------+--------------------------------------+
|Thread   1 |********     ************             |
|         2 |        *****            *************|
+-----------+--------------------------------------+
|Process  1 |***  ************** ******  ****      |
|         2 |** **** ****** ** ********* **********|
+-----------+--------------------------------------+
            + Time -->                             +
            +--------------------------------------+

这将表明:

  • 线程由GIL完全序列化
  • 进程可以并行运行

Python documentation quotes

I’ve highlighted the key Python documentation quotes about Process vs Threads and the GIL at: What is the global interpreter lock (GIL) in CPython?

Process vs thread experiments

I did a bit of benchmarking in order to show the difference more concretely.

In the benchmark, I timed CPU and IO bound work for various numbers of threads on an 8 hyperthread CPU. The work supplied per thread is always the same, such that more threads means more total work supplied.

The results were:

Plot data.

Conclusions:

  • for CPU bound work, multiprocessing is always faster, presumably due to the GIL

  • for IO bound work. both are exactly the same speed

  • threads only scale up to about 4x instead of the expected 8x since I’m on an 8 hyperthread machine.

    Contrast that with a C POSIX CPU-bound work which reaches the expected 8x speedup: What do ‘real’, ‘user’ and ‘sys’ mean in the output of time(1)?

    TODO: I don’t know the reason for this, there must be other Python inefficiencies coming into play.

Test code:

#!/usr/bin/env python3

import multiprocessing
import threading
import time
import sys

def cpu_func(result, niters):
    '''
    A useless CPU bound function.
    '''
    for i in range(niters):
        result = (result * result * i + 2 * result * i * i + 3) % 10000000
    return result

class CpuThread(threading.Thread):
    def __init__(self, niters):
        super().__init__()
        self.niters = niters
        self.result = 1
    def run(self):
        self.result = cpu_func(self.result, self.niters)

class CpuProcess(multiprocessing.Process):
    def __init__(self, niters):
        super().__init__()
        self.niters = niters
        self.result = 1
    def run(self):
        self.result = cpu_func(self.result, self.niters)

class IoThread(threading.Thread):
    def __init__(self, sleep):
        super().__init__()
        self.sleep = sleep
        self.result = self.sleep
    def run(self):
        time.sleep(self.sleep)

class IoProcess(multiprocessing.Process):
    def __init__(self, sleep):
        super().__init__()
        self.sleep = sleep
        self.result = self.sleep
    def run(self):
        time.sleep(self.sleep)

if __name__ == '__main__':
    cpu_n_iters = int(sys.argv[1])
    sleep = 1
    cpu_count = multiprocessing.cpu_count()
    input_params = [
        (CpuThread, cpu_n_iters),
        (CpuProcess, cpu_n_iters),
        (IoThread, sleep),
        (IoProcess, sleep),
    ]
    header = ['nthreads']
    for thread_class, _ in input_params:
        header.append(thread_class.__name__)
    print(' '.join(header))
    for nthreads in range(1, 2 * cpu_count):
        results = [nthreads]
        for thread_class, work_size in input_params:
            start_time = time.time()
            threads = []
            for i in range(nthreads):
                thread = thread_class(work_size)
                threads.append(thread)
                thread.start()
            for i, thread in enumerate(threads):
                thread.join()
            results.append(time.time() - start_time)
        print(' '.join('{:.6e}'.format(result) for result in results))

GitHub upstream + plotting code on same directory.

Tested on Ubuntu 18.10, Python 3.6.7, in a Lenovo ThinkPad P51 laptop with CPU: Intel Core i7-7820HQ CPU (4 cores / 8 threads), RAM: 2x Samsung M471A2K43BB1-CRC (2x 16GiB), SSD: Samsung MZVLB512HAJQ-000L7 (3,000 MB/s).

Visualize which threads are running at a given time

This post https://rohanvarma.me/GIL/ taught me that you can run a callback whenever a thread is scheduled with the target= argument of threading.Thread and the same for multiprocessing.Process.

This allows us to view exactly which thread runs at each time. When this is done, we would see something like (I made this particular graph up):

            +--------------------------------------+
            + Active threads / processes           +
+-----------+--------------------------------------+
|Thread   1 |********     ************             |
|         2 |        *****            *************|
+-----------+--------------------------------------+
|Process  1 |***  ************** ******  ****      |
|         2 |** **** ****** ** ********* **********|
+-----------+--------------------------------------+
            + Time -->                             +
            +--------------------------------------+

which would show that:

  • threads are fully serialized by the GIL
  • processes can run in parallel

回答 4

这是python 2.6.x的一些性能数据,这些数据使人们质疑在IO绑定的情况下线程比多处理的性能更高。这些结果来自40个处理器的IBM System x3650 M4 BD。

IO绑定处理:进程池的性能优于线程池

>>> do_work(50, 300, 'thread','fileio')
do_work function took 455.752 ms

>>> do_work(50, 300, 'process','fileio')
do_work function took 319.279 ms

CPU限制处理:进程池的性能优于线程池

>>> do_work(50, 2000, 'thread','square')
do_work function took 338.309 ms

>>> do_work(50, 2000, 'process','square')
do_work function took 287.488 ms

这些不是严格的测试,但是它们告诉我,与线程处理相比,多处理并非完全没有表现。

交互式python控制台中用于上述测试的代码

from multiprocessing import Pool
from multiprocessing.pool import ThreadPool
import time
import sys
import os
from glob import glob

text_for_test = str(range(1,100000))

def fileio(i):
 try :
  os.remove(glob('./test/test-*'))
 except : 
  pass
 f=open('./test/test-'+str(i),'a')
 f.write(text_for_test)
 f.close()
 f=open('./test/test-'+str(i),'r')
 text = f.read()
 f.close()


def square(i):
 return i*i

def timing(f):
 def wrap(*args):
  time1 = time.time()
  ret = f(*args)
  time2 = time.time()
  print '%s function took %0.3f ms' % (f.func_name, (time2-time1)*1000.0)
  return ret
 return wrap

result = None

@timing
def do_work(process_count, items, process_type, method) :
 pool = None
 if process_type == 'process' :
  pool = Pool(processes=process_count)
 else :
  pool = ThreadPool(processes=process_count)
 if method == 'square' : 
  multiple_results = [pool.apply_async(square,(a,)) for a in range(1,items)]
  result = [res.get()  for res in multiple_results]
 else :
  multiple_results = [pool.apply_async(fileio,(a,)) for a in range(1,items)]
  result = [res.get()  for res in multiple_results]


do_work(50, 300, 'thread','fileio')
do_work(50, 300, 'process','fileio')

do_work(50, 2000, 'thread','square')
do_work(50, 2000, 'process','square')

Here’s some performance data for python 2.6.x that calls to question the notion that threading is more performant that multiprocessing in IO-bound scenarios. These results are from a 40-processor IBM System x3650 M4 BD.

IO-Bound Processing : Process Pool performed better than Thread Pool

>>> do_work(50, 300, 'thread','fileio')
do_work function took 455.752 ms

>>> do_work(50, 300, 'process','fileio')
do_work function took 319.279 ms

CPU-Bound Processing : Process Pool performed better than Thread Pool

>>> do_work(50, 2000, 'thread','square')
do_work function took 338.309 ms

>>> do_work(50, 2000, 'process','square')
do_work function took 287.488 ms

These aren’t rigorous tests, but they tell me that multiprocessing isn’t entirely unperformant in comparison to threading.

Code used in the interactive python console for the above tests

from multiprocessing import Pool
from multiprocessing.pool import ThreadPool
import time
import sys
import os
from glob import glob

text_for_test = str(range(1,100000))

def fileio(i):
 try :
  os.remove(glob('./test/test-*'))
 except : 
  pass
 f=open('./test/test-'+str(i),'a')
 f.write(text_for_test)
 f.close()
 f=open('./test/test-'+str(i),'r')
 text = f.read()
 f.close()


def square(i):
 return i*i

def timing(f):
 def wrap(*args):
  time1 = time.time()
  ret = f(*args)
  time2 = time.time()
  print '%s function took %0.3f ms' % (f.func_name, (time2-time1)*1000.0)
  return ret
 return wrap

result = None

@timing
def do_work(process_count, items, process_type, method) :
 pool = None
 if process_type == 'process' :
  pool = Pool(processes=process_count)
 else :
  pool = ThreadPool(processes=process_count)
 if method == 'square' : 
  multiple_results = [pool.apply_async(square,(a,)) for a in range(1,items)]
  result = [res.get()  for res in multiple_results]
 else :
  multiple_results = [pool.apply_async(fileio,(a,)) for a in range(1,items)]
  result = [res.get()  for res in multiple_results]


do_work(50, 300, 'thread','fileio')
do_work(50, 300, 'process','fileio')

do_work(50, 2000, 'thread','square')
do_work(50, 2000, 'process','square')

回答 5

好吧,朱利奥·佛朗哥(Giulio Franco)回答了大多数问题。我将进一步阐述消费者-生产者问题,我想这将使您走上使用多线程应用程序的解决方案的正确轨道。

fill_count = Semaphore(0) # items produced
empty_count = Semaphore(BUFFER_SIZE) # remaining space
buffer = Buffer()

def producer(fill_count, empty_count, buffer):
    while True:
        item = produceItem()
        empty_count.down();
        buffer.push(item)
        fill_count.up()

def consumer(fill_count, empty_count, buffer):
    while True:
        fill_count.down()
        item = buffer.pop()
        empty_count.up()
        consume_item(item)

您可以从以下网站阅读有关同步原语的更多信息:

 http://linux.die.net/man/7/sem_overview
 http://docs.python.org/2/library/threading.html

伪代码在上面。我想您应该搜索生产者-消费者问题以获取更多参考。

Well, most of the question is answered by Giulio Franco. I will further elaborate on the consumer-producer problem, which I suppose will put you on the right track for your solution to using a multithreaded app.

fill_count = Semaphore(0) # items produced
empty_count = Semaphore(BUFFER_SIZE) # remaining space
buffer = Buffer()

def producer(fill_count, empty_count, buffer):
    while True:
        item = produceItem()
        empty_count.down();
        buffer.push(item)
        fill_count.up()

def consumer(fill_count, empty_count, buffer):
    while True:
        fill_count.down()
        item = buffer.pop()
        empty_count.up()
        consume_item(item)

You could read more on the synchronization primitives from:

 http://linux.die.net/man/7/sem_overview
 http://docs.python.org/2/library/threading.html

The pseudocode is above. I suppose you should search the producer-consumer-problem to get more references.


如何在Python中启动后台进程?

问题:如何在Python中启动后台进程?

我正在尝试将Shell脚本移植到可读性更高的python版本。原始的shell脚本在后台使用“&”启动多个进程(实用程序,监视器等)。如何在python中达到相同的效果?我希望这些过程在Python脚本完成后不会消失。我敢肯定它与守护程序的概念有关,但是我找不到如何轻松实现此目的。

I’m trying to port a shell script to the much more readable python version. The original shell script starts several processes (utilities, monitors, etc.) in the background with “&”. How can I achieve the same effect in python? I’d like these processes not to die when the python scripts complete. I am sure it’s related to the concept of a daemon somehow, but I couldn’t find how to do this easily.


回答 0

注意:此答案的最新版本比2009年发布时要少。subprocess现在建议在文档中使用其他答案中显示的模块

(请注意,子流程模块提供了更强大的工具来生成新流程并检索其结果;使用该模块比使用这些功能更可取。)


如果您希望您的进程在后台启动,则可以使用system()与您的Shell脚本相同的方式来使用和调用它,也可以spawn

import os
os.spawnl(os.P_DETACH, 'some_long_running_command')

(或者,您也可以尝试使用便携性较差的os.P_NOWAIT标志)。

请参阅此处文档

Note: This answer is less current than it was when posted in 2009. Using the subprocess module shown in other answers is now recommended in the docs

(Note that the subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using these functions.)


If you want your process to start in the background you can either use system() and call it in the same way your shell script did, or you can spawn it:

import os
os.spawnl(os.P_DETACH, 'some_long_running_command')

(or, alternatively, you may try the less portable os.P_NOWAIT flag).

See the documentation here.


回答 1

尽管jkp的解决方案有效,但是更新的方式(以及文档建议的方式)是使用subprocess模块。对于简单的命令,它等效,但是如果您要执行复杂的操作,它提供了更多选项。

您的案例示例:

import subprocess
subprocess.Popen(["rm","-r","some.file"])

这将rm -r somefile在后台运行。请注意,调用.communicate()从返回的对象Popen将一直阻塞,直到完成为止,因此,如果要使其在后台运行,请不要这样做:

import subprocess
ls_output=subprocess.Popen(["sleep", "30"])
ls_output.communicate()  # Will block for 30 seconds

请参阅此处的文档。

另外,有一点需要澄清:这里使用的“背景”纯粹是一个外壳概念;从技术上讲,您的意思是希望在等待进程完成时生成一个没有阻塞的进程。但是,我在这里使用“背景”来指代类似外壳背景的行为。

While jkp‘s solution works, the newer way of doing things (and the way the documentation recommends) is to use the subprocess module. For simple commands its equivalent, but it offers more options if you want to do something complicated.

Example for your case:

import subprocess
subprocess.Popen(["rm","-r","some.file"])

This will run rm -r some.file in the background. Note that calling .communicate() on the object returned from Popen will block until it completes, so don’t do that if you want it to run in the background:

import subprocess
ls_output=subprocess.Popen(["sleep", "30"])
ls_output.communicate()  # Will block for 30 seconds

See the documentation here.

Also, a point of clarification: “Background” as you use it here is purely a shell concept; technically, what you mean is that you want to spawn a process without blocking while you wait for it to complete. However, I’ve used “background” here to refer to shell-background-like behavior.


回答 2

您可能需要答案“如何在Python中调用外部命令”

最简单的方法是使用该os.system函数,例如:

import os
os.system("some_command &")

基本上,传递给system函数的所有内容都将与将其传递给脚本中的shell一样执行。

You probably want the answer to “How to call an external command in Python”.

The simplest approach is to use the os.system function, e.g.:

import os
os.system("some_command &")

Basically, whatever you pass to the system function will be executed the same as if you’d passed it to the shell in a script.


回答 3

我在这里找到这个:

在Windows(win xp)上,父进程longtask.py只有在完成工作后才能完成。这不是您想要的CGI脚本。问题并非特定于Python,在PHP社区中,问题是相同的。

解决方案是将DETACHED_PROCESS 过程创建标志传递给CreateProcesswin API中的基础函数。如果碰巧安装了pywin32,则可以从win32process模块​​中导入该标志,否则,您应该自己定义它:

DETACHED_PROCESS = 0x00000008

pid = subprocess.Popen([sys.executable, "longtask.py"],
                       creationflags=DETACHED_PROCESS).pid

I found this here:

On windows (win xp), the parent process will not finish until the longtask.py has finished its work. It is not what you want in CGI-script. The problem is not specific to Python, in PHP community the problems are the same.

The solution is to pass DETACHED_PROCESS Process Creation Flag to the underlying CreateProcess function in win API. If you happen to have installed pywin32 you can import the flag from the win32process module, otherwise you should define it yourself:

DETACHED_PROCESS = 0x00000008

pid = subprocess.Popen([sys.executable, "longtask.py"],
                       creationflags=DETACHED_PROCESS).pid

回答 4

subprocess.Popen()close_fds=True参数一起使用,这将允许将生成的子流程与Python流程本身分离,甚至在Python退出后也可以继续运行。

https://gist.github.com/yinjimmy/d6ad0742d03d54518e9f

import os, time, sys, subprocess

if len(sys.argv) == 2:
    time.sleep(5)
    print 'track end'
    if sys.platform == 'darwin':
        subprocess.Popen(['say', 'hello'])
else:
    print 'main begin'
    subprocess.Popen(['python', os.path.realpath(__file__), '0'], close_fds=True)
    print 'main end'

Use subprocess.Popen() with the close_fds=True parameter, which will allow the spawned subprocess to be detached from the Python process itself and continue running even after Python exits.

https://gist.github.com/yinjimmy/d6ad0742d03d54518e9f

import os, time, sys, subprocess

if len(sys.argv) == 2:
    time.sleep(5)
    print 'track end'
    if sys.platform == 'darwin':
        subprocess.Popen(['say', 'hello'])
else:
    print 'main begin'
    subprocess.Popen(['python', os.path.realpath(__file__), '0'], close_fds=True)
    print 'main end'

回答 5

您可能想开始研究os模块以派生不同的线程(通过打开交互式会话并发出help(os))。相关功能是fork和任何exec功能。为了让您了解如何启动,请在执行fork的函数中放入类似的内容(该函数需要使用列表或元组’args’作为包含程序名称及其参数的参数;您可能还需要为新线程定义stdin,out和err):

try:
    pid = os.fork()
except OSError, e:
    ## some debug output
    sys.exit(1)
if pid == 0:
    ## eventually use os.putenv(..) to set environment variables
    ## os.execv strips of args[0] for the arguments
    os.execv(args[0], args)

You probably want to start investigating the os module for forking different threads (by opening an interactive session and issuing help(os)). The relevant functions are fork and any of the exec ones. To give you an idea on how to start, put something like this in a function that performs the fork (the function needs to take a list or tuple ‘args’ as an argument that contains the program’s name and its parameters; you may also want to define stdin, out and err for the new thread):

try:
    pid = os.fork()
except OSError, e:
    ## some debug output
    sys.exit(1)
if pid == 0:
    ## eventually use os.putenv(..) to set environment variables
    ## os.execv strips of args[0] for the arguments
    os.execv(args[0], args)

回答 6

捕获输出并在后台运行 threading

本答案所述,如果您使用捕获输出,stdout=然后尝试进行read(),则该过程将阻塞。

但是,在某些情况下您需要这样做。例如,我想启动两个进程,它们通过它们之间的端口进行通信,并将它们的stdout保存到日志文件和stdout中。

threading模块使我们能够做到这一点。

首先,看看如何在此问题中单独完成输出重定向:Python Popen:同时写入stdout和日志文件

然后:

main.py

#!/usr/bin/env python3

import os
import subprocess
import sys
import threading

def output_reader(proc, file):
    while True:
        byte = proc.stdout.read(1)
        if byte:
            sys.stdout.buffer.write(byte)
            sys.stdout.flush()
            file.buffer.write(byte)
        else:
            break

with subprocess.Popen(['./sleep.py', '0'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc1, \
     subprocess.Popen(['./sleep.py', '10'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc2, \
     open('log1.log', 'w') as file1, \
     open('log2.log', 'w') as file2:
    t1 = threading.Thread(target=output_reader, args=(proc1, file1))
    t2 = threading.Thread(target=output_reader, args=(proc2, file2))
    t1.start()
    t2.start()
    t1.join()
    t2.join()

sleep.py

#!/usr/bin/env python3

import sys
import time

for i in range(4):
    print(i + int(sys.argv[1]))
    sys.stdout.flush()
    time.sleep(0.5)

运行后:

./main.py

标准输出每0.5秒更新一次,每两行包含一次:

0
10
1
11
2
12
3
13

每个日志文件都包含给定进程的相应日志。

灵感来源:https//eli.thegreenplace.net/2017/interacting-with-a-long-running-child-process-in-python/

已在Ubuntu 18.04,Python 3.6.7上测试。

Both capture output and run on background with threading

As mentioned on this answer, if you capture the output with stdout= and then try to read(), then the process blocks.

However, there are cases where you need this. For example, I wanted to launch two processes that talk over a port between them, and save their stdout to a log file and stdout.

The threading module allows us to do that.

First, have a look at how to do the output redirection part alone in this question: Python Popen: Write to stdout AND log file simultaneously

Then:

main.py

#!/usr/bin/env python3

import os
import subprocess
import sys
import threading

def output_reader(proc, file):
    while True:
        byte = proc.stdout.read(1)
        if byte:
            sys.stdout.buffer.write(byte)
            sys.stdout.flush()
            file.buffer.write(byte)
        else:
            break

with subprocess.Popen(['./sleep.py', '0'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc1, \
     subprocess.Popen(['./sleep.py', '10'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc2, \
     open('log1.log', 'w') as file1, \
     open('log2.log', 'w') as file2:
    t1 = threading.Thread(target=output_reader, args=(proc1, file1))
    t2 = threading.Thread(target=output_reader, args=(proc2, file2))
    t1.start()
    t2.start()
    t1.join()
    t2.join()

sleep.py

#!/usr/bin/env python3

import sys
import time

for i in range(4):
    print(i + int(sys.argv[1]))
    sys.stdout.flush()
    time.sleep(0.5)

After running:

./main.py

stdout get updated every 0.5 seconds for every two lines to contain:

0
10
1
11
2
12
3
13

and each log file contains the respective log for a given process.

Inspired by: https://eli.thegreenplace.net/2017/interacting-with-a-long-running-child-process-in-python/

Tested on Ubuntu 18.04, Python 3.6.7.