Windows上的RuntimeError尝试Python多处理

问题:Windows上的RuntimeError尝试Python多处理

我正在尝试在Windows机器上使用Threading and Multiprocessing的第一个正式python程序。我无法启动进程,但是python给出了以下消息。问题是,我没有在模块中启动线程。线程在类内的单独模块中处理。

编辑:顺便说一句,此代码在ubuntu上运行良好。在窗户上不太

RuntimeError: 
            Attempt to start a new process before the current process
            has finished its bootstrapping phase.
            This probably means that you are on Windows and you have
            forgotten to use the proper idiom in the main module:
                if __name__ == '__main__':
                    freeze_support()
                    ...
            The "freeze_support()" line can be omitted if the program
            is not going to be frozen to produce a Windows executable.

我的原始代码很长,但是我能够以节略的版本重现该错误。它分为两个文件,第一个是主模块,除了导入处理进程/线程和调用方法的模块外,几乎没有其他作用。第二个模块是代码的关键所在。


testMain.py:

import parallelTestModule

extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)

parallelTestModule.py:

import multiprocessing
from multiprocessing import Process
import threading

class ThreadRunner(threading.Thread):
    """ This class represents a single instance of a running thread"""
    def __init__(self, name):
        threading.Thread.__init__(self)
        self.name = name
    def run(self):
        print self.name,'\n'

class ProcessRunner:
    """ This class represents a single instance of a running process """
    def runp(self, pid, numThreads):
        mythreads = []
        for tid in range(numThreads):
            name = "Proc-"+str(pid)+"-Thread-"+str(tid)
            th = ThreadRunner(name)
            mythreads.append(th) 
        for i in mythreads:
            i.start()
        for i in mythreads:
            i.join()

class ParallelExtractor:    
    def runInParallel(self, numProcesses, numThreads):
        myprocs = []
        prunner = ProcessRunner()
        for pid in range(numProcesses):
            pr = Process(target=prunner.runp, args=(pid, numThreads)) 
            myprocs.append(pr) 
#        if __name__ == 'parallelTestModule':    #This didnt work
#        if __name__ == '__main__':              #This obviously doesnt work
#        multiprocessing.freeze_support()        #added after seeing error to no avail
        for i in myprocs:
            i.start()

        for i in myprocs:
            i.join()

I am trying my very first formal python program using Threading and Multiprocessing on a windows machine. I am unable to launch the processes though, with python giving the following message. The thing is, I am not launching my threads in the main module. The threads are handled in a separate module inside a class.

EDIT: By the way this code runs fine on ubuntu. Not quite on windows

RuntimeError: 
            Attempt to start a new process before the current process
            has finished its bootstrapping phase.
            This probably means that you are on Windows and you have
            forgotten to use the proper idiom in the main module:
                if __name__ == '__main__':
                    freeze_support()
                    ...
            The "freeze_support()" line can be omitted if the program
            is not going to be frozen to produce a Windows executable.

My original code is pretty long, but I was able to reproduce the error in an abridged version of the code. It is split in two files, the first is the main module and does very little other than import the module which handles processes/threads and calls a method. The second module is where the meat of the code is.


testMain.py:

import parallelTestModule

extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)

parallelTestModule.py:

import multiprocessing
from multiprocessing import Process
import threading

class ThreadRunner(threading.Thread):
    """ This class represents a single instance of a running thread"""
    def __init__(self, name):
        threading.Thread.__init__(self)
        self.name = name
    def run(self):
        print self.name,'\n'

class ProcessRunner:
    """ This class represents a single instance of a running process """
    def runp(self, pid, numThreads):
        mythreads = []
        for tid in range(numThreads):
            name = "Proc-"+str(pid)+"-Thread-"+str(tid)
            th = ThreadRunner(name)
            mythreads.append(th) 
        for i in mythreads:
            i.start()
        for i in mythreads:
            i.join()

class ParallelExtractor:    
    def runInParallel(self, numProcesses, numThreads):
        myprocs = []
        prunner = ProcessRunner()
        for pid in range(numProcesses):
            pr = Process(target=prunner.runp, args=(pid, numThreads)) 
            myprocs.append(pr) 
#        if __name__ == 'parallelTestModule':    #This didnt work
#        if __name__ == '__main__':              #This obviously doesnt work
#        multiprocessing.freeze_support()        #added after seeing error to no avail
        for i in myprocs:
            i.start()

        for i in myprocs:
            i.join()

回答 0

在Windows上,子进程将在启动时导入(即执行)主模块。您需要if __name__ == '__main__':在主模块中插入防护,以避免递归创建子流程。

已修改testMain.py

import parallelTestModule

if __name__ == '__main__':    
    extractor = parallelTestModule.ParallelExtractor()
    extractor.runInParallel(numProcesses=2, numThreads=4)

On Windows the subprocesses will import (i.e. execute) the main module at start. You need to insert an if __name__ == '__main__': guard in the main module to avoid creating subprocesses recursively.

Modified testMain.py:

import parallelTestModule

if __name__ == '__main__':    
    extractor = parallelTestModule.ParallelExtractor()
    extractor.runInParallel(numProcesses=2, numThreads=4)

回答 1

尝试将代码放入testMain.py的主函数中

import parallelTestModule

if __name__ ==  '__main__':
  extractor = parallelTestModule.ParallelExtractor()
  extractor.runInParallel(numProcesses=2, numThreads=4)

文档

"For an explanation of why (on Windows) the if __name__ == '__main__' 
part is necessary, see Programming guidelines."

哪说

“确保主模块可以由新的Python解释器安全地导入,而不会引起意外的副作用(例如,启动新进程)。”

… 通过使用 if __name__ == '__main__'

Try putting your code inside a main function in testMain.py

import parallelTestModule

if __name__ ==  '__main__':
  extractor = parallelTestModule.ParallelExtractor()
  extractor.runInParallel(numProcesses=2, numThreads=4)

See the docs:

"For an explanation of why (on Windows) the if __name__ == '__main__' 
part is necessary, see Programming guidelines."

which say

“Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process).”

… by using if __name__ == '__main__'


回答 2

尽管较早的答案是正确的,但有一点复杂之处将有助于进一步说明。

如果您的主模块导入了另一个模块,在该模块中定义了全局变量或类成员变量并将其初始化为(或使用)一些新对象,则可能必须以相同的方式来限制导入:

if __name__ ==  '__main__':
  import my_module

Though the earlier answers are correct, there’s a small complication it would help to remark on.

In case your main module imports another module in which global variables or class member variables are defined and initialized to (or using) some new objects, you may have to condition that import in the same way:

if __name__ ==  '__main__':
  import my_module

回答 3

正如@Ofer所说,当您使用其他库或模块时,应将所有库或模块导入其中。 if __name__ == '__main__':

因此,就我而言,这样结束:

if __name__ == '__main__':       
    import librosa
    import os
    import pandas as pd
    run_my_program()

As @Ofer said, when you are using another libraries or modules, you should import all of them inside the if __name__ == '__main__':

So, in my case, ended like this:

if __name__ == '__main__':       
    import librosa
    import os
    import pandas as pd
    run_my_program()

回答 4

就我而言,这是代码中的一个简单错误,在创建变量之前就使用了变量。在尝试上述解决方案之前,值得检查一下。上帝知道为什么我得到这个特殊的错误信息。

In my case it was a simple bug in the code, using a variable before it was created. Worth checking that out before trying the above solutions. Why I got this particular error message, Lord knows.