标签归档:unix

在Python脚本中,如何设置PYTHONPATH?

问题:在Python脚本中,如何设置PYTHONPATH?

我知道如何在/ etc / profile和环境变量中进行设置。

但是,如果我想在脚本中进行设置怎么办?是导入os,sys吗?我该怎么做?

I know how to set it in my /etc/profile and in my environment variables.

But what if I want to set it during a script? Is it import os, sys? How do I do it?


回答 0

您没有设置PYTHONPATH,而是向中添加条目sys.path。这是应该在其中搜索Python软件包的目录列表,因此您只需将目录追加到该列表即可。

sys.path.append('/path/to/whatever')

实际上,sys.path是通过分割PYTHONPATH路径分隔符:上的值来初始化的(在类似Linux的系统上,;在Windows上)。

您也可以使用来添加目录site.addsitedir,该方法还将考虑.pth您传递的目录内存在的文件。(对于您在中指定的目录,情况并非如此PYTHONPATH。)

You don’t set PYTHONPATH, you add entries to sys.path. It’s a list of directories that should be searched for Python packages, so you can just append your directories to that list.

sys.path.append('/path/to/whatever')

In fact, sys.path is initialized by splitting the value of PYTHONPATH on the path separator character (: on Linux-like systems, ; on Windows).

You can also add directories using site.addsitedir, and that method will also take into account .pth files existing within the directories you pass. (That would not be the case with directories you specify in PYTHONPATH.)


回答 1

您可以通过os.environ以下方式获取和设置环境变量:

import os
user_home = os.environ["HOME"]

os.environ["PYTHONPATH"] = "..."

但是,由于您的解释器已经在运行,因此不会起作用。你最好用

import sys
sys.path.append("...")

这是您PYTHONPATH将在解释程序启动时转换为的数组。

You can get and set environment variables via os.environ:

import os
user_home = os.environ["HOME"]

os.environ["PYTHONPATH"] = "..."

But since your interpreter is already running, this will have no effect. You’re better off using

import sys
sys.path.append("...")

which is the array that your PYTHONPATH will be transformed into on interpreter startup.


回答 2

如果您sys.path.append('dir/to/path')不加检查就放了它,则可以在中生成一个长列表sys.path。为此,我建议这样做:

import sys
import os # if you want this directory

try:
    sys.path.index('/dir/path') # Or os.getcwd() for this directory
except ValueError:
    sys.path.append('/dir/path') # Or os.getcwd() for this directory

If you put sys.path.append('dir/to/path') without check it is already added, you could generate a long list in sys.path. For that, I recommend this:

import sys
import os # if you want this directory

try:
    sys.path.index('/dir/path') # Or os.getcwd() for this directory
except ValueError:
    sys.path.append('/dir/path') # Or os.getcwd() for this directory

回答 3

PYTHONPATH结尾于sys.path,您可以在运行时进行修改。

import sys
sys.path += ["whatever"]

PYTHONPATH ends up in sys.path, which you can modify at runtime.

import sys
sys.path += ["whatever"]

回答 4

您可以通过设置PYTHONPATHos.environ['PATHPYTHON']=/some/path然后需要调用os.system('python')以重新启动python shell,以使新添加的路径生效。

you can set PYTHONPATH, by os.environ['PATHPYTHON']=/some/path, then you need to call os.system('python') to restart the python shell to make the newly added path effective.


回答 5

我的Linux也可以:

import sys
sys.path.extend(["/path/to/dotpy/file/"])

I linux this works too:

import sys
sys.path.extend(["/path/to/dotpy/file/"])

找不到资源u’tokenizers / punkt / english.pickle’

问题:找不到资源u’tokenizers / punkt / english.pickle’

我的代码:

import nltk.data
tokenizer = nltk.data.load('nltk:tokenizers/punkt/english.pickle')

错误信息:

[ec2-user@ip-172-31-31-31 sentiment]$ python mapper_local_v1.0.py
Traceback (most recent call last):
File "mapper_local_v1.0.py", line 16, in <module>

    tokenizer = nltk.data.load('nltk:tokenizers/punkt/english.pickle')

File "/usr/lib/python2.6/site-packages/nltk/data.py", line 774, in load

    opened_resource = _open(resource_url)

File "/usr/lib/python2.6/site-packages/nltk/data.py", line 888, in _open

    return find(path_, path + ['']).open()

File "/usr/lib/python2.6/site-packages/nltk/data.py", line 618, in find

    raise LookupError(resource_not_found)

LookupError:

Resource u'tokenizers/punkt/english.pickle' not found.  Please
use the NLTK Downloader to obtain the resource:

    >>>nltk.download()

Searched in:
- '/home/ec2-user/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- u''

我正在尝试在Unix机器上运行此程序:

根据错误消息,我从unix机器登录python shell,然后使用以下命令:

import nltk
nltk.download()

然后我使用d-down加载程序和l-list选项下载了所有可用的内容,但问题仍然存在。

我尽力在Internet中找到解决方案,但得到的解决方案与上述步骤中提到的解决方案相同。

My Code:

import nltk.data
tokenizer = nltk.data.load('nltk:tokenizers/punkt/english.pickle')

ERROR Message:

[ec2-user@ip-172-31-31-31 sentiment]$ python mapper_local_v1.0.py
Traceback (most recent call last):
File "mapper_local_v1.0.py", line 16, in <module>

    tokenizer = nltk.data.load('nltk:tokenizers/punkt/english.pickle')

File "/usr/lib/python2.6/site-packages/nltk/data.py", line 774, in load

    opened_resource = _open(resource_url)

File "/usr/lib/python2.6/site-packages/nltk/data.py", line 888, in _open

    return find(path_, path + ['']).open()

File "/usr/lib/python2.6/site-packages/nltk/data.py", line 618, in find

    raise LookupError(resource_not_found)

LookupError:

Resource u'tokenizers/punkt/english.pickle' not found.  Please
use the NLTK Downloader to obtain the resource:

    >>>nltk.download()

Searched in:
- '/home/ec2-user/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- u''

I’m trying to run this program in Unix machine:

As per the error message, I logged into python shell from my unix machine then I used the below commands:

import nltk
nltk.download()

and then I downloaded all the available things using d- down loader and l- list options but still the problem persists.

I tried my best to find the solution in internet but I got the same solution what I did as I mentioned in my above steps.


回答 0

要添加到alvas的答案中,您只能下载punkt语料库:

nltk.download('punkt')

all对我来说下载声音听起来像是过分杀了。除非那是您想要的。

To add to alvas’ answer, you can download only the punkt corpus:

nltk.download('punkt')

Downloading all sounds like overkill to me. Unless that’s what you want.


回答 1

如果您只想下载punkt模型:

import nltk
nltk.download('punkt')

如果不确定所需的数据/模型,可以从NLTK 安装流行的数据集,模型和标记器:

import nltk
nltk.download('popular')

使用上面的命令,无需使用GUI下载数据集。

If you’re looking to only download the punkt model:

import nltk
nltk.download('punkt')

If you’re unsure which data/model you need, you can install the popular datasets, models and taggers from NLTK:

import nltk
nltk.download('popular')

With the above command, there is no need to use the GUI to download the datasets.


回答 2

我得到了解决方案:

import nltk
nltk.download()

NLTK下载器启动后

d)下载l)列表u)更新c)配置h)帮助q)退出

下载器> d

下载哪个软件包(l = list; x = cancel)?标识符> punkt

I got the solution:

import nltk
nltk.download()

once the NLTK Downloader starts

d) Download l) List u) Update c) Config h) Help q) Quit

Downloader> d

Download which package (l=list; x=cancel)? Identifier> punkt


回答 3

您可以从外壳执行:

sudo python -m nltk.downloader punkt 

如果要安装流行的NLTK语料库/模型:

sudo python -m nltk.downloader popular

如果要安装所有 NLTK语料库/模型:

sudo python -m nltk.downloader all

列出您已下载的资源:

python -c 'import os; import nltk; print os.listdir(nltk.data.find("corpora"))'
python -c 'import os; import nltk; print os.listdir(nltk.data.find("tokenizers"))'

From the shell you can execute:

sudo python -m nltk.downloader punkt 

If you want to install the popular NLTK corpora/models:

sudo python -m nltk.downloader popular

If you want to install all NLTK corpora/models:

sudo python -m nltk.downloader all

To list the resources you have downloaded:

python -c 'import os; import nltk; print os.listdir(nltk.data.find("corpora"))'
python -c 'import os; import nltk; print os.listdir(nltk.data.find("tokenizers"))'

回答 4

import nltk
nltk.download('punkt')

打开Python提示符并运行以上语句。

sent_tokenize函数使用的一个实例PunktSentenceTokenizernltk.tokenize.punkt模块。该实例已经过培训,并适用于许多欧洲语言。因此,它知道哪些标点符号和字符标记了句子的结尾和新句子的开头。

import nltk
nltk.download('punkt')

Open the Python prompt and run the above statements.

The sent_tokenize function uses an instance of PunktSentenceTokenizer from the nltk.tokenize.punkt module. This instance has already been trained and works well for many European languages. So it knows what punctuation and characters mark the end of a sentence and the beginning of a new sentence.


回答 5

最近我也发生了同样的事情,您只需要下载“ punkt”软件包,它就可以工作。

在“下载所有可用内容”之后执行“列表”(l)时,所有内容是否都标记为以下行?:

[*] punkt............... Punkt Tokenizer Models

如果看到带有星星的这条线,则表示您已经拥有它,并且nltk应该能够加载它。

The same thing happened to me recently, you just need to download the “punkt” package and it should work.

When you execute “list” (l) after having “downloaded all the available things”, is everything marked like the following line?:

[*] punkt............... Punkt Tokenizer Models

If you see this line with the star, it means you have it, and nltk should be able to load it.


回答 6

通过键入转到python控制台

$Python

在您的终端中。然后,在python shell中键入以下2条命令以安装相应的软件包:

>> nltk.download(’punkt’)>> nltk.download(’averaged_perceptron_tagger’)

这为我解决了这个问题。

Go to python console by typing

$ python

in your terminal. Then, type the following 2 commands in your python shell to install the respective packages:

>> nltk.download(‘punkt’) >> nltk.download(‘averaged_perceptron_tagger’)

This solved the issue for me.


回答 7

我的问题是我叫nltk.download('all')root用户,但是最终使用nltk的进程是另一个用户,该用户无权访问下载内容的/ root / nltk_data。

因此,我只是以递归方式将所有内容从下载位置复制到NLTK希望找到的路径之一,如下所示:

cp -R /root/nltk_data/ /home/ubuntu/nltk_data

My issue was that I called nltk.download('all') as the root user, but the process that eventually used nltk was another user who didn’t have access to /root/nltk_data where the content was downloaded.

So I simply recursively copied everything from the download location to one of the paths where NLTK was looking to find it like this:

cp -R /root/nltk_data/ /home/ubuntu/nltk_data

回答 8

  1. 执行以下代码:

    import nltk
    nltk.download()
  2. 此后,将弹出NLTK下载器。

  3. 选择所有软件包。
  4. 下载punkt。
  1. Execute the following code:

    import nltk
    nltk.download()
    
  2. After this, NLTK downloader will pop out.

  3. Select All packages.
  4. Download punkt.

回答 9

尽管导入了以下内容,但还是出现错误,

import nltk
nltk.download()

但是对于谷歌colab,这解决了我的问题。

   !python3 -c "import nltk; nltk.download('all')"

I was getting an error despite importing the following,

import nltk
nltk.download()

but for google colab this solved my issue.

   !python3 -c "import nltk; nltk.download('all')"

回答 10

简单的nltk.download()无法解决此问题。我尝试了以下方法,它对我有用:

在nltk文件夹中,创建一个tokenizers文件夹,然后将您的punkt文件夹复制到tokenizers文件夹中。

这将起作用。 文件夹结构必须如图所示

Simple nltk.download() will not solve this issue. I tried the below and it worked for me:

in the nltk folder create a tokenizers folder and copy your punkt folder into tokenizers folder.

This will work.! the folder structure needs to be as shown in the picture


回答 11

您需要重新排列文件夹将tokenizers文件夹移到nltk_data文件夹中。如果您的nltk_data文件corpora夹中包含 tokenizers文件夹,则此方法不起作用

You need to rearrange your folders Move your tokenizers folder into nltk_data folder. This doesn’t work if you have nltk_data folder containing corpora folder containing tokenizers folder


回答 12

对我而言,上述方法均无效,因此我只是从网站http://www.nltk.org/nltk_data/手动下载了所有文件,并将它们也手动放置在“ nltk_data”内部的“ tokenizers”文件中”文件夹。不是一个漂亮的解决方案,但仍然是一个解决方案。

For me nothing of the above worked, so I just downloaded all the files by hand from the web site http://www.nltk.org/nltk_data/ and I put them also by hand in a file “tokenizers” inside of “nltk_data” folder. Not a pretty solution but still a solution.


回答 13

添加此行代码后,该问题将得到解决:

nltk.download('punkt')

After adding this line of code, the issue will be fixed:

nltk.download('punkt')

回答 14

我遇到了同样的问题。下载所有内容后,仍然存在“ punkt”错误。我在Windows机器上的C:\ Users \ vaibhav \ AppData \ Roaming \ nltk_data \ tokenizers中搜索了程序包,在那里可以看到“ punkt.zip”。我意识到,该zip尚未以某种方式提取到C:\ Users \ vaibhav \ AppData \ Roaming \ nltk_data \ tokenizers \ punk中。一旦我解压缩后,它就像音乐一样工作。

I faced same issue. After downloading everything, still ‘punkt’ error was there. I searched package on my windows machine at C:\Users\vaibhav\AppData\Roaming\nltk_data\tokenizers and I can see ‘punkt.zip’ present there. I realized that somehow the zip has not been extracted into C:\Users\vaibhav\AppData\Roaming\nltk_data\tokenizers\punk. Once I extracted the zip, it worked like music.


回答 15

只要确保您使用的是JupyterNotebook,然后在笔记本中执行以下操作:

import nltk

nltk.download()

然后将出现一个弹出窗口(显示信息https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml),您必须从中下载所有内容。

然后重新运行您的代码。

Just make sure you are using Jupyter Notebook and in a notebook, do the following:

import nltk

nltk.download()

Then one popup window will appear (showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml) From that you have to download everything.

Then rerun your code.


回答 16

对我来说,它可以通过使用“ nltk”来解决:

http://www.nltk.org/howto/data.html

使用nltk.data.load加载english.pickle失败

sent_tokenizer=nltk.data.load('nltk:tokenizers/punkt/english.pickle')

For me it got solved by using “nltk:”

http://www.nltk.org/howto/data.html

Failed loading english.pickle with nltk.data.load

sent_tokenizer=nltk.data.load('nltk:tokenizers/punkt/english.pickle')

如何在已经创建的virtualenv中设置pythonpath?

问题:如何在已经创建的virtualenv中设置pythonpath?

我要编辑什么文件?我创建了一个虚拟环境。

What file do I edit, and how? I created a virtual environment.


回答 0

编辑#2

正确的答案是@arogachev的答案。


如果要更改PYTHONPATHvirtualenv中使用的名称,可以将以下行添加到virtualenv的bin/activate文件中:

export PYTHONPATH="/the/path/you/want"

这样,PYTHONPATH每次使用此virtualenv时都会设置新的。

编辑:( 回答@RamRachum的评论)

要将其恢复到的原始值deactivate,您可以添加

export OLD_PYTHONPATH="$PYTHONPATH"

在前面提到的行之前,然后将以下行添加到bin/postdeactivate脚本中。

export PYTHONPATH="$OLD_PYTHONPATH"

EDIT #2

The right answer is @arogachev’s one.


If you want to change the PYTHONPATH used in a virtualenv, you can add the following line to your virtualenv’s bin/activate file:

export PYTHONPATH="/the/path/you/want"

This way, the new PYTHONPATH will be set each time you use this virtualenv.

EDIT: (to answer @RamRachum’s comment)

To have it restored to its original value on deactivate, you could add

export OLD_PYTHONPATH="$PYTHONPATH"

before the previously mentioned line, and add the following line to your bin/postdeactivate script.

export PYTHONPATH="$OLD_PYTHONPATH"

回答 1

@ s29的评论应该是一个答案:

向虚拟环境添加目录的一种方法是安装virtualenvwrapper(这对很多事情都有用),然后执行

mkvirtualenv myenv
workon myenv
add2virtualenv . #for current directory
add2virtualenv ~/my/path

如果要删除这些路径,请编辑文件 myenvhomedir/lib/python2.7/site-packages/_virtualenv_path_extensions.pth

有关virtualenvwrapper的文档可以在http://virtualenvwrapper.readthedocs.org/en/latest/中找到。

有关此功能的特定文档,请访问 http://virtualenvwrapper.readthedocs.org/en/latest/command_ref.html?highlight=add2virtualenv

The comment by @s29 should be an answer:

One way to add a directory to the virtual environment is to install virtualenvwrapper (which is useful for many things) and then do

mkvirtualenv myenv
workon myenv
add2virtualenv . #for current directory
add2virtualenv ~/my/path

If you want to remove these path edit the file myenvhomedir/lib/python2.7/site-packages/_virtualenv_path_extensions.pth

Documentation on virtualenvwrapper can be found at http://virtualenvwrapper.readthedocs.org/en/latest/

Specific documentation on this feature can be found at http://virtualenvwrapper.readthedocs.org/en/latest/command_ref.html?highlight=add2virtualenv


回答 2

您可以创建一个.pth包含要搜索的目录的文件,并将其放置在site-packages目录中。例如:

cd $(python -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())")
echo /some/library/path > some-library.pth

效果与添加/some/library/path到相同sys.path,并且保持在virtualenv设置本地。

You can create a .pth file that contains the directory to search for, and place it in the site-packages directory. E.g.:

cd $(python -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())")
echo /some/library/path > some-library.pth

The effect is the same as adding /some/library/path to sys.path, and remain local to the virtualenv setup.


回答 3

  1. 初始化您的virtualenv
cd venv

source bin/activate
  1. 只需通过输入以下命令来设置或更改python路径:
export PYTHONPATH='/home/django/srmvenv/lib/python3.4'
  1. 用于检查python路径,请在python中输入:
   python

      \>\> import sys

      \>\> sys.path
  1. Initialize your virtualenv
cd venv

source bin/activate
  1. Just set or change your python path by entering command following:
export PYTHONPATH='/home/django/srmvenv/lib/python3.4'
  1. for checking python path enter in python:
   python

      \>\> import sys

      \>\> sys.path


回答 4

我修改了激活脚本以获取文件.virtualenvrc(如果文件存在于当前目录中)并PYTHONPATH在激活/停用时保存/恢复。

您可以在activate此处找到修补的脚本。。它是由virtualenv 1.11.6创建的激活脚本的直接替代。

然后我向我添加了类似的内容.virtualenvrc

export PYTHONPATH="${PYTHONPATH:+$PYTHONPATH:}/some/library/path"

I modified my activate script to source the file .virtualenvrc, if it exists in the current directory, and to save/restore PYTHONPATH on activate/deactivate.

You can find the patched activate script here.. It’s a drop-in replacement for the activate script created by virtualenv 1.11.6.

Then I added something like this to my .virtualenvrc:

export PYTHONPATH="${PYTHONPATH:+$PYTHONPATH:}/some/library/path"

回答 5

它已经在这里回答-> 我的虚拟环境(python)是否导致PYTHONPATH中断?

UNIX / Linux

将“ export PYTHONPATH = / usr / local / lib / python2.0”添加到〜/ .bashrc文件,并通过键入“ source〜/ .bashrc”或“。〜/ .bashrc”将其导出。

Windows XP

1)转到控制面板2)双击系统3)转到高级选项卡4)单击环境变量

在“系统变量”窗口中,检查是否有一个名为PYTHONPATH的变量。如果已经有了,请检查它是否指向正确的目录。如果还没有,请单击“新建”按钮并创建它。

密码

另外,您也可以在代码下方执行以下操作:-

import sys
sys.path.append("/home/me/mypy") 

It’s already answered here -> Is my virtual environment (python) causing my PYTHONPATH to break?

UNIX/LINUX

Add “export PYTHONPATH=/usr/local/lib/python2.0” this to ~/.bashrc file and source it by typing “source ~/.bashrc” OR “. ~/.bashrc”.

WINDOWS XP

1) Go to the Control panel 2) Double click System 3) Go to the Advanced tab 4) Click on Environment Variables

In the System Variables window, check if you have a variable named PYTHONPATH. If you have one already, check that it points to the right directories. If you don’t have one already, click the New button and create it.

PYTHON CODE

Alternatively, you can also do below your code:-

import sys
sys.path.append("/home/me/mypy") 

创建守护程序时执行双叉的原因是什么?

问题:创建守护程序时执行双叉的原因是什么?

我正在尝试在python中创建守护程序。我发现了以下问题,该问题中有一些我目前正在关注的很好的资源,但是我很好奇为什么需要双叉。我到处搜寻Google,发现有很多资源宣称有必要,但不是为什么。

有人提到这是为了防止守护程序获取控制终端。没有第二个分叉怎么办?有什么影响?

I’m trying to create a daemon in python. I’ve found the following question, which has some good resources in it which I am currently following, but I’m curious as to why a double fork is necessary. I’ve scratched around google and found plenty of resources declaring that one is necessary, but not why.

Some mention that it is to prevent the daemon from acquiring a controlling terminal. How would it do this without the second fork? What are the repercussions?


回答 0

查看问题中引用的代码,理由是:

分叉第二个孩子并立即退出以防僵尸。这将导致第二个子进程被孤立,使初始化进程负责其清理。并且,由于第一个孩子是没有控制终端的会话主持人,因此将来有可能通过打开终端(基于System V的系统)来获取一个孩子。第二个分叉确保孩子不再是会话领导者,从而防止守护程序获取控制终端。

因此,这是为了确保该守护程序重新绑定到init上(以防启动该守护程序的进程长期存在),并消除该守护程序重新获得控制tty的任何可能性。因此,如果这两种情况均不适用,那么一个叉子就足够了。“ Unix网络编程-Stevens ”对此有很好的介绍。

Looking at the code referenced in the question, the justification is:

Fork a second child and exit immediately to prevent zombies. This causes the second child process to be orphaned, making the init process responsible for its cleanup. And, since the first child is a session leader without a controlling terminal, it’s possible for it to acquire one by opening a terminal in the future (System V- based systems). This second fork guarantees that the child is no longer a session leader, preventing the daemon from ever acquiring a controlling terminal.

So it is to ensure that the daemon is re-parented onto init (just in case the process kicking off the daemon is long lived), and removes any chance of the daemon reacquiring a controlling tty. So if neither of these cases apply, then one fork should be sufficient. “Unix Network Programming – Stevens” has a good section on this.


回答 1

我试图理解双叉,在这里偶然发现了这个问题。经过大量研究,这就是我想出的。希望它将有助于为有相同问题的任何人更好地澄清问题。

在Unix中,每个进程都属于一个组,而该组又属于一个会话。这是层次结构…

会话(SID)→进程组(PGID)→进程(PID)

进程组中的第一个进程成为进程组负责人,而会话中的第一个进程成为会话负责人。每个会话可以有一个关联的TTY。只有会议负责人可以控制TTY。为了使进程真正被守护(在后台运行),我们应确保杀死会话负责人,以使会话永远不可能控制TTY。

我在Ubuntu上的该站点上运行了Sander Marechal的python示例守护程序。这是我的评论结果。

1. `Parent`    = PID: 28084, PGID: 28084, SID: 28046
2. `Fork#1`    = PID: 28085, PGID: 28084, SID: 28046
3. `Decouple#1`= PID: 28085, PGID: 28085, SID: 28085
4. `Fork#2`    = PID: 28086, PGID: 28085, SID: 28085

请注意,该过程是之后的会话负责人Decouple#1,因为它是PID = SID。它仍然可以控制TTY。

请注意,Fork#2不再是会议负责人PID != SID。此过程永远无法控制TTY。真正守护。

我个人发现术语叉两次是令人困惑的。更好的习惯用法可能是前叉-后叉-叉子。

其他感兴趣的链接:

I was trying to understand the double fork and stumbled upon this question here. After a lot of research this is what I figured out. Hopefully it will help clarify things better for anyone who has the same question.

In Unix every process belongs to a group which in turn belongs to a session. Here is the hierarchy…

Session (SID) → Process Group (PGID) → Process (PID)

The first process in the process group becomes the process group leader and the first process in the session becomes the session leader. Every session can have one TTY associated with it. Only a session leader can take control of a TTY. For a process to be truly daemonized (ran in the background) we should ensure that the session leader is killed so that there is no possibility of the session ever taking control of the TTY.

I ran Sander Marechal’s python example daemon program from this site on my Ubuntu. Here are the results with my comments.

1. `Parent`    = PID: 28084, PGID: 28084, SID: 28046
2. `Fork#1`    = PID: 28085, PGID: 28084, SID: 28046
3. `Decouple#1`= PID: 28085, PGID: 28085, SID: 28085
4. `Fork#2`    = PID: 28086, PGID: 28085, SID: 28085

Note that the process is the session leader after Decouple#1, because it’s PID = SID. It could still take control of a TTY.

Note that Fork#2 is no longer the session leader PID != SID. This process can never take control of a TTY. Truly daemonized.

I personally find terminology fork-twice to be confusing. A better idiom might be fork-decouple-fork.

Additional links of interest:


回答 2

严格来说,双叉与将守护程序重新作为的子代无关init。重新给孩子父母父母所必须要做的就是父母必须退出。这仅需一个叉子即可完成。另外,仅靠自己做双叉也不会使守护进程重新成为父级init;守护程序的父项必须退出。换句话说,在派生适当的守护程序时,父级始终会退出,以便将守护程序进程重新绑定到init

那为什么要双叉呢?POSIX.1-2008第11.1.3节“ 控制终端 ”具有答案(强调):

会话的控制终端由会话负责人以实现定义的方式分配。如果会话负责人没有控制终端,并且在不使用该O_NOCTTY选项的情况下打开了尚未与会话关联的终端设备文件(请参阅参考资料open()),则该终端是否成为会话负责人的控制终端由实现定义。如果不是会话负责人的进程打开了终端文件,或者使用了该O_NOCTTY选项open()则该终端不应成为呼叫进程的控制终端

这告诉我们,如果守护进程执行了以下操作…

int fd = open("/dev/console", O_RDWR);

…然后,取决于守护进程是否是会话领导者以及取决于系统实现,守护进程可能会获得/dev/console作为其控制终端的权限。如果程序首先确保它不是会话领导者,则该程序可以保证上述呼叫不会获得控制终端。

通常,启动守护程序时,会setsid被调用(从调用后的子进程中fork),以将守护程序与其控制终端分离。但是,调用setsid还意味着调用过程将成为新会话的会话负责人,这使守护程序可以重新获取控制终端的可能性成为可能。双叉技术确保守护进程不是会话领导者,然后保证对的调用open(如上例所示)不会导致守护进程重新获取控制终端。

双叉技术有点偏执。如果您知道守护程序将永远不会打开终端设备文件,则可能没有必要。同样,在某些系统上,即使守护程序确实打开了终端设备文件,也不一定需要,因为该行为是实现定义的。但是,未实现定义的一件事是只有会话负责人才能分配控制终端。如果某个进程不是会话负责人,则无法分配控制终端。因此,如果您想变得偏执狂,并确保守护进程不会无意中获得控制终端,而不管任何实现定义的细节如何,那么使用双叉技术是必不可少的。

Strictly speaking, the double-fork has nothing to do with re-parenting the daemon as a child of init. All that is necessary to re-parent the child is that the parent must exit. This can be done with only a single fork. Also, doing a double-fork by itself doesn’t re-parent the daemon process to init; the daemon’s parent must exit. In other words, the parent always exits when forking a proper daemon so that the daemon process is re-parented to init.

So why the double fork? POSIX.1-2008 Section 11.1.3, “The Controlling Terminal“, has the answer (emphasis added):

The controlling terminal for a session is allocated by the session leader in an implementation-defined manner. If a session leader has no controlling terminal, and opens a terminal device file that is not already associated with a session without using the O_NOCTTY option (see open()), it is implementation-defined whether the terminal becomes the controlling terminal of the session leader. If a process which is not a session leader opens a terminal file, or the O_NOCTTY option is used on open(), then that terminal shall not become the controlling terminal of the calling process.

This tells us that if a daemon process does something like this …

int fd = open("/dev/console", O_RDWR);

… then the daemon process might acquire /dev/console as its controlling terminal, depending on whether the daemon process is a session leader, and depending on the system implementation. The program can guarantee that the above call will not acquire a controlling terminal if the program first ensures that it is not a session leader.

Normally, when launching a daemon, setsid is called (from the child process after calling fork) to dissociate the daemon from its controlling terminal. However, calling setsid also means that the calling process will be the session leader of the new session, which leaves open the possibility that the daemon could reacquire a controlling terminal. The double-fork technique ensures that the daemon process is not the session leader, which then guarantees that a call to open, as in the example above, will not result in the daemon process reacquiring a controlling terminal.

The double-fork technique is a bit paranoid. It may not be necessary if you know that the daemon will never open a terminal device file. Also, on some systems it may not be necessary even if the daemon does open a terminal device file, since that behavior is implementation-defined. However, one thing that is not implementation-defined is that only a session leader can allocate the controlling terminal. If a process isn’t a session leader, it can’t allocate a controlling terminal. Therefore, if you want to be paranoid and be certain that the daemon process cannot inadvertently acquire a controlling terminal, regardless of any implementation-defined specifics, then the double-fork technique is essential.


回答 3

摘自Bad CTK

“在某些版本的Unix上,为了进入守护程序模式,您不得不在启动时进行双叉。这是因为不能保证单叉会脱离控制终端。”

Taken from Bad CTK:

“On some flavors of Unix, you are forced to do a double-fork on startup, in order to go into daemon mode. This is because single forking isn’t guaranteed to detach from the controlling terminal.”


回答 4

根据Stephens和Rago的“ Unix环境中的高级编程”,第二个fork更为推荐,并且这样做是为了确保守护程序在基于System V的系统上不获取控制终端。

According to “Advanced Programming in the Unix Environment”, by Stephens and Rago, the second fork is more a recommendation, and it is done to guarantee that the daemon does not acquire a controlling terminal on System V-based systems.


回答 5

原因之一是父进程可以立即为孩子创建wait_pid(),然后将其忽略。然后,当孙子去世时,它的父级是init,它将等待()-并将其带出僵尸状态。

结果是父进程不需要知道派生的子进程,并且还可以从libs等派生长时间运行的进程。

One reason is that the parent process can immediately wait_pid() for the child, and then forget about it. When then grand-child dies, it’s parent is init, and it will wait() for it – and taking it out of the zombie state.

The result is that the parent process doesn’t need to be aware of the forked children, and it also makes it possible to fork long running processes from libs etc.


回答 6

如果成功,则daemon()调用具有父调用_exit()。最初的动机可能是让父母在孩子守护时做一些额外的工作。

它也可能基于一种错误的信念,即必须确保该守护进程没有父进程并被重新绑定到init,但是一旦父进程在单个fork实例中死亡,无论如何都会发生这种情况。

因此,我想这一切最终都归结为传统-只要父母在短时间内死亡,一个叉子就足够了。

The daemon() call has the parent call _exit() if it succeeds. The original motivation may have been to allow the parent to do some extra work while the child is daemonizing.

It may also be based on a mistaken belief that it’s necessary in order to ensure the daemon has no parent process and is reparented to init – but this will happen anyway once the parent dies in the single fork case.

So I suppose it all just boils down to tradition in the end – a single fork is sufficient as long as the parent dies in short order anyway.


回答 7

关于它的一个体面的讨论似乎在http://www.developerweb.net/forum/showthread.php?t=3025

从那里引用mlampkin:

…将setsid()调用视为做事​​的“新”方式(与终端解除关联),然后将[second] fork()调用作为处理SVr4的冗余方法…

A decent discussion of it appear to be at http://www.developerweb.net/forum/showthread.php?t=3025

Quoting mlampkin from there:

…think of the setsid( ) call as the “new” way to do thing (disassociate from the terminal) and the [second] fork( ) call after it as redundancy to deal with the SVr4…


回答 8

用这种方式可能更容易理解:

  • 第一个fork和setsid将创建一个新会话(但进程ID ==会话ID)。
  • 第二个派生确保进程ID!=会话ID。

It might be easier to understand in this way:

  • The first fork and setsid will create a new session (but the process ID == session ID).
  • The second fork makes sure the process ID != session ID.

如何在不使用CD-cd进入目录的情况下在命令行中使用Python脚本?是PYTHONPATH吗?

问题:如何在不使用CD-cd进入目录的情况下在命令行中使用Python脚本?是PYTHONPATH吗?

如何使用PYTHONPATH?当我尝试在路径中运行脚本时,找不到文件。当我进入包含该脚本的目录时,该脚本将运行。那么PYTHONPATH有什么用呢?

$ echo $PYTHONPATH
:/home/randy/lib/python

$ tree -L 1 '/home/randy/lib/python' 
/home/randy/lib/python
├── gbmx_html.py
├── gbmx.py
├── __init__.py
├── __pycache__
├── scripts
└── yesno.py

$ python gbmx.py -h
python: can't open file 'gbmx.py': [Errno 2] No such file or directory

$ cd '/home/randy/lib/python'

cd到文件目录后,它运行..

$ python gbmx.py -h
usage: gbmx.py [-h] [-b]

为什么我不能使用PYTHONPATH?

How can I make any use of PYTHONPATH? When I try to run a script in the path the file is not found. When I cd to the directory holding the script the script runs. So what good is the PYTHONPATH?

$ echo $PYTHONPATH
:/home/randy/lib/python

$ tree -L 1 '/home/randy/lib/python' 
/home/randy/lib/python
├── gbmx_html.py
├── gbmx.py
├── __init__.py
├── __pycache__
├── scripts
└── yesno.py

$ python gbmx.py -h
python: can't open file 'gbmx.py': [Errno 2] No such file or directory

$ cd '/home/randy/lib/python'

After cd to the file directory it runs ..

$ python gbmx.py -h
usage: gbmx.py [-h] [-b]

Why can I not make any use of the PYTHONPATH?


回答 0

我觉得你有点困惑。PYTHONPATH设置用于导入 python模块的搜索路径,而不是像您尝试的那样执行它们。

PYTHONPATH扩展模块文件的默认搜索路径。格式与Shell的PATH相同:一个或多个目录路径名,以os.pathsep分隔(例如Unix上的冒号或Windows上的分号)。不存在的目录将被静默忽略。

除了普通目录外,单独的PYTHONPATH条目还可以引用包含纯Python模块(以源代码或编译形式)的zipfile。无法从zip文件导入扩展模块。

默认搜索路径取决于安装,但通常以prefix / lib / pythonversion开头(请参见上面的PYTHONHOME)。它总是附加在PYTHONPATH上。

如上文“接口选项”下所述,将在PYTHONPATH前面的搜索路径中插入一个附加目录。可以从Python程序中将搜索路径作为变量sys.path进行操作。

http://docs.python.org/2/using/cmdline.html#envvar-PYTHONPATH

您正在寻找的是PATH。

export PATH=$PATH:/home/randy/lib/python 

但是,要将Python脚本作为程序运行,您还需要在第一行中为Python 设置shebang。这样的事情应该起作用:

#!/usr/bin/env python

并赋予执行特权:

chmod +x /home/randy/lib/python/gbmx.py

然后,您应该可以轻松地gmbx.py从任何地方运行。

I think you’re a little confused. PYTHONPATH sets the search path for importing python modules, not for executing them like you’re trying.

PYTHONPATH Augment the default search path for module files. The format is the same as the shell’s PATH: one or more directory pathnames separated by os.pathsep (e.g. colons on Unix or semicolons on Windows). Non-existent directories are silently ignored.

In addition to normal directories, individual PYTHONPATH entries may refer to zipfiles containing pure Python modules (in either source or compiled form). Extension modules cannot be imported from zipfiles.

The default search path is installation dependent, but generally begins with prefix/lib/pythonversion (see PYTHONHOME above). It is always appended to PYTHONPATH.

An additional directory will be inserted in the search path in front of PYTHONPATH as described above under Interface options. The search path can be manipulated from within a Python program as the variable sys.path.

http://docs.python.org/2/using/cmdline.html#envvar-PYTHONPATH

What you’re looking for is PATH.

export PATH=$PATH:/home/randy/lib/python 

However, to run your python script as a program, you also need to set a shebang for Python in the first line. Something like this should work:

#!/usr/bin/env python

And give execution privileges to it:

chmod +x /home/randy/lib/python/gbmx.py

Then you should be able to simply run gmbx.py from anywhere.


回答 1

您混淆了PATH和PYTHONPATH。您需要这样做:

export PATH=$PATH:/home/randy/lib/python 

python解释器使用PYTHONPATH确定要加载的模块。

Shell使用PATH来确定要运行的可执行文件。

You’re confusing PATH and PYTHONPATH. You need to do this:

export PATH=$PATH:/home/randy/lib/python 

PYTHONPATH is used by the python interpreter to determine which modules to load.

PATH is used by the shell to determine which executables to run.


回答 2

PYTHONPATH仅影响import语句,而不影响顶级Python解释器对作为参数给出的python文件的查找。

需要PYTHONPATH设置不是一个好主意-就像任何依赖于环境变量的东西一样,在不同的机器上一致地复制东西变得棘手。更好的方法是使用可以在Python已经知道的系统相关路径中安装(使用’pip’或distutils)的Python’packages’。

阅读https://the-hitchhikers-guide-to-packaging.readthedocs.org/en/latest/- “ The Hitchhiker’s Packaging Guide”,以及http://docs.python.org/3/tutorial /modules.html-在较低级别解释PYTHONPATH和软件包。

PYTHONPATH only affects import statements, not the top-level Python interpreter’s lookup of python files given as arguments.

Needing PYTHONPATH to be set is not a great idea – as with anything dependent on environment variables, replicating things consistently across different machines gets tricky. Better is to use Python ‘packages’ which can be installed (using ‘pip’, or distutils) in system-dependent paths which Python already knows about.

Have a read of https://the-hitchhikers-guide-to-packaging.readthedocs.org/en/latest/ – ‘The Hitchhiker’s Guide to Packaging’, and also http://docs.python.org/3/tutorial/modules.html – which explains PYTHONPATH and packages at a lower level.


回答 3

我认为您介于PATH和PYTHONPATH之间。运行“脚本”所需要做的就是将其父目录附加到PATH变量中。您可以通过运行来测试

which myscript.py

另外,如果myscripy.py依赖定制模块,则它们的父目录也必须添加到PYTHONPATH变量中。不幸的是,由于python的设计者显然是在使用药物,因此使用以下代码在repl中测试您的导入内容并不能保证您的PYTHONPATH设置正确,可以在脚本中使用。python编程的这一部分是神奇的,不能在stackoverflow上适当地回答。

$python
Python 2.7.8 blahblahblah
...
>from mymodule.submodule import ClassName
>test = ClassName()
>^D
$myscript_that_needs_mymodule.submodule.py
Traceback (most recent call last):
  File "myscript_that_needs_mymodule.submodule.py", line 5, in <module>
    from mymodule.submodule import ClassName
  File "/path/to/myscript_that_needs_mymodule.submodule.py", line 5, in <module>
    from mymodule.submodule import ClassName
ImportError: No module named submodule

I think you’re mixed up between PATH and PYTHONPATH. All you have to do to run a ‘script’ is have it’s parental directory appended to your PATH variable. You can test this by running

which myscript.py

Also, if myscripy.py depends on custom modules, their parental directories must also be added to the PYTHONPATH variable. Unfortunately, because the designers of python were clearly on drugs, testing your imports in the repl with the following will not guarantee that your PYTHONPATH is set properly for use in a script. This part of python programming is magic and can’t be answered appropriately on stackoverflow.

$python
Python 2.7.8 blahblahblah
...
>from mymodule.submodule import ClassName
>test = ClassName()
>^D
$myscript_that_needs_mymodule.submodule.py
Traceback (most recent call last):
  File "myscript_that_needs_mymodule.submodule.py", line 5, in <module>
    from mymodule.submodule import ClassName
  File "/path/to/myscript_that_needs_mymodule.submodule.py", line 5, in <module>
    from mymodule.submodule import ClassName
ImportError: No module named submodule

回答 4

如示例中那样设置PYTHONPATH,您应该可以

python -m gmbx

-m选项将使Python在路径中搜索模块,而Python通常会在其中搜索模块,包括您添加到PYTHONPATH中的模块。当您运行像这样的解释器时python gmbx.py,它将查找特定文件,而PYTHONPATH不适用。

With PYTHONPATH set as in your example, you should be able to do

python -m gmbx

-m option will make Python search for your module in paths Python usually searches modules in, including what you added to PYTHONPATH. When you run interpreter like python gmbx.py, it looks for particular file and PYTHONPATH does not apply.


如何在Linux和Windows中的Python中使用“ /”(目录分隔符)?

问题:如何在Linux和Windows中的Python中使用“ /”(目录分隔符)?

我已经在python中编写了一个代码,该代码使用/在文件夹中创建特定文件,如果我想在Windows中使用该代码将无法正常工作,有没有一种方法可以在Windows和Linux中使用该代码。

在python中,我使用以下代码:

pathfile=os.path.dirname(templateFile)
rootTree.write(''+pathfile+'/output/log.txt')

当我在Windows计算机中使用我的代码时,我的代码将无法工作。

在Linux和Windows中如何使用“ /”(目录分隔符)?

I have written a code in python which uses / to make a particular file in a folder, if I want to use the code in windows it will not work, is there a way by which I can use the code in Windows and Linux.

In python I am using this code:

pathfile=os.path.dirname(templateFile)
rootTree.write(''+pathfile+'/output/log.txt')

When I will use my code in suppose windows machine my code will not work.

How do I use “/” (directory separator) in both Linux and Windows?


回答 0

使用os.path.join()。范例:os.path.join(pathfile,"output","log.txt")

在您的代码中将是: rootTree.write(os.path.join(pathfile,"output","log.txt"))

Use os.path.join(). Example: os.path.join(pathfile,"output","log.txt").

In your code that would be: rootTree.write(os.path.join(pathfile,"output","log.txt"))


回答 1

用:

import os
print os.sep

查看分隔符在当前操作系统上的外观。
在您的代码中,您可以使用:

import os
path = os.path.join('folder_name', 'file_name')

Use:

import os
print os.sep

to see how separator looks on a current OS.
In your code you can use:

import os
path = os.path.join('folder_name', 'file_name')

回答 2

您可以使用os.sep

>>> import os
>>> os.sep
'/'

You can use os.sep:

>>> import os
>>> os.sep
'/'

回答 3

os.path.normpath(pathname)还应提及,因为它将Windows上的/路径分隔符转换为\分隔符。它还折叠冗余uplevel引用…即,A/BA/foo/../BA/./B一切变得A/B。如果您使用的是Windows,那么所有这些都将变为A\B

os.path.normpath(pathname) should also be mentioned as it converts / path separators into \ separators on Windows. It also collapses redundant uplevel references… i.e., A/B and A/foo/../B and A/./B all become A/B. And if you are Windows, these all become A\B.


回答 4

如果您有幸能够运行Python 3.4+,则可以使用pathlib

from pathlib import Path

path = Path(dir, subdir, filename)  # returns a path of the system's path flavour

或者,等效地,

path = Path(dir) / subdir / filename

If you are fortunate enough to be running Python 3.4+, you can use pathlib:

from pathlib import Path

path = Path(dir, subdir, filename)  # returns a path of the system's path flavour

or, equivalently,

path = Path(dir) / subdir / filename

回答 5

一些有用的链接将帮助您:

Some useful links that will help you:


回答 6

做一个import os然后使用os.sep

Do a import os and then use os.sep


回答 7

您可以使用“ os.sep

 import os
 pathfile=os.path.dirname(templateFile)
 directory = str(pathfile)+os.sep+'output'+os.sep+'log.txt'
 rootTree.write(directory)

You can use “os.sep

 import os
 pathfile=os.path.dirname(templateFile)
 directory = str(pathfile)+os.sep+'output'+os.sep+'log.txt'
 rootTree.write(directory)

回答 8

不要自行建立目录和文件名,请使用python随附的库。

在这种情况下,相关的是os.path。特别是join,它从目录和文件名或目录创建一个新的路径名,然后从完整路径中获取文件名。

你的例子是

pathfile=os.path.dirname(templateFile)
p = os.path.join(pathfile, 'output')
p = os.path.join( p, 'log.txt')
rootTree.write(p)

Don’t build directory and file names your self, use python’s included libraries.

In this case the relevant one is os.path. Especially join which creates a new pathname from a directory and a file name or directory and split that gets the filename from a full path.

Your example would be

pathfile=os.path.dirname(templateFile)
p = os.path.join(pathfile, 'output')
p = os.path.join( p, 'log.txt')
rootTree.write(p)