使用Python 3从Jupyter Notebook中相对导入的另一个目录中的模块导入本地函数

问题:使用Python 3从Jupyter Notebook中相对导入的另一个目录中的模块导入本地函数

我有一个类似于以下内容的目录结构

meta_project
    project1
        __init__.py
        lib
            module.py
            __init__.py
    notebook_folder
        notebook.jpynb

当在工作notebook.jpynb,如果我尝试使用相对导入来访问函数function()module.py有:

from ..project1.lib.module import function

我收到以下错误:

SystemError                               Traceback (most recent call last)
<ipython-input-7-6393744d93ab> in <module>()
----> 1 from ..project1.lib.module import function

SystemError: Parent module '' not loaded, cannot perform relative import

有什么办法可以使用相对导入来使它起作用?

注意,笔记本服务器是在meta_project目录级别实例化的,因此它应该有权访问这些文件中的信息。

同样要注意的是,至少没有按照最初的意图project1被认为是模块,因此没有__init__.py文件,它只是作为文件系统目录。如果解决问题的方法需要将其视为模块,并包括一个__init__.py很好的文件(甚至是空白文件),但这样做还不足以解决问题。

我在机器之间共享此目录,相对的导入使我可以在任何地方使用相同的代码,而且我经常使用笔记本进行快速原型制作,因此涉及将绝对路径捆绑在一起的建议不太可能有帮助。


编辑:这与Python 3中的相对导入不同,后者相对于Python 3中的相对导入一般来说,尤其是从包目录中运行脚本。这与在jupyter笔记本中工作有关,该笔记本试图调用另一个目录中具有不同常规和特定方面的本地模块中的函数。

I have a directory structure similar to the following

meta_project
    project1
        __init__.py
        lib
            module.py
            __init__.py
    notebook_folder
        notebook.jpynb

When working in notebook.jpynb if I try to use a relative import to access a function function() in module.py with:

from ..project1.lib.module import function

I get the following error:

SystemError                               Traceback (most recent call last)
<ipython-input-7-6393744d93ab> in <module>()
----> 1 from ..project1.lib.module import function

SystemError: Parent module '' not loaded, cannot perform relative import

Is there any way to get this to work using relative imports?

Note, the notebook server is instantiated at the level of the meta_project directory, so it should have access to the information in those files.

Note, also, that at least as originally intended project1 wasn’t thought of as a module and therefore does not have an __init__.py file, it was just meant as a file-system directory. If the solution to the problem requires treating it as a module and including an __init__.py file (even a blank one) that is fine, but doing so is not enough to solve the problem.

I share this directory between machines and relative imports allow me to use the same code everywhere, & I often use notebooks for quick prototyping, so suggestions that involve hacking together absolute paths are unlikely to be helpful.


Edit: This is unlike Relative imports in Python 3, which talks about relative imports in Python 3 in general and – in particular – running a script from within a package directory. This has to do with working within a jupyter notebook trying to call a function in a local module in another directory which has both different general and particular aspects.


回答 0

此笔记本中,我有一个与您几乎相同的示例,在我想以DRY方式说明相邻模块功能的用法。

我的解决方案是通过向笔记本中添加如下代码段来告知Python该额外的模块导入路径:

import os
import sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

这使您可以从模块层次结构中导入所需的功能:

from project1.lib.module import function
# use the function normally
function(...)

请注意,如果还没有空__init__.py文件,则必须将它们添加到project1 /lib /文件夹中。

I had almost the same example as you in this notebook where I wanted to illustrate the usage of an adjacent module’s function in a DRY manner.

My solution was to tell Python of that additional module import path by adding a snippet like this one to the notebook:

import os
import sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

This allows you to import the desired function from the module hierarchy:

from project1.lib.module import function
# use the function normally
function(...)

Note that it is necessary to add empty __init__.py files to project1/ and lib/ folders if you don’t have them already.


回答 1

在这里使用笔记本时,正在寻求将代码抽象到子模块的最佳实践。我不确定是否有最佳做法。我一直在提出这个建议。

这样的项目层次结构:

├── ipynb
   ├── 20170609-Examine_Database_Requirements.ipynb
   └── 20170609-Initial_Database_Connection.ipynb
└── lib
    ├── __init__.py
    └── postgres.py

来自20170609-Initial_Database_Connection.ipynb

    In [1]: cd ..

    In [2]: from lib.postgres import database_connection

之所以可行,是因为默认情况下Jupyter Notebook可以解析该cd命令。请注意,这没有利用Python Notebook魔术。它只是工作而无需前置%bash

考虑到我使用Project Jupyter Docker映像之一在Docker中工作的100次中有99次,以下修改幂等的

    In [1]: cd /home/jovyan

    In [2]: from lib.postgres import database_connection

Came here searching for best practices in abstracting code to submodules when working in Notebooks. I’m not sure that there is a best practice. I have been proposing this.

A project hierarchy as such:

├── ipynb
│   ├── 20170609-Examine_Database_Requirements.ipynb
│   └── 20170609-Initial_Database_Connection.ipynb
└── lib
    ├── __init__.py
    └── postgres.py

And from 20170609-Initial_Database_Connection.ipynb:

    In [1]: cd ..

    In [2]: from lib.postgres import database_connection

This works because by default the Jupyter Notebook can parse the cd command. Note that this does not make use of Python Notebook magic. It simply works without prepending %bash.

Considering that 99 times out of a 100 I am working in Docker using one of the Project Jupyter Docker images, the following modification is idempotent

    In [1]: cd /home/jovyan

    In [2]: from lib.postgres import database_connection

回答 2

到目前为止,已接受的答案对我来说效果最好。但是,我一直担心的是,在某些情况下,我可能会将notebooks目录重构为子目录,从而需要module_path在每个笔记本中进行更改。我决定在每个笔记本目录中添加一个python文件,以导入所需的模块。

因此,具有以下项目结构:

project
|__notebooks
   |__explore
      |__ notebook1.ipynb
      |__ notebook2.ipynb
      |__ project_path.py
   |__ explain
       |__notebook1.ipynb
       |__project_path.py
|__lib
   |__ __init__.py
   |__ module.py

project_path.py在每个笔记本子目录(notebooks/explorenotebooks/explain)中添加了文件。此文件包含相对导入的代码(来自@metakermit):

import sys
import os

module_path = os.path.abspath(os.path.join(os.pardir, os.pardir))
if module_path not in sys.path:
    sys.path.append(module_path)

这样,我只需要在project_path.py文件中而不是在笔记本中进行相对导入即可。然后,笔记本文件仅需要在导入project_path之前导入lib。例如在0.0-notebook.ipynb

import project_path
import lib

需要注意的是,逆转进口将行不通。这不起作用:

import lib
import project_path

因此在进口期间必须小心。

So far, the accepted answer has worked best for me. However, my concern has always been that there is a likely scenario where I might refactor the notebooks directory into subdirectories, requiring to change the module_path in every notebook. I decided to add a python file within each notebook directory to import the required modules.

Thus, having the following project structure:

project
|__notebooks
   |__explore
      |__ notebook1.ipynb
      |__ notebook2.ipynb
      |__ project_path.py
   |__ explain
       |__notebook1.ipynb
       |__project_path.py
|__lib
   |__ __init__.py
   |__ module.py

I added the file project_path.py in each notebook subdirectory (notebooks/explore and notebooks/explain). This file contains the code for relative imports (from @metakermit):

import sys
import os

module_path = os.path.abspath(os.path.join(os.pardir, os.pardir))
if module_path not in sys.path:
    sys.path.append(module_path)

This way, I just need to do relative imports within the project_path.py file, and not in the notebooks. The notebooks files would then just need to import project_path before importing lib. For example in 0.0-notebook.ipynb:

import project_path
import lib

The caveat here is that reversing the imports would not work. THIS DOES NOT WORK:

import lib
import project_path

Thus care must be taken during imports.


回答 3

我刚刚找到了这个漂亮的解决方案:

import sys; sys.path.insert(0, '..') # add parent folder path where lib folder is
import lib.store_load # store_load is a file on my library folder

您只需要该文件的某些功能

from lib.store_load import your_function_name

如果python版本> = 3.3,则不需要文件夹中的init.py文件

I have just found this pretty solution:

import sys; sys.path.insert(0, '..') # add parent folder path where lib folder is
import lib.store_load # store_load is a file on my library folder

You just want some functions of that file

from lib.store_load import your_function_name

If python version >= 3.3 you do not need init.py file in the folder


回答 4

我自己研究此主题并阅读答案,因此我建议使用path.py库,因为该提供了用于更改当前工作目录的上下文管理器。

然后你有类似的东西

import path
if path.Path('../lib').isdir():
    with path.Path('..'):
        import lib

虽然,您可能只是省略了isdir声明。

在这里,我将添加打印语句,以便于跟踪正在发生的事情

import path
import pandas

print(path.Path.getcwd())
print(path.Path('../lib').isdir())
if path.Path('../lib').isdir():
    with path.Path('..'):
        print(path.Path.getcwd())
        import lib
        print('Success!')
print(path.Path.getcwd())

在此示例中输出(其中lib在/home/jovyan/shared/notebooks/by-team/data-vis/demos/lib):

/home/jovyan/shared/notebooks/by-team/data-vis/demos/custom-chart
/home/jovyan/shared/notebooks/by-team/data-vis/demos
/home/jovyan/shared/notebooks/by-team/data-vis/demos/custom-chart

由于该解决方案使用上下文管理器,因此无论内核在单元之前处于什么状态,以及导入库代码引发了什么异常,都可以保证返回到先前的工作目录。

Researching this topic myself and having read the answers I recommend using the path.py library since it provides a context manager for changing the current working directory.

You then have something like

import path
if path.Path('../lib').isdir():
    with path.Path('..'):
        import lib

Although, you might just omit the isdir statement.

Here I’ll add print statements to make it easy to follow what’s happening

import path
import pandas

print(path.Path.getcwd())
print(path.Path('../lib').isdir())
if path.Path('../lib').isdir():
    with path.Path('..'):
        print(path.Path.getcwd())
        import lib
        print('Success!')
print(path.Path.getcwd())

which outputs in this example (where lib is at /home/jovyan/shared/notebooks/by-team/data-vis/demos/lib):

/home/jovyan/shared/notebooks/by-team/data-vis/demos/custom-chart
/home/jovyan/shared/notebooks/by-team/data-vis/demos
/home/jovyan/shared/notebooks/by-team/data-vis/demos/custom-chart

Since the solution uses a context manager, you are guaranteed to go back to your previous working directory, no matter what state your kernel was in before the cell and no matter what exceptions are thrown by importing your library code.


回答 5

这是我的2美分:

导入系统

映射模块文件所在的路径。就我而言,它是台式机

sys.path.append(’/ Users / John / Desktop’)

要么导入整个映射模块,要么然后使用.notation来映射诸如mapping.Shipping()的类。

导入映射#mapping.py是我的模块文件的名称

shipit = mapping.Shipment()#Shipment是我需要在映射模块中使用的类的名称

或从映射模块导入特定的类

从映射导入映射

shipit = Shipment()#现在,您不必使用.notation

Here’s my 2 cents:

import sys

map the path where the module file is located. In my case it was the desktop

sys.path.append(‘/Users/John/Desktop’)

Either import the whole mapping module BUT then you have to use the .notation to map the classes like mapping.Shipping()

import mapping #mapping.py is the name of my module file

shipit = mapping.Shipment() #Shipment is the name of the class I need to use in the mapping module

Or import the specific class from the mapping module

from mapping import Mapping

shipit = Shipment() #Now you don’t have to use the .notation


回答 6

我发现python-dotenv可以非常有效地解决此问题。您的项目结构最终会稍有变化,但是笔记本中的代码在笔记本之间更简单,更一致。

对于您的项目,请进行一些安装。

pipenv install python-dotenv

然后,项目更改为:

├── .env (this can be empty)
├── ipynb
   ├── 20170609-Examine_Database_Requirements.ipynb
   └── 20170609-Initial_Database_Connection.ipynb
└── lib
    ├── __init__.py
    └── postgres.py

最后,您的导入更改为:

import os
import sys

from dotenv import find_dotenv


sys.path.append(os.path.dirname(find_dotenv()))

此软件包的+1是您的笔记本可以位于多个目录中。python-dotenv将在父目录中找到最接近的目录并使用它。此方法的+2是jupyter将在启动时从.env文件加载环境变量。双重打击。

I have found that python-dotenv helps solve this issue pretty effectively. Your project structure ends up changing slightly, but the code in your notebook is a bit simpler and consistent across notebooks.

For your project, do a little install.

pipenv install python-dotenv

Then, project changes to:

├── .env (this can be empty)
├── ipynb
│   ├── 20170609-Examine_Database_Requirements.ipynb
│   └── 20170609-Initial_Database_Connection.ipynb
└── lib
    ├── __init__.py
    └── postgres.py

And finally, your import changes to:

import os
import sys

from dotenv import find_dotenv


sys.path.append(os.path.dirname(find_dotenv()))

A +1 for this package is that your notebooks can be several directories deep. python-dotenv will find the closest one in a parent directory and use it. A +2 for this approach is that jupyter will load environment variables from the .env file on startup. Double whammy.