问题:导入语句是否应该始终位于模块的顶部?
PEP 08指出:
导入总是放在文件的顶部,紧随任何模块注释和文档字符串之后,以及模块全局变量和常量之前。
但是,如果仅在极少数情况下使用我要导入的类/方法/函数,那么在需要时进行导入肯定会更有效吗?
这不是吗?
class SomeClass(object):
def not_often_called(self)
from datetime import datetime
self.datetime = datetime.now()
比这更有效?
from datetime import datetime
class SomeClass(object):
def not_often_called(self)
self.datetime = datetime.now()
PEP 08 states:
Imports are always put at the top of the file, just after any module comments and docstrings, and before module globals and constants.
However if the class/method/function that I am importing is only used in rare cases, surely it is more efficient to do the import when it is needed?
Isn’t this:
class SomeClass(object):
def not_often_called(self)
from datetime import datetime
self.datetime = datetime.now()
more efficient than this?
from datetime import datetime
class SomeClass(object):
def not_often_called(self)
self.datetime = datetime.now()
回答 0
模块导入非常快,但不是即时的。这意味着:
- 将导入放在模块顶部很好,因为这是微不足道的成本,只需要支付一次即可。
- 将导入放在函数中会导致对该函数的调用花费更长时间。
因此,如果您关心效率,则将进口放在首位。仅在您的分析显示有帮助的情况下,才将它们移入函数中(您进行了概要分析以查看最能改善性能的地方,对吗?)
我见过执行延迟导入的最佳原因是:
- 可选的库支持。如果您的代码具有使用不同库的多个路径,则在未安装可选库的情况下不要中断。
- 在
__init__.py
插件的中,可能已导入但并未实际使用。例如Bazaar插件,它使用bzrlib
的延迟加载框架。
Module importing is quite fast, but not instant. This means that:
- Putting the imports at the top of the module is fine, because it’s a trivial cost that’s only paid once.
- Putting the imports within a function will cause calls to that function to take longer.
So if you care about efficiency, put the imports at the top. Only move them into a function if your profiling shows that would help (you did profile to see where best to improve performance, right??)
The best reasons I’ve seen to perform lazy imports are:
- Optional library support. If your code has multiple paths that use different libraries, don’t break if an optional library is not installed.
- In the
__init__.py
of a plugin, which might be imported but not actually used. Examples are Bazaar plugins, which use bzrlib
‘s lazy-loading framework.
回答 1
将import语句放在函数内部可以防止循环依赖。例如,如果您有两个模块X.py和Y.py,并且它们都需要互相导入,那么当您导入其中一个模块导致无限循环时,这将导致循环依赖。如果将import语句移动到一个模块中,则它将在调用该函数之前不会尝试导入另一个模块,并且该模块将已经被导入,因此不会出现无限循环。在此处阅读更多内容-effbot.org/zone/import-confusion.htm
Putting the import statement inside of a function can prevent circular dependencies.
For example, if you have 2 modules, X.py and Y.py, and they both need to import each other, this will cause a circular dependency when you import one of the modules causing an infinite loop. If you move the import statement in one of the modules then it won’t try to import the other module till the function is called, and that module will already be imported, so no infinite loop. Read here for more – effbot.org/zone/import-confusion.htm
回答 2
我采用了将所有导入放入使用它们的函数中而不是放在模块顶部的做法。
我得到的好处是能够更可靠地进行重构。当我将一个功能从一个模块移动到另一个模块时,我知道该功能将继续使用其完整的测试遗留功能。如果我在模块的顶部放置了导入文件,那么当我移动一个函数时,我发现我花了很多时间来使新模块的导入文件完整而最少。重构IDE可能与此无关。
如其他地方提到的那样,存在速度损失。我已经在我的应用程序中对此进行了测量,发现对于我的目的而言它并不重要。
能够预先查看所有模块依赖性而无需借助搜索(例如grep),也很不错。但是,我关心模块依赖性的原因通常是因为我正在安装,重构或移动包含多个文件的整个系统,而不仅仅是一个模块。在这种情况下,无论如何,我将执行全局搜索以确保我具有系统级依赖项。因此,我还没有发现全局导入可以帮助我在实践中理解系统。
我通常将检查的内容sys
放入if __name__=='__main__'
检查中,然后将参数(如sys.argv[1:]
)传递给main()
函数。这使我可以main
在sys
尚未导入的上下文中使用。
I have adopted the practice of putting all imports in the functions that use them, rather than at the top of the module.
The benefit I get is the ability to refactor more reliably. When I move a function from one module to another, I know that the function will continue to work with all of its legacy of testing intact. If I have my imports at the top of the module, when I move a function, I find that I end up spending a lot of time getting the new module’s imports complete and minimal. A refactoring IDE might make this irrelevant.
There is a speed penalty as mentioned elsewhere. I have measured this in my application and found it to be insignificant for my purposes.
It is also nice to be able to see all module dependencies up front without resorting to search (e.g. grep). However, the reason I care about module dependencies is generally because I’m installing, refactoring, or moving an entire system comprising multiple files, not just a single module. In that case, I’m going to perform a global search anyway to make sure I have the system-level dependencies. So I have not found global imports to aid my understanding of a system in practice.
I usually put the import of sys
inside the if __name__=='__main__'
check and then pass arguments (like sys.argv[1:]
) to a main()
function. This allows me to use main
in a context where sys
has not been imported.
回答 3
在大多数情况下,这样做对于保持清晰性和明智性很有用,但并非总是如此。以下是几个可能会在其他地方导入模块的情况的示例。
首先,您可以拥有一个带有以下形式的单元测试的模块:
if __name__ == '__main__':
import foo
aa = foo.xyz() # initiate something for the test
其次,您可能需要在运行时有条件地导入一些不同的模块。
if [condition]:
import foo as plugin_api
else:
import bar as plugin_api
xx = plugin_api.Plugin()
[...]
在其他情况下,您可能会将导入放置在代码的其他部分中。
Most of the time this would be useful for clarity and sensible to do but it’s not always the case. Below are a couple of examples of circumstances where module imports might live elsewhere.
Firstly, you could have a module with a unit test of the form:
if __name__ == '__main__':
import foo
aa = foo.xyz() # initiate something for the test
Secondly, you might have a requirement to conditionally import some different module at runtime.
if [condition]:
import foo as plugin_api
else:
import bar as plugin_api
xx = plugin_api.Plugin()
[...]
There are probably other situations where you might place imports in other parts in the code.
回答 4
当函数被调用为零或一次时,第一种变体的确比第二种变体更有效。但是,在第二次及其后的调用中,“导入每个调用”方法实际上效率较低。请参阅此链接以获取延迟加载技术,该技术通过执行“延迟导入”结合了两种方法的优点。
但是,除了效率之外,还有其他原因导致您可能会偏爱一个。一种方法是使阅读该模块相关代码的人更加清楚。它们还具有非常不同的故障特征-如果没有“ datetime”模块,第一个将在加载时失败,而第二个在调用该方法之前不会失败。
补充说明:在IronPython中,导入可能比CPython中昂贵得多,因为代码基本上是在导入时进行编译的。
The first variant is indeed more efficient than the second when the function is called either zero or one times. With the second and subsequent invocations, however, the “import every call” approach is actually less efficient. See this link for a lazy-loading technique that combines the best of both approaches by doing a “lazy import”.
But there are reasons other than efficiency why you might prefer one over the other. One approach is makes it much more clear to someone reading the code as to the dependencies that this module has. They also have very different failure characteristics — the first will fail at load time if there’s no “datetime” module while the second won’t fail until the method is called.
Added Note: In IronPython, imports can be quite a bit more expensive than in CPython because the code is basically being compiled as it’s being imported.
回答 5
Curt提出了一个很好的观点:第二个版本更清晰,它将在加载时而不是以后失败,并且出乎意料地失败。
通常,我不必担心模块的加载效率,因为它的速度(a)非常快,而(b)大多仅在启动时发生。
如果必须在意外的时刻加载重量级模块,则可以通过该__import__
函数动态加载它们,并确保捕获ImportError
异常并以合理的方式处理它们,这可能更有意义。
Curt makes a good point: the second version is clearer and will fail at load time rather than later, and unexpectedly.
Normally I don’t worry about the efficiency of loading modules, since it’s (a) pretty fast, and (b) mostly only happens at startup.
If you have to load heavyweight modules at unexpected times, it probably makes more sense to load them dynamically with the __import__
function, and be sure to catch ImportError
exceptions, and handle them in a reasonable manner.
回答 6
我不会担心过多地预先加载模块的效率。模块占用的内存不会很大(假设它足够模块化),启动成本可以忽略不计。
在大多数情况下,您希望将模块加载到源文件的顶部。对于阅读您的代码的人来说,它更容易分辨出哪个功能或对象来自哪个模块。
将模块导入代码中其他位置的一个很好的理由是,如果该模块在调试语句中使用过。
例如:
do_something_with_x(x)
我可以使用以下命令调试它:
from pprint import pprint
pprint(x)
do_something_with_x(x)
当然,将模块导入代码中其他位置的另一个原因是是否需要动态导入它们。这是因为您几乎别无选择。
我不会担心过多地预先加载模块的效率。模块占用的内存不会很大(假设它足够模块化),启动成本可以忽略不计。
I wouldn’t worry about the efficiency of loading the module up front too much. The memory taken up by the module won’t be very big (assuming it’s modular enough) and the startup cost will be negligible.
In most cases you want to load the modules at the top of the source file. For somebody reading your code, it makes it much easier to tell what function or object came from what module.
One good reason to import a module elsewhere in the code is if it’s used in a debugging statement.
For example:
do_something_with_x(x)
I could debug this with:
from pprint import pprint
pprint(x)
do_something_with_x(x)
Of course, the other reason to import modules elsewhere in the code is if you need to dynamically import them. This is because you pretty much don’t have any choice.
I wouldn’t worry about the efficiency of loading the module up front too much. The memory taken up by the module won’t be very big (assuming it’s modular enough) and the startup cost will be negligible.
回答 7
这是一个折衷,只有程序员才能决定进行。
情况1通过在需要之前不导入datetime模块(并进行可能需要的任何初始化)来节省一些内存和启动时间。请注意,“仅在调用时”执行导入也意味着“在调用时每次”进行导入,因此第一个调用之后的每个调用仍会产生执行导入的额外开销。
情况2通过预先导入datetime来节省一些执行时间和延迟,以便not_often_drawn()在被调用时将更快地返回,并且还不会在每次调用时都导致导入开销。
除了效率外,如果import语句在…前面,则更容易在前面看到模块依赖性。将它们隐藏在代码中会使您更难于找到所需的模块。
就个人而言,除了单元测试之类的东西外,我通常都遵循PEP,因此我不希望总是加载它,因为我知道除了测试代码之外不会使用它们。
It’s a tradeoff, that only the programmer can decide to make.
Case 1 saves some memory and startup time by not importing the datetime module (and doing whatever initialization it might require) until needed. Note that doing the import ‘only when called’ also means doing it ‘every time when called’, so each call after the first one is still incurring the additional overhead of doing the import.
Case 2 save some execution time and latency by importing datetime beforehand so that not_often_called() will return more quickly when it is called, and also by not incurring the overhead of an import on every call.
Besides efficiency, it’s easier to see module dependencies up front if the import statements are … up front. Hiding them down in the code can make it more difficult to easily find what modules something depends on.
Personally I generally follow the PEP except for things like unit tests and such that I don’t want always loaded because I know they aren’t going to be used except for test code.
回答 8
这是一个示例,其中所有导入都位于最顶部(这是我唯一需要这样做的时间)。我希望能够在Un * x和Windows上终止子进程。
import os
# ...
try:
kill = os.kill # will raise AttributeError on Windows
from signal import SIGTERM
def terminate(process):
kill(process.pid, SIGTERM)
except (AttributeError, ImportError):
try:
from win32api import TerminateProcess # use win32api if available
def terminate(process):
TerminateProcess(int(process._handle), -1)
except ImportError:
def terminate(process):
raise NotImplementedError # define a dummy function
(评论:约翰·米利金说的话。)
Here’s an example where all the imports are at the very top (this is the only time I’ve needed to do this). I want to be able to terminate a subprocess on both Un*x and Windows.
import os
# ...
try:
kill = os.kill # will raise AttributeError on Windows
from signal import SIGTERM
def terminate(process):
kill(process.pid, SIGTERM)
except (AttributeError, ImportError):
try:
from win32api import TerminateProcess # use win32api if available
def terminate(process):
TerminateProcess(int(process._handle), -1)
except ImportError:
def terminate(process):
raise NotImplementedError # define a dummy function
(On review: what John Millikin said.)
回答 9
就像许多其他优化一样,您会牺牲一些可读性来提高速度。如John所述,如果您完成了概要分析作业,并且发现这是一项非常有用的更改,并且您需要额外的速度,则可以继续进行。在所有其他进口商品上加上注释可能会很好:
from foo import bar
from baz import qux
# Note: datetime is imported in SomeClass below
This is like many other optimizations – you sacrifice some readability for speed. As John mentioned, if you’ve done your profiling homework and found this to be a significantly useful enough change and you need the extra speed, then go for it. It’d probably be good to put a note up with all the other imports:
from foo import bar
from baz import qux
# Note: datetime is imported in SomeClass below
回答 10
模块初始化仅发生一次-在首次导入时。如果有问题的模块来自标准库,那么您也可能会从程序中的其他模块导入它。对于像日期时间一样普遍的模块,它也可能是许多其他标准库的依赖项。由于模块初始化已经发生,因此import语句的花费很少。此时,它所做的全部工作就是将现有模块对象绑定到本地范围。
将该信息与用于可读性的参数相结合,我想说最好在模块范围内使用import语句。
Module initialization only occurs once – on the first import. If the module in question is from the standard library, then you will likely import it from other modules in your program as well. For a module as prevalent as datetime, it is also likely a dependency for a slew of other standard libraries. The import statement would cost very little then since the module intialization would have happened already. All it is doing at this point is binding the existing module object to the local scope.
Couple that information with the argument for readability and I would say that it is best to have the import statement at module scope.
回答 11
只是为了完成萌的答案和原始问题:
当我们不得不处理循环依赖时,我们可以做一些“技巧”。假设我们正在与模块的工作a.py
,并b.py
包含x()
和B y()
分别。然后:
- 我们可以移动
from imports
模块底部的之一。
- 我们可以移动
from imports
实际上需要导入的函数或方法的内部之一(这并不总是可能的,因为您可以在多个地方使用它)。
- 我们可以将两者之一更改
from imports
为如下所示的导入:import a
因此,总结一下。如果您不是在处理循环依赖关系,而是采取某种技巧来避免它们,那么最好将所有导入内容放在顶部,因为在此问题的其他答案中已经说明了这些原因。并且,请在做“技巧”时添加评论,我们始终欢迎您!:)
Just to complete Moe’s answer and the original question:
When we have to deal with circular dependences we can do some “tricks”. Assuming we’re working with modules a.py
and b.py
that contain x()
and b y()
, respectively. Then:
- We can move one of the
from imports
at the bottom of the module.
- We can move one of the
from imports
inside the function or method that is actually requiring the import (this isn’t always possible, as you may use it from several places).
- We can change one of the two
from imports
to be an import that looks like: import a
So, to conclude. If you aren’t dealing with circular dependencies and doing some kind of trick to avoid them, then it’s better to put all your imports at the top because of the reasons already explained in other answers to this question. And please, when doing this “tricks” include a comment, it’s always welcome! :)
回答 12
除了已经给出的出色答案外,值得注意的是,进口商品的摆放不仅是风格问题。有时,模块具有隐式依赖关系,需要首先导入或初始化,而顶级导入可能会导致违反所需的执行顺序。
这个问题通常出现在Apache Spark的Python API中,您需要在导入任何pyspark软件包或模块之前初始化SparkContext。最好将pyspark导入放置在保证SparkContext可用的范围内。
In addition to the excellent answers already given, it’s worth noting that the placement of imports is not merely a matter of style. Sometimes a module has implicit dependencies that need to be imported or initialized first, and a top-level import could lead to violations of the required order of execution.
This issue often comes up in Apache Spark’s Python API, where you need to initialize the SparkContext before importing any pyspark packages or modules. It’s best to place pyspark imports in a scope where the SparkContext is guaranteed to be available.
回答 13
我很惊讶地没有看到已经发布的重复负载检查的实际成本数字,尽管对预期的结果有很多很好的解释。
如果您在顶部导入,则无论如何都会承受重击。这个数字很小,但是通常以毫秒为单位,而不是纳秒。
如果导入功能(S)之内,那么你只需要命中的加载,如果和当首次调用这些功能之一。正如许多人指出的那样,如果根本不发生这种情况,则可以节省加载时间。但是,如果函数被调用很多,您将遭受一次重复的打击,尽管命中率要小得多(用于检查它是否已加载;不是实际重新加载)。另一方面,正如@aaronasterling指出的那样,您还可以节省一点,因为在函数中进行导入使该函数可以使用稍快的局部变量查找来稍后标识名称(http://stackoverflow.com/questions/477096/python- import-coding-style / 4789963#4789963)。
这是一个简单测试的结果,该测试从函数内部导入了一些东西。报告的时间(在2.3 GHz Intel Core i7上的Python 2.7.14中)显示如下(第二次调用比以后的调用更多,这似乎是一致的,尽管我不知道为什么)。
0 foo: 14429.0924 µs
1 foo: 63.8962 µs
2 foo: 10.0136 µs
3 foo: 7.1526 µs
4 foo: 7.8678 µs
0 bar: 9.0599 µs
1 bar: 6.9141 µs
2 bar: 7.1526 µs
3 bar: 7.8678 µs
4 bar: 7.1526 µs
编码:
from __future__ import print_function
from time import time
def foo():
import collections
import re
import string
import math
import subprocess
return
def bar():
import collections
import re
import string
import math
import subprocess
return
t0 = time()
for i in xrange(5):
foo()
t1 = time()
print(" %2d foo: %12.4f \xC2\xB5s" % (i, (t1-t0)*1E6))
t0 = t1
for i in xrange(5):
bar()
t1 = time()
print(" %2d bar: %12.4f \xC2\xB5s" % (i, (t1-t0)*1E6))
t0 = t1
I was surprised not to see actual cost numbers for the repeated load-checks posted already, although there are many good explanations of what to expect.
If you import at the top, you take the load hit no matter what. That’s pretty small, but commonly in the milliseconds, not nanoseconds.
If you import within a function(s), then you only take the hit for loading if and when one of those functions is first called. As many have pointed out, if that doesn’t happen at all, you save the load time. But if the function(s) get called a lot, you take a repeated though much smaller hit (for checking that it has been loaded; not for actually re-loading). On the other hand, as @aaronasterling pointed out you also save a little because importing within a function lets the function use slightly-faster local variable lookups to identify the name later (http://stackoverflow.com/questions/477096/python-import-coding-style/4789963#4789963).
Here are the results of a simple test that imports a few things from inside a function. The times reported (in Python 2.7.14 on a 2.3 GHz Intel Core i7) are shown below (the 2nd call taking more than later calls seems consistent, though I don’t know why).
0 foo: 14429.0924 µs
1 foo: 63.8962 µs
2 foo: 10.0136 µs
3 foo: 7.1526 µs
4 foo: 7.8678 µs
0 bar: 9.0599 µs
1 bar: 6.9141 µs
2 bar: 7.1526 µs
3 bar: 7.8678 µs
4 bar: 7.1526 µs
The code:
from __future__ import print_function
from time import time
def foo():
import collections
import re
import string
import math
import subprocess
return
def bar():
import collections
import re
import string
import math
import subprocess
return
t0 = time()
for i in xrange(5):
foo()
t1 = time()
print(" %2d foo: %12.4f \xC2\xB5s" % (i, (t1-t0)*1E6))
t0 = t1
for i in xrange(5):
bar()
t1 = time()
print(" %2d bar: %12.4f \xC2\xB5s" % (i, (t1-t0)*1E6))
t0 = t1
回答 14
我不希望提供完整的答案,因为其他人已经做得很好。当我发现在功能内部导入模块特别有用时,我只想提及一个用例。我的应用程序使用存储在特定位置的python软件包和模块作为插件。在应用程序启动期间,应用程序遍历该位置的所有模块并将其导入,然后在模块内部查找,如果找到了插件的安装点(在我的情况下,它是具有唯一标识的某些基类的子类ID)将其注册。插件的数量很大(现在有几十个,但将来可能有数百个),每个插件很少使用。在应用程序启动过程中,在我的插件模块顶部添加了第三方库,这会带来一些损失。尤其是某些第三方库的导入非常繁重(例如,密谋导入甚至尝试连接到Internet并下载一些内容,这些内容在启动时增加了大约一秒钟的时间)。通过优化插件中的导入(仅在使用它们的函数中调用它们),我设法将启动时间从10秒缩短到大约2秒。对于我的用户而言,这是一个很大的差异。
所以我的答案是不,不要总是将导入放在模块的顶部。
I do not aspire to provide complete answer, because others have already done this very well. I just want to mention one use case when I find especially useful to import modules inside functions. My application uses python packages and modules stored in certain location as plugins. During application startup, the application walks through all the modules in the location and imports them, then it looks inside the modules and if it finds some mounting points for the plugins (in my case it is a subclass of a certain base class having a unique ID) it registers them. The number of plugins is large (now dozens, but maybe hundreds in the future) and each of them is used quite rarely. Having imports of third party libraries at the top of my plugin modules was a bit penalty during application startup. Especially some thirdparty libraries are heavy to import (e.g. import of plotly even tries to connect to internet and download something which was adding about one second to startup). By optimizing imports (calling them only in the functions where they are used) in the plugins I managed to shrink the startup from 10 seconds to some 2 seconds. That is a big difference for my users.
So my answer is no, do not always put the imports at the top of your modules.
回答 15
有趣的是,到目前为止,没有一个答案提到了并行处理,当序列化的函数代码被推到其他内核时,例如ipyparallel的情况,可能需要在函数中引入导入。
It’s interesting that not a single answer mentioned parallel processing so far, where it might be REQUIRED that the imports are in the function, when the serialized function code is what is being pushed around to other cores, e.g. like in the case of ipyparallel.
回答 16
通过将变量/局部作用域导入函数内部,可以提高性能。这取决于函数中导入事物的用法。如果要循环很多次并访问模块全局对象,则将其作为本地导入可以有所帮助。
test.py
X=10
Y=11
Z=12
def add(i):
i = i + 10
runlocal.py
from test import add, X, Y, Z
def callme():
x=X
y=Y
z=Z
ladd=add
for i in range(100000000):
ladd(i)
x+y+z
callme()
运行
from test import add, X, Y, Z
def callme():
for i in range(100000000):
add(i)
X+Y+Z
callme()
在Linux上使用一段时间显示收益很小
/usr/bin/time -f "\t%E real,\t%U user,\t%S sys" python run.py
0:17.80 real, 17.77 user, 0.01 sys
/tmp/test$ /usr/bin/time -f "\t%E real,\t%U user,\t%S sys" python runlocal.py
0:14.23 real, 14.22 user, 0.01 sys
真正的是壁钟。用户是程序中的时间。sys是时候进行系统调用了。
https://docs.python.org/3.5/reference/executionmodel.html#resolution-of-names
There can be a performance gain by importing variables/local scoping inside of a function. This depends on the usage of the imported thing inside the function. If you are looping many times and accessing a module global object, importing it as local can help.
test.py
X=10
Y=11
Z=12
def add(i):
i = i + 10
runlocal.py
from test import add, X, Y, Z
def callme():
x=X
y=Y
z=Z
ladd=add
for i in range(100000000):
ladd(i)
x+y+z
callme()
run.py
from test import add, X, Y, Z
def callme():
for i in range(100000000):
add(i)
X+Y+Z
callme()
A time on Linux shows a small gain
/usr/bin/time -f "\t%E real,\t%U user,\t%S sys" python run.py
0:17.80 real, 17.77 user, 0.01 sys
/tmp/test$ /usr/bin/time -f "\t%E real,\t%U user,\t%S sys" python runlocal.py
0:14.23 real, 14.22 user, 0.01 sys
real is wall clock. user is time in program. sys is time for system calls.
https://docs.python.org/3.5/reference/executionmodel.html#resolution-of-names
回答 17
可读性
除了启动性能外,还有一个可读性参数可用于本地化import
语句。例如,在我当前的第一个python项目中,使用python行号1283到1296:
listdata.append(['tk font version', font_version])
listdata.append(['Gtk version', str(Gtk.get_major_version())+"."+
str(Gtk.get_minor_version())+"."+
str(Gtk.get_micro_version())])
import xml.etree.ElementTree as ET
xmltree = ET.parse('/usr/share/gnome/gnome-version.xml')
xmlroot = xmltree.getroot()
result = []
for child in xmlroot:
result.append(child.text)
listdata.append(['Gnome version', result[0]+"."+result[1]+"."+
result[2]+" "+result[3]])
如果该import
语句位于文件的顶部,则必须向上滚动很长一段距离,或者按Home,以查找内容ET
。然后,我将不得不导航回到第1283行以继续阅读代码。
确实,即使 import
语句位于函数(或类)的顶部(如许多语句所放置的那样),也需要向上和向下分页。
显示Gnome版本号的操作很少,因此import
文件顶部会引入不必要的启动延迟。
Readability
In addition to startup performance, there is a readability argument to be made for localizing import
statements. For example take python line numbers 1283 through 1296 in my current first python project:
listdata.append(['tk font version', font_version])
listdata.append(['Gtk version', str(Gtk.get_major_version())+"."+
str(Gtk.get_minor_version())+"."+
str(Gtk.get_micro_version())])
import xml.etree.ElementTree as ET
xmltree = ET.parse('/usr/share/gnome/gnome-version.xml')
xmlroot = xmltree.getroot()
result = []
for child in xmlroot:
result.append(child.text)
listdata.append(['Gnome version', result[0]+"."+result[1]+"."+
result[2]+" "+result[3]])
If the import
statement was at the top of file I would have to scroll up a long way, or press Home, to find out what ET
was. Then I would have to navigate back to line 1283 to continue reading code.
Indeed even if the import
statement was at the top of the function (or class) as many would place it, paging up and back down would be required.
Displaying the Gnome version number will rarely be done so the import
at top of file introduces unnecessary startup lag.
回答 18
我想提一下我的一个用例,与@John Millikin和@VK提到的用例非常相似:
可选进口
我使用Jupyter Notebook进行数据分析,并且使用相同的IPython Notebook作为所有分析的模板。在某些情况下,我需要导入Tensorflow来进行一些快速的模型运行,但有时我会在未设置tensorflow或导入缓慢的地方工作。在这些情况下,我将依赖Tensorflow的操作封装在一个辅助函数中,将tensorflow导入该函数内部,并将其绑定到按钮。
这样,我可以“重新启动并运行所有程序”,而不必等待导入,也不必在导入失败时恢复其余的单元格。
I would like to mention a usecase of mine, very similar to those mentioned by @John Millikin and @V.K. :
Optional Imports
I do data analysis with Jupyter Notebook, and I use the same IPython notebook as a template for all analyses. In some occasions, I need to import Tensorflow to do some quick model runs, but sometimes I work in places where tensorflow isn’t set up / is slow to import. In those cases, I encapsulate my Tensorflow-dependent operations in a helper function, import tensorflow inside that function, and bind it to a button.
This way, I could do “restart-and-run-all” without having to wait for the import, or having to resume the rest of the cells when it fails.
回答 19
这是一个有趣的讨论。像许多其他人一样,我什至从未考虑过这个话题。由于想要在我的一个库中使用Django ORM,我不得不在函数中具有导入功能。我不得不打电话django.setup()
在导入模型类之前,我,因为这是文件的顶部,由于IoC注入器的构造,它被拖到了完全非Django的库代码中。
我有点四处乱窜,最后将django.setup()
in放在单例构造函数中,并将相关的导入放在每个类方法的顶部。现在,这种方法工作正常,但是由于进口商品不在顶部而使我感到不安,而且我也开始担心进口商品的额外时间。然后我来到这里,以极大的兴趣阅读了大家对此的看法。
我有很长的C ++背景,现在使用Python / Cython。我对此的看法是,为什么不将导入内容放入函数中,除非它导致概要分析的瓶颈。这就像在需要变量之前为变量声明空间。麻烦的是,我有数千行代码,所有导入都在顶部!所以我想从现在开始,当我经过并有时间时,在这里和那里更改奇数文件。
This is a fascinating discussion. Like many others I had never even considered this topic. I got cornered into having to have the imports in the functions because of wanting to use the Django ORM in one of my libraries. I was having to call django.setup()
before importing my model classes and because this was at the top of the file it was being dragged into completely non-Django library code because of the IoC injector construction.
I kind of hacked around a bit and ended up putting the django.setup()
in the singleton constructor and the relevant import at the top of each class method. Now this worked fine but made me uneasy because the imports weren’t at the top and also I started worrying about the extra time hit of the imports. Then I came here and read with great interest everybody’s take on this.
I have a long C++ background and now use Python/Cython. My take on this is that why not put the imports in the function unless it causes you a profiled bottleneck. It’s only like declaring space for variables just before you need them. The trouble is I have thousands of lines of code with all the imports at the top! So I think I will do it from now on and change the odd file here and there when I’m passing through and have the time.