from xml.etree importElementTreeas ET
tree = ET.parse(r"test.xml")
el1 = tree.findall("DEAL_LEVEL/PAID_OFF")# Return None
el2 = tree.findall("{http://www.test.com}DEAL_LEVEL/{http://www.test.com}PAID_OFF")# Return <Element '{http://www.test.com}DEAL_LEVEL/PAID_OFF' at 0xb78b90>
from xml.etree import ElementTree as ET
tree = ET.parse(r"test.xml")
el1 = tree.findall("DEAL_LEVEL/PAID_OFF") # Return None
el2 = tree.findall("{http://www.test.com}DEAL_LEVEL/{http://www.test.com}PAID_OFF") # Return <Element '{http://www.test.com}DEAL_LEVEL/PAID_OFF' at 0xb78b90>
Although it can works, because there is a namespace “{http://www.test.com}”, it’s very inconvenient to add a namespace in front of each tag.
How can I ignore the namespace when using the method of “find”, “findall” and so on?
from io importStringIO# for Python 2 import from StringIO insteadimport xml.etree.ElementTreeas ET
# instead of ET.fromstring(xml)
it = ET.iterparse(StringIO(xml))for _, el in it:
prefix, has_namespace, postfix = el.tag.partition('}')if has_namespace:
el.tag = postfix # strip all namespaces
root = it.root
Instead of modifying the XML document itself, it’s best to parse it and then modify the tags in the result. This way you can handle multiple namespaces and namespace aliases:
from io import StringIO # for Python 2 import from StringIO instead
import xml.etree.ElementTree as ET
# instead of ET.fromstring(xml)
it = ET.iterparse(StringIO(xml))
for _, el in it:
prefix, has_namespace, postfix = el.tag.partition('}')
if has_namespace:
el.tag = postfix # strip all namespaces
root = it.root
fromStringIOimportStringIOimport xml.etree.ElementTreeas ET
# instead of ET.fromstring(xml)
it = ET.iterparse(StringIO(xml))for _, el in it:if'}'in el.tag:
el.tag = el.tag.split('}',1)[1]# strip all namespacesfor at in list(el.attrib.keys()):# strip namespaces of attributes tooif'}'in at:
newat = at.split('}',1)[1]
el.attrib[newat]= el.attrib[at]del el.attrib[at]
root = it.root
Here’s an extension to nonagon’s answer, which also strips namespaces off attributes:
from StringIO import StringIO
import xml.etree.ElementTree as ET
# instead of ET.fromstring(xml)
it = ET.iterparse(StringIO(xml))
for _, el in it:
if '}' in el.tag:
el.tag = el.tag.split('}', 1)[1] # strip all namespaces
for at in list(el.attrib.keys()): # strip namespaces of attributes too
if '}' in at:
newat = at.split('}', 1)[1]
el.attrib[newat] = el.attrib[at]
del el.attrib[at]
root = it.root
UPDATE: added list() so the iterator works (needed for Python 3)
import xml.etree.ElementTree as ET
with DisableXmlNamespaces():
tree = ET.parse("test.xml")
The beauty of this way is that it does not change any behaviour for unrelated code outside the with block. I ended up creating this after getting errors in unrelated libraries after using the version by ericspod which also happened to use expat.
ElementTree tries to use Expat by calling ParserCreate() but provides no option to not provide a namespace separator string, the above code will cause it to be ignore but be warned this could break other things.
回答 7
我为此可能会迟到,但我认为这re.sub不是一个好的解决方案。
但是,该重写xml.parsers.expat不适用于Python 3.x版本,
罪魁祸首是xml/etree/ElementTree.py源代码的底部
# Import the C acceleratorstry:# Element is going to be shadowed by the C implementation. We need to keep# the Python version of it accessible for some "creative" by external code# (see tests)_Element_Py=Element# Element, SubElement, ParseError, TreeBuilder, XMLParserfrom _elementtree import*exceptImportError:pass
I might be late for this but I dont think re.sub is a good solution.
However the rewrite xml.parsers.expat does not work for Python 3.x versions,
The main culprit is the xml/etree/ElementTree.py see bottom of the source code
# Import the C accelerators
try:
# Element is going to be shadowed by the C implementation. We need to keep
# the Python version of it accessible for some "creative" by external code
# (see tests)
_Element_Py = Element
# Element, SubElement, ParseError, TreeBuilder, XMLParser
from _elementtree import *
except ImportError:
pass
Which is kinda sad.
The solution is to get rid of it first.
import _elementtree
try:
del _elementtree.XMLParser
except AttributeError:
# in case deleted twice
pass
else:
from xml.parsers import expat # NOQA: F811
oldcreate = expat.ParserCreate
expat.ParserCreate = lambda encoding, sep: oldcreate(encoding, None)
Tested on Python 3.6.
Try try statement is useful in case somewhere in your code you reload or import a module twice you get some strange errors like
maximum recursion depth exceeded
AttributeError: XMLParser
btw damn the etree source code looks really messy.
Create an iterator to get both namespaces and a parsed tree object.
Iterate over the created iterator to get the namespaces dict that we can
later pass in each find() or findall() call as sugested by
iMom0.
Return the parsed tree’s root element object and namespaces.
I think this is the best approach all around as there’s no manipulation either of a source XML or resulting parsed xml.etree.ElementTree output whatsoever involved.
I’d like also to credit barny’s answer with providing an essential piece of this puzzle (that you can get the parsed root from the iterator). Until that I actually traversed XML tree twice in my application (once to get namespaces, second for a root).
>>>from datetime import datetime
>>> datetime
<type 'datetime.datetime'>>>> datetime.datetime(2001,5,1)# You shouldn't expect this to work # as you imported the type, not the moduleTraceback(most recent call last):File"<stdin>", line 1,in<module>AttributeError: type object 'datetime.datetime' has no attribute 'datetime'>>> datetime(2001,5,1)
datetime.datetime(2001,5,1,0,0)
我怀疑您或您正在使用的模块之一已这样导入:
from datetime import datetime。
Datetime is a module that allows for handling of dates, times and datetimes (all of which are datatypes). This means that datetime is both a top-level module as well as being a type within that module. This is confusing.
Your error is probably based on the confusing naming of the module, and what either you or a module you’re using has already imported.
>>> from datetime import datetime
>>> datetime
<type 'datetime.datetime'>
>>> datetime.datetime(2001,5,1) # You shouldn't expect this to work
# as you imported the type, not the module
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'datetime.datetime' has no attribute 'datetime'
>>> datetime(2001,5,1)
datetime.datetime(2001, 5, 1, 0, 0)
I suspect you or one of the modules you’re using has imported like this:
from datetime import datetime.
回答 1
对于python 3.3
from datetime import datetime, timedelta
futuredate = datetime.now()+ timedelta(days=10)
I run into the same error maybe you have already imported the module by using only import datetime so change form datetime import datetime to only import datetime. It worked for me after I changed it back.
回答 7
from datetime import datetime
import time
from calendar import timegm
d = datetime.utcnow()
d = d.strftime("%Y-%m-%dT%H:%M:%S.%fZ")
utc_time = time.strptime(d,"%Y-%m-%dT%H:%M:%S.%fZ")
epoch_time = timegm(utc_time)
from datetime import datetime
import time
from calendar import timegm
d = datetime.utcnow()
d = d.strftime("%Y-%m-%dT%H:%M:%S.%fZ")
utc_time = time.strptime(d,"%Y-%m-%dT%H:%M:%S.%fZ")
epoch_time = timegm(utc_time)
Traceback(most recent call last):File"Z:\Python\main.py", line 10,in<module>
module1.f()File"Z:\Python\module1.py", line 3,in f
print a
NameError:global name 'a'isnot defined
I’ve run into a bit of a wall importing modules in a Python script. I’ll do my best to describe the error, why I run into it, and why I’m tying this particular approach to solve my problem (which I will describe in a second):
Let’s suppose I have a module in which I’ve defined some utility functions/classes, which refer to entities defined in the namespace into which this auxiliary module will be imported (let “a” be such an entity):
module1:
def f():
print a
And then I have the main program, where “a” is defined, into which I want to import those utilities:
import module1
a=3
module1.f()
Executing the program will trigger the following error:
Traceback (most recent call last):
File "Z:\Python\main.py", line 10, in <module>
module1.f()
File "Z:\Python\module1.py", line 3, in f
print a
NameError: global name 'a' is not defined
Similar questions have been asked in the past (two days ago, d’uh) and several solutions have been suggested, however I don’t really think these fit my requirements. Here’s my particular context:
I’m trying to make a Python program which connects to a MySQL database server and displays/modifies data with a GUI. For cleanliness sake, I’ve defined the bunch of auxiliary/utility MySQL-related functions in a separate file. However they all have a common variable, which I had originally defined inside the utilities module, and which is the cursor object from MySQLdb module.
I later realised that the cursor object (which is used to communicate with the db server) should be defined in the main module, so that both the main module and anything that is imported into it can access that object.
End result would be something like this:
utilities_module.py:
def utility_1(args):
code which references a variable named "cur"
def utility_n(args):
etcetera
And my main module:
program.py:
import MySQLdb, Tkinter
db=MySQLdb.connect(#blahblah) ; cur=db.cursor() #cur is defined!
from utilities_module import *
And then, as soon as I try to call any of the utilities functions, it triggers the aforementioned “global name not defined” error.
A particular suggestion was to have a “from program import cur” statement in the utilities file, such as this:
utilities_module.py:
from program import cur
#rest of function definitions
program.py:
import Tkinter, MySQLdb
db=MySQLdb.connect(#blahblah) ; cur=db.cursor() #cur is defined!
from utilities_module import *
But that’s cyclic import or something like that and, bottom line, it crashes too. So my question is:
How in hell can I make the “cur” object, defined in the main module, visible to those auxiliary functions which are imported into it?
Thanks for your time and my deepest apologies if the solution has been posted elsewhere. I just can’t find the answer myself and I’ve got no more tricks in my book.
Globals in Python are global to a module, not across all modules. (Many people are confused by this, because in, say, C, a global is the same across all implementation files unless you explicitly make it static.)
There are different ways to solve this, depending on your actual use case.
Before even going down this path, ask yourself whether this really needs to be global. Maybe you really want a class, with f as an instance method, rather than just a free function? Then you could do something like this:
Don’t use a from import unless the variable is intended to be a constant. from shared_stuff import a would create a new a variable initialized to whatever shared_stuff.a referred to at the time of the import, and this new a variable would not be affected by assignments to shared_stuff.a.
Or, in the rare case that you really do need it to be truly global everywhere, like a builtin, add it to the builtin module. The exact details differ between Python 2.x and 3.x. In 3.x, it works like this:
A function uses the globals of the module it’s defined in. Instead of setting a = 3, for example, you should be setting module1.a = 3. So, if you want cur available as a global in utilities_module, set utilities_module.cur.
A better solution: don’t use globals. Pass the variables you need into the functions that need it, or create a class to bundle all the data together, and pass it when initializing the instance.
This post is just an observation for Python behaviour I encountered. Maybe the advices you read above don’t work for you if you made the same thing I did below.
Namely, I have a module which contains global/shared variables (as suggested above):
Then I had the main module which imports the shared stuff with:
import sharedstuff as shared
and some other modules that actually populated these arrays. These are called by the main module. When exiting these other modules I can clearly see that the arrays are populated. But when reading them back in the main module, they were empty. This was rather strange for me (well, I am new to Python). However, when I change the way I import the sharedstuff.py in the main module to:
The easiest solution to this particular problem would have been to add another function within the module that would have stored the cursor in a variable global to the module. Then all the other functions could use it as well.
module1:
cursor = None
def setCursor(cur):
global cursor
cursor = cur
def method(some, args):
global cursor
do_stuff(cursor, some, args)
Since I haven’t seen it in the answers above, I thought I would add my simple workaround, which is just to add a global_dict argument to the function requiring the calling module’s globals, and then pass the dict into the function when calling; e.g:
# external_module
def imported_function(global_dict=None):
print(global_dict["a"])
# calling_module
a = 12
from external_module import imported_function
imported_function(global_dict=globals())
>>> 12
The OOP way of doing this would be to make your module a class instead of a set of unbound methods. Then you could use __init__ or a setter method to set the variables from the caller for use in the module methods.
In Python, a namespace package allows you to spread Python code among several projects. This is useful when you want to release related libraries as separate downloads. For example, with the directories Package-1 and Package-2 in PYTHONPATH,
On Python 3.3 you don’t have to do anything, just don’t put any __init__.py in your namespace package directories and it will just work. On pre-3.3, choose the pkgutil.extend_path() solution over the pkg_resources.declare_namespace() one, because it’s future-proof and already compatible with implicit namespace packages.
Python 3.3 introduces implicit namespace packages, see PEP 420.
This means there are now three types of object that can be created by an import foo:
A module represented by a foo.py file
A regular package, represented by a directory foo containing an __init__.py file
A namespace package, represented by one or more directories foo without any __init__.py files
Packages are modules too, but here I mean “non-package module” when I say “module”.
First it scans sys.path for a module or regular package. If it succeeds, it stops searching and creates and initalizes the module or package. If it found no module or regular package, but it found at least one directory, it creates and initializes a namespace package.
Modules and regular packages have __file__ set to the .py file they were created from. Regular and namespace packages have __path__set to the directory or directories they were created from.
When you do import foo.bar, the above search happens first for foo, then if a package was found, the search for bar is done with foo.__path__as the search path instead of sys.path. If foo.bar is found, foo and foo.bar are created and initialized.
So how do regular packages and namespace packages mix? Normally they don’t, but the old pkgutil explicit namespace package method has been extended to include implicit namespace packages.
If you have an existing regular package that has an __init__.py like this:
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
… the legacy behavior is to add any other regular packages on the searched path to its __path__. But in Python 3.3, it also adds namespace packages.
So you can have the following directory structure:
… and as long as the two __init__.py have the extend_path lines (and path1, path2 and path3 are in your sys.path) import package.foo, import package.bar and import package.baz will all work.
pkg_resources.declare_namespace(__name__) has not been updated to include implicit namespace packages.
This is an old question, but someone recently commented on my blog that my posting about namespace packages was still relevant, so thought I would link to it here as it provides a practical example of how to make it go:
The __import__("pkg_resources").declare_namespace(__name__) trick is pretty much drives the management of plugins in TiddlyWeb and thus far seems to be working out.
You have your Python namespace concepts back to front, it is not possible in python to put packages into modules. Packages contain modules not the other way around.
A Python package is simply a folder containing a __init__.py file. A module is any other file in a package (or directly on the PYTHONPATH) that has a .py extension. So in your example you have two packages but no modules defined. If you consider that a package is a file system folder and a module is file then you see why packages contain modules and not the other way around.
So in your example assuming Package-1 and Package-2 are folders on the file system that you have put on the Python path you can have the following:
You now have one package namespace with two modules module1 and module2. and unless you have a good reason you should probably put the modules in the folder and have only that on the python path like below:
I have a fairly complex “product” I’m getting ready to build using Django. I’m going to avoid using the terms “project” and “application” in this context, because I’m not clear on their specific meaning in Django.
Projects can have many apps. Apps can be shared among many projects. Fine.
I’m not reinventing the blog or forum – I don’t see any portion of my product being reusable in any context. Intuitively, I would call this one “application.” Do I then do all my work in a single “app” folder?
If so… in terms of Django’s project.app namespace, my inclination is to use myproduct.myproduct, but of course this isn’t allowed (but the application I’m building is my project, and my project is an application!). I’m therefore lead to believe that perhaps I’m supposed to approach Django by building one app per “significant” model, but I don’t know where to draw the boundaries in my schema to separate it into apps – I have a lot of models with relatively complex relationships.
and so on. Would it help if I said views.py doesn’t have to be called views.py? Provided you can name, on the python path, a function (usually package.package.views.function_name) it will get handled. Simple as that. All this “project”/”app” stuff is just python packages.
Now, how are you supposed to do it? Or rather, how might I do it? Well, if you create a significant piece of reusable functionality, like say a markup editor, that’s when you create a “top level app” which might contain widgets.py, fields.py, context_processors.py etc – all things you might want to import.
Similarly, if you can create something like a blog in a format that is pretty generic across installs, you can wrap it up in an app, with its own template, static content folder etc, and configure an instance of a django project to use that app’s content.
There are no hard and fast rules saying you must do this, but it is one of the goals of the framework. The fact that everything, templates included, allows you to include from some common base means your blog should fit snugly into any other setup, simply by looking after its own part.
However, to address your actual concern, yes, nothing says you can’t work with the top level project folder. That’s what apps do and you can do it if you really want to. I tend not to, however, for several reasons:
Django’s default setup doesn’t do it.
Often, I want to create a main app, so I create one, usually called website. However, at a later date I might want to develop original functionality just for this site. With a view to making it removable (whether or not I ever do) I tend to then create a separate directory. This also means I can drop said functionality just by unlinking that package from the config and removing the folder, rather than a complex delete the right urls from a global urls.py folder.
Very often, even when I want to make something independent, it needs somewhere to live whilst I look after it / make it independent. Basically the above case, but for stuff I do intend to make generic.
My top level folder often contains a few other things, including but not limited to wsgi scripts, sql scripts etc.
django’s management extensions rely on subdirectories. So it makes sense to name packages appropriately.
In short, the reason there is a convention is the same as any other convention – it helps when it comes to others working with your project. If I see fields.py I immediately expect code in it to subclass django’s field, whereas if I see inputtypes.py I might not be so clear on what that means without looking at it.
Once you graduate from using startproject and startapp, there’s nothing to stop you from combining a “project” and “app” in the same Python package. A project is really nothing more than a settings module, and an app is really nothing more than a models module—everything else is optional.
For small sites, it’s entirely reasonable to have something like:
Try to answer question: “What does my
application do?”. If you cannot answer
in a single sentence, then maybe you can
split it into several apps with cleaner
logic.
I read this thought somewhere soon after I’ve started to work with django and I find that I ask this question of myself quite often and it helps me.
Your apps don’t have to be reusable, they can depend on each other, but they should do one thing.
If so… in terms of Django’s project.app namespace, my inclination is to usemyproduct.myproduct, but of course this isn’t allowed
There is nothing like not allowed. Its your project, no one is restricting you. It is advisable to keep a reasonable name.
I don’t see any portion of my product being reusable in any context. Intuitively, I would call this one “application.” Do I then do all my work in a single “app” folder?
In a general django project there are many apps (contrib apps) which are used really in every project.
Let us say that your project does only one task and has only a single app (I name it main as thethe project revolves around it and is hardly pluggable). This project too still uses some other apps generally.
Now if you say that your project is using just the one app (INSTALLED_APPS='myproduct') so what is use of project defining the project as project.app, I think you should consider some points:
There are many other things that the code other than the app in a project handles (base static files, base templates, settings….i.e. provides the base).
In the general project.app approach django automatically defines sql schema from models.
Your project would be much easier to be built with the conventional approach.
You may define some different names for urls, views and other files as you wish, but I don’t see the need.
You might need to add some applications in future which would be real easy with the conventional django projects which otherwise it may become equally or more difficult and tedious to do.
As far as most of the work being done in the app is concerned, I think that is the case with most of django projects.
Imagine that you are creating big dynamic web app basing on JavaScript.
You can create then in django App named e.g “FrontEnd” <– in thins app you will display content.
Then you create some backend Apps. E.g App named “Comments” that will store user comments. And “Comments” App will not display anything itself. It will be just API for AJAX requests of your dynamicJSwebsite.
In this way you can always reuse your “Comments” app. You can make it open source without opening source of whole project. And you keep clean logic of your project.
It’s a list of public objects of that module, as interpreted by import *. It overrides the default of hiding everything that begins with an underscore.
Linked to, but not explicitly mentioned here, is exactly when __all__ is used. It is a list of strings defining what symbols in a module will be exported when from <module> import * is used on the module.
For example, the following code in a foo.py explicitly exports the symbols bar and baz:
from foo import *
print(bar)
print(baz)
# The following will trigger an exception, as "waz" is not exported by the module
print(waz)
If the __all__ above is commented out, this code will then execute to completion, as the default behaviour of import * is to import all symbols that do not begin with an underscore, from the given namespace.
NOTE:__all__ affects the from <module> import * behavior only. Members that are not mentioned in __all__ are still accessible from outside the module and can be imported with from <module> import <member>.
from.module_1 import*# also constrained by __all__'sfrom.module_2 import*# in the __init__.py's
__all__ =['foo','Bar']# further constraining the names advertised
$ cat > main.py
from lib import export
__all__ =[]# optional - we create a list if __all__ is not there.@exportdef foo():pass@exportdef bar():'bar'def main():print('main')if __name__ =='__main__':
main()
无论是作为主程序运行还是由其他函数导入,此方法都可以正常工作。
$ cat > run.py
import main
main.main()
$ python run.py
main
和API供应import *也将起作用:
$ cat > run.py
from main import*
foo()
bar()
main()# expected to error here, not exported
$ python run.py
Traceback(most recent call last):File"run.py", line 4,in<module>
main()# expected to error here, not exportedNameError: name 'main'isnot defined
I keep seeing the variable __all__ set in different __init__.py files.
What does this do?
What does __all__ do?
It declares the semantically “public” names from a module. If there is a name in __all__, users are expected to use it, and they can have the expectation that it will not change.
It also will have programmatic affects:
import *
__all__ in a module, e.g. module.py:
__all__ = ['foo', 'Bar']
means that when you import * from the module, only those names in the __all__ are imported:
from module import * # imports foo and Bar
Documentation tools
Documentation and code autocompletion tools may (in fact, should) also inspect the __all__ to determine what names to show as available from a module.
The __init__.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as string, from unintentionally hiding valid modules that occur later on the module search path.
In the simplest case, __init__.py can just be an empty file, but it can also execute initialization code for the package or set the __all__ variable.
So the __init__.py can declare the __all__ for a package.
Managing an API:
A package is typically made up of modules that may import one another, but that are necessarily tied together with an __init__.py file. That file is what makes the directory an actual Python package. For example, say you have the following files in a package:
And you can easily add things to your API that you can manage at the subpackage level instead of the subpackage’s module level. If you want to add a new name to the API, you simply update the __init__.py, e.g. in module_2:
from .Bar_implementation import *
from .Baz_implementation import *
__all__ = ['Bar', 'Baz']
And if you’re not ready to publish Baz in the top level API, in your top level __init__.py you could have:
from .module_1 import * # also constrained by __all__'s
from .module_2 import * # in the __init__.py's
__all__ = ['foo', 'Bar'] # further constraining the names advertised
and if your users are aware of the availability of Baz, they can use it:
import package
package.Baz()
but if they don’t know about it, other tools (like pydoc) won’t inform them.
You can later change that when Baz is ready for prime time:
from .module_1 import *
from .module_2 import *
__all__ = ['foo', 'Bar', 'Baz']
Prefixing _ versus __all__:
By default, Python will export all names that do not start with an _. You certainly could rely on this mechanism. Some packages in the Python standard library, in fact, do rely on this, but to do so, they alias their imports, for example, in ctypes/__init__.py:
import os as _os, sys as _sys
Using the _ convention can be more elegant because it removes the redundancy of naming the names again. But it adds the redundancy for imports (if you have a lot of them) and it is easy to forget to do this consistently – and the last thing you want is to have to indefinitely support something you intended to only be an implementation detail, just because you forgot to prefix an _ when naming a function.
I personally write an __all__ early in my development lifecycle for modules so that others who might use my code know what they should use and not use.
Most packages in the standard library also use __all__.
When avoiding __all__ makes sense
It makes sense to stick to the _ prefix convention in lieu of __all__ when:
You’re still in early development mode and have no users, and are constantly tweaking your API.
Maybe you do have users, but you have unittests that cover the API, and you’re still actively adding to the API and tweaking in development.
An export decorator
The downside of using __all__ is that you have to write the names of functions and classes being exported twice – and the information is kept separate from the definitions. We could use a decorator to solve this problem.
I got the idea for such an export decorator from David Beazley’s talk on packaging. This implementation seems to work well in CPython’s traditional importer. If you have a special import hook or system, I do not guarantee it, but if you adopt it, it is fairly trivial to back out – you’ll just need to manually add the names back into the __all__
So in, for example, a utility library, you would define the decorator:
import sys
def export(fn):
mod = sys.modules[fn.__module__]
if hasattr(mod, '__all__'):
mod.__all__.append(fn.__name__)
else:
mod.__all__ = [fn.__name__]
return fn
and then, where you would define an __all__, you do this:
$ cat > main.py
from lib import export
__all__ = [] # optional - we create a list if __all__ is not there.
@export
def foo(): pass
@export
def bar():
'bar'
def main():
print('main')
if __name__ == '__main__':
main()
And this works fine whether run as main or imported by another function.
$ cat > run.py
import main
main.main()
$ python run.py
main
And API provisioning with import * will work too:
$ cat > run.py
from main import *
foo()
bar()
main() # expected to error here, not exported
$ python run.py
Traceback (most recent call last):
File "run.py", line 4, in <module>
main() # expected to error here, not exported
NameError: name 'main' is not defined
All other answers refer to modules. The original question explicitely mentioned __all__ in __init__.py files, so this is about python packages.
Generally, __all__ only comes into play when the from xxx import * variant of the import statement is used. This applies to packages as well as to modules.
The behaviour for modules is explained in the other answers. The exact behaviour for packages is described here in detail.
In short, __all__ on package level does approximately the same thing as for modules, except it deals with modules within the package (in contrast to specifying names within the module). So __all__ specifies all modules that shall be loaded and imported into the current namespace when us use from package import *.
The big difference is, that when you omit the declaration of __all__ in a package’s __init__.py, the statement from package import * will not import anything at all (with exceptions explained in the documentation, see link above).
On the other hand, if you omit __all__ in a module, the “starred import” will import all names (not starting with an underscore) defined in the module.
Help on module module1:
NAME
module1
FILE
module1.py
DATAa = 'A'
b = 'B'
c = 'C'
$ pydoc module2
Help on module module2:
NAME
module2
FILE
module2.py
DATA__all__ = ['a', 'b']
a = 'A'
b = 'B'
I declare __all__ in all my modules, as well as underscore internal details, these really help when using things you’ve never used before in live interpreter sessions.
A package is a directory with a __init__.py file. A package usually contains modules.
MODULES
""" cheese.py - an example module """
__all__ = ['swiss', 'cheddar']
swiss = 4.99
cheddar = 3.99
gouda = 10.99
__all__ lets humans know the “public” features of a module.[@AaronHall] Also, pydoc recognizes them.[@Longpoke]
from module import *
See how swiss and cheddar are brought into the local namespace, but not gouda:
>>> from cheese import *
>>> swiss, cheddar
(4.99, 3.99)
>>> gouda
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'gouda' is not defined
Without __all__, any symbol (that doesn’t start with an underscore) would have been available.
In the __init__.py file of a package__all__ is a list of strings with the names of public modules or other objects. Those features are available to wildcard imports. As with modules, __all__ customizes the * when wildcard-importing from the package.[@MartinStettner]
The default case, asterisk with no __all__ for a package, is complicated, because the obvious behavior would be expensive: to use the file system to search for all modules in the package. Instead, in my reading of the docs, only the objects defined in __init__.py are imported:
If __all__ is not defined, the statement from sound.effects import * does not import all submodules from the package sound.effects into the current namespace; it only ensures that the package sound.effects has been imported (possibly running any initialization code in __init__.py) and then imports whatever names are defined in the package. This includes any names defined (and submodules explicitly loaded) by __init__.py. It also includes any submodules of the package that were explicitly loaded by previous import statements.
Wildcard imports … should be avoided as they [confuse] readers and many automated tools.
The public names defined by a module are determined by checking the module’s namespace for a variable named __all__; if defined, it must be a sequence of strings which are names defined or imported by that module. The names given in __all__ are all considered public and are required to exist. If __all__ is not defined, the set of public names includes all names found in the module’s namespace which do not begin with an underscore character (“_”). __all__ should contain the entire public API. It is intended to avoid accidentally exporting items that are not part of the API (such as library modules which were imported and used within the module).
The public names defined by a module are determined by checking the module’s namespace for a variable named __all__; if defined, it must be a sequence of strings which are names defined or imported by that module. The names given in __all__ are all considered public and are required to exist. If __all__ is not defined, the set of public names includes all names found in the module’s namespace which do not begin with an underscore character (‘_’). __all__ should contain the entire public API. It is intended to avoid accidentally exporting items that are not part of the API (such as library modules which were imported and used within the module).
PEP 8 uses similar wording, although it also makes it clear that imported names are not part of the public API when __all__ is absent:
To better support introspection, modules should explicitly declare the names in their public API using the __all__ attribute. Setting __all__ to an empty list indicates that the module has no public API.
[…]
Imported names should always be considered an implementation detail. Other modules must not rely on indirect access to such imported names unless they are an explicitly documented part of the containing module’s API, such as os.path or a package’s __init__ module that exposes functionality from submodules.
The import statement uses the following convention: if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered.
Code that is inside a module body (but not in the body of a function or class) may use an asterisk (*) in a from statement:
from foo import *
The * requests that all attributes of module foo (except those beginning with underscores) be bound as global variables in the importing module. When foo has an attribute __all__, the attribute’s value is the list of the names that are bound by this type of from statement.
If foo is a package and its __init__.py defines a list named __all__, it is taken to be the list of submodule names that should be imported when from foo import * is encountered. If __all__ is not defined, the statement from foo import * imports whatever names are defined in the package. This includes any names defined (and submodules explicitly loaded) by __init__.py.
Note that __all__ doesn’t have to be a list. As per the documentation on the import statement, if defined, __all__ must be a sequence of strings which are names defined or imported by the module. So you may as well use a tuple to save some memory and CPU cycles. Just don’t forget a comma in case the module defines a single public name:
(Let’s hope that these variables are meant for use inside one module only.) The conventions are about the same as those for functions.
Modules that are designed for use via from M import * should use the __all__ mechanism to prevent exporting globals, or use the older convention of prefixing such globals with an underscore (which you might want to do to indicate these globals are “module non-public”).
PEP8 provides coding conventions for the Python code comprising the standard library in the main Python distribution. The more you follow this, closer you are to the original intent.
# What gets printed if foo is the main program
before import
before functionA
before functionB
before __name__ guard
Function A
Function B 10.0
after __name__ guard
# What gets printed if foo is imported as a regular module
before import
before functionA
before functionB
before __name__ guard
after __name__ guard
Whenever the Python interpreter reads a source file, it does two things:
it sets a few special variables like __name__, and then
it executes all of the code found in the file.
Let’s see how this works and how it relates to your question about the __name__ checks we always see in Python scripts.
Code Sample
Let’s use a slightly different code sample to explore how imports and scripts work. Suppose the following is in a file called foo.py.
# Suppose this is foo.py.
print("before import")
import math
print("before functionA")
def functionA():
print("Function A")
print("before functionB")
def functionB():
print("Function B {}".format(math.sqrt(100)))
print("before __name__ guard")
if __name__ == '__main__':
functionA()
functionB()
print("after __name__ guard")
Special Variables
When the Python interpeter reads a source file, it first defines a few special variables. In this case, we care about the __name__ variable.
When Your Module Is the Main Program
If you are running your module (the source file) as the main program, e.g.
python foo.py
the interpreter will assign the hard-coded string "__main__" to the __name__ variable, i.e.
# It's as if the interpreter inserts this at the top
# of your module when run as the main program.
__name__ = "__main__"
When Your Module Is Imported By Another
On the other hand, suppose some other module is the main program and it imports your module. This means there’s a statement like this in the main program, or in some other module the main program imports:
# Suppose this is in some other main program.
import foo
The interpreter will search for your foo.py file (along with searching for a few other variants), and prior to executing that module, it will assign the name "foo" from the import statement to the __name__ variable, i.e.
# It's as if the interpreter inserts this at the top
# of your module when it's imported from another module.
__name__ = "foo"
Executing the Module’s Code
After the special variables are set up, the interpreter executes all the code in the module, one statement at a time. You may want to open another window on the side with the code sample so you can follow along with this explanation.
Always
It prints the string "before import" (without quotes).
It loads the math module and assigns it to a variable called math. This is equivalent to replacing import math with the following (note that __import__ is a low-level function in Python that takes a string and triggers the actual import):
# Find and load a module given its string name, "math",
# then assign it to a local variable called math.
math = __import__("math")
It prints the string "before functionA".
It executes the def block, creating a function object, then assigning that function object to a variable called functionA.
It prints the string "before functionB".
It executes the second def block, creating another function object, then assigning it to a variable called functionB.
It prints the string "before __name__ guard".
Only When Your Module Is the Main Program
If your module is the main program, then it will see that __name__ was indeed set to "__main__" and it calls the two functions, printing the strings "Function A" and "Function B 10.0".
Only When Your Module Is Imported by Another
(instead) If your module is not the main program but was imported by another one, then __name__ will be "foo", not "__main__", and it’ll skip the body of the if statement.
Always
It will print the string "after __name__ guard" in both situations.
Summary
In summary, here’s what’d be printed in the two cases:
# What gets printed if foo is the main program
before import
before functionA
before functionB
before __name__ guard
Function A
Function B 10.0
after __name__ guard
# What gets printed if foo is imported as a regular module
before import
before functionA
before functionB
before __name__ guard
after __name__ guard
Why Does It Work This Way?
You might naturally wonder why anybody would want this. Well, sometimes you want to write a .py file that can be both used by other programs and/or modules as a module, and can also be run as the main program itself. Examples:
Your module is a library, but you want to have a script mode where it runs some unit tests or a demo.
Your module is only used as a main program, but it has some unit tests, and the testing framework works by importing .py files like your script and running special test functions. You don’t want it to try running the script just because it’s importing the module.
Your module is mostly used as a main program, but it also provides a programmer-friendly API for advanced users.
Beyond those examples, it’s elegant that running a script in Python is just setting up a few magic variables and importing the script. “Running” the script is a side effect of importing the script’s module.
Food for Thought
Question: Can I have multiple __name__ checking blocks? Answer: it’s strange to do so, but the language won’t stop you.
Suppose the following is in foo2.py. What happens if you say python foo2.py on the command-line? Why?
# Suppose this is foo2.py.
def functionA():
print("a1")
from foo2 import functionB
print("a2")
functionB()
print("a3")
def functionB():
print("b")
print("t1")
if __name__ == "__main__":
print("m1")
functionA()
print("m2")
print("t2")
Now, figure out what will happen if you remove the __name__ check in foo3.py:
# Suppose this is foo3.py.
def functionA():
print("a1")
from foo3 import functionB
print("a2")
functionB()
print("a3")
def functionB():
print("b")
print("t1")
print("m1")
functionA()
print("m2")
print("t2")
What will this do when used as a script? When imported as a module?
# Suppose this is in foo4.py
__name__ = "__main__"
def bar():
print("bar")
print("before __name__ guard")
if __name__ == "__main__":
bar()
print("after __name__ guard")
# file one.pydef func():print("func() in one.py")print("top-level in one.py")if __name__ =="__main__":print("one.py is being run directly")else:print("one.py is being imported into another module")
# file two.pyimport one
print("top-level in two.py")
one.func()if __name__ =="__main__":print("two.py is being run directly")else:print("two.py is being imported into another module")
现在,如果您将解释器调用为
python one.py
输出将是
top-level in one.py
one.py is being run directly
如果two.py改为运行:
python two.py
你得到
top-level in one.py
one.py is being imported into another module
top-level in two.py
func()in one.py
two.py is being run directly
When your script is run by passing it as a command to the Python interpreter,
python myscript.py
all of the code that is at indentation level 0 gets executed. Functions and classes that are defined are, well, defined, but none of their code gets run. Unlike other languages, there’s no main() function that gets run automatically – the main() function is implicitly all the code at the top level.
In this case, the top-level code is an if block. __name__ is a built-in variable which evaluates to the name of the current module. However, if a module is being run directly (as in myscript.py above), then __name__ instead is set to the string "__main__". Thus, you can test whether your script is being run directly or being imported by something else by testing
if __name__ == "__main__":
...
If your script is being imported into another module, its various function and class definitions will be imported and its top-level code will be executed, but the code in the then-body of the if clause above won’t get run as the condition is not met. As a basic example, consider the following two scripts:
# file one.py
def func():
print("func() in one.py")
print("top-level in one.py")
if __name__ == "__main__":
print("one.py is being run directly")
else:
print("one.py is being imported into another module")
# file two.py
import one
print("top-level in two.py")
one.func()
if __name__ == "__main__":
print("two.py is being run directly")
else:
print("two.py is being imported into another module")
Now, if you invoke the interpreter as
python one.py
The output will be
top-level in one.py
one.py is being run directly
If you run two.py instead:
python two.py
You get
top-level in one.py
one.py is being imported into another module
top-level in two.py
func() in one.py
two.py is being run directly
Thus, when module one gets loaded, its __name__ equals "one" instead of "__main__".
回答 2
__name__变量(imho)的最简单解释如下:
创建以下文件。
# a.pyimport b
和
# b.pyprint"Hello World from %s!"% __name__
if __name__ =='__main__':print"Hello World again from %s!"% __name__
The simplest explanation for the __name__ variable (imho) is the following:
Create the following files.
# a.py
import b
and
# b.py
print "Hello World from %s!" % __name__
if __name__ == '__main__':
print "Hello World again from %s!" % __name__
Running them will get you this output:
$ python a.py
Hello World from b!
As you can see, when a module is imported, Python sets globals()['__name__'] in this module to the module’s name. Also, upon import all the code in the module is being run. As the if statement evaluates to False this part is not executed.
$ python b.py
Hello World from __main__!
Hello World again from __main__!
As you can see, when a file is executed, Python sets globals()['__name__'] in this file to "__main__". This time, the if statement evaluates to True and is being run.
def main():"""business logic for when running this module as the primary one!"""
setup()
foo = do_important()
bar = do_even_more_important(foo)for baz in bar:
do_super_important(baz)
teardown()# Here's our payoff idiom!if __name__ =='__main__':
main()
The global variable, __name__, in the module that is the entry point to your program, is '__main__'. Otherwise, it’s the name you import the module by.
So, code under the if block will only run if the module is the entry point to your program.
It allows the code in the module to be importable by other modules, without executing the code block beneath on import.
Why do we need this?
Developing and Testing Your Code
Say you’re writing a Python script designed to be used as a module:
def do_important():
"""This function does something very important"""
You could test the module by adding this call of the function to the bottom:
do_important()
and running it (on a command prompt) with something like:
~$ python important.py
The Problem
However, if you want to import the module to another script:
import important
On import, the do_important function would be called, so you’d probably comment out your function call, do_important(), at the bottom.
# do_important() # I must remember to uncomment to execute this!
And then you’ll have to remember whether or not you’ve commented out your test function call. And this extra complexity would mean you’re likely to forget, making your development process more troublesome.
A Better Way
The __name__ variable points to the namespace wherever the Python interpreter happens to be at the moment.
Inside an imported module, it’s the name of that module.
But inside the primary module (or an interactive Python session, i.e. the interpreter’s Read, Eval, Print Loop, or REPL) you are running everything from its "__main__".
So if you check before executing:
if __name__ == "__main__":
do_important()
With the above, your code will only execute when you’re running it as the primary module (or intentionally call it from another script).
An Even Better Way
There’s a Pythonic way to improve on this, though.
What if we want to run this business process from outside the module?
If we put the code we want to exercise as we develop and test in a function like this and then do our check for '__main__' immediately after:
def main():
"""business logic for when running this module as the primary one!"""
setup()
foo = do_important()
bar = do_even_more_important(foo)
for baz in bar:
do_super_important(baz)
teardown()
# Here's our payoff idiom!
if __name__ == '__main__':
main()
We now have a final function for the end of our module that will run if we run the module as the primary module.
It will allow the module and its functions and classes to be imported into other scripts without running the main function, and will also allow the module (and its functions and classes) to be called when running from a different '__main__' module, i.e.
This module represents the (otherwise anonymous) scope in which the
interpreter’s main program executes — commands read either from
standard input, from a script file, or from an interactive prompt. It
is this environment in which the idiomatic “conditional script” stanza
causes a script to run:
if __name__ == '__main__':
main()
回答 4
if __name__ == "__main__"是使用(例如)命令从(例如)命令行运行脚本时运行的部分python myscript.py。
__name__ is a global variable (in Python, global actually means on the module level) that exists in all namespaces. It is typically the module’s name (as a str type).
As the only special case, however, in whatever Python process you run, as in mycode.py:
python mycode.py
the otherwise anonymous global namespace is assigned the value of '__main__' to its __name__.
when it is the primary, entry-point module that is run by a Python process,
will cause your script’s uniquely defined main function to run.
Another benefit of using this construct: you can also import your code as a module in another script and then run the main function if and when your program decides:
import mycode
# ... any amount of other code
mycode.main()
import ab
def main():print('main function: this is where the action is')def x():print('peripheral task: might be useful in other projects')
x()if __name__ =="__main__":
main()
There are lots of different takes here on the mechanics of the code in question, the “How”, but for me none of it made sense until I understood the “Why”. This should be especially helpful for new programmers.
Take file “ab.py”:
def a():
print('A function in ab file');
a()
And a second file “xy.py”:
import ab
def main():
print('main function: this is where the action is')
def x():
print ('peripheral task: might be useful in other projects')
x()
if __name__ == "__main__":
main()
What is this code actually doing?
When you execute xy.py, you import ab. The import statement runs the module immediately on import, so ab‘s operations get executed before the remainder of xy‘s. Once finished with ab, it continues with xy.
The interpreter keeps track of which scripts are running with __name__. When you run a script – no matter what you’ve named it – the interpreter calls it "__main__", making it the master or ‘home’ script that gets returned to after running an external script.
Any other script that’s called from this "__main__" script is assigned its filename as its __name__ (e.g., __name__ == "ab.py"). Hence, the line if __name__ == "__main__": is the interpreter’s test to determine if it’s interpreting/parsing the ‘home’ script that was initially executed, or if it’s temporarily peeking into another (external) script. This gives the programmer flexibility to have the script behave differently if it’s executed directly vs. called externally.
Let’s step through the above code to understand what’s happening, focusing first on the unindented lines and the order they appear in the scripts. Remember that function – or def – blocks don’t do anything by themselves until they’re called. What the interpreter might say if mumbled to itself:
Open xy.py as the ‘home’ file; call it "__main__" in the __name__ variable.
Import and open file with the __name__ == "ab.py".
Oh, a function. I’ll remember that.
Ok, function a(); I just learned that. Printing ‘A function in ab file‘.
End of file; back to "__main__"!
Oh, a function. I’ll remember that.
Another one.
Function x(); ok, printing ‘peripheral task: might be useful in other projects‘.
What’s this? An if statement. Well, the condition has been met (the variable __name__ has been set to "__main__"), so I’ll enter the main() function and print ‘main function: this is where the action is‘.
The bottom two lines mean: “If this is the "__main__" or ‘home’ script, execute the function called main()“. That’s why you’ll see a def main(): block up top, which contains the main flow of the script’s functionality.
Why implement this?
Remember what I said earlier about import statements? When you import a module it doesn’t just ‘recognize’ it and wait for further instructions – it actually runs all the executable operations contained within the script. So, putting the meat of your script into the main() function effectively quarantines it, putting it in isolation so that it won’t immediately run when imported by another script.
Again, there will be exceptions, but common practice is that main() doesn’t usually get called externally. So you may be wondering one more thing: if we’re not calling main(), why are we calling the script at all? It’s because many people structure their scripts with standalone functions that are built to be run independent of the rest of the code in the file. They’re then later called somewhere else in the body of the script. Which brings me to this:
But the code works without it
Yes, that’s right. These separate functions can be called from an in-line script that’s not contained inside a main() function. If you’re accustomed (as I am, in my early learning stages of programming) to building in-line scripts that do exactly what you need, and you’ll try to figure it out again if you ever need that operation again … well, you’re not used to this kind of internal structure to your code, because it’s more complicated to build and it’s not as intuitive to read.
But that’s a script that probably can’t have its functions called externally, because if it did it would immediately start calculating and assigning variables. And chances are if you’re trying to re-use a function, your new script is related closely enough to the old one that there will be conflicting variables.
In splitting out independent functions, you gain the ability to re-use your previous work by calling them into another script. For example, “example.py” might import “xy.py” and call x(), making use of the ‘x’ function from “xy.py”. (Maybe it’s capitalizing the third word of a given text string; creating a NumPy array from a list of numbers and squaring them; or detrending a 3D surface. The possibilities are limitless.)
(As an aside, this question contains an answer by @kindall that finally helped me to understand – the why, not the how. Unfortunately it’s been marked as a duplicate of this one, which I think is a mistake.)
When there are certain statements in our module (M.py) we want to be executed when it’ll be running as main (not imported), we can place those statements (test-cases, print statements) under this if block.
As by default (when module running as main, not imported) the __name__ variable is set to "__main__", and when it’ll be imported the __name__ variable will get a different value, most probably the name of the module ('M').
This is helpful in running different variants of a modules together, and separating their specific input & output statements and also if there are any test-cases.
In short, use this ‘if __name__ == "main" ‘ block to prevent (certain) code from being run when the module is imported.
Put simply, __name__ is a variable defined for each script that defines whether the script is being run as the main module or it is being run as an imported module.
Script1's name is script1
Script 2's name: __main__
As you can see, __name__ tells us which code is the ‘main’ module.
This is great, because you can just write code and not have to worry about structural issues like in C/C++, where, if a file does not implement a ‘main’ function then it cannot be compiled as an executable and if it does, it cannot then be used as a library.
Say you write a Python script that does something great and you implement a boatload of functions that are useful for other purposes. If I want to use them I can just import your script and use them without executing your program (given that your code only executes within the if __name__ == "__main__": context). Whereas in C/C++ you would have to portion out those pieces into a separate module that then includes the file. Picture the situation below;
The arrows are import links. For three modules each trying to include the previous modules code there are six files (nine, counting the implementation files) and five links. This makes it difficult to include other code into a C project unless it is compiled specifically as a library. Now picture it for Python:
You write a module, and if someone wants to use your code they just import it and the __name__ variable can help to separate the executable portion of the program from the library part.
...
<Block A>
if __name__ == '__main__':
<Block B>
...
Blocks A and B are run when we are running x.py.
But just block A (and not B) is run when we are running another module, y.py for example, in which x.py is imported and the code is run from there (like when a function in x.py is called from y.py).
if __name__ =='__main__':# Do something appropriate here, like calling a# main() function defined elsewhere in this module.
main()else:# Do nothing. This module has been imported by another# module that wants to make use of the functions,# classes and other useful bits it has defined.
When you run Python interactively the local __name__ variable is assigned a value of __main__. Likewise, when you execute a Python module from the command line, rather than importing it into another module, its __name__ attribute is assigned a value of __main__, rather than the actual name of the module. In this way, modules can look at their own __name__ value to determine for themselves how they are being used, whether as support for another program or as the main application executed from the command line. Thus, the following idiom is quite common in Python modules:
if __name__ == '__main__':
# Do something appropriate here, like calling a
# main() function defined elsewhere in this module.
main()
else:
# Do nothing. This module has been imported by another
# module that wants to make use of the functions,
# classes and other useful bits it has defined.
It checks if the __name__ attribute of the Python script is "__main__". In other words, if the program itself is executed, the attribute will be __main__, so the program will be executed (in this case the main() function).
However, if your Python script is used by a module, any code outside of the if statement will be executed, so if \__name__ == "\__main__" is used just to check if the program is used as a module or not, and therefore decides whether to run the code.
Before explaining anything about if __name__ == '__main__' it is important to understand what __name__ is and what it does.
What is __name__?
__name__ is a DunderAlias – can be thought of as a global variable (accessible from modules) and works in a similar way to global.
It is a string (global as mentioned above) as indicated by type(__name__) (yielding <class 'str'>), and is an inbuilt standard for both Python 3 and Python 2 versions.
Where:
It can not only be used in scripts but can also be found in both the interpreter and modules/packages.
Interpreter:
>>> print(__name__)
__main__
>>>
Script:
test_file.py:
print(__name__)
Resulting in __main__
Module or package:
somefile.py:
def somefunction():
print(__name__)
test_file.py:
import somefile
somefile.somefunction()
Resulting in somefile
Notice that when used in a package or module, __name__ takes the name of the file. The path of the actual module or package path is not given, but has its own DunderAlias __file__, that allows for this.
You should see that, where __name__, where it is the main file (or program) will always return __main__, and if it is a module/package, or anything that is running off some other Python script, will return the name of the file where it has originated from.
Practice:
Being a variable means that it’s value can be overwritten (“can” does not mean “should”), overwriting the value of __name__ will result in a lack of readability. So do not do it, for any reason. If you need a variable define a new variable.
It is always assumed that the value of __name__ to be __main__ or the name of the file. Once again changing this default value will cause more confusion that it will do good, causing problems further down the line.
It is considered good practice in general to include the if __name__ == '__main__' in scripts.
Now to answer if __name__ == '__main__':
Now we know the behaviour of __name__ things become clearer:
An if is a flow control statement that contains the block of code will execute if the value given is true. We have seen that __name__ can take either
__main__ or the file name it has been imported from.
This means that if __name__ is equal to __main__ then the file must be the main file and must actually be running (or it is the interpreter), not a module or package imported into the script.
If indeed __name__ does take the value of __main__ then whatever is in that block of code will execute.
This tells us that if the file running is the main file (or you are running from the interpreter directly) then that condition must execute. If it is a package then it should not, and the value will not be __main__.
Modules:
__name__ can also be used in modules to define the name of a module
Variants:
It is also possible to do other, less common but useful things with __name__, some I will show here:
Executing only if the file is a module or package:
if __name__ != '__main__':
# Do some useful things
Running one condition if the file is the main one and another if it is not:
if __name__ == '__main__':
# Execute something
else:
# Do some useful things
You can also use it to provide runnable help functions/utilities on packages and modules without the elaborate use of libraries.
It also allows modules to be run from the command line as main scripts, which can be also very useful.
I think it’s best to break the answer in depth and in simple words:
__name__: Every module in Python has a special attribute called __name__.
It is a built-in variable that returns the name of the module.
__main__: Like other programming languages, Python too has an execution entry point, i.e., main. '__main__'is the name of the scope in which top-level code executes. Basically you have two ways of using a Python module: Run it directly as a script, or import it. When a module is run as a script, its __name__ is set to __main__.
Thus, the value of the __name__ attribute is set to __main__ when the module is run as the main program. Otherwise the value of __name__ is set to contain the name of the module.
It is a special for when a Python file is called from the command line. This is typically used to call a “main()” function or execute other appropriate startup code, like commandline arguments handling for instance.
It could be written in several ways. Another is:
def some_function_for_instance_main():
dosomething()
__name__ == '__main__' and some_function_for_instance_main()
I am not saying you should use this in production code, but it serves to illustrate that there is nothing “magical” about if __name__ == '__main__'. It is a good convention for invoking a main function in Python files.
There are a number of variables that the system (Python interpreter) provides for source files (modules). You can get their values anytime you want, so, let us focus on the __name__ variable/attribute:
When Python loads a source code file, it executes all of the code found in it. (Note that it doesn’t call all of the methods and functions defined in the file, but it does define them.)
Before the interpreter executes the source code file though, it defines a few special variables for that file; __name__ is one of those special variables that Python automatically defines for each source code file.
If Python is loading this source code file as the main program (i.e. the file you run), then it sets the special __name__ variable for this file to have a value “__main__”.
If this is being imported from another module, __name__ will be set to that module’s name.
will be executed only when you run the module directly; the code block will not execute if another module is calling/importing it because the value of __name__ will not equal to “main” in that particular instance.
Hope this helps out.
回答 16
if __name__ == "__main__": 基本上是顶级脚本环境,它指定了解释器(“我首先执行的优先级最高”)。
if __name__ == "__main__": is basically the top-level script environment, and it specifies the interpreter that (‘I have the highest priority to be executed first’).
'__main__' is the name of the scope in which top-level code executes. A module’s __name__ is set equal to '__main__' when read from standard input, a script, or from an interactive prompt.
if __name__ == "__main__":
# Execute only if run as a script
main()
I’ve been reading so much throughout the answers on this page. I would say, if you know the thing, for sure you will understand those answers, otherwise, you are still confused.
To be short, you need to know several points:
import a action actually runs all that can be ran in “a”
Because of point 1, you may not want everything to be run in “a” when importing it
To solve the problem in point 2, python allows you to put a condition check
__name__ is an implicit variable in all .py modules; when a.py is imported, the value of __name__ of a.py module is set to its file name “a“; when a.py is run directly using “python a.py“, which means a.py is the entry point, then the value of __name__ of a.py module is set to a string __main__
Based on the mechanism how python sets the variable __name__ for each module, do you know how to achieve point 3? The answer is fairly easy, right? Put a if condition: if __name__ == "__main__": ...; you can even put if __name__ == "a" depending on your functional need
The important thing that python is special at is point 4! The rest is just basic logic.
回答 18
考虑:
print __name__
上面的输出是__main__。
if __name__ =="__main__":print"direct method"
上面的陈述是正确的,并显示“ direct method”。假设他们在另一个类中导入了该类,则不会打印“直接方法”,因为在导入时它将设置__name__ equal to "first model name"。
The above statement is true and prints “direct method”. Suppose if they imported this class in another class it doesn’t print “direct method” because, while importing, it will set __name__ equal to "first model name".
回答 19
您可以使该文件可用作脚本以及可导入模块。
fibo.py(名为的模块fibo)
# Other modules can IMPORT this MODULE to use the function fibdef fib(n):# write Fibonacci series up to n
a, b =0,1while b < n:print(b, end=' ')
a, b = b, a+b
print()# This allows the file to be used as a SCRIPTif __name__ =="__main__":import sys
fib(int(sys.argv[1]))
You can make the file usable as a script as well as an importable module.
fibo.py (a module named fibo)
# Other modules can IMPORT this MODULE to use the function fib
def fib(n): # write Fibonacci series up to n
a, b = 0, 1
while b < n:
print(b, end=' ')
a, b = b, a+b
print()
# This allows the file to be used as a SCRIPT
if __name__ == "__main__":
import sys
fib(int(sys.argv[1]))
is primarily to avoid the import lock problems that would arise from having code directly imported. You want main() to run if your file was directly invoked (that’s the __name__ == "__main__" case), but if your code was imported then the importer has to enter your code from the true main module to avoid import lock problems.
A side-effect is that you automatically sign on to a methodology that supports multiple entry points. You can run your program using main() as the entry point, but you don’t have to. While setup.py expects main(), other tools use alternate entry points. For example, to run your file as a gunicorn process, you define an app() function instead of a main(). Just as with setup.py, gunicorn imports your code so you don’t want it do do anything while it’s being imported (because of the import lock issue).
This answer is for Java programmers learning Python.
Every Java file typically contains one public class. You can use that class in two ways:
Call the class from other files. You just have to import it in the calling program.
Run the class stand alone, for testing purposes.
For the latter case, the class should contain a public static void main() method. In Python this purpose is served by the globally defined label '__main__'.
回答 22
if __name__ == '__main__':仅当模块作为脚本调用时,才会执行以下代码。
例如,考虑以下模块my_test_module.py:
# my_test_module.pyprint('This is going to be printed out, no matter what')if __name__ =='__main__':print('This is going to be printed out, only if user invokes the module as a script')
第一种可能性:导入my_test_module.py另一个模块
# main.pyimport my_test_module
if __name__ =='__main__':print('Hello from main.py')
现在,如果您调用main.py:
python main.py
>>'This is going to be printed out, no matter what'>>'Hello from main.py'
python my_test_module.py
>>>'This is going to be printed out, no matter what'>>>'This is going to be printed out, only if user invokes the module as a script'
The code under if __name__ == '__main__':will only be executed if the module is invoked as a script.
As an example consider the following module my_test_module.py:
# my_test_module.py
print('This is going to be printed out, no matter what')
if __name__ == '__main__':
print('This is going to be printed out, only if user invokes the module as a script')
1st possibility: Import my_test_module.py in another module
# main.py
import my_test_module
if __name__ == '__main__':
print('Hello from main.py')
Now if you invoke main.py:
python main.py
>> 'This is going to be printed out, no matter what'
>> 'Hello from main.py'
Note that only the top-level print() statement in my_test_module is executed.
2nd possibility: Invoke my_test_module.py as a script
Now if you run my_test_module.py as a Python script, both print() statements will be exectued:
python my_test_module.py
>>> 'This is going to be printed out, no matter what'
>>> 'This is going to be printed out, only if user invokes the module as a script'
#Script test.py
apple =42def hello_world():print("I am inside hello_world")if __name__ =="__main__":print("Value of __name__ is: ", __name__)print("Going to call hello_world")
hello_world()
我们可以直接执行为
python test.py
输出量
Value of __name__ is: __main__
Going to call hello_world
I am inside hello_world
现在假设我们从其他脚本中调用上述脚本
#script external_calling.pyimport test
print(test.apple)
test.hello_world()print(test.__name__)
Every module in python has a attribute called __name__. The value of __name__ attribute is __main__ when the module is run directly, like python my_module.py. Otherwise (like when you say import my_module) the value of __name__ is the name of the module.
Small example to explain in short.
#Script test.py
apple = 42
def hello_world():
print("I am inside hello_world")
if __name__ == "__main__":
print("Value of __name__ is: ", __name__)
print("Going to call hello_world")
hello_world()
We can execute this directly as
python test.py
Output
Value of __name__ is: __main__
Going to call hello_world
I am inside hello_world
Now suppose we call above script from other script
#script external_calling.py
import test
print(test.apple)
test.hello_world()
print(test.__name__)
When you execute this
python external_calling.py
Output
42
I am inside hello_world
test
So, above is self explanatory that when you call test from other script, if loop __name__ in test.py will not execute.
All the answers have pretty much explained the functionality. But I will provide one example of its usage which might help clearing out the concept further.
Assume that you have two Python files, a.py and b.py. Now, a.py imports b.py. We run the a.py file, where the “import b.py” code is executed first. Before the rest of the a.py code runs, the code in the file b.py must run completely.
In the b.py code there is some code that is exclusive to that file b.py and we don’t want any other file (other than b.py file), that has imported the b.py file, to run it.
So that is what this line of code checks. If it is the main file (i.e., b.py) running the code, which in this case it is not (a.py is the main file running), then only the code gets executed.
回答 27
创建一个文件a.py:
print(__name__)# It will print out __main__
__name__始终等于__main__该文件直接运行时表明它是主文件。
在同一目录中创建另一个文件b.py:
import a # Prints a
运行。它将打印一个,即被导入文件的名称。
因此,为了显示同一文件的两种不同行为,这是一个常用的技巧:
# Code to be run when imported into another python fileif __name__ =='__main__':# Code to be run only when run directly