Python 实用宝典

Question 1

Is there a straightforward way to find all the modules that are part of a python package? I’ve found this old discussion, which is not really conclusive, but I’d love to have a definite answer before I roll out my own solution based on os.listdir().

Question 2

Yes, you want something based on pkgutil or similar — this way you can treat all packages alike regardless if they are in eggs or zips or so (where os.listdir won’t help).

import pkgutil

# this is the package we are inspecting -- for example 'email' from stdlib
import email

package = email
for importer, modname, ispkg in pkgutil.iter_modules(package.__path__):
    print "Found submodule %s (is a package: %s)" % (modname, ispkg)

How to import them too? You can just use __import__ as normal:

import pkgutil

# this is the package we are inspecting -- for example 'email' from stdlib
import email

package = email
prefix = package.__name__ + "."
for importer, modname, ispkg in pkgutil.iter_modules(package.__path__, prefix):
    print "Found submodule %s (is a package: %s)" % (modname, ispkg)
    module = __import__(modname, fromlist="dummy")
    print "Imported", module

Question 3

The right tool for this job is pkgutil.walk_packages.

To list all the modules on your system:

import pkgutil
for importer, modname, ispkg in pkgutil.walk_packages(path=None, onerror=lambda x: None):
    print(modname)

Be aware that walk_packages imports all subpackages, but not submodules.

If you wish to list all submodules of a certain package then you can use something like this:

import pkgutil
import scipy
package=scipy
for importer, modname, ispkg in pkgutil.walk_packages(path=package.__path__,
                                                      prefix=package.__name__+'.',
                                                      onerror=lambda x: None):
    print(modname)

iter_modules only lists the modules which are one-level deep. walk_packages gets all the submodules. In the case of scipy, for example, walk_packages returns

scipy.stats.stats

while iter_modules only returns

scipy.stats

The documentation on pkgutil (http://docs.python.org/library/pkgutil.html) does not list all the interesting functions defined in /usr/lib/python2.6/pkgutil.py.

Perhaps this means the functions are not part of the “public” interface and are subject to change.

However, at least as of Python 2.6 (and perhaps earlier versions?) pkgutil comes with a walk_packages method which recursively walks through all the modules available.

Question 4

This works for me:

import types

for key, obj in nltk.__dict__.iteritems():
    if type(obj) is types.ModuleType: 
        print key

Question 5

I was looking for a way to reload all submodules that I’m editing live in my package. It is a combination of the answers/comments above, so I’ve decided to post it here as an answer rather than a comment.

package=yourPackageName
import importlib
import pkgutil
for importer, modname, ispkg in pkgutil.walk_packages(path=package.__path__, prefix=package.__name__+'.', onerror=lambda x: None):
    try:
        modulesource = importlib.import_module(modname)
        reload(modulesource)
        print("reloaded: {}".format(modname))
    except Exception as e:
        print('Could not load {} {}'.format(modname, e))

Question 6

Here’s one way, off the top of my head:

>>> import os
>>> filter(lambda i: type(i) == type(os), [getattr(os, j) for j in dir(os)])
[<module 'UserDict' from '/usr/lib/python2.5/UserDict.pyc'>, <module 'copy_reg' from '/usr/lib/python2.5/copy_reg.pyc'>, <module 'errno' (built-in)>, <module 'posixpath' from '/usr/lib/python2.5/posixpath.pyc'>, <module 'sys' (built-in)>]

It could certainly be cleaned up and improved.

EDIT: Here’s a slightly nicer version:

>>> [m[1] for m in filter(lambda a: type(a[1]) == type(os), os.__dict__.items())]
[<module 'copy_reg' from '/usr/lib/python2.5/copy_reg.pyc'>, <module 'UserDict' from '/usr/lib/python2.5/UserDict.pyc'>, <module 'posixpath' from '/usr/lib/python2.5/posixpath.pyc'>, <module 'errno' (built-in)>, <module 'sys' (built-in)>]
>>> [m[0] for m in filter(lambda a: type(a[1]) == type(os), os.__dict__.items())]
['_copy_reg', 'UserDict', 'path', 'errno', 'sys']

NOTE: This will also find modules that might not necessarily be located in a subdirectory of the package, if they’re pulled in in its __init__.py file, so it depends on what you mean by “part of” a package.

Question 7

As far as I know, Python has 3 ways of finding out what operating system is running on:

os.name
sys.platform
platform.system()

Knowing this information is often useful in conditional imports, or using functionality that differs between platforms (e.g. time.clock() on Windows v.s. time.time() on UNIX).

My question is, why 3 different ways of doing this? When should one way be used and not another? Which way is the ‘best’ (most future-proof or least likely to accidentally exclude a particular system which your program can actually run on)?

It seems like sys.platform is more specific than os.name, allowing you to distinguish win32 from cygwin (as opposed to just nt), and linux2 from darwin (as opposed to just posix). But if that’s so, that what about the difference between sys.platform and platform.system()?

For example, which is better, this:

import sys
if sys.platform == 'linux2':
    # Do Linux-specific stuff

or this? :

import platform
if platform.system() == 'Linux':
    # Do Linux-specific stuff

For now I’ll be sticking to sys.platform, so this question isn’t particularly urgent, but I would be very grateful for some clarification regarding this.

Question 8

Dived a bit into the source code.

The output of sys.platform and os.name are determined at compile time. platform.system() determines the system type at run time.

sys.platform is specified as a compiler define during the build configuration.
os.name checks whether certain os specific modules are available (e.g. posix, nt, …)
platform.system() actually runs uname and potentially several other functions to determine the system type at run time.

My suggestion:

Use os.name to check whether it’s a posix-compliant system.
Use sys.platform to check whether it’s a linux, cygwin, darwin, atheos, etc.
Use platform.system() if you don’t believe the other sources.

Question 9

There is a thin line difference between platform.system() and sys.platform and interestingly for most cases platform.system() degenerates to sys.platform

Here is what the Source Python2.7\Lib\Platform.py\system says

def system():

    """ Returns the system/OS name, e.g. 'Linux', 'Windows' or 'Java'.

        An empty string is returned if the value cannot be determined.

    """
    return uname()[0]

def uname():
    # Get some infos from the builtin os.uname API...
    try:
        system,node,release,version,machine = os.uname()
    except AttributeError:
        no_os_uname = 1

    if no_os_uname or not filter(None, (system, node, release, version, machine)):
        # Hmm, no there is either no uname or uname has returned
        #'unknowns'... we'll have to poke around the system then.
        if no_os_uname:
            system = sys.platform
            release = ''
            version = ''
            node = _node()
            machine = ''

Also per the documentation

os.uname()

Return a 5-tuple containing information identifying the current operating system. The tuple contains 5 strings: (sysname, nodename, release, version, machine). Some systems truncate the nodename to 8 characters or to the leading component; a better way to get the hostname is socket.gethostname() or even socket.gethostbyaddr(socket.gethostname()).
Availability: recent flavors of Unix.

Question 10

From sys.platform docs:

os.name has a coarser granularity
os.uname() gives system-dependent version information
The platform module provides detailed checks for the system’s identity

Often the “best” future-proof way to test whether some functionality is available is just to try to use it and use a fallback if it fails.

what about the difference between sys.platform and platform.system()?

platform.system() returns a normalized value that it might get from several sources: os.uname(), sys.platform, ver command (on Windows).

Question 11

It depends on whether you prefer raising exception or trying anything on an untested system and whether your code is so high level or so low level that it can or can’t work on a similar untested system (e.g. untested Mac – ‘posix’ or on embedded ARM systems). More pythonic is to not enumerate all known systems but to test possible relevant properties. (e.g. it is considered important the endianess of the system but unimportant multiprocessing properties.)

os.name is a sufficient resolution for the correct usage of os module. Possible values are ‘posix’, ‘nt’, ‘os2’, ‘ce’, ‘java’ or ‘riscos’ in Python 2.7, while only the ‘posix’, ‘nt’ and ‘java’ are used since Python 3.4.
sys.platform is a finer resolution. It is recommended to use if sys.platform.startswith('linux') idiom because “linux2” means a Linux kernel version 2.xx or 3. Older kernels are currently never used. In Python 3.3 are all Linux systems simple ‘linux’.

I do not know the specifics of “Mac” and “Java” systems and so I can not use the results of very good method platform.system() for branching, but I would use advantages of the platform module for messages and error logging.

Question 12

I believe the platform module is probably preferred for new code. The others existed before it. It is an evolution, and the others remain for backwards compatibility.

Question 13

The python style guide suggests to group imports like this:

Imports should be grouped in the following order:

standard library imports

related third party imports

local application/library specific imports

However, it does not mention anything how the two different ways of imports should be laid out:

from foo import bar
import foo

There are multiple ways to sort them (let’s assume all those import belong to the same group):

first from..import, then import

from g import gg
from x import xx
import abc
import def
import x

first import, then from..import

import abc
import def
import x
from g import gg
from x import xx

alphabetic order by module name, ignoring the kind of import

import abc
import def
from g import gg
import x
from xx import xx

PEP8 does not mention the preferred order for this and the “cleanup imports” features some IDEs have probably just do whatever the developer of that feature preferred.

I’m looking for another PEP clarifying this or a relevant comment/email from the BDFL (or another Python core developer). Please don’t post subjective answers stating your own preference.

Question 14

Imports are generally sorted alphabetically and described in various places beside PEP 8.

Alphabetically sorted modules are quicker to read and searchable. After all python is all about readability. Also It is easier to verify that something is imported, and avoids duplicated imports

There is nothing available in PEP 8 regarding sorting.So its all about choice what you use.

According to few references from reputable sites and repositories also popularity, Alphabetical ordering is the way.

for eg like this:

import httplib
import logging
import random
import StringIO
import time
import unittest
from nova.api import openstack
from nova.auth import users
from nova.endpoint import cloud

OR

import a_standard
import b_standard

import a_third_party
import b_third_party

from a_soc import f
from a_soc import g
from b_soc import d

Reddit official repository also states that, In general PEP-8 import ordering should be used. However there are a few additions which is

for each imported group the order of imports should be:
import <package>.<module> style lines in alphabetical order
from <package>.<module> import <symbol> style in alphabetical order

References:

PS: the isort utility automatically sorts your imports.

Question 15

According to the CIA’s internal coding conventions (part of the WikiLeaks Vault 7 leak), python imports should be grouped into three groups:

Standard library imports
Third-party imports
Application-specific imports

Imports should be ordered lexicographically within these groups, ignoring case:

import foo
from foo import bar
from foo.bar import baz
from foo.bar import Quux
from Foob import ar

Question 16

The PEP 8 says nothing about it indeed. There’s no convention for this point, and it doesn’t mean the Python community need to define one absolutely. A choice can be better for a project but the worst for another… It’s a question of preferences for this, since each solutions has pro and cons. But if you want to follow conventions, you have to respect the principal order you quoted:

standard library imports
related third party imports
local application/library specific imports

For example, Google recommend in this page that import should be sorted lexicographically, in each categories (standard/third parties/yours). But at Facebook, Yahoo and whatever, it’s maybe another convention…

Question 17

I highly recommend reorder-python-imports. It follows the 2nd option of the accepted answer and also integrates into pre-commit, which is super helpful.

Question 18

All import x statements should be sorted by the value of x and all from x import y statements should be sorted by the value of x in alphabetical order and the sorted groups of from x import y statements must follow the sorted group of import x statements.

import abc
import def
import x
from g import gg
from x import xx
from z import a

Question 19

I feel like the accepted answer is a bit too verbose. Here is TLDR:

Within each grouping, imports should be sorted lexicographically, ignoring case, according to each module’s full package path

Google code style guide

So, the third option is correct:

import abc
import def
from g import yy  # changed gg->yy for illustrative purposes
import x
from xx import xx

Question 20

Given a string of a Python class, e.g. my_package.my_module.MyClass, what is the best possible way to load it?

In other words I am looking for a equivalent Class.forName() in Java, function in Python. It needs to work on Google App Engine.

Preferably this would be a function that accepts the FQN of the class as a string, and returns a reference to the class:

my_class = load_class('my_package.my_module.MyClass')
my_instance = my_class()

Question 21

From the python documentation, here’s the function you want:

def my_import(name):
    components = name.split('.')
    mod = __import__(components[0])
    for comp in components[1:]:
        mod = getattr(mod, comp)
    return mod

The reason a simple __import__ won’t work is because any import of anything past the first dot in a package string is an attribute of the module you’re importing. Thus, something like this won’t work:

__import__('foo.bar.baz.qux')

You’d have to call the above function like so:

my_import('foo.bar.baz.qux')

Or in the case of your example:

klass = my_import('my_package.my_module.my_class')
some_object = klass()

EDIT: I was a bit off on this. What you’re basically wanting to do is this:

from my_package.my_module import my_class

The above function is only necessary if you have a empty fromlist. Thus, the appropriate call would be like this:

mod = __import__('my_package.my_module', fromlist=['my_class'])
klass = getattr(mod, 'my_class')

Question 22

If you don’t want to roll your own, there is a function available in the pydoc module that does exactly this:

from pydoc import locate
my_class = locate('my_package.my_module.MyClass')

The advantage of this approach over the others listed here is that locate will find any python object at the provided dotted path, not just an object directly within a module. e.g. my_package.my_module.MyClass.attr.

If you’re curious what their recipe is, here’s the function:

def locate(path, forceload=0):
    """Locate an object by name or dotted path, importing as necessary."""
    parts = [part for part in split(path, '.') if part]
    module, n = None, 0
    while n < len(parts):
        nextmodule = safeimport(join(parts[:n+1], '.'), forceload)
        if nextmodule: module, n = nextmodule, n + 1
        else: break
    if module:
        object = module
    else:
        object = __builtin__
    for part in parts[n:]:
        try:
            object = getattr(object, part)
        except AttributeError:
            return None
    return object

It relies on pydoc.safeimport function. Here are the docs for that:

"""Import a module; handle errors; return None if the module isn't found.

If the module *is* found but an exception occurs, it's wrapped in an
ErrorDuringImport exception and reraised.  Unlike __import__, if a
package path is specified, the module at the end of the path is returned,
not the package at the beginning.  If the optional 'forceload' argument
is 1, we reload the module from disk (unless it's a dynamic extension)."""

Question 23

import importlib

module = importlib.import_module('my_package.my_module')
my_class = getattr(module, 'MyClass')
my_instance = my_class()

Question 24

def import_class(cl):
    d = cl.rfind(".")
    classname = cl[d+1:len(cl)]
    m = __import__(cl[0:d], globals(), locals(), [classname])
    return getattr(m, classname)

Question 25

If you’re using Django you can use this. Yes i’m aware OP did not ask for django, but i ran across this question looking for a Django solution, didn’t find one, and put it here for the next boy/gal that looks for it.

# It's available for v1.7+
# https://github.com/django/django/blob/stable/1.7.x/django/utils/module_loading.py
from django.utils.module_loading import import_string

Klass = import_string('path.to.module.Klass')
func = import_string('path.to.module.func')
var = import_string('path.to.module.var')

Keep in mind, if you want to import something that doesn’t have a ., like re or argparse use:

re = __import__('re')

Question 26

Here is to share something I found on __import__ and importlib while trying to solve this problem.

I am using Python 3.7.3.

When I try to get to the class d in module a.b.c,

mod = __import__('a.b.c')

The mod variable refer to the top namespace a.

So to get to the class d, I need to

mod = getattr(mod, 'b') #mod is now module b
mod = getattr(mod, 'c') #mod is now module c
mod = getattr(mod, 'd') #mod is now class d

If we try to do

mod = __import__('a.b.c')
d = getattr(mod, 'd')

we are actually trying to look for a.d.

When using importlib, I suppose the library has done the recursive getattr for us. So, when we use importlib.import_module, we actually get a handle on the deepest module.

mod = importlib.import_module('a.b.c') #mod is module c
d = getattr(mod, 'd') #this is a.b.c.d

Question 27

OK, for me that is the way it worked (I am using Python 2.7):

a = __import__('file_to_import', globals(), locals(), ['*'], -1)
b = a.MyClass()

Then, b is an instance of class ‘MyClass’

Question 28

If you happen to already have an instance of your desired class, you can use the ‘type’ function to extract its class type and use this to construct a new instance:

class Something(object):
    def __init__(self, name):
        self.name = name
    def display(self):
        print(self.name)

one = Something("one")
one.display()
cls = type(one)
two = cls("two")
two.display()

Question 29

module = __import__("my_package/my_module")
the_class = getattr(module, "MyClass")
obj = the_class()

Question 30

In Google App Engine there is a webapp2 function called import_string. For more info see here:https://webapp-improved.appspot.com/api/webapp2.html

So,

import webapp2
my_class = webapp2.import_string('my_package.my_module.MyClass')

For example this is used in the webapp2.Route where you can either use a handler or a string.

Question 31

I have answered a question regarding absolute imports in Python, which I thought I understood based on reading the Python 2.5 changelog and accompanying PEP. However, upon installing Python 2.5 and attempting to craft an example of properly using from __future__ import absolute_import, I realize things are not so clear.

Straight from the changelog linked above, this statement accurately summarized my understanding of the absolute import change:

Let’s say you have a package directory like this:
pkg/
pkg/__init__.py
pkg/main.py
pkg/string.py
This defines a package named pkg containing the pkg.main and pkg.string submodules.

Consider the code in the main.py module. What happens if it executes the statement import string? In Python 2.4 and earlier, it will first look in the package’s directory to perform a relative import, finds pkg/string.py, imports the contents of that file as the pkg.string module, and that module is bound to the name "string" in the pkg.main module’s namespace.

So I created this exact directory structure:

$ ls -R
.:
pkg/

./pkg:
__init__.py  main.py  string.py

__init__.py and string.py are empty. main.py contains the following code:

import string
print string.ascii_uppercase

As expected, running this with Python 2.5 fails with an AttributeError:

$ python2.5 pkg/main.py
Traceback (most recent call last):
  File "pkg/main.py", line 2, in <module>
    print string.ascii_uppercase
AttributeError: 'module' object has no attribute 'ascii_uppercase'

However, further along in the 2.5 changelog, we find this (emphasis added):

In Python 2.5, you can switch import‘s behaviour to absolute imports using a from __future__ import absolute_import directive. This absolute-import behaviour will become the default in a future version (probably Python 2.7). Once absolute imports are the default, import string will always find the standard library’s version.

I thus created pkg/main2.py, identical to main.py but with the additional future import directive. It now looks like this:

from __future__ import absolute_import
import string
print string.ascii_uppercase

Running this with Python 2.5, however… fails with an AttributeError:

$ python2.5 pkg/main2.py
Traceback (most recent call last):
  File "pkg/main2.py", line 3, in <module>
    print string.ascii_uppercase
AttributeError: 'module' object has no attribute 'ascii_uppercase'

This pretty flatly contradicts the statement that import string will always find the std-lib version with absolute imports enabled. What’s more, despite the warning that absolute imports are scheduled to become the “new default” behavior, I hit this same problem using both Python 2.7, with or without the __future__ directive:

$ python2.7 pkg/main.py
Traceback (most recent call last):
  File "pkg/main.py", line 2, in <module>
    print string.ascii_uppercase
AttributeError: 'module' object has no attribute 'ascii_uppercase'

$ python2.7 pkg/main2.py
Traceback (most recent call last):
  File "pkg/main2.py", line 3, in <module>
    print string.ascii_uppercase
AttributeError: 'module' object has no attribute 'ascii_uppercase'

as well as Python 3.5, with or without (assuming the print statement is changed in both files):

$ python3.5 pkg/main.py
Traceback (most recent call last):
  File "pkg/main.py", line 2, in <module>
    print(string.ascii_uppercase)
AttributeError: module 'string' has no attribute 'ascii_uppercase'

$ python3.5 pkg/main2.py
Traceback (most recent call last):
  File "pkg/main2.py", line 3, in <module>
    print(string.ascii_uppercase)
AttributeError: module 'string' has no attribute 'ascii_uppercase'

I have tested other variations of this. Instead of string.py, I have created an empty module — a directory named string containing only an empty __init__.py — and instead of issuing imports from main.py, I have cd‘d to pkg and run imports directly from the REPL. Neither of these variations (nor a combination of them) changed the results above. I cannot reconcile this with what I have read about the __future__ directive and absolute imports.

It seems to me that this is easily explicable by the following (this is from the Python 2 docs but this statement remains unchanged in the same docs for Python 3):

sys.path

(…)

As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first.

So what am I missing? Why does the __future__ statement seemingly not do what it says, and what is the resolution of this contradiction between these two sections of documentation, as well as between described and actual behavior?

Question 32

The changelog is sloppily worded. from __future__ import absolute_import does not care about whether something is part of the standard library, and import string will not always give you the standard-library module with absolute imports on.

from __future__ import absolute_import means that if you import string, Python will always look for a top-level string module, rather than current_package.string. However, it does not affect the logic Python uses to decide what file is the string module. When you do

python pkg/script.py

pkg/script.py doesn’t look like part of a package to Python. Following the normal procedures, the pkg directory is added to the path, and all .py files in the pkg directory look like top-level modules. import string finds pkg/string.py not because it’s doing a relative import, but because pkg/string.py appears to be the top-level module string. The fact that this isn’t the standard-library string module doesn’t come up.

To run the file as part of the pkg package, you could do

python -m pkg.script

In this case, the pkg directory will not be added to the path. However, the current directory will be added to the path.

You can also add some boilerplate to pkg/script.py to make Python treat it as part of the pkg package even when run as a file:

if __name__ == '__main__' and __package__ is None:
    __package__ = 'pkg'

However, this won’t affect sys.path. You’ll need some additional handling to remove the pkg directory from the path, and if pkg‘s parent directory isn’t on the path, you’ll need to stick that on the path too.

Question 33

The difference between absolute and relative imports come into play only when you import a module from a package and that module imports an other submodule from that package. See the difference:

$ mkdir pkg
$ touch pkg/__init__.py
$ touch pkg/string.py
$ echo 'import string;print(string.ascii_uppercase)' > pkg/main1.py
$ python2
Python 2.7.9 (default, Dec 13 2014, 18:02:08) [GCC] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkg.main1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pkg/main1.py", line 1, in <module>
    import string;print(string.ascii_uppercase)
AttributeError: 'module' object has no attribute 'ascii_uppercase'
>>> 
$ echo 'from __future__ import absolute_import;import string;print(string.ascii_uppercase)' > pkg/main2.py
$ python2
Python 2.7.9 (default, Dec 13 2014, 18:02:08) [GCC] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkg.main2
ABCDEFGHIJKLMNOPQRSTUVWXYZ
>>>

In particular:

$ python2 pkg/main2.py
Traceback (most recent call last):
  File "pkg/main2.py", line 1, in <module>
    from __future__ import absolute_import;import string;print(string.ascii_uppercase)
AttributeError: 'module' object has no attribute 'ascii_uppercase'
$ python2
Python 2.7.9 (default, Dec 13 2014, 18:02:08) [GCC] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkg.main2
ABCDEFGHIJKLMNOPQRSTUVWXYZ
>>> 
$ python2 -m pkg.main2
ABCDEFGHIJKLMNOPQRSTUVWXYZ

Note that python2 pkg/main2.py has a different behaviour then launching python2 and then importing pkg.main2 (which is equivalent to using the -m switch).

If you ever want to run a submodule of a package always use the -m switch which prevents the interpreter for chaining the sys.path list and correctly handles the semantics of the submodule.

Also, I much prefer using explicit relative imports for package submodules since they provide more semantics and better error messages in case of failure.

Question 34

It is recommended to not to use import * in Python.

Can anyone please share the reason for that, so that I can avoid it doing next time?

Question 35

Because it puts a lot of stuff into your namespace (might shadow some other object from previous import and you won’t know about it).
Because you don’t know exactly what is imported and can’t easily find from which module a certain thing was imported (readability).
Because you can’t use cool tools like pyflakes to statically detect errors in your code.

Question 36

According to the Zen of Python:

Explicit is better than implicit.

… can’t argue with that, surely?

Question 37

You don’t pass **locals() to functions, do you?

Since Python lacks an “include” statement, and the self parameter is explicit, and scoping rules are quite simple, it’s usually very easy to point a finger at a variable and tell where that object comes from — without reading other modules and without any kind of IDE (which are limited in the way of introspection anyway, by the fact the language is very dynamic).

The import * breaks all that.

Also, it has a concrete possibility of hiding bugs.

import os, sys, foo, sqlalchemy, mystuff
from bar import *

Now, if the bar module has any of the “os“, “mystuff“, etc… attributes, they will override the explicitly imported ones, and possibly point to very different things. Defining __all__ in bar is often wise — this states what will implicitly be imported – but still it’s hard to trace where objects come from, without reading and parsing the bar module and following its imports. A network of import * is the first thing I fix when I take ownership of a project.

Don’t misunderstand me: if the import * were missing, I would cry to have it. But it has to be used carefully. A good use case is to provide a facade interface over another module. Likewise, the use of conditional import statements, or imports inside function/class namespaces, requires a bit of discipline.

I think in medium-to-big projects, or small ones with several contributors, a minimum of hygiene is needed in terms of statical analysis — running at least pyflakes or even better a properly configured pylint — to catch several kind of bugs before they happen.

Of course since this is python — feel free to break rules, and to explore — but be wary of projects that could grow tenfold, if the source code is missing discipline it will be a problem.

Question 38

It is OK to do from ... import * in an interactive session.

Question 39

That is because you are polluting the namespace. You will import all the functions and classes in your own namespace, which may clash with the functions you define yourself.

Furthermore, I think using a qualified name is more clear for the maintenance task; you see on the code line itself where a function comes from, so you can check out the docs much more easily.

In module foo:

def myFunc():
    print 1

In your code:

from foo import *

def doThis():
    myFunc() # Which myFunc is called?

def myFunc():
    print 2

Question 40

http://docs.python.org/tutorial/modules.html

Note that in general the practice of importing * from a module or package is frowned upon, since it often causes poorly readable code.

Question 41

Say you have the following code in a module called foo:

import ElementTree as etree

and then in your own module you have:

from lxml import etree
from foo import *

You now have a difficult-to-debug module that looks like it has lxml’s etree in it, but really has ElementTree instead.

Question 42

These are all good answers. I’m going to add that when teaching new people to code in Python, dealing with import * is very difficult. Even if you or they didn’t write the code, it’s still a stumbling block.

I teach children (about 8 years old) to program in Python to manipulate Minecraft. I like to give them a helpful coding environment to work with (Atom Editor) and teach REPL-driven development (via bpython). In Atom I find that the hints/completion works just as effectively as bpython. Luckily, unlike some other statistical analysis tools, Atom is not fooled by import *.

However, lets take this example… In this wrapper they from local_module import * a bunch modules including this list of blocks. Let’s ignore the risk of namespace collisions. By doing from mcpi.block import * they make this entire list of obscure types of blocks something that you have to go look at to know what is available. If they had instead used from mcpi import block, then you could type walls = block. and then an autocomplete list would pop up.

Question 43

Understood the valid points people put here. However, I do have one argument that, sometimes, “star import” may not always be a bad practice:

When I want to structure my code in such a way that all the constants go to a module called const.py:
- If I do import const, then for every constant, I have to refer it as const.SOMETHING, which is probably not the most convenient way.
- If I do from const import SOMETHING_A, SOMETHING_B ..., then obviously it’s way too verbose and defeats the purpose of the structuring.
- Thus I feel in this case, doing a from const import * may be a better choice.

Question 44

It is a very BAD practice for two reasons:

Code Readability
Risk of overriding the variables/functions etc

For point 1: Let’s see an example of this:

from module1 import *
from module2 import *
from module3 import *

a = b + c - d

Here, on seeing the code no one will get idea regarding from which module b, c and d actually belongs.

On the other way, if you do it like:

#                   v  v  will know that these are from module1
from module1 import b, c   # way 1
import module2             # way 2

a = b + c - module2.d
#            ^ will know it is from module2

It is much cleaner for you, and also the new person joining your team will have better idea.

For point 2: Let say both module1 and module2 have variable as b. When I do:

from module1 import *
from module2 import *

print b  # will print the value from module2

Here the value from module1 is lost. It will be hard to debug why the code is not working even if b is declared in module1 and I have written the code expecting my code to use module1.b

If you have same variables in different modules, and you do not want to import entire module, you may even do:

from module1 import b as mod1b
from module2 import b as mod2b

Question 45

As a test, I created a module test.py with 2 functions A and B, which respectively print “A 1” and “B 1”. After importing test.py with:

import test

. . . I can run the 2 functions as test.A() and test.B(), and “test” shows up as a module in the namespace, so if I edit test.py I can reload it with:

import importlib
importlib.reload(test)

But if I do the following:

from test import *

there is no reference to “test” in the namespace, so there is no way to reload it after an edit (as far as I can tell), which is a problem in an interactive session. Whereas either of the following:

import test
import test as tt

will add “test” or “tt” (respectively) as module names in the namespace, which will allow re-loading.

If I do:

from test import *

the names “A” and “B” show up in the namespace as functions. If I edit test.py, and repeat the above command, the modified versions of the functions do not get reloaded.

And the following command elicits an error message.

importlib.reload(test)    # Error - name 'test' is not defined

If someone knows how to reload a module loaded with “from module import *”, please post. Otherwise, this would be another reason to avoid the form:

from module import *

Question 46

As suggested in the docs, you should (almost) never use import * in production code.

While importing * from a module is bad, importing * from a package is probably even worse.

By default, from package import * imports whatever names are defined by the package’s __init__.py, including any submodules of the package that were loaded by previous import statements.

If a package’s __init__.py code defines a list named __all__, it is taken to be the list of submodule names that should be imported when from package import * is encountered.

Now consider this example (assuming there’s no __all__ defined in sound/effects/__init__.py):

# anywhere in the code before import *
import sound.effects.echo
import sound.effects.surround

# in your module
from sound.effects import *

The last statement will import the echo and surround modules into the current namespace (possibly overriding previous definitions) because they are defined in the sound.effects package when the import statement is executed.

Question 47

Module A includes import B at its top. However under test conditions I’d like to mock B in A (mock A.B) and completely refrain from importing B.

In fact, B isn’t installed in the test environment on purpose.

A is the unit under test. I have to import A with all its functionality. B is the module I need to mock. But how can I mock B within A and stop A from importing the real B, if the first thing A does is import B?

(The reason B isn’t installed is that I use pypy for quick testing and unfortunately B isn’t compatible with pypy yet.)

How could this be done?

Question 48

You can assign to sys.modules['B'] before importing A to get what you want:

test.py:

import sys
sys.modules['B'] = __import__('mock_B')
import A

print(A.B.__name__)

A.py:

import B

Note B.py does not exist, but when running test.py no error is returned and print(A.B.__name__) prints mock_B. You still have to create a mock_B.py where you mock B‘s actual functions/variables/etc. Or you can just assign a Mock() directly:

test.py:

import sys
sys.modules['B'] = Mock()
import A

Question 49

The builtin __import__ can be mocked with the ‘mock’ library for more control:

# Store original __import__
orig_import = __import__
# This will be the B module
b_mock = mock.Mock()

def import_mock(name, *args):
    if name == 'B':
        return b_mock
    return orig_import(name, *args)

with mock.patch('__builtin__.__import__', side_effect=import_mock):
    import A

Say A looks like:

import B

def a():
    return B.func()

A.a() returns b_mock.func() which can be mocked also.

b_mock.func.return_value = 'spam'
A.a()  # returns 'spam'

Note for Python 3: As stated in the changelog for 3.0, __builtin__ is now named builtins:

Renamed module __builtin__ to builtins (removing the underscores, adding an ‘s’).

The code in this answer works fine if you replace __builtin__ by builtins for Python 3.

Question 50

How to mock an import, (mock A.B)?

Module A includes import B at its top.

Easy, just mock the library in sys.modules before it gets imported:

if wrong_platform():
    sys.modules['B'] = mock.MagicMock()

and then, so long as A doesn’t rely on specific types of data being returned from B’s objects:

import A

should just work.

You can also mock `import A.B`:

This works even if you have submodules, but you’ll want to mock each module. Say you have this:

from foo import This, That, andTheOtherThing
from foo.bar import Yada, YadaYada
from foo.baz import Blah, getBlah, boink

To mock, simply do the below before the module that contains the above is imported:

sys.modules['foo'] = MagicMock()
sys.modules['foo.bar'] = MagicMock()
sys.modules['foo.baz'] = MagicMock()

(My experience: I had a dependency that works on one platform, Windows, but didn’t work on Linux, where we run our daily tests. So I needed to mock the dependency for our tests. Luckily it was a black box, so I didn’t need to set up a lot of interaction.)

Mocking Side Effects

Addendum: Actually, I needed to simulate a side-effect that took some time. So I needed an object’s method to sleep for a second. That would work like this:

sys.modules['foo'] = MagicMock()
sys.modules['foo.bar'] = MagicMock()
sys.modules['foo.baz'] = MagicMock()
# setup the side-effect:
from time import sleep

def sleep_one(*args): 
    sleep(1)

# this gives us the mock objects that will be used
from foo.bar import MyObject 
my_instance = MyObject()
# mock the method!
my_instance.method_that_takes_time = mock.MagicMock(side_effect=sleep_one)

And then the code takes some time to run, just like the real method.

Question 51

I realize I’m a bit late to the party here, but here’s a somewhat insane way to automate this with the mock library:

(here’s an example usage)

import contextlib
import collections
import mock
import sys

def fake_module(**args):
    return (collections.namedtuple('module', args.keys())(**args))

def get_patch_dict(dotted_module_path, module):
    patch_dict = {}
    module_splits = dotted_module_path.split('.')

    # Add our module to the patch dict
    patch_dict[dotted_module_path] = module

    # We add the rest of the fake modules in backwards
    while module_splits:
        # This adds the next level up into the patch dict which is a fake
        # module that points at the next level down
        patch_dict['.'.join(module_splits[:-1])] = fake_module(
            **{module_splits[-1]: patch_dict['.'.join(module_splits)]}
        )
        module_splits = module_splits[:-1]

    return patch_dict

with mock.patch.dict(
    sys.modules,
    get_patch_dict('herp.derp', fake_module(foo='bar'))
):
    import herp.derp
    # prints bar
    print herp.derp.foo

The reason this is so ridiculously complicated is when an import occurs python basically does this (take for example from herp.derp import foo)

Does sys.modules['herp'] exist? Else import it. If still not ImportError
Does sys.modules['herp.derp'] exist? Else import it. If still not ImportError
Get attribute foo of sys.modules['herp.derp']. Else ImportError
foo = sys.modules['herp.derp'].foo

There are some downsides to this hacked together solution: If something else relies on other stuff in the module path this kind of screws it over. Also this only works for stuff that is being imported inline such as

def foo():
    import herp.derp

or

def foo():
    __import__('herp.derp')

Question 52

Aaron Hall’s answer works for me. Just want to mention one important thing,

if in A.py you do

from B.C.D import E

then in test.py you must mock every module along the path, otherwise you get ImportError

sys.modules['B'] = mock.MagicMock()
sys.modules['B.C'] = mock.MagicMock()
sys.modules['B.C.D'] = mock.MagicMock()

Question 53

I found fine way to mock the imports in Python. It’s Eric’s Zaadi solution found here which I just use inside my Django application.

I’ve got class SeatInterface which is interface to Seat model class. So inside my seat_interface module I have such an import:

from ..models import Seat

class SeatInterface(object):
    (...)

I wanted to create isolated tests for SeatInterface class with mocked Seat class as FakeSeat. The problem was – how tu run tests offline, where Django application is down. I had below error:

ImproperlyConfigured: Requested setting BASE_DIR, but settings are not configured. You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings.

Ran 1 test in 0.078s

FAILED (errors=1)

The solution was:

import unittest
from mock import MagicMock, patch

class FakeSeat(object):
    pass

class TestSeatInterface(unittest.TestCase):

    def setUp(self):
        models_mock = MagicMock()
        models_mock.Seat.return_value = FakeSeat
        modules = {'app.app.models': models_mock}
        patch.dict('sys.modules', modules).start()

    def test1(self):
        from app.app.models_interface.seat_interface import SeatInterface

And then test magically runs OK :)

.
Ran 1 test in 0.002s

OK

Question 54

If you do an import ModuleB you are really calling the builtin method __import__ as:

ModuleB = __import__('ModuleB', globals(), locals(), [], -1)

You could overwrite this method by importing the __builtin__ module and make a wrapper around the __builtin__.__import__method. Or you could play with the NullImporter hook from the imp module. Catching the exception and Mock your module/class in the except-block.

Pointer to the relevant docs:

docs.python.org: __import__

Accessing Import internals with the imp Module

I hope this helps. Be HIGHLY adviced that you step into the more arcane perimeters of python programming and that a) solid understanding what you really want to achieve and b)thorough understanding of the implications is important.

Question 55

In Python, is it possible to define an alias for an imported module?

For instance:

import a_ridiculously_long_module_name

…so that is has an alias of ‘short_name’.

Question 56

import a_ridiculously_long_module_name as short_name

also works for

import module.submodule.subsubmodule as short_name

Question 57

Check here

import module as name

or

from relative_module import identifier as name

Question 58

If you’ve done:

import long_module_name

you can also give it an alias by:

lmn = long_module_name

There’s no reason to do it this way in code, but I sometimes find it useful in the interactive interpreter.

Question 59

Yes, modules can be imported under an alias name. using as keyword. See

import math as ilovemaths # here math module is imported under an alias name
print(ilovemaths.sqrt(4))  # Using the sqrt() function

Question 60

from MODULE import TAGNAME as ALIAS

Question 61

According to the official documentation, os.path is a module. Thus, what is the preferred way of importing it?

# Should I always import it explicitly?
import os.path

Or…

# Is importing os enough?
import os

Please DON’T answer “importing os works for me”. I know, it works for me too right now (as of Python 2.6). What I want to know is any official recommendation about this issue. So, if you answer this question, please post your references.

Question 62

os.path works in a funny way. It looks like os should be a package with a submodule path, but in reality os is a normal module that does magic with sys.modules to inject os.path. Here’s what happens:

When Python starts up, it loads a bunch of modules into sys.modules. They aren’t bound to any names in your script, but you can access the already-created modules when you import them in some way.
- sys.modules is a dict in which modules are cached. When you import a module, if it already has been imported somewhere, it gets the instance stored in sys.modules.
os is among the modules that are loaded when Python starts up. It assigns its path attribute to an os-specific path module.
It injects sys.modules['os.path'] = path so that you’re able to do “import os.path” as though it was a submodule.

I tend to think of os.path as a module I want to use rather than a thing in the os module, so even though it’s not really a submodule of a package called os, I import it sort of like it is one and I always do import os.path. This is consistent with how os.path is documented.

Incidentally, this sort of structure leads to a lot of Python programmers’ early confusion about modules and packages and code organization, I think. This is really for two reasons

If you think of os as a package and know that you can do import os and have access to the submodule os.path, you may be surprised later when you can’t do import twisted and automatically access twisted.spread without importing it.
It is confusing that os.name is a normal thing, a string, and os.path is a module. I always structure my packages with empty __init__.py files so that at the same level I always have one type of thing: a module/package or other stuff. Several big Python projects take this approach, which tends to make more structured code.

Question 63

As per PEP-20 by Tim Peters, “Explicit is better than implicit” and “Readability counts”. If all you need from the os module is under os.path, import os.path would be more explicit and let others know what you really care about.

Likewise, PEP-20 also says “Simple is better than complex”, so if you also need stuff that resides under the more-general os umbrella, import os would be preferred.

Question 64

Definitive answer: import os and use os.path. do not import os.path directly.

From the documentation of the module itself:

>>> import os
>>> help(os.path)
...
Instead of importing this module directly, import os and refer to
this module as os.path.  The "os.path" name is an alias for this
module on Posix systems; on other systems (e.g. Mac, Windows),
os.path provides the same operations in a manner specific to that
platform, and is an alias to another module (e.g. macpath, ntpath).
...

问题：列出所有属于python软件包的模块吗？

回答 0

回答 1

回答 2

回答 3

回答 4

问题：何时使用os.name，sys.platform或platform.system？

回答 0

回答 1

回答 2

回答 3

回答 4

问题：排序Python`import x`和`from x import y`语句的正确方法是什么？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：如何动态加载Python类

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

问题：从__future__导入absolute_import实际起什么作用？

系统路径

sys.path

回答 0

回答 1

问题：为什么“导入*”不好？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

问题：如何模拟导入

回答 0

回答 1

回答 2

如何模拟导入（模拟AB）？

您还可以嘲笑import A.B：

模拟副作用

How to mock an import, (mock A.B)?

You can also mock import A.B:

Mocking Side Effects

回答 3

回答 4

回答 5

回答 6

问题：您可以在Python中为导入的模块定义别名吗？

回答 0

回答 1

回答 2

回答 3

回答 4

问题：我应该使用`import os.path`还是`import os`？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：如何导入上面目录中的Python类？

回答 0

回答 1

回答 2

回答 3

问题：从future导入absolute_import实际起什么作用？

您还可以嘲笑`import A.B`：

You can also mock `import A.B`: