Python 实用宝典

Question 1

Could you explain to me what the difference is between calling

python -m mymod1 mymod2.py args

and

python mymod1.py mymod2.py args

It seems in both cases mymod1.py is called and sys.argv is

['mymod1.py', 'mymod2.py', 'args']

So what is the -m switch for?

Question 2

The first line of the Rationale section of PEP 338 says:

Python 2.4 adds the command line switch -m to allow modules to be located using the Python module namespace for execution as scripts. The motivating examples were standard library modules such as pdb and profile, and the Python 2.4 implementation is fine for this limited purpose.

So you can specify any module in Python’s search path this way, not just files in the current directory. You’re correct that python mymod1.py mymod2.py args has exactly the same effect. The first line of the Scope of this proposal section states:

In Python 2.4, a module located using -m is executed just as if its filename had been provided on the command line.

With -m more is possible, like working with modules which are part of a package, etc. That’s what the rest of PEP 338 is about. Read it for more info.

Question 3

It’s worth mentioning this only works if the package has a file __main__.py Otherwise, this package can not be executed directly.

python -m some_package some_arguments

The python interpreter will looking for a __main__.py file in the package path to execute. It’s equivalent to:

python path_to_package/__main__.py somearguments

It will execute the content after:

if __name__ == "__main__":

Question 4

Despite this question having been asked and answered several times (e.g., here, here, here, and here) in my opinion no existing answer fully or concisely captures all the implications of the -m flag. Therefore, the following will attempt to improve on what has come before.

Introduction (TLDR)

The -m flag does a lot of things, not all of which will be needed all the time. In short it can be used to: (1) execute python code from the command line via modulename rather than filename (2) add a directory to sys.path for use in import resolution and (3) execute python code that contains relative imports from the command line.

Preliminaries

To explain the -m flag we first need to explain a little terminology.

Python’s primary organizational unit is known as a module. Module’s come in one of two flavors: code modules and package modules. A code module is any file that contains python executable code. A package module is a directory that contains other modules (either code modules or package modules). The most common type of code modules are *.py files while the most common type of package modules are directories containing an __init__.py file.

Python allows modules to be uniquely identified in two distinct ways: modulename and filename. In general, modules are identified by modulename in Python code (e.g., import <modulename>) and by filename on the command line (e.g., python <filename>). All python interpreters are able to convert modulenames to filenames by following the same few, well-defined rules. These rules hinge on the sys.path variable. By altering this variable one can change how Python resolves modulenames into filenames (for more on how this is done see PEP 302).

All modules (both code and package) can be executed (i.e., code associated with the module will be evaluated by the Python interpreter). Depending on the execution method (and module type) what code gets evaluated, and when, can change quite a bit. For example, if one executes a package module via python <filename> then <filename>/__init__.py will be evaluated followed by <filename>/__main__.py. On the other hand, if one executes that same package module via import <modulename> then only the package’s __init__.py will be executed.

Historical Development of `-m`

The -m flag was first introduced in Python 2.4.1. Initially its only purpose was to provide an alternative means of identifying the python module to execute from the command line. That is, if we knew both the <filename> and <modulename> for a module then the following two commands were equivalent: python <filename> <args> and python -m <modulename> <args>. One constraint with this iteration, according to PEP 338, was that -m only worked with top level modulenames (i.e., modules that could be found directly on sys.path without any intervening package modules).

With the completion of PEP 338 the -m feature was extended to support <modulename> representations beyond the top level. This meant names such as http.server were now fully supported. This extension also meant that each parent package in modulename was now evaluated (i.e., all parent package __init__.py files were evaluated) in addition to the module referenced by the modulename itself.

The final major feature enhancement for -m came with PEP 366. With this upgrade -m gained the ability to support not only absolute imports but also explicit relative imports when executing modules. This was achieved by changing -m so that it set the __package__ variable to the parent module of the given modulename (in addition to everything else it already did).

Use Cases

There are two notable use cases for the -m flag:

To execute modules from the command line for which one may not know their filename. This use case takes advantage of the fact that the Python interpreter knows how to convert modulenames to filenames. This is particularly advantageous when one wants to run stdlib modules or 3rd-party module from the command line. For example, very few people know the filename for the http.server module but most people do know its modulename so we can execute it from the command line using python -m http.server.
To execute a local package containing absolute or relative imports without needing to install it. This use case is detailed in PEP 338 and leverages the fact that the current working directory is added to sys.path rather than the module’s directory. This use case is very similar to using pip install -e . to install a package in develop/edit mode.

Shortcomings

With all the enhancements made to -m over the years it still has one major shortcoming — it can only execute modules written in Python (i.e., *.py). For example, if -m is used to execute a C compiled code module the following error will be produced, No code object available for <modulename> (see here for more details).

Detailed Comparisons

Effects of module execution via import statement (i.e., import <modulename>):

sys.path is not modified in any way
__name__ is set to the absolute form of <modulename>
__package__ is set to the immediate parent package in <modulename>
__init__.py is evaluated for all packages (including its own for package modules)
__main__.py is not evaluated for package modules; the code is evaluated for code modules

Effects of module execution via command line (i.e., python <filename>):

sys.path is modified to include the final directory in <filename>
__name__ is set to '__main__'
__package__ is set to None
__init__.py is not evaluated for any package (including its own for package modules)
__main__.py is evaluated for package modules; the code is evaluated for code modules.

Effects of module execution via command line with the -m flag (i.e., python -m <modulename>):

sys.path is modified to include the current directory
__name__ is set to '__main__'
__package__ is set to the immediate parent package in <modulename>
__init__.py is evaluated for all packages (including its own for package modules)
__main__.py is evaluated for package modules; the code is evaluated for code modules

Conclusion

The -m flag is, at its simplest, a means to execute python scripts from the command line by using modulenames rather than filenames. The real power of -m, however, is in its ability to combine the power of import statements (e.g., support for explicit relative imports and automatic package __init__ evaluation) with the convenience of the command line.

Question 5

I have a Python program I’m building that can be run in either of 2 ways: the first is to call “python main.py” which prompts the user for input in a friendly manner and then runs the user input through the program. The other way is to call “python batch.py -file-” which will pass over all the friendly input gathering and run an entire file’s worth of input through the program in a single go.

The problem is that when I run “batch.py” it imports some variables/methods/etc from “main.py”, and when it runs this code:

import main

at the first line of the program, it immediately errors because it tries to run the code in “main.py”.

How can I stop Python from running the code contained in the “main” module which I’m importing?

Question 6

Because this is just how Python works – keywords such as class and def are not declarations. Instead, they are real live statements which are executed. If they were not executed your module would be .. empty :-)

Anyway, the idiomatic approach is:

# stuff to run always here such as class/def
def main():
    pass

if __name__ == "__main__":
   # stuff only to run when not called via 'import' here
   main()

See What is if __name__ == "__main__" for?

It does require source control over the module being imported, however.

Happy coding.

Question 7

Due to the way Python works, it is necessary for it to run your modules when it imports them.

To prevent code in the module from being executed when imported, but only when run directly, you can guard it with this if:

if __name__ == "__main__":
    # this won't be run when imported

You may want to put this code in a main() method, so that you can either execute the file directly, or import the module and call the main(). For example, assume this is in the file foo.py.

def main():
    print "Hello World"

if __name__ == "__main__":
    main()

This program can be run either by going python foo.py, or from another Python script:

import foo

...

foo.main()

Question 8

Use the if __name__ == '__main__' idiom — __name__ is a special variable whose value is '__main__' if the module is being run as a script, and the module name if it’s imported. So you’d do something like

# imports
# class/function definitions
if __name__ == '__main__':
    # code here will only run when you invoke 'python main.py'

Question 9

Unfortunately, you don’t. That is part of how the import syntax works and it is important that it does so — remember def is actually something executed, if Python did not execute the import, you’d be, well, stuck without functions.

Since you probably have access to the file, though, you might be able to look and see what causes the error. It might be possible to modify your environment to prevent the error from happening.

Question 10

Put the code inside a function and it won’t run until you call the function. You should have a main function in your main.py. with the statement:

if __name__ == '__main__':
  main()

Then, if you call python main.py the main() function will run. If you import main.py, it will not. Also, you should probably rename main.py to something else for clarity’s sake.

Question 11

There was a Python enhancement proposal PEP 299 which aimed to replace if __name__ == '__main__': idiom with def __main__:, but it was rejected. It’s still a good read to know what to keep in mind when using if __name__ = '__main__':.

Question 12

You may write your “main.py” like this:

#!/usr/bin/env python

__all__=["somevar", "do_something"]

somevar=""

def do_something():
    pass #blahblah

if __name__=="__main__":
    do_something()

Question 13

Although you cannot use import without running the code; there is quite a swift way in which you can input your variables; by using numpy.savez, which stores variables as numpy arrays in a .npz file. Afterwards you can load the variables using numpy.load.

See a full description in the scipy documentation

Please note this is only the case for variables and arrays of variable, and not for methods, etc.

Question 14

Try just importing the functions needed from main.py? So,

from main import SomeFunction

It could be that you’ve named a function in batch.py the same as one in main.py, and when you import main.py the program runs the main.py function instead of the batch.py function; doing the above should fix that. I hope.

Question 15

I have a file with some probabilities for different values e.g.:

I would like to generate random numbers using this distribution. Does an existing module that handles this exist? It’s fairly simple to code on your own (build the cumulative density function, generate a random value [0,1] and pick the corresponding value) but it seems like this should be a common problem and probably someone has created a function/module for it.

I need this because I want to generate a list of birthdays (which do not follow any distribution in the standard random module).

Question 16

scipy.stats.rv_discrete might be what you want. You can supply your probabilities via the values parameter. You can then use the rvs() method of the distribution object to generate random numbers.

As pointed out by Eugene Pakhomov in the comments, you can also pass a p keyword parameter to numpy.random.choice(), e.g.

numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])

If you are using Python 3.6 or above, you can use random.choices() from the standard library – see the answer by Mark Dickinson.

Question 17

Since Python 3.6, there’s a solution for this in Python’s standard library, namely random.choices.

Example usage: let’s set up a population and weights matching those in the OP’s question:

>>> from random import choices
>>> population = [1, 2, 3, 4, 5, 6]
>>> weights = [0.1, 0.05, 0.05, 0.2, 0.4, 0.2]

Now choices(population, weights) generates a single sample:

>>> choices(population, weights)
4

The optional keyword-only argument k allows one to request more than one sample at once. This is valuable because there’s some preparatory work that random.choices has to do every time it’s called, prior to generating any samples; by generating many samples at once, we only have to do that preparatory work once. Here we generate a million samples, and use collections.Counter to check that the distribution we get roughly matches the weights we gave.

>>> million_samples = choices(population, weights, k=10**6)
>>> from collections import Counter
>>> Counter(million_samples)
Counter({5: 399616, 6: 200387, 4: 200117, 1: 99636, 3: 50219, 2: 50025})

Question 18

An advantage to generating the list using CDF is that you can use binary search. While you need O(n) time and space for preprocessing, you can get k numbers in O(k log n). Since normal Python lists are inefficient, you can use array module.

If you insist on constant space, you can do the following; O(n) time, O(1) space.

def random_distr(l):
    r = random.uniform(0, 1)
    s = 0
    for item, prob in l:
        s += prob
        if s >= r:
            return item
    return item  # Might occur because of floating point inaccuracies

Question 19

Maybe it is kind of late. But you can use numpy.random.choice(), passing the p parameter:

val = numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])

Question 20

(OK, I know you are asking for shrink-wrap, but maybe those home-grown solutions just weren’t succinct enough for your liking. :-)

pdf = [(1, 0.1), (2, 0.05), (3, 0.05), (4, 0.2), (5, 0.4), (6, 0.2)]
cdf = [(i, sum(p for j,p in pdf if j < i)) for i,_ in pdf]
R = max(i for r in [random.random()] for i,c in cdf if c <= r)

I pseudo-confirmed that this works by eyeballing the output of this expression:

sorted(max(i for r in [random.random()] for i,c in cdf if c <= r)
       for _ in range(1000))

Question 21

I wrote a solution for drawing random samples from a custom continuous distribution.

I needed this for a similar use-case to yours (i.e. generating random dates with a given probability distribution).

You just need the funtion random_custDist and the line samples=random_custDist(x0,x1,custDist=custDist,size=1000). The rest is decoration ^^.

import numpy as np

#funtion
def random_custDist(x0,x1,custDist,size=None, nControl=10**6):
    #genearte a list of size random samples, obeying the distribution custDist
    #suggests random samples between x0 and x1 and accepts the suggestion with probability custDist(x)
    #custDist noes not need to be normalized. Add this condition to increase performance. 
    #Best performance for max_{x in [x0,x1]} custDist(x) = 1
    samples=[]
    nLoop=0
    while len(samples)<size and nLoop<nControl:
        x=np.random.uniform(low=x0,high=x1)
        prop=custDist(x)
        assert prop>=0 and prop<=1
        if np.random.uniform(low=0,high=1) <=prop:
            samples += [x]
        nLoop+=1
    return samples

#call
x0=2007
x1=2019
def custDist(x):
    if x<2010:
        return .3
    else:
        return (np.exp(x-2008)-1)/(np.exp(2019-2007)-1)
samples=random_custDist(x0,x1,custDist=custDist,size=1000)
print(samples)

#plot
import matplotlib.pyplot as plt
#hist
bins=np.linspace(x0,x1,int(x1-x0+1))
hist=np.histogram(samples, bins )[0]
hist=hist/np.sum(hist)
plt.bar( (bins[:-1]+bins[1:])/2, hist, width=.96, label='sample distribution')
#dist
grid=np.linspace(x0,x1,100)
discCustDist=np.array([custDist(x) for x in grid]) #distrete version
discCustDist*=1/(grid[1]-grid[0])/np.sum(discCustDist)
plt.plot(grid,discCustDist,label='custom distribustion (custDist)', color='C1', linewidth=4)
#decoration
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()

The performance of this solution is improvable for sure, but I prefer readability.

Question 22

Make a list of items, based on their weights:

items = [1, 2, 3, 4, 5, 6]
probabilities= [0.1, 0.05, 0.05, 0.2, 0.4, 0.2]
# if the list of probs is normalized (sum(probs) == 1), omit this part
prob = sum(probabilities) # find sum of probs, to normalize them
c = (1.0)/prob # a multiplier to make a list of normalized probs
probabilities = map(lambda x: c*x, probabilities)
print probabilities

ml = max(probabilities, key=lambda x: len(str(x)) - str(x).find('.'))
ml = len(str(ml)) - str(ml).find('.') -1
amounts = [ int(x*(10**ml)) for x in probabilities]
itemsList = list()
for i in range(0, len(items)): # iterate through original items
  itemsList += items[i:i+1]*amounts[i]

# choose from itemsList randomly
print itemsList

An optimization may be to normalize amounts by the greatest common divisor, to make the target list smaller.

Also, this might be interesting.

Question 23

Another answer, probably faster :)

distribution = [(1, 0.2), (2, 0.3), (3, 0.5)]  
# init distribution  
dlist = []  
sumchance = 0  
for value, chance in distribution:  
    sumchance += chance  
    dlist.append((value, sumchance))  
assert sumchance == 1.0 # not good assert because of float equality  

# get random value  
r = random.random()  
# for small distributions use lineair search  
if len(distribution) < 64: # don't know exact speed limit  
    for value, sumchance in dlist:  
        if r < sumchance:  
            return value  
else:  
    # else (not implemented) binary search algorithm

Question 24

from __future__ import division
import random
from collections import Counter


def num_gen(num_probs):
    # calculate minimum probability to normalize
    min_prob = min(prob for num, prob in num_probs)
    lst = []
    for num, prob in num_probs:
        # keep appending num to lst, proportional to its probability in the distribution
        for _ in range(int(prob/min_prob)):
            lst.append(num)
    # all elems in lst occur proportional to their distribution probablities
    while True:
        # pick a random index from lst
        ind = random.randint(0, len(lst)-1)
        yield lst[ind]

Verification:

gen = num_gen([(1, 0.1),
               (2, 0.05),
               (3, 0.05),
               (4, 0.2),
               (5, 0.4),
               (6, 0.2)])
lst = []
times = 10000
for _ in range(times):
    lst.append(next(gen))
# Verify the created distribution:
for item, count in Counter(lst).iteritems():
    print '%d has %f probability' % (item, count/times)

1 has 0.099737 probability
2 has 0.050022 probability
3 has 0.049996 probability 
4 has 0.200154 probability
5 has 0.399791 probability
6 has 0.200300 probability

Question 25

based on other solutions, you generate accumulative distribution (as integer or float whatever you like), then you can use bisect to make it fast

this is a simple example (I used integers here)

l=[(20, 'foo'), (60, 'banana'), (10, 'monkey'), (10, 'monkey2')]
def get_cdf(l):
    ret=[]
    c=0
    for i in l: c+=i[0]; ret.append((c, i[1]))
    return ret

def get_random_item(cdf):
    return cdf[bisect.bisect_left(cdf, (random.randint(0, cdf[-1][0]),))][1]

cdf=get_cdf(l)
for i in range(100): print get_random_item(cdf),

the get_cdf function would convert it from 20, 60, 10, 10 into 20, 20+60, 20+60+10, 20+60+10+10

now we pick a random number up to 20+60+10+10 using random.randint then we use bisect to get the actual value in a fast way

Question 26

you might want to have a look at NumPy Random sampling distributions

Question 27

None of these answers is particularly clear or simple.

Here is a clear, simple method that is guaranteed to work.

accumulate_normalize_probabilities takes a dictionary p that maps symbols to probabilities OR frequencies. It outputs usable list of tuples from which to do selection.

def accumulate_normalize_values(p):
        pi = p.items() if isinstance(p,dict) else p
        accum_pi = []
        accum = 0
        for i in pi:
                accum_pi.append((i[0],i[1]+accum))
                accum += i[1]
        if accum == 0:
                raise Exception( "You are about to explode the universe. Continue ? Y/N " )
        normed_a = []
        for a in accum_pi:
                normed_a.append((a[0],a[1]*1.0/accum))
        return normed_a

Yields:

>>> accumulate_normalize_values( { 'a': 100, 'b' : 300, 'c' : 400, 'd' : 200  } )
[('a', 0.1), ('c', 0.5), ('b', 0.8), ('d', 1.0)]

Why it works

The accumulation step turns each symbol into an interval between itself and the previous symbols probability or frequency (or 0 in the case of the first symbol). These intervals can be used to select from (and thus sample the provided distribution) by simply stepping through the list until the random number in interval 0.0 -> 1.0 (prepared earlier) is less or equal to the current symbol’s interval end-point.

The normalization releases us from the need to make sure everything sums to some value. After normalization the “vector” of probabilities sums to 1.0.

The rest of the code for selection and generating a arbitrarily long sample from the distribution is below :

def select(symbol_intervals,random):
        print symbol_intervals,random
        i = 0
        while random > symbol_intervals[i][1]:
                i += 1
                if i >= len(symbol_intervals):
                        raise Exception( "What did you DO to that poor list?" )
        return symbol_intervals[i][0]


def gen_random(alphabet,length,probabilities=None):
        from random import random
        from itertools import repeat
        if probabilities is None:
                probabilities = dict(zip(alphabet,repeat(1.0)))
        elif len(probabilities) > 0 and isinstance(probabilities[0],(int,long,float)):
                probabilities = dict(zip(alphabet,probabilities)) #ordered
        usable_probabilities = accumulate_normalize_values(probabilities)
        gen = []
        while len(gen) < length:
                gen.append(select(usable_probabilities,random()))
        return gen

Usage :

>>> gen_random (['a','b','c','d'],10,[100,300,400,200])
['d', 'b', 'b', 'a', 'c', 'c', 'b', 'c', 'c', 'c']   #<--- some of the time

Question 28

Here is a more effective way of doing this:

Just call the following function with your ‘weights’ array (assuming the indices as the corresponding items) and the no. of samples needed. This function can be easily modified to handle ordered pair.

Returns indexes (or items) sampled/picked (with replacement) using their respective probabilities:

def resample(weights, n):
    beta = 0

    # Caveat: Assign max weight to max*2 for best results
    max_w = max(weights)*2

    # Pick an item uniformly at random, to start with
    current_item = random.randint(0,n-1)
    result = []

    for i in range(n):
        beta += random.uniform(0,max_w)

        while weights[current_item] < beta:
            beta -= weights[current_item]
            current_item = (current_item + 1) % n   # cyclic
        else:
            result.append(current_item)
    return result

A short note on the concept used in the while loop. We reduce the current item’s weight from cumulative beta, which is a cumulative value constructed uniformly at random, and increment current index in order to find the item, the weight of which matches the value of beta.

Question 29

I want to define a constant that should be available in all of the submodules of a package. I’ve thought that the best place would be in in the __init__.py file of the root package. But I don’t know how to do this. Suppose I have a few subpackages and each with several modules. How can I access that variable from these modules?

Of course, if this is totally wrong, and there is a better alternative, I’d like to know it.

Question 30

You should be able to put them in __init__.py. This is done all the time.

mypackage/__init__.py:

MY_CONSTANT = 42

mypackage/mymodule.py:

from mypackage import MY_CONSTANT
print "my constant is", MY_CONSTANT

Then, import mymodule:

>>> from mypackage import mymodule
my constant is 42

Still, if you do have constants, it would be reasonable (best practices, probably) to put them in a separate module (constants.py, config.py, …) and then if you want them in the package namespace, import them.

mypackage/__init__.py:

from mypackage.constants import *

Still, this doesn’t automatically include the constants in the namespaces of the package modules. Each of the modules in the package will still have to import constants explicitly either from mypackage or from mypackage.constants.

Question 31

You cannot do that. You will have to explicitely import your constants into each individual module’s namespace. The best way to achieve this is to define your constants in a “config” module and import it everywhere you require it:

# mypackage/config.py
MY_CONST = 17

# mypackage/main.py
from mypackage.config import *

Question 32

You can define global variables from anywhere, but it is a really bad idea. import the __builtin__ module and modify or add attributes to this modules, and suddenly you have new builtin constants or functions. In fact, when my application installs gettext, I get the _() function in all my modules, without importing anything. So this is possible, but of course only for Application-type projects, not for reusable packages or modules.

And I guess no one would recommend this practice anyway. What’s wrong with a namespace? Said application has the version module, so that I have “global” variables available like version.VERSION, version.PACKAGE_NAME etc.

Question 33

Just wanted to add that constants can be employed using a config.ini file and parsed in the script using the configparser library. This way you could have constants for multiple circumstances. For instance if you had parameter constants for two separate url requests just label them like so:

mymodule/config.ini
[request0]
conn = 'admin@localhost'
pass = 'admin'
...

[request1]
conn = 'barney@localhost'
pass = 'dinosaur'
...

I found the documentation on the Python website very helpful. I am not sure if there are any differences between Python 2 and 3 so here are the links to both:

For Python 3: https://docs.python.org/3/library/configparser.html#module-configparser

For Python 2: https://docs.python.org/2/library/configparser.html#module-configparser

Question 34

How can implement the equivalent of a __getattr__ on a class, on a module?

Example

When calling a function that does not exist in a module’s statically defined attributes, I wish to create an instance of a class in that module, and invoke the method on it with the same name as failed in the attribute lookup on the module.

class A(object):
    def salutation(self, accusative):
        print "hello", accusative

# note this function is intentionally on the module, and not the class above
def __getattr__(mod, name):
    return getattr(A(), name)

if __name__ == "__main__":
    # i hope here to have my __getattr__ function above invoked, since
    # salutation does not exist in the current namespace
    salutation("world")

Which gives:

matt@stanley:~/Desktop$ python getattrmod.py 
Traceback (most recent call last):
  File "getattrmod.py", line 9, in <module>
    salutation("world")
NameError: name 'salutation' is not defined

Question 35

A while ago, Guido declared that all special method lookups on new-style classes bypass __getattr__ and __getattribute__. Dunder methods had previously worked on modules – you could, for example, use a module as a context manager simply by defining __enter__ and __exit__, before those tricks broke.

Recently some historical features have made a comeback, the module __getattr__ among them, and so the existing hack (a module replacing itself with a class in sys.modules at import time) should be no longer necessary.

In Python 3.7+, you just use the one obvious way. To customize attribute access on a module, define a __getattr__ function at the module level which should accept one argument (name of attribute), and return the computed value or raise an AttributeError:

# my_module.py

def __getattr__(name: str) -> Any:
    ...

This will also allow hooks into “from” imports, i.e. you can return dynamically generated objects for statements such as from my_module import whatever.

On a related note, along with the module getattr you may also define a __dir__ function at module level to respond to dir(my_module). See PEP 562 for details.

Question 36

There are two basic problems you are running into here:

__xxx__ methods are only looked up on the class
TypeError: can't set attributes of built-in/extension type 'module'

(1) means any solution would have to also keep track of which module was being examined, otherwise every module would then have the instance-substitution behavior; and (2) means that (1) isn’t even possible… at least not directly.

Fortunately, sys.modules is not picky about what goes there so a wrapper will work, but only for module access (i.e. import somemodule; somemodule.salutation('world'); for same-module access you pretty much have to yank the methods from the substitution class and add them to globals() eiher with a custom method on the class (I like using .export()) or with a generic function (such as those already listed as answers). One thing to keep in mind: if the wrapper is creating a new instance each time, and the globals solution is not, you end up with subtly different behavior. Oh, and you don’t get to use both at the same time — it’s one or the other.

Update

From Guido van Rossum:

There is actually a hack that is occasionally used and recommended: a module can define a class with the desired functionality, and then at the end, replace itself in sys.modules with an instance of that class (or with the class, if you insist, but that’s generally less useful). E.g.:

# module foo.py

import sys

class Foo:
    def funct1(self, <args>): <code>
    def funct2(self, <args>): <code>

sys.modules[__name__] = Foo()

This works because the import machinery is actively enabling this hack, and as its final step pulls the actual module out of sys.modules, after loading it. (This is no accident. The hack was proposed long ago and we decided we liked enough to support it in the import machinery.)

So the established way to accomplish what you want is to create a single class in your module, and as the last act of the module replace sys.modules[__name__] with an instance of your class — and now you can play with __getattr__/__setattr__/__getattribute__ as needed.

Note 1: If you use this functionality then anything else in the module, such as globals, other functions, etc., will be lost when the sys.modules assignment is made — so make sure everything needed is inside the replacement class.

Note 2: To support from module import * you must have __all__ defined in the class; for example:

class Foo:
    def funct1(self, <args>): <code>
    def funct2(self, <args>): <code>
    __all__ = list(set(vars().keys()) - {'__module__', '__qualname__'})

Depending on your Python version, there may be other names to omit from __all__. The set() can be omitted if Python 2 compatibility is not needed.

Question 37

This is a hack, but you can wrap the module with a class:

class Wrapper(object):
  def __init__(self, wrapped):
    self.wrapped = wrapped
  def __getattr__(self, name):
    # Perform custom logic here
    try:
      return getattr(self.wrapped, name)
    except AttributeError:
      return 'default' # Some sensible default

sys.modules[__name__] = Wrapper(sys.modules[__name__])

Question 38

We don’t usually do it that way.

What we do is this.

class A(object):
....

# The implicit global instance
a= A()

def salutation( *arg, **kw ):
    a.salutation( *arg, **kw )

Why? So that the implicit global instance is visible.

For examples, look at the random module, which creates an implicit global instance to slightly simplify the use cases where you want a “simple” random number generator.

Question 39

Similar to what @Håvard S proposed, in a case where I needed to implement some magic on a module (like __getattr__), I would define a new class that inherits from types.ModuleType and put that in sys.modules (probably replacing the module where my custom ModuleType was defined).

See the main __init__.py file of Werkzeug for a fairly robust implementation of this.

Question 40

This is hackish, but…

import types

class A(object):
    def salutation(self, accusative):
        print "hello", accusative

    def farewell(self, greeting, accusative):
         print greeting, accusative

def AddGlobalAttribute(classname, methodname):
    print "Adding " + classname + "." + methodname + "()"
    def genericFunction(*args):
        return globals()[classname]().__getattribute__(methodname)(*args)
    globals()[methodname] = genericFunction

# set up the global namespace

x = 0   # X and Y are here to add them implicitly to globals, so
y = 0   # globals does not change as we iterate over it.

toAdd = []

def isCallableMethod(classname, methodname):
    someclass = globals()[classname]()
    something = someclass.__getattribute__(methodname)
    return callable(something)


for x in globals():
    print "Looking at", x
    if isinstance(globals()[x], (types.ClassType, type)):
        print "Found Class:", x
        for y in dir(globals()[x]):
            if y.find("__") == -1: # hack to ignore default methods
                if isCallableMethod(x,y):
                    if y not in globals(): # don't override existing global names
                        toAdd.append((x,y))


for x in toAdd:
    AddGlobalAttribute(*x)


if __name__ == "__main__":
    salutation("world")
    farewell("goodbye", "world")

This works by iterating over the all the objects in the global namespace. If the item is a class, it iterates over the class attributes. If the attribute is callable it adds it to the global namespace as a function.

It ignore all attributes which contain “__”.

I wouldn’t use this in production code, but it should get you started.

Question 41

Here’s my own humble contribution — a slight embellishment of @Håvard S’s highly rated answer, but a bit more explicit (so it might be acceptable to @S.Lott, even though probably not good enough for the OP):

import sys

class A(object):
    def salutation(self, accusative):
        print "hello", accusative

class Wrapper(object):
    def __init__(self, wrapped):
        self.wrapped = wrapped

    def __getattr__(self, name):
        try:
            return getattr(self.wrapped, name)
        except AttributeError:
            return getattr(A(), name)

_globals = sys.modules[__name__] = Wrapper(sys.modules[__name__])

if __name__ == "__main__":
    _globals.salutation("world")

Question 42

Create your module file that has your classes. Import the module. Run getattr on the module you just imported. You can do a dynamic import using __import__ and pull the module from sys.modules.

Here’s your module some_module.py:

class Foo(object):
    pass

class Bar(object):
    pass

And in another module:

import some_module

Foo = getattr(some_module, 'Foo')

Doing this dynamically:

import sys

__import__('some_module')
mod = sys.modules['some_module']
Foo = getattr(mod, 'Foo')

Question 43

I’m starting to learn python and loving it. I work on a Mac mainly as well as Linux. I’m finding that on Linux (Ubuntu 9.04 mostly) when I install a python module using apt-get it works fine. I can import it with no trouble.

On the Mac, I’m used to using Macports to install all the Unixy stuff. However, I’m finding that most of the python modules I install with it are not being seen by python. I’ve spent some time playing around with PATH settings and using python_select . Nothing has really worked and at this point I’m not really understanding, instead I’m just poking around.

I get the impression that Macports isn’t universally loved for managing python modules. I’d like to start fresh using a more “accepted” (if that’s the right word) approach.

So, I was wondering, what is the method that Mac python developers use to manage their modules?

Bonus questions:

Do you use Apple’s python, or some other version? Do you compile everything from source or is there a package manger that works well (Fink?).

Question 44

The most popular way to manage python packages (if you’re not using your system package manager) is to use setuptools and easy_install. It is probably already installed on your system. Use it like this:

easy_install django

easy_install uses the Python Package Index which is an amazing resource for python developers. Have a look around to see what packages are available.

A better option is pip, which is gaining traction, as it attempts to fix a lot of the problems associated with easy_install. Pip uses the same package repository as easy_install, it just works better. Really the only time use need to use easy_install is for this command:

easy_install pip

After that, use:

pip install django

At some point you will probably want to learn a bit about virtualenv. If you do a lot of python development on projects with conflicting package requirements, virtualenv is a godsend. It will allow you to have completely different versions of various packages, and switch between them easily depending your needs.

Regarding which python to use, sticking with Apple’s python will give you the least headaches, but If you need a newer version (Leopard is 2.5.1 I believe), I would go with the macports python 2.6.

Question 45

Your question is already three years old and there are some details not covered in other answers:

Most people I know use HomeBrew or MacPorts, I prefer MacPorts because of its clean cut of what is a default Mac OS X environment and my development setup. Just move out your /opt folder and test your packages with a normal user Python environment

MacPorts is only portable within Mac, but with easy_install or pip you will learn how to setup your environment in any platform (Win/Mac/Linux/Bsd…). Furthermore it will always be more up to date and with more packages

I personally let MacPorts handle my Python modules to keep everything updated. Like any other high level package manager (ie: apt-get) it is much better for the heavy lifting of modules with lots of binary dependencies. There is no way I would build my Qt bindings (PySide) with easy_install or pip. Qt is huge and takes a lot to compile. As soon as you want a Python package that needs a library used by non Python programs, try to avoid easy_install or pip

At some point you will find that there are some packages missing within MacPorts. I do not believe that MacPorts will ever give you the whole CheeseShop. For example, recently I needed the Elixir module, but MacPorts only offers py25-elixir and py26-elixir, no py27 version. In cases like these you have:

pip-2.7 install –user elixir

( make sure you always type pip-(version) )

That will build an extra Python library in your home dir. Yes, Python will work with more than one library location: one controlled by MacPorts and a user local one for everything missing within MacPorts.

Now notice that I favor pip over easy_install. There is a good reason you should avoid setuptools and easy_install. Here is a good explanation and I try to keep away from them. One very useful feature of pip is giving you a list of all the modules (along their versions) that you installed with MacPorts, easy_install and pip itself:

pip-2.7 freeze

If you already started using easy_install, don’t worry, pip can recognize everything done already by easy_install and even upgrade the packages installed with it.

If you are a developer keep an eye on virtualenv for controlling different setups and combinations of module versions. Other answers mention it already, what is not mentioned so far is the Tox module, a tool for testing that your package installs correctly with different Python versions.

Although I usually do not have version conflicts, I like to have virtualenv to set up a clean environment and get a clear view of my packages dependencies. That way I never forget any dependencies in my setup.py

If you go for MacPorts be aware that multiple versions of the same package are not selected anymore like the old Debian style with an extra python_select package (it is still there for compatibility). Now you have the select command to choose which Python version will be used (you can even select the Apple installed ones):

$  port select python
Available versions for python:
    none
    python25-apple
    python26-apple
    python27 (active)
    python27-apple
    python32

$ port select python python32

Add tox on top of it and your programs should be really portable

Question 46

Please see Python OS X development environment. The best way is to use MacPorts. Download and install MacPorts, then install Python via MacPorts by typing the following commands in the Terminal:

sudo port install python26 python_select
sudo port select --set python python26

OR

sudo port install python30 python_select
sudo port select --set python python30

Use the first set of commands to install Python 2.6 and the second set to install Python 3.0. Then use:

sudo port install py26-packagename

OR

sudo port install py30-packagename

In the above commands, replace packagename with the name of the package, for example:

sudo port install py26-setuptools

These commands will automatically install the package (and its dependencies) for the given Python version.

For a full list of available packages for Python, type:

port list | grep py26-

OR

port list | grep py30-

Which command you use depends on which version of Python you chose to install.

Question 47

I use MacPorts to install Python and any third-party modules tracked by MacPorts into /opt/local, and I install any manually installed modules (those not in the MacPorts repository) into /usr/local, and this has never caused any problems. I think you may be confused as to the use of certain MacPorts scripts and environment variables.

MacPorts python_select is used to select the “current” version of Python, but it has nothing to do with modules. This allows you to, e.g., install both Python 2.5 and Python 2.6 using MacPorts, and switch between installs.

The $PATH environment variables does not affect what Python modules are loaded. $PYTHONPATH is what you are looking for. $PYTHONPATH should point to directories containing Python modules you want to load. In my case, my $PYTHONPATH variable contains /usr/local/lib/python26/site-packages. If you use MacPorts’ Python, it sets up the other proper directories for you, so you only need to add additional paths to $PYTHONPATH. But again, $PATH isn’t used at all when Python searches for modules you have installed.

$PATH is used to find executables, so if you install MacPorts’ Python, make sure /opt/local/bin is in your $PATH.

Question 48

There’s nothing wrong with using a MacPorts Python installation. If you are installing python modules from MacPorts but then not seeing them, that likely means you are not invoking the MacPorts python you installed to. In a terminal shell, you can use absolute paths to invoke the various Pythons that may be installed. For example:

$ /usr/bin/python2.5         # Apple-supplied 2.5 (Leopard)
$ /opt/local/bin/python2.5   # MacPorts 2.5
$ /opt/local/bin/python2.6   # MacPorts 2.6
$ /usr/local/bin/python2.6   # python.org (MacPython) 2.6
$ /usr/local/bin/python3.1   # python.org (MacPython) 3.1

To get the right python by default requires ensuring your shell $PATH is set properly to ensure that the right executable is found first. Another solution is to define shell aliases to the various pythons.

A python.org (MacPython) installation is fine, too, as others have suggested. easy_install can help but, again, because each Python instance may have its own easy_install command, make sure you are invoking the right easy_install.

Question 49

If you use Python from MacPorts, it has it’s own easy_install located at: /opt/local/bin/easy_install-2.6 (for py26, that is). It’s not the same one as simply calling easy_install directly, even if you used python_select to change your default python command.

Question 50

Have you looked into easy_install at all? It won’t synchronize your macports or anything like that, but it will automatically download the latest package and all necessary dependencies, i.e.

easy_install nose

for the nose unit testing package, or

easy_install trac

for the trac bug tracker.

There’s a bit more information on their EasyInstall page too.

Question 51

For MacPython installations, I found an effective solution to fixing the problem with setuptools (easy_install) in this blog post:

http://droidism.com/getting-running-with-django-and-macpython-26-on-leopard

One handy tip includes finding out which version of python is active in the terminal:

which python

Question 52

When you install modules with MacPorts, it does not go into Apple’s version of Python. Instead those modules are installed onto the MacPorts version of Python selected.

You can change which version of Python is used by default using a mac port called python_select. instructions here.

Also, there’s easy_install. Which will use python to install python modules.

Question 53

You may already have pip3 pre-installed, so just try it!

Question 54

Regarding which python version to use, Mac OS usually ships an old version of python. It’s a good idea to upgrade to a newer version. You can download a .dmg from http://www.python.org/download/ . If you do that, remember to update the path. You can find the exact commands here http://farmdev.com/thoughts/66/python-3-0-on-mac-os-x-alongside-2-6-2-5-etc-/

Question 55

I use easy_install with Apple’s Python, and it works like a charm.

Question 56

Directly install one of the fink packages (Django 1.6 as of 2013-Nov)

fink install django-py27
fink install django-py33

Or create yourself a virtualenv:

fink install virtualenv-py27
virtualenv django-env
source django-env/bin/activate
pip install django
deactivate # when you are done

Or use fink django plus any other pip installed packages in a virtualenv

fink install django-py27
fink install virtualenv-py27
virtualenv django-env --system-site-packages
source django-env/bin/activate
# django already installed
pip install django-analytical # or anything else you might want
deactivate # back to your normally scheduled programming

Question 57

The __debug__ variable is handy in part because it affects every module. If I want to create another variable that works the same way, how would I do it?

The variable (let’s be original and call it ‘foo’) doesn’t have to be truly global, in the sense that if I change foo in one module, it is updated in others. I’d be fine if I could set foo before importing other modules and then they would see the same value for it.

Question 58

I don’t endorse this solution in any way, shape or form. But if you add a variable to the __builtin__ module, it will be accessible as if a global from any other module that includes __builtin__ — which is all of them, by default.

a.py contains

print foo

b.py contains

import __builtin__
__builtin__.foo = 1
import a

The result is that “1” is printed.

Edit: The __builtin__ module is available as the local symbol __builtins__ — that’s the reason for the discrepancy between two of these answers. Also note that __builtin__ has been renamed to builtins in python3.

Question 59

If you need a global cross-module variable maybe just simple global module-level variable will suffice.

a.py:

var = 1

b.py:

import a
print a.var
import c
print a.var

c.py:

import a
a.var = 2

Test:

$ python b.py
# -> 1 2

Real-world example: Django’s global_settings.py (though in Django apps settings are used by importing the object django.conf.settings).

Question 60

Define a module ( call it “globalbaz” ) and have the variables defined inside it. All the modules using this “pseudoglobal” should import the “globalbaz” module, and refer to it using “globalbaz.var_name”

This works regardless of the place of the change, you can change the variable before or after the import. The imported module will use the latest value. (I tested this in a toy example)

For clarification, globalbaz.py looks just like this:

var_name = "my_useful_string"

Question 61

I believe that there are plenty of circumstances in which it does make sense and it simplifies programming to have some globals that are known across several (tightly coupled) modules. In this spirit, I would like to elaborate a bit on the idea of having a module of globals which is imported by those modules which need to reference them.

When there is only one such module, I name it “g”. In it, I assign default values for every variable I intend to treat as global. In each module that uses any of them, I do not use “from g import var”, as this only results in a local variable which is initialized from g only at the time of the import. I make most references in the form g.var, and the “g.” serves as a constant reminder that I am dealing with a variable that is potentially accessible to other modules.

If the value of such a global variable is to be used frequently in some function in a module, then that function can make a local copy: var = g.var. However, it is important to realize that assignments to var are local, and global g.var cannot be updated without referencing g.var explicitly in an assignment.

Note that you can also have multiple such globals modules shared by different subsets of your modules to keep things a little more tightly controlled. The reason I use short names for my globals modules is to avoid cluttering up the code too much with occurrences of them. With only a little experience, they become mnemonic enough with only 1 or 2 characters.

It is still possible to make an assignment to, say, g.x when x was not already defined in g, and a different module can then access g.x. However, even though the interpreter permits it, this approach is not so transparent, and I do avoid it. There is still the possibility of accidentally creating a new variable in g as a result of a typo in the variable name for an assignment. Sometimes an examination of dir(g) is useful to discover any surprise names that may have arisen by such accident.

Question 62

You can pass the globals of one module to onother:

In Module A:

import module_b
my_var=2
module_b.do_something_with_my_globals(globals())
print my_var

In Module B:

def do_something_with_my_globals(glob): # glob is simply a dict.
    glob["my_var"]=3

Question 63

Global variables are usually a bad idea, but you can do this by assigning to __builtins__:

__builtins__.foo = 'something'
print foo

Also, modules themselves are variables that you can access from any module. So if you define a module called my_globals.py:

# my_globals.py
foo = 'something'

Then you can use that from anywhere as well:

import my_globals
print my_globals.foo

Using modules rather than modifying __builtins__ is generally a cleaner way to do globals of this sort.

Question 64

You can already do this with module-level variables. Modules are the same no matter what module they’re being imported from. So you can make the variable a module-level variable in whatever module it makes sense to put it in, and access it or assign to it from other modules. It would be better to call a function to set the variable’s value, or to make it a property of some singleton object. That way if you end up needing to run some code when the variable’s changed, you can do so without breaking your module’s external interface.

It’s not usually a great way to do things — using globals seldom is — but I think this is the cleanest way to do it.

问题：-m开关的作用是什么？

回答 0

回答 1

回答 2

简介（TLDR）

初赛

的历史发展 -m

用例

缺点

详细比较

结论

Introduction (TLDR)

Preliminaries

Historical Development of -m

Use Cases

Shortcomings

Detailed Comparisons

Conclusion

问题：为什么在导入模块时Python运行我的模块，以及如何停止它？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

问题：生成具有给定（数字）分布的随机数

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

问题：我可以使用__init__.py定义全局变量吗？

回答 0

回答 1

回答 2

回答 3

问题：__getattr__在模块上

例

Example

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

问题：在Mac上安装python模块的最兼容方法是什么？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

问题：如何制作跨模块变量？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

的历史发展 `-m`

Historical Development of `-m`

问题：我可以使用init.py定义全局变量吗？

问题：getattr在模块上

文件`one.py`：

文件`two.py`：

file `one.py`:

file `two.py`:

`-m` 而不将当前目录添加到路径：

怎么`package`办？

`-m` without adding the current directory to the path:

What does `package` do?