Python 实用宝典

Question 1

I’m trying to save a object to my database, but it’s throwing a MultiValueDictKeyError error.

The problems lies within the form, the is_private is represented by a checkbox. If the check box is NOT selected, obviously nothing is passed. This is where the error gets chucked.

How do I properly deal with this exception, and catch it?

The line is

is_private = request.POST['is_private']

Question 2

Use the MultiValueDict’s get method. This is also present on standard dicts and is a way to fetch a value while providing a default if it does not exist.

is_private = request.POST.get('is_private', False)

Generally,

my_var = dict.get(<key>, <default>)

Question 3

Choose what is best for you:

1

is_private = request.POST.get('is_private', False);

If is_private key is present in request.POST the is_private variable will be equal to it, if not, then it will be equal to False.

2

if 'is_private' in request.POST:
    is_private = request.POST['is_private']
else:
    is_private = False

3

from django.utils.datastructures import MultiValueDictKeyError
try:
    is_private = request.POST['is_private']
except MultiValueDictKeyError:
    is_private = False

Question 4

You get that because you’re trying to get a key from a dictionary when it’s not there. You need to test if it is in there first.

try:

is_private = 'is_private' in request.POST

or

is_private = 'is_private' in request.POST and request.POST['is_private']

depending on the values you’re using.

Question 5

Why didn’t you try to define is_private in your models as default=False?

class Foo(models.Models):
    is_private = models.BooleanField(default=False)

Question 6

Another thing to remember is that request.POST['keyword'] refers to the element identified by the specified html name attribute keyword.

So, if your form is:

<form action="/login/" method="POST">
  <input type="text" name="keyword" placeholder="Search query">
  <input type="number" name="results" placeholder="Number of results">
</form>

then, request.POST['keyword'] and request.POST['results'] will contain the value of the input elements keyword and results, respectively.

Question 7

First check if the request object have the ‘is_private’ key parameter. Most of the case’s this MultiValueDictKeyError occurred for missing key in the dictionary-like request object. Because dictionary is an unordered key, value pair “associative memories” or “associative arrays”

In another word. request.GET or request.POST is a dictionary-like object containing all request parameters. This is specific to Django.

The method get() returns a value for the given key if key is in the dictionary. If key is not available then returns default value None.

You can handle this error by putting :

is_private = request.POST.get('is_private', False);

Question 8

For me, this error occurred in my django project because of the following:

I inserted a new hyperlink in my home.html present in templates folder of my project as below:
```
<input type="button" value="About" onclick="location.href='{% url 'about' %}'">
```
In views.py, I had the following definitions of count and about:

   def count(request):
           fulltext = request.GET['fulltext']
           wordlist = fulltext.split()
           worddict = {}
           for word in wordlist:
               if word in worddict:
                   worddict[word] += 1
               else:
                   worddict[word] = 1
                   worddict = sorted(worddict.items(), key = operator.itemgetter(1),reverse=True)
           return render(request,'count.html', 'fulltext':fulltext,'count':len(wordlist),'worddict'::worddict})

   def about(request): 
       return render(request,"about.html")

In urls.py, I had the following url patterns:

    urlpatterns = [
        path('admin/', admin.site.urls),
        path('',views.homepage,name="home"),
        path('eggs',views.eggs),
        path('count/',views.count,name="count"),
        path('about/',views.count,name="about"),
    ]

As can be seen in no. 3 above,in the last url pattern, I was incorrectly calling views.count whereas I needed to call views.about. This line fulltext = request.GET['fulltext'] in count function (which was mistakenly called because of wrong entry in urlpatterns) of views.py threw the multivaluedictkeyerror exception.

Then I changed the last url pattern in urls.py to the correct one i.e. path('about/',views.about,name="about"), and everything worked fine.

Apparently, in general a newbie programmer in django can make the mistake I made of wrongly calling another view function for a url, which might be expecting different set of parameters or passing different set of objects in its render call, rather than the intended behavior.

Hope this helps some newbie programmer to django.

Question 9

Let’s say that I have a class that represents locations. Locations “belong” to customers. Locations are identified by a unicode 10 character code. The “location code” should be unique among the locations for a specific customer.

The two below fields in combination should be unique
customer_id = Column(Integer,ForeignKey('customers.customer_id')
location_code = Column(Unicode(10))

So if i have two customers, customer “123” and customer “456”. They both can have a location called “main” but neither could have two locations called main.

I can handle this in the business logic but I want to make sure there is no way to easily add the requirement in sqlalchemy. The unique=True option seems to only work when applied to a specific field and it would cause the entire table to only have a unique code for all locations.

Question 10

Extract from the documentation of the Column:

unique – When True, indicates that this column contains a unique constraint, or if index is True as well, indicates that the Index should be created with the unique flag. To specify multiple columns in the constraint/index or to specify an explicit name, use the UniqueConstraint or Index constructs explicitly.

As these belong to a Table and not to a mapped Class, one declares those in the table definition, or if using declarative as in the __table_args__:

# version1: table definition
mytable = Table('mytable', meta,
    # ...
    Column('customer_id', Integer, ForeignKey('customers.customer_id')),
    Column('location_code', Unicode(10)),

    UniqueConstraint('customer_id', 'location_code', name='uix_1')
    )
# or the index, which will ensure uniqueness as well
Index('myindex', mytable.c.customer_id, mytable.c.location_code, unique=True)


# version2: declarative
class Location(Base):
    __tablename__ = 'locations'
    id = Column(Integer, primary_key = True)
    customer_id = Column(Integer, ForeignKey('customers.customer_id'), nullable=False)
    location_code = Column(Unicode(10), nullable=False)
    __table_args__ = (UniqueConstraint('customer_id', 'location_code', name='_customer_location_uc'),
                     )

Question 11

from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()

class Location(Base):
      __table_args__ = (
        # this can be db.PrimaryKeyConstraint if you want it to be a primary key
        db.UniqueConstraint('customer_id', 'location_code'),
      )
      customer_id = Column(Integer,ForeignKey('customers.customer_id')
      location_code = Column(Unicode(10))

Question 12

I would like to have a optional argument that will default to a value if only the flag is present with no value specified, but store a user-specified value instead of the default if the user specifies a value. Is there already an action available for this?

An example:

python script.py --example
# args.example would equal a default value of 1
python script.py --example 2
# args.example would equal a default value of 2

I can create an action, but wanted to see if there was an existing way to do this.

Question 13

import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--example', nargs='?', const=1, type=int)
args = parser.parse_args()
print(args)

% test.py 
Namespace(example=None)
% test.py --example
Namespace(example=1)
% test.py --example 2
Namespace(example=2)

nargs='?' means 0-or-1 arguments
const=1 sets the default when there are 0 arguments
type=int converts the argument to int

If you want test.py to set example to 1 even if no --example is specified, then include default=1. That is, with

parser.add_argument('--example', nargs='?', const=1, type=int, default=1)

then

% test.py 
Namespace(example=1)

Question 14

Actually, you only need to use the default argument to add_argument as in this test.py script:

import argparse

if __name__ == '__main__':

    parser = argparse.ArgumentParser()
    parser.add_argument('--example', default=1)
    args = parser.parse_args()
    print(args.example)

test.py --example
% 1
test.py --example 2
% 2

Details are here.

Question 15

The difference between:

parser.add_argument("--debug", help="Debug", nargs='?', type=int, const=1, default=7)

and

parser.add_argument("--debug", help="Debug", nargs='?', type=int, const=1)

is thus:

myscript.py => debug is 7 (from default) in the first case and “None” in the second

myscript.py --debug => debug is 1 in each case

myscript.py --debug 2 => debug is 2 in each case

Question 16

>>> x=[1,2]
>>> x[1]
2
>>> x=(1,2)
>>> x[1]
2

Are they both valid? Is one preferred for some reason?

Question 17

Square brackets are lists while parentheses are tuples.

A list is mutable, meaning you can change its contents:

>>> x = [1,2]
>>> x.append(3)
>>> x
[1, 2, 3]

while tuples are not:

>>> x = (1,2)
>>> x
(1, 2)
>>> x.append(3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'append'

The other main difference is that a tuple is hashable, meaning that you can use it as a key to a dictionary, among other things. For example:

>>> x = (1,2)
>>> y = [1,2]
>>> z = {}
>>> z[x] = 3
>>> z
{(1, 2): 3}
>>> z[y] = 4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

Note that, as many people have pointed out, you can add tuples together. For example:

>>> x = (1,2)
>>> x += (3,)
>>> x
(1, 2, 3)

However, this does not mean tuples are mutable. In the example above, a new tuple is constructed by adding together the two tuples as arguments. The original tuple is not modified. To demonstrate this, consider the following:

>>> x = (1,2)
>>> y = x
>>> x += (3,)
>>> x
(1, 2, 3)
>>> y
(1, 2)

Whereas, if you were to construct this same example with a list, y would also be updated:

>>> x = [1, 2]
>>> y = x
>>> x += [3]
>>> x
[1, 2, 3]
>>> y
[1, 2, 3]

Question 18

One interesting difference :

lst=[1]
print lst          // prints [1]
print type(lst)    // prints <type 'list'>

notATuple=(1)
print notATuple        // prints 1
print type(notATuple)  // prints <type 'int'>
                                         ^^ instead of tuple(expected)

A comma must be included in a tuple even if it contains only a single value. e.g. (1,) instead of (1).

Question 19

They are not lists, they are a list and a tuple. You can read about tuples in the Python tutorial. While you can mutate lists, this is not possible with tuples.

In [1]: x = (1, 2)

In [2]: x[0] = 3
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

/home/user/<ipython console> in <module>()

TypeError: 'tuple' object does not support item assignment

Question 20

Another way brackets and parentheses differ is that square brackets can describe a list comprehension, e.g. [x for x in y]

Whereas the corresponding parenthetic syntax specifies a tuple generator: (x for x in y)

You can get a tuple comprehension using: tuple(x for x in y)

See: Why is there no tuple comprehension in Python?

Question 21

The first is a list, the second is a tuple. Lists are mutable, tuples are not.

Take a look at the Data Structures section of the tutorial, and the Sequence Types section of the documentation.

Question 22

Comma-separated items enclosed by ( and ) are tuples, those enclosed by [ and ] are lists.

Question 23

How do you access other class variables from a list comprehension within the class definition? The following works in Python 2 but fails in Python 3:

class Foo:
    x = 5
    y = [x for i in range(1)]

Python 3.2 gives the error:

NameError: global name 'x' is not defined

Trying Foo.x doesn’t work either. Any ideas on how to do this in Python 3?

A slightly more complicated motivating example:

from collections import namedtuple
class StateDatabase:
    State = namedtuple('State', ['name', 'capital'])
    db = [State(*args) for args in [
        ['Alabama', 'Montgomery'],
        ['Alaska', 'Juneau'],
        # ...
    ]]

In this example, apply() would have been a decent workaround, but it is sadly removed from Python 3.

Question 24

Class scope and list, set or dictionary comprehensions, as well as generator expressions do not mix.

The why; or, the official word on this

In Python 3, list comprehensions were given a proper scope (local namespace) of their own, to prevent their local variables bleeding over into the surrounding scope (see Python list comprehension rebind names even after scope of comprehension. Is this right?). That’s great when using such a list comprehension in a module or in a function, but in classes, scoping is a little, uhm, strange.

This is documented in pep 227:

Names in class scope are not accessible. Names are resolved in the innermost enclosing function scope. If a class definition occurs in a chain of nested scopes, the resolution process skips class definitions.

and in the class compound statement documentation:

The class’s suite is then executed in a new execution frame (see section Naming and binding), using a newly created local namespace and the original global namespace. (Usually, the suite contains only function definitions.) When the class’s suite finishes execution, its execution frame is discarded but its local namespace is saved. [4] A class object is then created using the inheritance list for the base classes and the saved local namespace for the attribute dictionary.

Emphasis mine; the execution frame is the temporary scope.

Because the scope is repurposed as the attributes on a class object, allowing it to be used as a nonlocal scope as well leads to undefined behaviour; what would happen if a class method referred to x as a nested scope variable, then manipulates Foo.x as well, for example? More importantly, what would that mean for subclasses of Foo? Python has to treat a class scope differently as it is very different from a function scope.

Last, but definitely not least, the linked Naming and binding section in the Execution model documentation mentions class scopes explicitly:

The scope of names defined in a class block is limited to the class block; it does not extend to the code blocks of methods – this includes comprehensions and generator expressions since they are implemented using a function scope. This means that the following will fail:
class A:
     a = 42
     b = list(a + i for i in range(10))

So, to summarize: you cannot access the class scope from functions, list comprehensions or generator expressions enclosed in that scope; they act as if that scope does not exist. In Python 2, list comprehensions were implemented using a shortcut, but in Python 3 they got their own function scope (as they should have had all along) and thus your example breaks. Other comprehension types have their own scope regardless of Python version, so a similar example with a set or dict comprehension would break in Python 2.

# Same error, in Python 2 or 3
y = {x: x for i in range(1)}

The (small) exception; or, why one part may still work

There’s one part of a comprehension or generator expression that executes in the surrounding scope, regardless of Python version. That would be the expression for the outermost iterable. In your example, it’s the range(1):

y = [x for i in range(1)]
#               ^^^^^^^^

Thus, using x in that expression would not throw an error:

# Runs fine
y = [i for i in range(x)]

This only applies to the outermost iterable; if a comprehension has multiple for clauses, the iterables for inner for clauses are evaluated in the comprehension’s scope:

# NameError
y = [i for i in range(1) for j in range(x)]

This design decision was made in order to throw an error at genexp creation time instead of iteration time when creating the outermost iterable of a generator expression throws an error, or when the outermost iterable turns out not to be iterable. Comprehensions share this behavior for consistency.

Looking under the hood; or, way more detail than you ever wanted

You can see this all in action using the dis module. I’m using Python 3.3 in the following examples, because it adds qualified names that neatly identify the code objects we want to inspect. The bytecode produced is otherwise functionally identical to Python 3.2.

To create a class, Python essentially takes the whole suite that makes up the class body (so everything indented one level deeper than the class <name>: line), and executes that as if it were a function:

>>> import dis
>>> def foo():
...     class Foo:
...         x = 5
...         y = [x for i in range(1)]
...     return Foo
... 
>>> dis.dis(foo)
  2           0 LOAD_BUILD_CLASS     
              1 LOAD_CONST               1 (<code object Foo at 0x10a436030, file "<stdin>", line 2>) 
              4 LOAD_CONST               2 ('Foo') 
              7 MAKE_FUNCTION            0 
             10 LOAD_CONST               2 ('Foo') 
             13 CALL_FUNCTION            2 (2 positional, 0 keyword pair) 
             16 STORE_FAST               0 (Foo) 

  5          19 LOAD_FAST                0 (Foo) 
             22 RETURN_VALUE

The first LOAD_CONST there loads a code object for the Foo class body, then makes that into a function, and calls it. The result of that call is then used to create the namespace of the class, its __dict__. So far so good.

The thing to note here is that the bytecode contains a nested code object; in Python, class definitions, functions, comprehensions and generators all are represented as code objects that contain not only bytecode, but also structures that represent local variables, constants, variables taken from globals, and variables taken from the nested scope. The compiled bytecode refers to those structures and the python interpreter knows how to access those given the bytecodes presented.

The important thing to remember here is that Python creates these structures at compile time; the class suite is a code object (<code object Foo at 0x10a436030, file "<stdin>", line 2>) that is already compiled.

Let’s inspect that code object that creates the class body itself; code objects have a co_consts structure:

>>> foo.__code__.co_consts
(None, <code object Foo at 0x10a436030, file "<stdin>", line 2>, 'Foo')
>>> dis.dis(foo.__code__.co_consts[1])
  2           0 LOAD_FAST                0 (__locals__) 
              3 STORE_LOCALS         
              4 LOAD_NAME                0 (__name__) 
              7 STORE_NAME               1 (__module__) 
             10 LOAD_CONST               0 ('foo.<locals>.Foo') 
             13 STORE_NAME               2 (__qualname__) 

  3          16 LOAD_CONST               1 (5) 
             19 STORE_NAME               3 (x) 

  4          22 LOAD_CONST               2 (<code object <listcomp> at 0x10a385420, file "<stdin>", line 4>) 
             25 LOAD_CONST               3 ('foo.<locals>.Foo.<listcomp>') 
             28 MAKE_FUNCTION            0 
             31 LOAD_NAME                4 (range) 
             34 LOAD_CONST               4 (1) 
             37 CALL_FUNCTION            1 (1 positional, 0 keyword pair) 
             40 GET_ITER             
             41 CALL_FUNCTION            1 (1 positional, 0 keyword pair) 
             44 STORE_NAME               5 (y) 
             47 LOAD_CONST               5 (None) 
             50 RETURN_VALUE

The above bytecode creates the class body. The function is executed and the resulting locals() namespace, containing x and y is used to create the class (except that it doesn’t work because x isn’t defined as a global). Note that after storing 5 in x, it loads another code object; that’s the list comprehension; it is wrapped in a function object just like the class body was; the created function takes a positional argument, the range(1) iterable to use for its looping code, cast to an iterator. As shown in the bytecode, range(1) is evaluated in the class scope.

From this you can see that the only difference between a code object for a function or a generator, and a code object for a comprehension is that the latter is executed immediately when the parent code object is executed; the bytecode simply creates a function on the fly and executes it in a few small steps.

Python 2.x uses inline bytecode there instead, here is output from Python 2.7:

  2           0 LOAD_NAME                0 (__name__)
              3 STORE_NAME               1 (__module__)

  3           6 LOAD_CONST               0 (5)
              9 STORE_NAME               2 (x)

  4          12 BUILD_LIST               0
             15 LOAD_NAME                3 (range)
             18 LOAD_CONST               1 (1)
             21 CALL_FUNCTION            1
             24 GET_ITER            
        >>   25 FOR_ITER                12 (to 40)
             28 STORE_NAME               4 (i)
             31 LOAD_NAME                2 (x)
             34 LIST_APPEND              2
             37 JUMP_ABSOLUTE           25
        >>   40 STORE_NAME               5 (y)
             43 LOAD_LOCALS         
             44 RETURN_VALUE

No code object is loaded, instead a FOR_ITER loop is run inline. So in Python 3.x, the list generator was given a proper code object of its own, which means it has its own scope.

However, the comprehension was compiled together with the rest of the python source code when the module or script was first loaded by the interpreter, and the compiler does not consider a class suite a valid scope. Any referenced variables in a list comprehension must look in the scope surrounding the class definition, recursively. If the variable wasn’t found by the compiler, it marks it as a global. Disassembly of the list comprehension code object shows that x is indeed loaded as a global:

>>> foo.__code__.co_consts[1].co_consts
('foo.<locals>.Foo', 5, <code object <listcomp> at 0x10a385420, file "<stdin>", line 4>, 'foo.<locals>.Foo.<listcomp>', 1, None)
>>> dis.dis(foo.__code__.co_consts[1].co_consts[2])
  4           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                12 (to 21) 
              9 STORE_FAST               1 (i) 
             12 LOAD_GLOBAL              0 (x) 
             15 LIST_APPEND              2 
             18 JUMP_ABSOLUTE            6 
        >>   21 RETURN_VALUE

This chunk of bytecode loads the first argument passed in (the range(1) iterator), and just like the Python 2.x version uses FOR_ITER to loop over it and create its output.

Had we defined x in the foo function instead, x would be a cell variable (cells refer to nested scopes):

>>> def foo():
...     x = 2
...     class Foo:
...         x = 5
...         y = [x for i in range(1)]
...     return Foo
... 
>>> dis.dis(foo.__code__.co_consts[2].co_consts[2])
  5           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                12 (to 21) 
              9 STORE_FAST               1 (i) 
             12 LOAD_DEREF               0 (x) 
             15 LIST_APPEND              2 
             18 JUMP_ABSOLUTE            6 
        >>   21 RETURN_VALUE

The LOAD_DEREF will indirectly load x from the code object cell objects:

>>> foo.__code__.co_cellvars               # foo function `x`
('x',)
>>> foo.__code__.co_consts[2].co_cellvars  # Foo class, no cell variables
()
>>> foo.__code__.co_consts[2].co_consts[2].co_freevars  # Refers to `x` in foo
('x',)
>>> foo().y
[2]

The actual referencing looks the value up from the current frame data structures, which were initialized from a function object’s .__closure__ attribute. Since the function created for the comprehension code object is discarded again, we do not get to inspect that function’s closure. To see a closure in action, we’d have to inspect a nested function instead:

>>> def spam(x):
...     def eggs():
...         return x
...     return eggs
... 
>>> spam(1).__code__.co_freevars
('x',)
>>> spam(1)()
1
>>> spam(1).__closure__
>>> spam(1).__closure__[0].cell_contents
1
>>> spam(5).__closure__[0].cell_contents
5

So, to summarize:

List comprehensions get their own code objects in Python 3, and there is no difference between code objects for functions, generators or comprehensions; comprehension code objects are wrapped in a temporary function object and called immediately.
Code objects are created at compile time, and any non-local variables are marked as either global or as free variables, based on the nested scopes of the code. The class body is not considered a scope for looking up those variables.
When executing the code, Python has only to look into the globals, or the closure of the currently executing object. Since the compiler didn’t include the class body as a scope, the temporary function namespace is not considered.

A workaround; or, what to do about it

If you were to create an explicit scope for the x variable, like in a function, you can use class-scope variables for a list comprehension:

>>> class Foo:
...     x = 5
...     def y(x):
...         return [x for i in range(1)]
...     y = y(x)
... 
>>> Foo.y
[5]

The ‘temporary’ y function can be called directly; we replace it when we do with its return value. Its scope is considered when resolving x:

>>> foo.__code__.co_consts[1].co_consts[2]
<code object y at 0x10a5df5d0, file "<stdin>", line 4>
>>> foo.__code__.co_consts[1].co_consts[2].co_cellvars
('x',)

Of course, people reading your code will scratch their heads over this a little; you may want to put a big fat comment in there explaining why you are doing this.

The best work-around is to just use __init__ to create an instance variable instead:

def __init__(self):
    self.y = [self.x for i in range(1)]

and avoid all the head-scratching, and questions to explain yourself. For your own concrete example, I would not even store the namedtuple on the class; either use the output directly (don’t store the generated class at all), or use a global:

from collections import namedtuple
State = namedtuple('State', ['name', 'capital'])

class StateDatabase:
    db = [State(*args) for args in [
       ('Alabama', 'Montgomery'),
       ('Alaska', 'Juneau'),
       # ...
    ]]

Question 25

In my opinion it is a flaw in Python 3. I hope they change it.

Old Way (works in 2.7, throws NameError: name 'x' is not defined in 3+):

class A:
    x = 4
    y = [x+i for i in range(1)]

NOTE: simply scoping it with A.x would not solve it

New Way (works in 3+):

class A:
    x = 4
    y = (lambda x=x: [x+i for i in range(1)])()

Because the syntax is so ugly I just initialize all my class variables in the constructor typically

Question 26

The accepted answer provides excellent information, but there appear to be a few other wrinkles here — differences between list comprehension and generator expressions. A demo that I played around with:

class Foo:

    # A class-level variable.
    X = 10

    # I can use that variable to define another class-level variable.
    Y = sum((X, X))

    # Works in Python 2, but not 3.
    # In Python 3, list comprehensions were given their own scope.
    try:
        Z1 = sum([X for _ in range(3)])
    except NameError:
        Z1 = None

    # Fails in both.
    # Apparently, generator expressions (that's what the entire argument
    # to sum() is) did have their own scope even in Python 2.
    try:
        Z2 = sum(X for _ in range(3))
    except NameError:
        Z2 = None

    # Workaround: put the computation in lambda or def.
    compute_z3 = lambda val: sum(val for _ in range(3))

    # Then use that function.
    Z3 = compute_z3(X)

    # Also worth noting: here I can refer to XS in the for-part of the
    # generator expression (Z4 works), but I cannot refer to XS in the
    # inner-part of the generator expression (Z5 fails).
    XS = [15, 15, 15, 15]
    Z4 = sum(val for val in XS)
    try:
        Z5 = sum(XS[i] for i in range(len(XS)))
    except NameError:
        Z5 = None

print(Foo.Z1, Foo.Z2, Foo.Z3, Foo.Z4, Foo.Z5)

Question 27

This is a bug in Python. Comprehensions are advertised as being equivalent to for loops, but this is not true in classes. At least up to Python 3.6.6, in a comprehension used in a class, only one variable from outside the comprehension is accessible inside the comprehension, and it must be used as the outermost iterator. In a function, this scope limitation does not apply.

To illustrate why this is a bug, let’s return to the original example. This fails:

class Foo:
    x = 5
    y = [x for i in range(1)]

But this works:

def Foo():
    x = 5
    y = [x for i in range(1)]

The limitation is stated at the end of this section in the reference guide.

Question 28

Since the outermost iterator is evaluated in the surrounding scope we can use zip together with itertools.repeat to carry the dependencies over to the comprehension’s scope:

import itertools as it

class Foo:
    x = 5
    y = [j for i, j in zip(range(3), it.repeat(x))]

One can also use nested for loops in the comprehension and include the dependencies in the outermost iterable:

class Foo:
    x = 5
    y = [j for j in (x,) for i in range(3)]

For the specific example of the OP:

from collections import namedtuple
import itertools as it

class StateDatabase:
    State = namedtuple('State', ['name', 'capital'])
    db = [State(*args) for State, args in zip(it.repeat(State), [
        ['Alabama', 'Montgomery'],
        ['Alaska', 'Juneau'],
        # ...
    ])]

Question 29

I have a string that looks like '%s in %s' and I want to know how to seperate the arguments so that they are two different %s. My mind coming from Java came up with this:

'%s in %s' % unicode(self.author),  unicode(self.publication)

But this doesn’t work so how does it look in Python?

Question 30

Mark Cidade’s answer is right – you need to supply a tuple.

However from Python 2.6 onwards you can use format instead of %:

'{0} in {1}'.format(unicode(self.author,'utf-8'),  unicode(self.publication,'utf-8'))

Usage of % for formatting strings is no longer encouraged.

This method of string formatting is the new standard in Python 3.0, and should be preferred to the % formatting described in String Formatting Operations in new code.

Question 31

If you’re using more than one argument it has to be in a tuple (note the extra parentheses):

'%s in %s' % (unicode(self.author),  unicode(self.publication))

As EOL points out, the unicode() function usually assumes ascii encoding as a default, so if you have non-ASCII characters, it’s safer to explicitly pass the encoding:

'%s in %s' % (unicode(self.author,'utf-8'),  unicode(self.publication('utf-8')))

And as of Python 3.0, it’s preferred to use the str.format() syntax instead:

'{0} in {1}'.format(unicode(self.author,'utf-8'),unicode(self.publication,'utf-8'))

Question 32

On a tuple/mapping object for multiple argument `format`

The following is excerpt from the documentation:

Given format % values, % conversion specifications in format are replaced with zero or more elements of values. The effect is similar to the using sprintf() in the C language.

If format requires a single argument, values may be a single non-tuple object. Otherwise, values must be a tuple with exactly the number of items specified by the format string, or a single mapping object (for example, a dictionary).

References

docs.python.org/library/stdtypes – string formatting

On `str.format` instead of `%`

A newer alternative to % operator is to use str.format. Here’s an excerpt from the documentation:

str.format(*args, **kwargs)

Perform a string formatting operation. The string on which this method is called can contain literal text or replacement fields delimited by braces {}. Each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument. Returns a copy of the string where each replacement field is replaced with the string value of the corresponding argument.

This method is the new standard in Python 3.0, and should be preferred to % formatting.

References

docs.python.org/library/stdtypes – str.format – syntax

Examples

Here are some usage examples:

>>> '%s for %s' % ("tit", "tat")
tit for tat

>>> '{} and {}'.format("chicken", "waffles")
chicken and waffles

>>> '%(last)s, %(first)s %(last)s' % {'first': "James", 'last': "Bond"}
Bond, James Bond

>>> '{last}, {first} {last}'.format(first="James", last="Bond")
Bond, James Bond

Two major differences between `apply` and `transform`

There are two major differences between the transform and apply groupby methods.

Input:
apply implicitly passes all the columns for each group as a DataFrame to the custom function.
while transform passes each column for each group individually as a Series to the custom function.
Output:
The custom function passed to apply can return a scalar, or a Series or DataFrame (or numpy array or even list).
The custom function passed to transform must return a sequence (a one dimensional Series, array or list) the same length as the group.

So, transform works on just one Series at a time and apply works on the entire DataFrame at once.

Inspecting the custom function

It can help quite a bit to inspect the input to your custom function passed to apply or transform.

Examples

Let’s create some sample data and inspect the groups so that you can see what I am talking about:

import pandas as pd
import numpy as np
df = pd.DataFrame({'State':['Texas', 'Texas', 'Florida', 'Florida'], 
                   'a':[4,5,1,3], 'b':[6,10,3,11]})

     State  a   b
0    Texas  4   6
1    Texas  5  10
2  Florida  1   3
3  Florida  3  11

Let’s create a simple custom function that prints out the type of the implicitly passed object and then raised an error so that execution can be stopped.

def inspect(x):
    print(type(x))
    raise

Now let’s pass this function to both the groupby apply and transform methods to see what object is passed to it:

df.groupby('State').apply(inspect)

<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.frame.DataFrame'>
RuntimeError

As you can see, a DataFrame is passed into the inspect function. You might be wondering why the type, DataFrame, got printed out twice. Pandas runs the first group twice. It does this to determine if there is a fast way to complete the computation or not. This is a minor detail that you shouldn’t worry about.

Now, let’s do the same thing with transform

df.groupby('State').transform(inspect)
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
RuntimeError

It is passed a Series – a totally different Pandas object.

So, transform is only allowed to work with a single Series at a time. It is impossible for it to act on two columns at the same time. So, if we try and subtract column a from b inside of our custom function we would get an error with transform. See below:

def subtract_two(x):
    return x['a'] - x['b']

df.groupby('State').transform(subtract_two)
KeyError: ('a', 'occurred at index a')

We get a KeyError as pandas is attempting to find the Series index a which does not exist. You can complete this operation with apply as it has the entire DataFrame:

df.groupby('State').apply(subtract_two)

State     
Florida  2   -2
         3   -8
Texas    0   -2
         1   -5
dtype: int64

The output is a Series and a little confusing as the original index is kept, but we have access to all columns.

Displaying the passed pandas object

It can help even more to display the entire pandas object within the custom function, so you can see exactly what you are operating with. You can use print statements by I like to use the display function from the IPython.display module so that the DataFrames get nicely outputted in HTML in a jupyter notebook:

from IPython.display import display
def subtract_two(x):
    display(x)
    return x['a'] - x['b']

Screenshot:

Transform must return a single dimensional sequence the same size as the group

The other difference is that transform must return a single dimensional sequence the same size as the group. In this particular instance, each group has two rows, so transform must return a sequence of two rows. If it does not then an error is raised:

def return_three(x):
    return np.array([1, 2, 3])

df.groupby('State').transform(return_three)
ValueError: transform must return a scalar value for each group

The error message is not really descriptive of the problem. You must return a sequence the same length as the group. So, a function like this would work:

def rand_group_len(x):
    return np.random.rand(len(x))

df.groupby('State').transform(rand_group_len)

          a         b
0  0.962070  0.151440
1  0.440956  0.782176
2  0.642218  0.483257
3  0.056047  0.238208

Returning a single scalar object also works for `transform`

If you return just a single scalar from your custom function, then transform will use it for each of the rows in the group:

def group_sum(x):
    return x.sum()

df.groupby('State').transform(group_sum)

   a   b
0  9  16
1  9  16
2  4  14
3  4  14

Question 40

As I felt similarly confused with .transform operation vs. .apply I found a few answers shedding some light on the issue. This answer for example was very helpful.

My takeout so far is that .transform will work (or deal) with Series (columns) in isolation from each other. What this means is that in your last two calls:

df.groupby('A').transform(lambda x: (x['C'] - x['D']))
df.groupby('A').transform(lambda x: (x['C'] - x['D']).mean())

You asked .transform to take values from two columns and ‘it’ actually does not ‘see’ both of them at the same time (so to speak). transform will look at the dataframe columns one by one and return back a series (or group of series) ‘made’ of scalars which are repeated len(input_column) times.

So this scalar, that should be used by .transform to make the Series is a result of some reduction function applied on an input Series (and only on ONE series/column at a time).

Consider this example (on your dataframe):

zscore = lambda x: (x - x.mean()) / x.std() # Note that it does not reference anything outside of 'x' and for transform 'x' is one column.
df.groupby('A').transform(zscore)

will yield:

       C      D
0  0.989  0.128
1 -0.478  0.489
2  0.889 -0.589
3 -0.671 -1.150
4  0.034 -0.285
5  1.149  0.662
6 -1.404 -0.907
7 -0.509  1.653

Which is exactly the same as if you would use it on only on one column at a time:

df.groupby('A')['C'].transform(zscore)

yielding:

Note that .apply in the last example (df.groupby('A')['C'].apply(zscore)) would work in exactly the same way, but it would fail if you tried using it on a dataframe:

df.groupby('A').apply(zscore)

gives error:

ValueError: operands could not be broadcast together with shapes (6,) (2,)

So where else is .transform useful? The simplest case is trying to assign results of reduction function back to original dataframe.

df['sum_C'] = df.groupby('A')['C'].transform(sum)
df.sort('A') # to clearly see the scalar ('sum') applies to the whole column of the group

yielding:

     A      B      C      D  sum_C
1  bar    one  1.998  0.593  3.973
3  bar  three  1.287 -0.639  3.973
5  bar    two  0.687 -1.027  3.973
4  foo    two  0.205  1.274  4.373
2  foo    two  0.128  0.924  4.373
6  foo    one  2.113 -0.516  4.373
7  foo  three  0.657 -1.179  4.373
0  foo    one  1.270  0.201  4.373

Trying the same with .apply would give NaNs in sum_C. Because .apply would return a reduced Series, which it does not know how to broadcast back:

df.groupby('A')['C'].apply(sum)

giving:

A
bar    3.973
foo    4.373

There are also cases when .transform is used to filter the data:

df[df.groupby(['B'])['D'].transform(sum) < -1]

     A      B      C      D
3  bar  three  1.287 -0.639
7  foo  three  0.657 -1.179

I hope this adds a bit more clarity.

Question 41

I am going to use a very simple snippet to illustrate the difference:

test = pd.DataFrame({'id':[1,2,3,1,2,3,1,2,3], 'price':[1,2,3,2,3,1,3,1,2]})
grouping = test.groupby('id')['price']

The DataFrame looks like this:

There are 3 customer IDs in this table, each customer made three transactions and paid 1,2,3 dollars each time.

Now, I want to find the minimum payment made by each customer. There are two ways of doing it:

Using apply:

grouping.min()

The return looks like this:

id
1    1
2    1
3    1
Name: price, dtype: int64

pandas.core.series.Series # return type
Int64Index([1, 2, 3], dtype='int64', name='id') #The returned Series' index
# lenght is 3

Using transform:

grouping.transform(min)

The return looks like this:

0    1
1    1
2    1
3    1
4    1
5    1
6    1
7    1
8    1
Name: price, dtype: int64

pandas.core.series.Series # return type
RangeIndex(start=0, stop=9, step=1) # The returned Series' index
# length is 9

Both methods return a Series object, but the length of the first one is 3 and the length of the second one is 9.

If you want to answer What is the minimum price paid by each customer, then the apply method is the more suitable one to choose.

If you want to answer What is the difference between the amount paid for each transaction vs the minimum payment, then you want to use transform, because:

test['minimum'] = grouping.transform(min) # ceates an extra column filled with minimum payment
test.price - test.minimum # returns the difference for each row

Apply does not work here simply because it returns a Series of size 3, but the original df’s length is 9. You cannot integrate it back to the original df easily.

Question 42

tmp = df.groupby(['A'])['c'].transform('mean')

is like

tmp1 = df.groupby(['A']).agg({'c':'mean'})
tmp = df['A'].map(tmp1['c'])

or

tmp1 = df.groupby(['A'])['c'].mean()
tmp = df['A'].map(tmp1)

Question 43

I’ve successfully converted something of 26 Sep 2012 format to 26-09-2012 using:

datetime.strptime(request.POST['sample_date'],'%d %b %Y')

However, I don’t know how to set the hour and minute of something like the above to 11:59. Does anyone know how to do this?

Note, this can be a future date or any random one, not just the current date.

Question 44

Use datetime.replace:

from datetime import datetime
date = datetime.strptime('26 Sep 2012', '%d %b %Y')
newdate = date.replace(hour=11, minute=59)

Question 45

datetime.replace() will provide the best options. Also, it provides facility for replacing day, year, and month.

Suppose we have a datetime object and date is represented as: "2017-05-04"

>>> from datetime import datetime
>>> date = datetime.strptime('2017-05-04',"%Y-%m-%d")
>>> print(date)
2017-05-04 00:00:00
>>> date = date.replace(minute=59, hour=23, second=59, year=2018, month=6, day=1)
>>> print(date)
2018-06-01 23:59:59

Question 46

I am trying to save plots I make using matplotlib; however, the images are saving blank.

Here is my code:

plt.subplot(121)
plt.imshow(dataStack, cmap=mpl.cm.bone)

plt.subplot(122)
y = copy.deepcopy(tumorStack)
y = np.ma.masked_where(y == 0, y)

plt.imshow(dataStack, cmap=mpl.cm.bone)
plt.imshow(y, cmap=mpl.cm.jet_r, interpolation='nearest')

if T0 is not None:
    plt.subplot(123)
    plt.imshow(T0, cmap=mpl.cm.bone)

    #plt.subplot(124)
    #Autozoom

#else:
    #plt.subplot(124)
    #Autozoom

plt.show()
plt.draw()
plt.savefig('tessstttyyy.png', dpi=100)

And tessstttyyy.png is blank (also tried with .jpg)

Question 47

First, what happens when T0 is not None? I would test that, then I would adjust the values I pass to plt.subplot(); maybe try values 131, 132, and 133, or values that depend whether or not T0 exists.

Second, after plt.show() is called, a new figure is created. To deal with this, you can

Call plt.savefig('tessstttyyy.png', dpi=100) before you call plt.show()
Save the figure before you show() by calling plt.gcf() for “get current figure”, then you can call savefig() on this Figure object at any time.

For example:

fig1 = plt.gcf()
plt.show()
plt.draw()
fig1.savefig('tessstttyyy.png', dpi=100)

In your code, ‘tesssttyyy.png’ is blank because it is saving the new figure, to which nothing has been plotted.

Question 48

plt.show() should come after plt.savefig()

Explanation: plt.show() clears the whole thing, so anything afterwards will happen on a new empty figure

Question 49

change the order of the functions fixed the problem for me:

first Save the plot
then Show the plot

as following:

plt.savefig('heatmap.png')

plt.show()

Question 50

Calling savefig before show() worked for me.

fig ,ax = plt.subplots(figsize = (4,4))
sns.barplot(x='sex', y='tip', color='g', ax=ax,data=tips)
sns.barplot(x='sex', y='tip', color='b', ax=ax,data=tips)
ax.legend(['Male','Female'], facecolor='w')

plt.savefig('figure.png')
plt.show()

Question 51

let’s me give a more detail example:

import numpy as np
import matplotlib.pyplot as plt


def draw_result(lst_iter, lst_loss, lst_acc, title):
    plt.plot(lst_iter, lst_loss, '-b', label='loss')
    plt.plot(lst_iter, lst_acc, '-r', label='accuracy')

    plt.xlabel("n iteration")
    plt.legend(loc='upper left')
    plt.title(title)
    plt.savefig(title+".png")  # should before plt.show method

    plt.show()


def test_draw():
    lst_iter = range(100)
    lst_loss = [0.01 * i + 0.01 * i ** 2 for i in xrange(100)]
    # lst_loss = np.random.randn(1, 100).reshape((100, ))
    lst_acc = [0.01 * i - 0.01 * i ** 2 for i in xrange(100)]
    # lst_acc = np.random.randn(1, 100).reshape((100, ))
    draw_result(lst_iter, lst_loss, lst_acc, "sgd_method")


if __name__ == '__main__':
    test_draw()

Question 52

Say I have a multiple inheritance scenario:

class A(object):
    # code for A here

class B(object):
    # code for B here

class C(A, B):
    def __init__(self):
        # What's the right code to write here to ensure 
        # A.__init__ and B.__init__ get called?

There’s two typical approaches to writing C‘s __init__:

(old-style) ParentClass.__init__(self)
(newer-style) super(DerivedClass, self).__init__()

However, in either case, if the parent classes (A and B) don’t follow the same convention, then the code will not work correctly (some may be missed, or get called multiple times).

So what’s the correct way again? It’s easy to say “just be consistent, follow one or the other”, but if A or B are from a 3rd party library, what then? Is there an approach that can ensure that all parent class constructors get called (and in the correct order, and only once)?

Edit: to see what I mean, if I do:

class A(object):
    def __init__(self):
        print("Entering A")
        super(A, self).__init__()
        print("Leaving A")

class B(object):
    def __init__(self):
        print("Entering B")
        super(B, self).__init__()
        print("Leaving B")

class C(A, B):
    def __init__(self):
        print("Entering C")
        A.__init__(self)
        B.__init__(self)
        print("Leaving C")

Then I get:

Entering C
Entering A
Entering B
Leaving B
Leaving A
Entering B
Leaving B
Leaving C

Note that B‘s init gets called twice. If I do:

class A(object):
    def __init__(self):
        print("Entering A")
        print("Leaving A")

class B(object):
    def __init__(self):
        print("Entering B")
        super(B, self).__init__()
        print("Leaving B")

class C(A, B):
    def __init__(self):
        print("Entering C")
        super(C, self).__init__()
        print("Leaving C")

Then I get:

Entering C
Entering A
Leaving A
Leaving C

Note that B‘s init never gets called. So it seems that unless I know/control the init’s of the classes I inherit from (A and B) I cannot make a safe choice for the class I’m writing (C).

Question 53

Both ways work fine. The approach using super() leads to greater flexibility for subclasses.

In the direct call approach, C.__init__ can call both A.__init__ and B.__init__.

When using super(), the classes need to be designed for cooperative multiple inheritance where C calls super, which invokes A‘s code which will also call super which invokes B‘s code. See http://rhettinger.wordpress.com/2011/05/26/super-considered-super for more detail on what can be done with super.

[Response question as later edited]

So it seems that unless I know/control the init’s of the classes I inherit from (A and B) I cannot make a safe choice for the class I’m writing (C).

The referenced article shows how to handle this situation by adding a wrapper class around A and B. There is a worked-out example in the section titled “How to Incorporate a Non-cooperative Class”.

One might wish that multiple inheritance were easier, letting you effortlessly compose Car and Airplane classes to get a FlyingCar, but the reality is that separately designed components often need adapters or wrappers before fitting together as seamlessly as we would like :-)

One other thought: if you’re unhappy with composing functionality using multiple inheritance, you can use composition for complete control over which methods get called on which occasions.

Question 54

The answer to your question depends on one very important aspect: Are your base classes designed for multiple inheritance?

There are 3 different scenarios:

The base classes are unrelated, standalone classes.

If your base classes are separate entities that are capable of functioning independently and they don’t know each other, they’re not designed for multiple inheritance. Example:
```
class Foo:
    def __init__(self):
        self.foo = 'foo'

class Bar:
    def __init__(self, bar):
        self.bar = bar
```
Important: Notice that neither Foo nor Bar calls super().__init__()! This is why your code didn’t work correctly. Because of the way diamond inheritance works in python, classes whose base class is object should not call super().__init__(). As you’ve noticed, doing so would break multiple inheritance because you end up calling another class’s __init__ rather than object.__init__(). (Disclaimer: Avoiding super().__init__() in object-subclasses is my personal recommendation and by no means an agreed-upon consensus in the python community. Some people prefer to use super in every class, arguing that you can always write an adapter if the class doesn’t behave as you expect.)

This also means that you should never write a class that inherits from object and doesn’t have an __init__ method. Not defining a __init__ method at all has the same effect as calling super().__init__(). If your class inherits directly from object, make sure to add an empty constructor like so:
```
class Base(object):
    def __init__(self):
        pass
```
Anyway, in this situation, you will have to call each parent constructor manually. There are two ways to do this:
- Without super
```
class FooBar(Foo, Bar):
    def __init__(self, bar='bar'):
        Foo.__init__(self)  # explicit calls without super
        Bar.__init__(self, bar)
```
- With super
```
class FooBar(Foo, Bar):
    def __init__(self, bar='bar'):
        super().__init__()  # this calls all constructors up to Foo
        super(Foo, self).__init__(bar)  # this calls all constructors after Foo up
                                        # to Bar
```
Each of these two methods has its own advantages and disadvantages. If you use super, your class will support dependency injection. On the other hand, it’s easier to make mistakes. For example if you change the order of Foo and Bar (like class FooBar(Bar, Foo)), you’d have to update the super calls to match. Without super you don’t have to worry about this, and the code is much more readable.
One of the classes is a mixin.

A mixin is a class that’s designed to be used with multiple inheritance. This means we don’t have to call both parent constructors manually, because the mixin will automatically call the 2nd constructor for us. Since we only have to call a single constructor this time, we can do so with super to avoid having to hard-code the parent class’s name.

Example:
```
class FooMixin:
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)  # forwards all unused arguments
        self.foo = 'foo'

class Bar:
    def __init__(self, bar):
        self.bar = bar

class FooBar(FooMixin, Bar):
    def __init__(self, bar='bar'):
        super().__init__(bar)  # a single call is enough to invoke
                               # all parent constructors

        # NOTE: `FooMixin.__init__(self, bar)` would also work, but isn't
        # recommended because we don't want to hard-code the parent class.
```
The important details here are:
- The mixin calls super().__init__() and passes through any arguments it receives.
- The subclass inherits from the mixin first: class FooBar(FooMixin, Bar). If the order of the base classes is wrong, the mixin’s constructor will never be called.
All base classes are designed for cooperative inheritance.

Classes designed for cooperative inheritance are a lot like mixins: They pass through all unused arguments to the next class. Like before, we just have to call super().__init__() and all parent constructors will be chain-called.

Example:
```
class CoopFoo:
    def __init__(self, **kwargs):
        super().__init__(**kwargs)  # forwards all unused arguments
        self.foo = 'foo'

class CoopBar:
    def __init__(self, bar, **kwargs):
        super().__init__(**kwargs)  # forwards all unused arguments
        self.bar = bar

class CoopFooBar(CoopFoo, CoopBar):
    def __init__(self, bar='bar'):
        super().__init__(bar=bar)  # pass all arguments on as keyword
                                   # arguments to avoid problems with
                                   # positional arguments and the order
                                   # of the parent classes
```
In this case, the order of the parent classes doesn’t matter. We might as well inherit from CoopBar first, and the code would still work the same. But that’s only true because all arguments are passed as keyword arguments. Using positional arguments would make it easy to get the order of the arguments wrong, so it’s customary for cooperative classes to accept only keyword arguments.

This is also an exception to the rule I mentioned earlier: Both CoopFoo and CoopBar inherit from object, but they still call super().__init__(). If they didn’t, there would be no cooperative inheritance.

Bottom line: The correct implementation depends on the classes you’re inheriting from.

The constructor is part of a class’s public interface. If the class is designed as a mixin or for cooperative inheritance, that must be documented. If the docs don’t mention anything of the sort, it’s safe to assume that the class isn’t designed for cooperative multiple inheritance.

Question 55

Either approach (“new style” or “old style”) will work if you have control over the source code for A and B. Otherwise, use of an adapter class might be necessary.

Source code accessible: Correct use of “new style”

class A(object):
    def __init__(self):
        print("-> A")
        super(A, self).__init__()
        print("<- A")

class B(object):
    def __init__(self):
        print("-> B")
        super(B, self).__init__()
        print("<- B")

class C(A, B):
    def __init__(self):
        print("-> C")
        # Use super here, instead of explicit calls to __init__
        super(C, self).__init__()
        print("<- C")

>>> C()
-> C
-> A
-> B
<- B
<- A
<- C

Here, method resolution order (MRO) dictates the following:

C(A, B) dictates A first, then B. MRO is C -> A -> B -> object.
super(A, self).__init__() continues along the MRO chain initiated in C.__init__ to B.__init__.
super(B, self).__init__() continues along the MRO chain initiated in C.__init__ to object.__init__.

You could say that this case is designed for multiple inheritance.

Source code accessible: Correct use of “old style”

class A(object):
    def __init__(self):
        print("-> A")
        print("<- A")

class B(object):
    def __init__(self):
        print("-> B")
        # Don't use super here.
        print("<- B")

class C(A, B):
    def __init__(self):
        print("-> C")
        A.__init__(self)
        B.__init__(self)
        print("<- C")

>>> C()
-> C
-> A
<- A
-> B
<- B
<- C

Here, MRO does not matter, since A.__init__ and B.__init__ are called explicitly. class C(B, A): would work just as well.

Although this case is not “designed” for multiple inheritance in the new style as the previous one was, multiple inheritance is still possible.

Now, what if A and B are from a third party library – i.e., you have no control over the source code for A and B? The short answer: You must design an adapter class that implements the necessary super calls, then use an empty class to define the MRO (see Raymond Hettinger’s article on super – especially the section, “How to Incorporate a Non-cooperative Class”).

Third-party parents: `A` does not implement `super`; `B` does

class A(object):
    def __init__(self):
        print("-> A")
        print("<- A")

class B(object):
    def __init__(self):
        print("-> B")
        super(B, self).__init__()
        print("<- B")

class Adapter(object):
    def __init__(self):
        print("-> C")
        A.__init__(self)
        super(Adapter, self).__init__()
        print("<- C")

class C(Adapter, B):
    pass

>>> C()
-> C
-> A
<- A
-> B
<- B
<- C

Class Adapter implements super so that C can define the MRO, which comes into play when super(Adapter, self).__init__() is executed.

And what if it’s the other way around?

Third-party parents: `A` implements `super`; `B` does not

class A(object):
    def __init__(self):
        print("-> A")
        super(A, self).__init__()
        print("<- A")

class B(object):
    def __init__(self):
        print("-> B")
        print("<- B")

class Adapter(object):
    def __init__(self):
        print("-> C")
        super(Adapter, self).__init__()
        B.__init__(self)
        print("<- C")

class C(Adapter, A):
    pass

>>> C()
-> C
-> A
<- A
-> B
<- B
<- C

Same pattern here, except the order of execution is switched in Adapter.__init__; super call first, then explicit call. Notice that each case with third-party parents requires a unique adapter class.

So it seems that unless I know/control the init’s of the classes I inherit from (A and B) I cannot make a safe choice for the class I’m writing (C).

Although you can handle the cases where you don’t control the source code of A and B by using an adapter class, it is true that you must know how the init’s of the parent classes implement super (if at all) in order to do so.

Question 56

As Raymond said in his answer, a direct call to A.__init__ and B.__init__ works fine, and your code would be readable.

However, it does not use the inheritance link between C and those classes. Exploiting that link gives you more consistancy and make eventual refactorings easier and less error-prone. An example of how to do that:

class C(A, B):
    def __init__(self):
        print("entering c")
        for base_class in C.__bases__:  # (A, B)
             base_class.__init__(self)
        print("leaving c")

Question 57

This article helps to explain cooperative multiple inheritance:

http://www.artima.com/weblogs/viewpost.jsp?thread=281127

It mentions the useful method mro() that shows you the method resolution order. In your 2nd example, where you call super in A, the super call continues on in MRO. The next class in the order is B, this is why B‘s init is called the first time.

Here’s a more technical article from the official python site:

http://www.python.org/download/releases/2.3/mro/

Question 58

If you are multiply sub-classing classes from third party libraries, then no, there is no blind approach to calling the base class __init__ methods (or any other methods) that actually works regardless of how the base classes are programmed.

super makes it possible to write classes designed to cooperatively implement methods as part of complex multiple inheritance trees which need not be known to the class author. But there’s no way to use it to correctly inherit from arbitrary classes that may or may not use super.

Essentially, whether a class is designed to be sub-classed using super or with direct calls to the base class is a property which is part of the class’ “public interface”, and it should be documented as such. If you’re using third-party libraries in the way that the library author expected and the library has reasonable documentation, it would normally tell you what you are required to do to subclass particular things. If not, then you’ll have to look at the source code for the classes you’re sub-classing and see what their base-class-invocation convention is. If you’re combining multiple classes from one or more third-party libraries in a way that the library authors didn’t expect, then it may not be possible to consistently invoke super-class methods at all; if class A is part of a hierarchy using super and class B is part of a hierarchy that doesn’t use super, then neither option is guaranteed to work. You’ll have to figure out a strategy that happens to work for each particular case.

问题：django MultiValueDictKeyError错误，我该如何处理

回答 0

回答 1

1个

2

3

1

2

3

回答 2

回答 3

回答 4

回答 5

回答 6

问题：sqlalchemy在多列中唯一

回答 0

回答 1

问题：Python argparse：默认值或指定值

回答 0

回答 1

回答 2

问题：用括号括起来的列表和括号在Python中有什么区别？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：从类定义中的列表理解访问类变量

回答 0

为什么；或者，关于这个的正式词

（小）异常；或者，为什么一部分仍然可以工作

在引擎盖下看；或者，比您想要的方式更详细

解决方法；或者，该怎么办

The why; or, the official word on this

The (small) exception; or, why one part may still work

Looking under the hood; or, way more detail than you ever wanted

A workaround; or, what to do about it

回答 1

回答 2

回答 3

回答 4

问题：在Python中使用多个参数进行字符串格式化（例如’％s…％s’）

回答 0

回答 1

回答 2

在元组/映射对象上有多个参数 format

参考资料

开启str.format而不是%

参考资料

例子

也可以看看

On a tuple/mapping object for multiple argument format

References

On str.format instead of %

References

Examples

See also

回答 3

回答 4

回答 5

回答 6

回答 7

问题：在组对象上应用vs变换

回答 0

apply和之间的两个主要区别transform

检查自定义功能

例子

显示传递的熊猫对象

变换必须返回与组大小相同的一维序列

返回单个标量对象也适用于 transform

Two major differences between apply and transform

Inspecting the custom function

Examples

Displaying the passed pandas object

Transform must return a single dimensional sequence the same size as the group

Returning a single scalar object also works for transform

回答 1

回答 2

回答 3

在元组/映射对象上有多个参数 `format`

开启`str.format`而不是`%`

On a tuple/mapping object for multiple argument `format`

On `str.format` instead of `%`

`apply`和之间的两个主要区别`transform`

返回单个标量对象也适用于 `transform`

Two major differences between `apply` and `transform`

Returning a single scalar object also works for `transform`

问题：用多重继承调用父类init，正确的方法是什么？

第三方家长：`A`未实施`super`；`B`确实

第三方父母：`A`工具`super`；`B`才不是

Third-party parents: `A` does not implement `super`; `B` does

Third-party parents: `A` implements `super`; `B` does not