Python 实用宝典

Question 1

Note: This question is for informational purposes only. I am interested to see how deep into Python’s internals it is possible to go with this.

Not very long ago, a discussion began inside a certain question regarding whether the strings passed to print statements could be modified after/during the call to print has been made. For example, consider the function:

def print_something():
    print('This cat was scared.')

Now, when print is run, then the output to the terminal should display:

This dog was scared.

Notice the word “cat” has been replaced by the word “dog”. Something somewhere somehow was able to modify those internal buffers to change what was printed. Assume this is done without the original code author’s explicit permission (hence, hacking/hijacking).

This comment from the wise @abarnert, in particular, got me thinking:

There are a couple of ways to do that, but they’re all very ugly, and should never be done. The least ugly way is to probably replace the code object inside the function with one with a different co_consts list. Next is probably reaching into the C API to access the str’s internal buffer. […]

So, it looks like this is actually possible.

Here’s my naive way of approaching this problem:

>>> import inspect
>>> exec(inspect.getsource(print_something).replace('cat', 'dog'))
>>> print_something()
This dog was scared.

Of course, exec is bad, but that doesn’t really answer the question, because it does not actually modify anything during when/after print is called.

How would it be done as @abarnert has explained it?

Question 2

First, there’s actually a much less hacky way. All we want to do is change what print prints, right?

_print = print
def print(*args, **kw):
    args = (arg.replace('cat', 'dog') if isinstance(arg, str) else arg
            for arg in args)
    _print(*args, **kw)

Or, similarly, you can monkeypatch sys.stdout instead of print.

Also, nothing wrong with the exec … getsource … idea. Well, of course there’s plenty wrong with it, but less than what follows here…

But if you do want to modify the function object’s code constants, we can do that.

If you really want to play around with code objects for real, you should use a library like bytecode (when it’s finished) or byteplay (until then, or for older Python versions) instead of doing it manually. Even for something this trivial, the CodeType initializer is a pain; if you actually need to do stuff like fixing up lnotab, only a lunatic would do that manually.

Also, it goes without saying that not all Python implementations use CPython-style code objects. This code will work in CPython 3.7, and probably all versions back to at least 2.2 with a few minor changes (and not the code-hacking stuff, but things like generator expressions), but it won’t work with any version of IronPython.

import types

def print_function():
    print ("This cat was scared.")

def main():
    # A function object is a wrapper around a code object, with
    # a bit of extra stuff like default values and closure cells.
    # See inspect module docs for more details.
    co = print_function.__code__
    # A code object is a wrapper around a string of bytecode, with a
    # whole bunch of extra stuff, including a list of constants used
    # by that bytecode. Again see inspect module docs. Anyway, inside
    # the bytecode for string (which you can read by typing
    # dis.dis(string) in your REPL), there's going to be an
    # instruction like LOAD_CONST 1 to load the string literal onto
    # the stack to pass to the print function, and that works by just
    # reading co.co_consts[1]. So, that's what we want to change.
    consts = tuple(c.replace("cat", "dog") if isinstance(c, str) else c
                   for c in co.co_consts)
    # Unfortunately, code objects are immutable, so we have to create
    # a new one, copying over everything except for co_consts, which
    # we'll replace. And the initializer has a zillion parameters.
    # Try help(types.CodeType) at the REPL to see the whole list.
    co = types.CodeType(
        co.co_argcount, co.co_kwonlyargcount, co.co_nlocals,
        co.co_stacksize, co.co_flags, co.co_code,
        consts, co.co_names, co.co_varnames, co.co_filename,
        co.co_name, co.co_firstlineno, co.co_lnotab,
        co.co_freevars, co.co_cellvars)
    print_function.__code__ = co
    print_function()

main()

What could go wrong with hacking up code objects? Mostly just segfaults, RuntimeErrors that eat up the whole stack, more normal RuntimeErrors that can be handled, or garbage values that will probably just raise a TypeError or AttributeError when you try to use them. For examples, try creating a code object with just a RETURN_VALUE with nothing on the stack (bytecode b'S\0' for 3.6+, b'S' before), or with an empty tuple for co_consts when there’s a LOAD_CONST 0 in the bytecode, or with varnames decremented by 1 so the highest LOAD_FAST actually loads a freevar/cellvar cell. For some real fun, if you get the lnotab wrong enough, your code will only segfault when run in the debugger.

Using bytecode or byteplay won’t protect you from all of those problems, but they do have some basic sanity checks, and nice helpers that let you do things like insert a chunk of code and let it worry about updating all offsets and labels so you can’t get it wrong, and so on. (Plus, they keep you from having to type in that ridiculous 6-line constructor, and having to debug the silly typos that come from doing so.)

Now on to #2.

I mentioned that code objects are immutable. And of course the consts are a tuple, so we can’t change that directly. And the thing in the const tuple is a string, which we also can’t change directly. That’s why I had to build a new string to build a new tuple to build a new code object.

But what if you could change a string directly?

Well, deep enough under the covers, everything is just a pointer to some C data, right? If you’re using CPython, there’s a C API to access the objects, and you can use ctypes to access that API from within Python itself, which is such a terrible idea that they put a pythonapi right there in the stdlib’s ctypes module. :) The most important trick you need to know is that id(x) is the actual pointer to x in memory (as an int).

Unfortunately, the C API for strings won’t let us safely get at the internal storage of an already-frozen string. So screw safely, let’s just read the header files and find that storage ourselves.

If you’re using CPython 3.4 – 3.7 (it’s different for older versions, and who knows for the future), a string literal from a module that’s made of pure ASCII is going to be stored using the compact ASCII format, which means the struct ends early and the buffer of ASCII bytes follows immediately in memory. This will break (as in probably segfault) if you put a non-ASCII character in the string, or certain kinds of non-literal strings, but you can read up on the other 4 ways to access the buffer for different kinds of strings.

To make things slightly easier, I’m using the superhackyinternals project off my GitHub. (It’s intentionally not pip-installable because you really shouldn’t be using this except to experiment with your local build of the interpreter and the like.)

import ctypes
import internals # https://github.com/abarnert/superhackyinternals/blob/master/internals.py

def print_function():
    print ("This cat was scared.")

def main():
    for c in print_function.__code__.co_consts:
        if isinstance(c, str):
            idx = c.find('cat')
            if idx != -1:
                # Too much to explain here; just guess and learn to
                # love the segfaults...
                p = internals.PyUnicodeObject.from_address(id(c))
                assert p.compact and p.ascii
                addr = id(c) + internals.PyUnicodeObject.utf8_length.offset
                buf = (ctypes.c_int8 * 3).from_address(addr + idx)
                buf[:3] = b'dog'

    print_function()

main()

If you want to play with this stuff, int is a whole lot simpler under the covers than str. And it’s a lot easier to guess what you can break by changing the value of 2 to 1, right? Actually, forget imagining, let’s just do it (using the types from superhackyinternals again):

>>> n = 2
>>> pn = PyLongObject.from_address(id(n))
>>> pn.ob_digit[0]
2
>>> pn.ob_digit[0] = 1
>>> 2
1
>>> n * 3
3
>>> i = 10
>>> while i < 40:
...     i *= 2
...     print(i)
10
10
10

… pretend that code box has an infinite-length scrollbar.

I tried the same thing in IPython, and the first time I tried to evaluate 2 at the prompt, it went into some kind of uninterruptable infinite loop. Presumably it’s using the number 2 for something in its REPL loop, while the stock interpreter isn’t?

Question 3

Monkey-patch `print`

print is a builtin function so it will use the print function defined in the builtins module (or __builtin__ in Python 2). So whenever you want to modify or change the behavior of a builtin function you can simply reassign the name in that module.

This process is called monkey-patching.

# Store the real print function in another variable otherwise
# it will be inaccessible after being modified.
_print = print  

# Actual implementation of the new print
def custom_print(*args, **options):
    _print('custom print called')
    _print(*args, **options)

# Change the print function globally
import builtins
builtins.print = custom_print

After that every print call will go through custom_print, even if the print is in an external module.

However you don’t really want to print additional text, you want to change the text that is printed. One way to go about that is to replace it in the string that would be printed:

_print = print  

def custom_print(*args, **options):
    # Get the desired seperator or the default whitspace
    sep = options.pop('sep', ' ')
    # Create the final string
    printed_string = sep.join(args)
    # Modify the final string
    printed_string = printed_string.replace('cat', 'dog')
    # Call the default print function
    _print(printed_string, **options)

import builtins
builtins.print = custom_print

And indeed if you run:

>>> def print_something():
...     print('This cat was scared.')
>>> print_something()
This dog was scared.

Or if you write that to a file:

test_file.py

def print_something():
    print('This cat was scared.')

print_something()

and import it:

>>> import test_file
This dog was scared.
>>> test_file.print_something()
This dog was scared.

So it really works as intended.

However, in case you only temporarily want to monkey-patch print you could wrap this in a context-manager:

import builtins

class ChangePrint(object):
    def __init__(self):
        self.old_print = print

    def __enter__(self):
        def custom_print(*args, **options):
            # Get the desired seperator or the default whitspace
            sep = options.pop('sep', ' ')
            # Create the final string
            printed_string = sep.join(args)
            # Modify the final string
            printed_string = printed_string.replace('cat', 'dog')
            # Call the default print function
            self.old_print(printed_string, **options)

        builtins.print = custom_print

    def __exit__(self, *args, **kwargs):
        builtins.print = self.old_print

So when you run that it depends on the context what is printed:

>>> with ChangePrint() as x:
...     test_file.print_something()
... 
This dog was scared.
>>> test_file.print_something()
This cat was scared.

So that’s how you could “hack” print by monkey-patching.

Modify the target instead of the `print`

If you look at the signature of print you’ll notice a file argument which is sys.stdout by default. Note that this is a dynamic default argument (it really looks up sys.stdout every time you call print) and not like normal default arguments in Python. So if you change sys.stdout print will actually print to the different target even more convenient that Python also provides a redirect_stdout function (from Python 3.4 on, but it’s easy to create an equivalent function for earlier Python versions).

The downside is that it won’t work for print statements that don’t print to sys.stdout and that creating your own stdout isn’t really straightforward.

import io
import sys

class CustomStdout(object):
    def __init__(self, *args, **kwargs):
        self.current_stdout = sys.stdout

    def write(self, string):
        self.current_stdout.write(string.replace('cat', 'dog'))

However this also works:

>>> import contextlib
>>> with contextlib.redirect_stdout(CustomStdout()):
...     test_file.print_something()
... 
This dog was scared.
>>> test_file.print_something()
This cat was scared.

Summary

Some of these points have already be mentioned by @abarnet but I wanted to explore these options in more detail. Especially how to modify it across modules (using builtins/__builtin__) and how to make that change only temporary (using contextmanagers).

Question 4

A simple way to capture all output from a print function and then process it, is to change the output stream to something else, e.g. a file.

I’ll use a PHP naming conventions (ob_start, ob_get_contents,…)

from functools import partial
output_buffer = None
print_orig = print
def ob_start(fname="print.txt"):
    global print
    global output_buffer
    print = partial(print_orig, file=output_buffer)
    output_buffer = open(fname, 'w')
def ob_end():
    global output_buffer
    close(output_buffer)
    print = print_orig
def ob_get_contents(fname="print.txt"):
    return open(fname, 'r').read()

Usage:

print ("Hi John")
ob_start()
print ("Hi John")
ob_end()
print (ob_get_contents().replace("Hi", "Bye"))

Would print

Hi John Bye John

Question 5

Let’s combine this with frame introspection!

import sys

_print = print

def print(*args, **kw):
    frame = sys._getframe(1)
    _print(frame.f_code.co_name)
    _print(*args, **kw)

def greetly(name, greeting = "Hi")
    print(f"{greeting}, {name}!")

class Greeter:
    def __init__(self, greeting = "Hi"):
        self.greeting = greeting
    def greet(self, name):
        print(f"{self.greeting}, {name}!")

You’ll find this trick prefaces every greeting with the calling function or method. This might be very useful for logging or debugging; especially as it lets you “hijack” print statements in third party code.

Question 6

I was looking at the source of sorted_containers and was surprised to see this line:

self._load, self._twice, self._half = load, load * 2, load >> 1

Here load is an integer. Why use bit shift in one place, and multiplication in another? It seems reasonable that bit shifting may be faster than integral division by 2, but why not replace the multiplication by a shift as well? I benchmarked the the following cases:

(times, divide)
(shift, shift)
(times, shift)
(shift, divide)

and found that #3 is consistently faster than other alternatives:

# self._load, self._twice, self._half = load, load * 2, load >> 1

import random
import timeit
import pandas as pd

x = random.randint(10 ** 3, 10 ** 6)

def test_naive():
    a, b, c = x, 2 * x, x // 2

def test_shift():
    a, b, c = x, x << 1, x >> 1    

def test_mixed():
    a, b, c = x, x * 2, x >> 1    

def test_mixed_swapped():
    a, b, c = x, x << 1, x // 2

def observe(k):
    print(k)
    return {
        'naive': timeit.timeit(test_naive),
        'shift': timeit.timeit(test_shift),
        'mixed': timeit.timeit(test_mixed),
        'mixed_swapped': timeit.timeit(test_mixed_swapped),
    }

def get_observations():
    return pd.DataFrame([observe(k) for k in range(100)])

The question:

Is my test valid? If so, why is (multiply, shift) faster than (shift, shift)?

I run Python 3.5 on Ubuntu 14.04.

Edit

Above is the original statement of the question. Dan Getz provides an excellent explanation in his answer.

For the sake of completeness, here are sample illustrations for larger x when multiplication optimizations do not apply.

Question 7

This seems to be because multiplication of small numbers is optimized in CPython 3.5, in a way that left shifts by small numbers are not. Positive left shifts always create a larger integer object to store the result, as part of the calculation, while for multiplications of the sort you used in your test, a special optimization avoids this and creates an integer object of the correct size. This can be seen in the source code of Python’s integer implementation.

Because integers in Python are arbitrary-precision, they are stored as arrays of integer “digits”, with a limit on the number of bits per integer digit. So in the general case, operations involving integers are not single operations, but instead need to handle the case of multiple “digits”. In pyport.h, this bit limit is defined as 30 bits on 64-bit platform, or 15 bits otherwise. (I’ll just call this 30 from here on to keep the explanation simple. But note that if you were using Python compiled for 32-bit, your benchmark’s result would depend on if x were less than 32,768 or not.)

When an operation’s inputs and outputs stay within this 30-bit limit, the operation can be handled in an optimized way instead of the general way. The beginning of the integer multiplication implementation is as follows:

static PyObject *
long_mul(PyLongObject *a, PyLongObject *b)
{
    PyLongObject *z;

    CHECK_BINOP(a, b);

    /* fast path for single-digit multiplication */
    if (Py_ABS(Py_SIZE(a)) <= 1 && Py_ABS(Py_SIZE(b)) <= 1) {
        stwodigits v = (stwodigits)(MEDIUM_VALUE(a)) * MEDIUM_VALUE(b);
#ifdef HAVE_LONG_LONG
        return PyLong_FromLongLong((PY_LONG_LONG)v);
#else
        /* if we don't have long long then we're almost certainly
           using 15-bit digits, so v will fit in a long.  In the
           unlikely event that we're using 30-bit digits on a platform
           without long long, a large v will just cause us to fall
           through to the general multiplication code below. */
        if (v >= LONG_MIN && v <= LONG_MAX)
            return PyLong_FromLong((long)v);
#endif
    }

So when multiplying two integers where each fits in a 30-bit digit, this is done as a direct multiplication by the CPython interpreter, instead of working with the integers as arrays. (MEDIUM_VALUE() called on a positive integer object simply gets its first 30-bit digit.) If the result fits in a single 30-bit digit, PyLong_FromLongLong() will notice this in a relatively small number of operations, and create a single-digit integer object to store it.

In contrast, left shifts are not optimized this way, and every left shift deals with the integer being shifted as an array. In particular, if you look at the source code for long_lshift(), in the case of a small but positive left shift, a 2-digit integer object is always created, if only to have its length truncated to 1 later: (my comments in /*** ***/)

static PyObject *
long_lshift(PyObject *v, PyObject *w)
{
    /*** ... ***/

    wordshift = shiftby / PyLong_SHIFT;   /*** zero for small w ***/
    remshift  = shiftby - wordshift * PyLong_SHIFT;   /*** w for small w ***/

    oldsize = Py_ABS(Py_SIZE(a));   /*** 1 for small v > 0 ***/
    newsize = oldsize + wordshift;
    if (remshift)
        ++newsize;   /*** here newsize becomes at least 2 for w > 0, v > 0 ***/
    z = _PyLong_New(newsize);

    /*** ... ***/
}

Integer division

You didn’t ask about the worse performance of integer floor division compared to right shifts, because that fit your (and my) expectations. But dividing a small positive number by another small positive number is not as optimized as small multiplications, either. Every // computes both the quotient and the remainder using the function long_divrem(). This remainder is computed for a small divisor with a multiplication, and is stored in a newly-allocated integer object, which in this situation is immediately discarded.

Question 8

How does tf.app.run() work in Tensorflow translate demo?

In tensorflow/models/rnn/translate/translate.py, there is a call to tf.app.run(). How is it being handled?

if __name__ == "__main__":
    tf.app.run()

Question 9

if __name__ == "__main__":

means current file is executed under a shell instead of imported as a module.

tf.app.run()

As you can see through the file app.py

def run(main=None, argv=None):
  """Runs the program with an optional 'main' function and 'argv' list."""
  f = flags.FLAGS

  # Extract the args from the optional `argv` list.
  args = argv[1:] if argv else None

  # Parse the known flags from that list, or from the command
  # line otherwise.
  # pylint: disable=protected-access
  flags_passthrough = f._parse_flags(args=args)
  # pylint: enable=protected-access

  main = main or sys.modules['__main__'].main

  # Call the main function, passing through any arguments
  # to the final program.
  sys.exit(main(sys.argv[:1] + flags_passthrough))

Let’s break line by line:

flags_passthrough = f._parse_flags(args=args)

This ensures that the argument you pass through command line is valid,e.g. python my_model.py --data_dir='...' --max_iteration=10000 Actually, this feature is implemented based on python standard argparse module.

main = main or sys.modules['__main__'].main

The first main in right side of = is the first argument of current function run(main=None, argv=None) . While sys.modules['__main__'] means current running file(e.g. my_model.py).

So there are two cases:

You don’t have a main function in my_model.py Then you have to call tf.app.run(my_main_running_function)
you have a main function in my_model.py. (This is mostly the case.)

Last line:

sys.exit(main(sys.argv[:1] + flags_passthrough))

ensures your main(argv) or my_main_running_function(argv) function is called with parsed arguments properly.

Question 10

It’s just a very quick wrapper that handles flag parsing and then dispatches to your own main. See the code.

Question 11

There is nothing special in tf.app. This is just a generic entry point script, which

Runs the program with an optional ‘main’ function and ‘argv’ list.

It has nothing to do with neural networks and it just calls the main function, passing through any arguments to it.

Question 12

In simple terms, the job of tf.app.run() is to first set the global flags for later usage like:

from tensorflow.python.platform import flags
f = flags.FLAGS

and then run your custom main function with a set of arguments.

For e.g. in TensorFlow NMT codebase, the very first entry point for the program execution for training/inference starts at this point (see below code)

if __name__ == "__main__":
  nmt_parser = argparse.ArgumentParser()
  add_arguments(nmt_parser)
  FLAGS, unparsed = nmt_parser.parse_known_args()
  tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)

After parsing the arguments using argparse, with tf.app.run() you run the function “main” which is defined like:

def main(unused_argv):
  default_hparams = create_hparams(FLAGS)
  train_fn = train.train
  inference_fn = inference.inference
  run_main(FLAGS, default_hparams, train_fn, inference_fn)

So, after setting the flags for global use, tf.app.run() simply runs that main function that you pass to it with argv as its parameters.

P.S.: As Salvador Dali’s answer says, it’s just a good software engineering practice, I guess, although I’m not sure whether TensorFlow performs any optimized run of the main function than that was run using normal CPython.

Question 13

Google code depends on a lot on global flags being accessing in libraries/binaries/python scripts and so tf.app.run() parses out those flags to create a global state in FLAGs(or something similar) variable and then calls python main() as it should.

If they didn’t have this call to tf.app.run(), then users might forget to do FLAGs parsing, leading to these libraries/binaries/scripts not having access to FLAGs they need.

Question 14

2.0 Compatible Answer: If you want to use tf.app.run() in Tensorflow 2.0, we should use the command,

tf.compat.v1.app.run() or you can use tf_upgrade_v2 to convert 1.x code to 2.0.

Question 15

Python 3.2 introduced Concurrent Futures, which appear to be some advanced combination of the older threading and multiprocessing modules.

What are the advantages and disadvantages of using this for CPU bound tasks over the older multiprocessing module?

This article suggests they’re much easier to work with – is that the case?

Question 16

I wouldn’t call concurrent.futures more “advanced” – it’s a simpler interface that works very much the same regardless of whether you use multiple threads or multiple processes as the underlying parallelization gimmick.

So, like virtually all instances of “simpler interface”, much the same trade-offs are involved: it has a shallower learning curve, in large part just because there’s so much less available to be learned; but, because it offers fewer options, it may eventually frustrate you in ways the richer interfaces won’t.

So far as CPU-bound tasks go, that’s way too under-specified to say much meaningful. For CPU-bound tasks under CPython, you need multiple processes rather than multiple threads to have any chance of getting a speedup. But how much (if any) of a speedup you get depends on the details of your hardware, your OS, and especially on how much inter-process communication your specific tasks require. Under the covers, all inter-process parallelization gimmicks rely on the same OS primitives – the high-level API you use to get at those isn’t a primary factor in bottom-line speed.

Edit: example

Here’s the final code shown in the article you referenced, but I’m adding an import statement needed to make it work:

from concurrent.futures import ProcessPoolExecutor
def pool_factorizer_map(nums, nprocs):
    # Let the executor divide the work among processes by using 'map'.
    with ProcessPoolExecutor(max_workers=nprocs) as executor:
        return {num:factors for num, factors in
                                zip(nums,
                                    executor.map(factorize_naive, nums))}

Here’s exactly the same thing using multiprocessing instead:

import multiprocessing as mp
def mp_factorizer_map(nums, nprocs):
    with mp.Pool(nprocs) as pool:
        return {num:factors for num, factors in
                                zip(nums,
                                    pool.map(factorize_naive, nums))}

Note that the ability to use multiprocessing.Pool objects as context managers was added in Python 3.3.

As for which one is easier to work with, they’re essentially identical.

One difference is that Pool supports so many different ways of doing things that you may not realize how easy it can be until you’ve climbed quite a way up the learning curve.

Again, all those different ways are both a strength and a weakness. They’re a strength because the flexibility may be required in some situations. They’re a weakness because of “preferably only one obvious way to do it”. A project sticking exclusively (if possible) to concurrent.futures will probably be easier to maintain over the long run, due to the lack of gratuitous novelty in how its minimal API can be used.

Question 17

What does the / mean in Python 3.4’s help output for range before the closing parenthesis?

>>> help(range)
Help on class range in module builtins:

class range(object)
 |  range(stop) -> range object
 |  range(start, stop[, step]) -> range object
 |  
 |  Return a virtual sequence of numbers from start to stop by step.
 |  
 |  Methods defined here:
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.

                                        ...

Question 18

It signifies the end of the positional only parameters, parameters you cannot use as keyword parameters. Before Python 3.8, such parameters could only be specified in the C API.

It means the key argument to __contains__ can only be passed in by position (range(5).__contains__(3)), not as a keyword argument (range(5).__contains__(key=3)), something you can do with positional arguments in pure-python functions.

Also see the Argument Clinic documentation:

To mark all parameters as positional-only in Argument Clinic, add a / on a line by itself after the last parameter, indented the same as the parameter lines.

and the (very recent addition to) the Python FAQ:

A slash in the argument list of a function denotes that the parameters prior to it are positional-only. Positional-only parameters are the ones without an externally-usable name. Upon calling a function that accepts positional-only parameters, arguments are mapped to parameters based solely on their position.

The syntax is now part of the Python language specification, as of version 3.8, see PEP 570 – Python Positional-Only Parameters. Before PEP 570, the syntax was already reserved for possible future inclusion in Python, see PEP 457 – Syntax For Positional-Only Parameters.

Positional-only parameters can lead to cleaner and clearer APIs, make pure-Python implementations of otherwise C-only modules more consistent and easier to maintain, and because positional-only parameters require very little processing, they lead to faster Python code.

Question 19

I asked this question myself. :) Found out that / was originally proposed by Guido in here.

Alternative proposal: how about using ‘/’ ? It’s kind of the opposite of ‘*’ which means “keyword argument”, and ‘/’ is not a new character.

Then his proposal won.

Heh. If that’s true, my ‘/’ proposal wins:
 def foo(pos_only, /, pos_or_kw, *, kw_only): ...

I think the very relevant document covering this is PEP 570. Where recap section looks nice.

Recap

The use case will determine which parameters to use in the function definition:
 def f(pos1, pos2, /, pos_or_kwd, *, kwd1, kwd2):
As guidance:

Use positional-only if names do not matter or have no meaning, and there are only a few arguments which will always be passed in the same order. Use keyword-only when names have meaning and the function definition is more understandable by being explicit with names.

If the function ends with /

def foo(p1, p2, /)

This means all functional arguments are positional.

Question 20

Forward Slash (/) indicates all arguments prior to it are positional only argument. Positional only arguments feature was added in python 3.8 after PEP 570 was accepted. Initially this notation was defined in PEP 457 – Notation for Notation For Positional-Only Parameters

Parameters in function definition prior Foraward slash (/) are positional only and parameters followed by slash(/) can be of any kind as per syntax. Where arguments are mapped to positional only parameters solely based on their position upon calling a function. Passing positional-only parameters by keywords(name) is invalid.

Let’s take following example

def foo(a, b, / , x, y):
   print("positional ", a, b)
   print("positional or keyword", x, y)

Here in the above function definition parameters a and b are positional-only, while x or y can be either positional or keyword.

Following function calls are valid

foo(40, 20, 99, 39)
foo(40, 3.14, "hello", y="world")
foo(1.45, 3.14, x="hello", y="world")

But, following function call is not valid which raises an exception TypeError since a, b are not passed as positional arguments instead passed as keyword

foo(a=1.45, b=3.14, x=1, y=4)

TypeError: foo() got some positional-only arguments passed as keyword arguments: ‘a, b’

Many built in function in python accept positional only arguments where passing arguments by keyword doesn’t make sense. For example built-in function len accepts only one positional(only) argument, Where calling len as len(obj=”hello world”) impairs readability, check help(len).

>>> help(len)
Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.

Positional only parameters make underlying c/library functions easy to maintain. It allows parameters names of positional only parameters to be changes in future without risk of breaking client code that uses API

Last but not least, positional only parameters allow us to use their names to be used in variable length keyword arguments. Check following example

>>> def f(a, b, /, **kwargs):
...     print(a, b, kwargs)
...
>>> f(10, 20, a=1, b=2, c=3)         # a and b are used in two ways
10 20 {'a': 1, 'b': 2, 'c': 3}

Positional only parameters is better Explained here at Types of function arguments in python: Positional Only Parameters

Positional-only parameters syntax was officially added to python3.8. Checkout what’s new python3.8 – positional only arguments

PEP Related: PEP 570 — Python Positional-Only Parameters

Question 21

I have defined a class in a file named Object.py. When I try to inherit from this class in another file, calling the constructor throws an exception:

TypeError: module.__init__() takes at most 2 arguments (3 given)

This is my code:

import Object

class Visitor(Object):
    pass

instance = Visitor()  # this line throws the exception

What am I doing wrong?

Question 22

Your error is happening because Object is a module, not a class. So your inheritance is screwy.

Change your import statement to:

from Object import ClassName

and your class definition to:

class Visitor(ClassName):

or

change your class definition to:

class Visitor(Object.ClassName):
   etc

Question 23

Even after @Mickey Perlstein’s answer and his 3 hours of detective work, it still took me a few more minutes to apply this to my own mess. In case anyone else is like me and needs a little more help, here’s what was going on in my situation.

responses is a module
Response is a base class within the responses module
GeoJsonResponse is a new class derived from Response

Initial GeoJsonResponse class:

from pyexample.responses import Response

class GeoJsonResponse(Response):

    def __init__(self, geo_json_data):

Looks fine. No problems until you try to debug the thing, which is when you get a bunch of seemingly vague error messages like this:

from pyexample.responses import GeoJsonResponse ..\pyexample\responses\GeoJsonResponse.py:12: in (module) class GeoJsonResponse(Response):

E TypeError: module() takes at most 2 arguments (3 given)

=================================== ERRORS ====================================

___________________ ERROR collecting tests/test_geojson.py ____________________

test_geojson.py:2: in (module) from pyexample.responses import GeoJsonResponse ..\pyexample\responses \GeoJsonResponse.py:12: in (module)

class GeoJsonResponse(Response): E TypeError: module() takes at most 2 arguments (3 given)

ERROR: not found: \PyExample\tests\test_geojson.py::TestGeoJson::test_api_response

C:\Python37\lib\site-packages\aenum__init__.py:163

(no name ‘PyExample\ tests\test_geojson.py::TestGeoJson::test_api_response’ in any of [])

The errors were doing their best to point me in the right direction, and @Mickey Perlstein’s answer was dead on, it just took me a minute to put it all together in my own context:

I was importing the module:

from pyexample.responses import Response

when I should have been importing the class:

from pyexample.responses.Response import Response

Hope this helps someone. (In my defense, it’s still pretty early.)

Question 24

You may also do the following in Python 3.6.1

from Object import Object as Parent

and your class definition to:

class Visitor(Parent):

Question 25

from Object import Object

or

From Class_Name import Class_name

If Object is a .py file.

Question 26

In my case where I had the problem I was referring to a module when I tried extending the class.

import logging
class UserdefinedLogging(logging):

If you look at the Documentation Info, you’ll see “logging” displayed as module.

In this specific case I had to simply inherit the logging module to create an extra class for the logging.

Question 27

If you execute the following statement in Python 3.7, it will (from my testing) print b:

if None.__eq__("a"):
    print("b")

However, None.__eq__("a") evaluates to NotImplemented.

Naturally, "a".__eq__("a") evaluates to True, and "b".__eq__("a") evaluates to False.

I initially discovered this when testing the return value of a function, but didn’t return anything in the second case — so, the function returned None.

What’s going on here?

Question 28

This is a great example of why the __dunder__ methods should not be used directly as they are quite often not appropriate replacements for their equivalent operators; you should use the == operator instead for equality comparisons, or in this special case, when checking for None, use is (skip to the bottom of the answer for more information).

You’ve done

None.__eq__('a')
# NotImplemented

Which returns NotImplemented since the types being compared are different. Consider another example where two objects with different types are being compared in this fashion, such as 1 and 'a'. Doing (1).__eq__('a') is also not correct, and will return NotImplemented. The right way to compare these two values for equality would be

1 == 'a'
# False

What happens here is

First, (1).__eq__('a') is tried, which returns NotImplemented. This indicates that the operation is not supported, so
'a'.__eq__(1) is called, which also returns the same NotImplemented. So,
The objects are treated as if they are not the same, and False is returned.

Here’s a nice little MCVE using some custom classes to illustrate how this happens:

class A:
    def __eq__(self, other):
        print('A.__eq__')
        return NotImplemented

class B:
    def __eq__(self, other):
        print('B.__eq__')
        return NotImplemented

class C:
    def __eq__(self, other):
        print('C.__eq__')
        return True

a = A()
b = B()
c = C()

print(a == b)
# A.__eq__
# B.__eq__
# False

print(a == c)
# A.__eq__
# C.__eq__
# True

print(c == a)
# C.__eq__
# True

Of course, that doesn’t explain why the operation returns true. This is because NotImplemented is actually a truthy value:

bool(None.__eq__("a"))
# True

Same as,

bool(NotImplemented)
# True

For more information on what values are considered truthy and falsy, see the docs section on Truth Value Testing, as well as this answer. It is worth noting here that NotImplemented is truthy, but it would have been a different story had the class defined a __bool__ or __len__ method that returned False or 0 respectively.

If you want the functional equivalent of the == operator, use operator.eq:

import operator
operator.eq(1, 'a')
# False

However, as mentioned earlier, for this specific scenario, where you are checking for None, use is:

var = 'a'
var is None
# False

var2 = None
var2 is None
# True

The functional equivalent of this is using operator.is_:

operator.is_(var2, None)
# True

None is a special object, and only 1 version exists in memory at any point of time. IOW, it is the sole singleton of the NoneType class (but the same object may have any number of references). The PEP8 guidelines make this explicit:

Comparisons to singletons like None should always be done with is or is not, never the equality operators.

In summary, for singletons like None, a reference check with is is more appropriate, although both == and is will work just fine.

Question 29

The result you are seeing is caused by that fact that

None.__eq__("a") # evaluates to NotImplemented

evaluates to NotImplemented, and NotImplemented‘s truth value is documented to be True:

https://docs.python.org/3/library/constants.html

Special value which should be returned by the binary special methods (e.g. __eq__(), __lt__(), __add__(), __rsub__(), etc.) to indicate that the operation is not implemented with respect to the other type; may be returned by the in-place binary special methods (e.g. __imul__(), __iand__(), etc.) for the same purpose. Its truth value is true.

If you call the __eq()__ method manually rather than just using ==, you need to be prepared to deal with the possibility it may return NotImplemented and that its truth value is true.

Question 30

As you already figured None.__eq__("a") evaluates to NotImplemented however if you try something like

if NotImplemented:
    print("Yes")
else:
    print("No")

the result is

yes

this mean that the truth value of NotImplemented true

Therefor the outcome of the question is obvious:

None.__eq__(something) yields NotImplemented

And bool(NotImplemented) evaluates to True

So if None.__eq__("a") is always True

Question 31

Why?

It returns a NotImplemented, yeah:

>>> None.__eq__('a')
NotImplemented
>>>

But if you look at this:

>>> bool(NotImplemented)
True
>>>

NotImplemented is actually a truthy value, so that’s why it returns b, anything that is True will pass, anything that is False wouldn’t.

How to solve it?

You have to check if it is True, so be more suspicious, as you see:

>>> NotImplemented == True
False
>>>

So you would do:

>>> if None.__eq__('a') == True:
    print('b')


>>>

And as you see, it wouldn’t return anything.

Question 32

Is there a way to generate random letters in Python (like random.randint but for letters)? The range functionality of random.randint would be nice but having a generator that just outputs a random letter would be better than nothing.

Question 33

Simple:

>>> import string
>>> string.ascii_letters
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> import random
>>> random.choice(string.ascii_letters)
'j'

string.ascii_letters returns a string containing the lower case and upper case letters according to the current locale.

random.choice returns a single, random element from a sequence.

Question 34

>>> import random
>>> import string
>>> random.choice(string.ascii_letters)
'g'

Question 35

>>>def random_char(y):
       return ''.join(random.choice(string.ascii_letters) for x in range(y))

>>>print (random_char(5))
>>>fxkea

to generate y number of random characters

Question 36

>>> import random
>>> import string    
>>> random.choice(string.ascii_lowercase)
'b'

Question 37

Another way, for completeness:

>>> chr(random.randrange(97, 97 + 26))

Use the fact that ascii ‘a’ is 97, and there are 26 letters in the alphabet.

When determining the upper and lower bound of the random.randrange() function call, remember that random.randrange() is exclusive on its upper bound, meaning it will only ever generate an integer up to 1 unit less that the provided value.

Question 38

You can use this to get one or more random letter(s)

import random
import string
random.seed(10)
letters = string.ascii_lowercase
rand_letters = random.choices(letters,k=5) # where k is the number of required rand_letters

print(rand_letters)

['o', 'l', 'p', 'f', 'v']

Question 39

You can just make a list:

import random
list1=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
b=random.randint(0,7)
print(list1[b])

Question 40

def randchar(a, b):
    return chr(random.randint(ord(a), ord(b)))

Question 41

import random
def guess_letter():
    return random.choice('abcdefghijklmnopqrstuvwxyz')

Question 42

import random
def Random_Alpha():
    l = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z']
    return l[random.randint(0,25)]

print(Random_Alpha())

Question 43

You can use

map(lambda a : chr(a),  np.random.randint(low=65, high=90, size=4))

Question 44

import string
import random

KEY_LEN = 20

def base_str():
    return (string.letters+string.digits)   
def key_gen():
    keylist = [random.choice(base_str()) for i in range(KEY_LEN)]
    return ("".join(keylist))

You can get random strings like this:

g9CtUljUWD9wtk1z07iF
ndPbI1DDn6UvHSQoDMtd
klMFY3pTYNVWsNJ6cs34
Qgr7OEalfhXllcFDGh2l

Question 45

def create_key(key_len):
    key = ''
    valid_characters_list = string.letters + string.digits
    for i in range(key_len):
        character = choice(valid_characters_list)
        key = key + character
    return key

def create_key_list(key_num):
    keys = []
    for i in range(key_num):
        key = create_key(key_len)
        if key not in keys:
            keys.append(key)
    return keys

Question 46

All previous answers are correct, if you are looking for random characters of various types (i.e. alphanumeric and special characters) then here is an script that I created to demonstrate various types of creating random functions, it has three functions one for numbers, alpha- characters and special characters. The script simply generates passwords and is just an example to demonstrate various ways of generating random characters.

import string
import random
import sys

#make sure it's 3.7 or above
print(sys.version)

def create_str(str_length):
    return random.sample(string.ascii_letters, str_length)

def create_num(num_length):
    digits = []
    for i in range(num_length):
        digits.append(str(random.randint(1, 100)))

    return digits

def create_special_chars(special_length):
    stringSpecial = []
    for i in range(special_length):
        stringSpecial.append(random.choice('!$%&()*+,-.:;<=>?@[]^_`{|}~'))

    return stringSpecial

print("how many characters would you like to use ? (DO NOT USE LESS THAN 8)")
str_cnt = input()
print("how many digits would you like to use ? (DO NOT USE LESS THAN 2)")
num_cnt = input()
print("how many special characters would you like to use ? (DO NOT USE LESS THAN 1)")
s_chars_cnt = input()
password_values = create_str(int(str_cnt)) +create_num(int(num_cnt)) + create_special_chars(int(s_chars_cnt))

#shuffle/mix the values
random.shuffle(password_values)

print("generated password is: ")
print(''.join(password_values))

Result:

Question 47

well, this is my answer! It works well. Just put the number of random letters you want in ‘number’… (Python 3)

import random

def key_gen():
    keylist = random.choice('abcdefghijklmnopqrstuvwxyz')
    return keylist

number = 0
list_item = ''
while number < 20:
    number = number + 1
    list_item = list_item + key_gen()

print(list_item)

Question 48

import string
import random

def random_char(y):
    return ''.join(random.choice(string.ascii_letters+string.digits+li) for x in range(y))
no=int(input("Enter the number of character for your password=  "))
li = random.choice('!@#$%^*&( )_+}{')
print(random_char(no)+li)

Question 49

My overly complicated piece of code:

import random

letter = (random.randint(1,26))
if letter == 1:
   print ('a')
elif letter == 2:
    print ('b')
elif letter == 3:
    print ('c')
elif letter == 4:
    print ('d')
elif letter == 5:
    print ('e')
elif letter == 6:
    print ('f')
elif letter == 7:
    print ('g')
elif letter == 8:
    print ('h')
elif letter == 9:
    print ('i')
elif letter == 10:
    print ('j')
elif letter == 11:
    print ('k')
elif letter == 12:
    print ('l')
elif letter == 13:
    print ('m')
elif letter == 14:
    print ('n')
elif letter == 15:
    print ('o')
elif letter == 16:
    print ('p')
elif letter == 17:
    print ('q')
elif letter == 18:
    print ('r')
elif letter == 19:
    print ('s')
elif letter == 20:
    print ('t')
elif letter == 21:
    print ('u')
elif letter == 22:
    print ('v')
elif letter == 23:
    print ('w')
elif letter == 24:
    print ('x')
elif letter == 25:
    print ('y')
elif letter == 26:
    print ('z')

It basically generates a random number out of 26 and then converts into its corresponding letter. This could defiantly be improved but I am only a beginner and I am proud of this piece of code.

Question 50

Maybe this can help you:

import random
for a in range(64,90):
    h = random.randint(64, a)
    e += chr(h)
print e

Question 51

Place a python on the keyboard and let him roll over the letters until you find your preferd random combo Just kidding!

import string #This was a design above but failed to print. I remodled it.
import random
irandom = random.choice(string.ascii_letters) 
print irandom

Question 52

I have this program that calculates the time taken to answer a specific question, and quits out of the while loop when answer is incorrect, but i want to delete the last calculation, so i can call min() and it not be the wrong time, sorry if this is confusing.

from time import time

q = input('What do you want to type? ')
a = ' '
record = []
while a != '':
    start = time()
    a = input('Type: ')
    end = time()
    v = end-start
    record.append(v)
    if a == q:
        print('Time taken to type name: {:.2f}'.format(v))
    else:
        break
for i in record:
    print('{:.2f} seconds.'.format(i))

Question 53

If I understood the question correctly, you can use the slicing notation to keep everything except the last item:

record = record[:-1]

But a better way is to delete the item directly:

del record[-1]

Note 1: Note that using record = record[:-1] does not really remove the last element, but assign the sublist to record. This makes a difference if you run it inside a function and record is a parameter. With record = record[:-1] the original list (outside the function) is unchanged, with del record[-1] or record.pop() the list is changed. (as stated by @pltrdy in the comments)

Note 2: The code could use some Python idioms. I highly recommend reading this:
Code Like a Pythonista: Idiomatic Python (via wayback machine archive).

Question 54

you should use this

del record[-1]

The problem with

record = record[:-1]

Is that it makes a copy of the list every time you remove an item, so isn’t very efficient

Question 55

list.pop() removes and returns the last element of the list.

Question 56

You need:

record = record[:-1]

before the for loop.

This will set record to the current record list but without the last item. You may, depending on your needs, want to ensure the list isn’t empty before doing this.

Question 57

If you do a lot with timing, I can recommend this little (20 line) context manager:

https://github.com/brouberol/timer-context-manager

You code could look like this then:

#!/usr/bin/env python
# coding: utf-8

from timer import Timer

if __name__ == '__main__':
    a, record = None, []
    while not a == '':
        with Timer() as t: # everything in the block will be timed
            a = input('Type: ')
        record.append(t.elapsed_s)
    # drop the last item (makes a copy of the list):
    record = record[:-1] 
    # or just delete it:
    # del record[-1]

Just for reference, here’s the content of the Timer context manager in full:

from timeit import default_timer

class Timer(object):
    """ A timer as a context manager. """

    def __init__(self):
        self.timer = default_timer
        # measures wall clock time, not CPU time!
        # On Unix systems, it corresponds to time.time
        # On Windows systems, it corresponds to time.clock

    def __enter__(self):
        self.start = self.timer() # measure start time
        return self

    def __exit__(self, exc_type, exc_value, exc_traceback):
        self.end = self.timer() # measure end time
        self.elapsed_s = self.end - self.start # elapsed time, in seconds
        self.elapsed_ms = self.elapsed_s * 1000  # elapsed time, in milliseconds

Question 58

just simply use list.pop() now if you want it the other way use : list.popleft()

Question 59

If you have a list of lists (tracked_output_sheet in my case), where you want to delete last element from each list, you can use the following code:

interim = []
for x in tracked_output_sheet:interim.append(x[:-1])
tracked_output_sheet= interim

Question 60

I’m trying to use Python to download the HTML source code of a website but I’m receiving this error.

Traceback (most recent call last):  
    File "C:\Users\Sergio.Tapia\Documents\NetBeansProjects\DICParser\src\WebDownload.py", line 3, in <module>
     file = urllib.urlopen("http://www.python.org")
AttributeError: 'module' object has no attribute 'urlopen'

I’m following the guide here: http://www.boddie.org.uk/python/HTML.html

import urllib

file = urllib.urlopen("http://www.python.org")
s = file.read()
f.close()

#I'm guessing this would output the html source code?
print(s)

I’m using Python 3.

Question 61

This works in Python 2.x.

For Python 3 look in the docs:

import urllib.request

with urllib.request.urlopen("http://www.python.org") as url:
    s = url.read()
    # I'm guessing this would output the html source code ?
    print(s)

Question 62

A Python 2+3 compatible solution is:

import sys

if sys.version_info[0] == 3:
    from urllib.request import urlopen
else:
    # Not Python 3 - today, it is most likely to be Python 2
    # But note that this might need an update when Python 4
    # might be around one day
    from urllib import urlopen


# Your code where you can use urlopen
with urlopen("http://www.python.org") as url:
    s = url.read()

print(s)

Question 63

import urllib.request as ur
s = ur.urlopen("http://www.google.com")
sl = s.read()
print(sl)

In Python v3 the “urllib.request” is a module by itself, therefore “urllib” cannot be used here.

Question 64

To get ‘dataX = urllib.urlopen(url).read()‘ working in python3 (this would have been correct for python2) you must just change 2 little things.

1: The urllib statement itself (add the .request in the middle):

dataX = urllib.request.urlopen(url).read()

2: The import statement preceding it (change from ‘import urlib’ to:

import urllib.request

And it should work in python3 :)

问题：是否可以“破解” Python的打印功能？

回答 0

回答 1

Monkey补丁 print

test_file.py

修改目标，而不是 print

摘要

Monkey-patch print

test_file.py

Modify the target instead of the print

Summary

回答 2

回答 3

问题：对于Python 3.x整数，比位移快两倍？

回答 0

整数除法

Integer division

问题：tf.app.run（）如何工作？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：Python 3中的Concurrent.futures与Multiprocessing

回答 0

问题：斜杠在help（）输出中意味着什么？

回答 0

回答 1

回答 2

问题：TypeError：module .__ init __（）最多接受2个参数（给定3个）

回答 0

回答 1

回答 2

回答 3

回答 4

问题：为什么`if None .__ eq __（“ a”）`似乎评估为True（但不完全）？

回答 0

回答 1

回答 2

回答 3

为什么？

怎么解决呢？

Why?

How to solve it?

问题：在Python中生成随机字母

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

回答 15

回答 16

回答 17

回答 18

问题：如何删除列表中的最后一项？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

问题：AttributeError：’模块’对象没有属性’urlopen’

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

Monkey补丁 `print`

修改目标，而不是 `print`

Monkey-patch `print`

Modify the target instead of the `print`

问题：TypeError：module . init （）最多接受2个参数（给定3个）

问题：为什么`if None . eq （“ a”）`似乎评估为True（但不完全）？