Python 实用宝典

Question 1

I found out about the // operator in Python which in Python 3 does division with floor.

Is there an operator which divides with ceil instead? (I know about the / operator which in Python 3 does floating point division.)

Question 2

There is no operator which divides with ceil. You need to import math and use math.ceil

Question 3

You can just do upside-down floor division:

def ceildiv(a, b):
    return -(-a // b)

This works because Python’s division operator does floor division (unlike in C, where integer division truncates the fractional part).

This also works with Python’s big integers, because there’s no (lossy) floating-point conversion.

Here’s a demonstration:

>>> from __future__ import division   # a/b is float division
>>> from math import ceil
>>> b = 3
>>> for a in range(-7, 8):
...     print(["%d/%d" % (a, b), int(ceil(a / b)), -(-a // b)])
... 
['-7/3', -2, -2]
['-6/3', -2, -2]
['-5/3', -1, -1]
['-4/3', -1, -1]
['-3/3', -1, -1]
['-2/3', 0, 0]
['-1/3', 0, 0]
['0/3', 0, 0]
['1/3', 1, 1]
['2/3', 1, 1]
['3/3', 1, 1]
['4/3', 2, 2]
['5/3', 2, 2]
['6/3', 2, 2]
['7/3', 3, 3]

Question 4

You could do (x + (d-1)) // d when dividing x by d, i.e. (x + 4) // 5.

Question 5

Solution 1: Convert floor to ceiling with negation

def ceiling_division(n, d):
    return -(n // -d)

Reminiscent of the Penn & Teller levitation trick, this “turns the world upside down (with negation), uses plain floor division (where the ceiling and floor have been swapped), and then turns the world right-side up (with negation again)”

Solution 2: Let divmod() do the work

def ceiling_division(n, d):
    q, r = divmod(n, d)
    return q + bool(r)

The divmod() function gives (a // b, a % b) for integers (this may be less reliable with floats due to round-off error). The step with bool(r) adds one to the quotient whenever there is a non-zero remainder.

Solution 3: Adjust the numerator before the division

def ceiling_division(n, d):
    return (n + d - 1) // d

Translate the numerator upwards so that floor division rounds down to the intended ceiling. Note, this only works for integers.

Solution 4: Convert to floats to use math.ceil()

def ceiling_division(n, d):
    return math.ceil(n / d)

The math.ceil() code is easy to understand, but it converts from ints to floats and back. This isn’t very fast and it may have rounding issues. Also, it relies on Python 3 semantics where “true division” produces a float and where the ceil() function returns an integer.

Question 6

You can always just do it inline as well

((foo - 1) // bar) + 1

In python3, this is just shy of an order of magnitude faster than forcing the float division and calling ceil(), provided you care about the speed. Which you shouldn’t, unless you’ve proven through usage that you need to.

>>> timeit.timeit("((5 - 1) // 4) + 1", number = 100000000)
1.7249219375662506
>>> timeit.timeit("ceil(5/4)", setup="from math import ceil", number = 100000000)
12.096064013894647

Question 7

Note that math.ceil is limited to 53 bits of precision. If you are working with large integers, you may not get exact results.

The gmpy2 libary provides a c_div function which uses ceiling rounding.

Disclaimer: I maintain gmpy2.

Question 8

Simple solution: a // b + 1

Question 9

Consider following piece of code:

from collections import namedtuple
point = namedtuple("Point", ("x:int", "y:int"))

The Code above is just a way to demonstrate as to what I am trying to achieve. I would like to make namedtuple with type hints.

Do you know any elegant way how to achieve result as intended?

Question 10

The prefered Syntax for a typed named tuple since 3.6 is

from typing import NamedTuple

class Point(NamedTuple):
    x: int
    y: int = 1  # Set default value

Point(3)  # -> Point(x=3, y=1)

Edit Starting Python 3.7, consider using dataclasses (your IDE may not yet support them for static type checking):

from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int = 1  # Set default value

Point(3)  # -> Point(x=3, y=1)

Question 11

You can use typing.NamedTuple

From the docs

Typed version of namedtuple.

>>> import typing
>>> Point = typing.NamedTuple("Point", [('x', int), ('y', int)])

This is present only in Python 3.5 onwards

Question 12

I’m trying to convert a fairly simple Python program to an executable and couldn’t find what I was looking for, so I have a few questions (I’m running Python 3.6):

The methods of doing this that I have found so far are as follows

downloading an old version of Python and using pyinstaller/py2exe
setting up a virtual environment in Python 3.6 that will allow me to do 1.
downloading a Python to C++ converter and using that.

Here is what I’ve tried/what problems I’ve run into.

I installed pyinstaller before the required download before it (pypi-something) so it did not work. After downloading the prerequisite file, pyinstaller still does not recognize it.
If I’m setting up a virtualenv in Python 2.7, do I actually need to have Python 2.7 installed?
similarly, the only python to C++ converters I see work only up until Python 3.5 – do I need to download and use this version if attempting this?

Question 13

Steps to convert .py to .exe in Python 3.6

Install Python 3.6.
Install cx_Freeze, (open your command prompt and type pip install cx_Freeze.
Install idna, (open your command prompt and type pip install idna.
Write a .py program named myfirstprog.py.
Create a new python file named setup.py on the current directory of your script.
In the setup.py file, copy the code below and save it.
With shift pressed right click on the same directory, so you are able to open a command prompt window.
In the prompt, type python setup.py build
If your script is error free, then there will be no problem on creating application.
Check the newly created folder build. It has another folder in it. Within that folder you can find your application. Run it. Make yourself happy.

See the original script in my blog.

setup.py:

from cx_Freeze import setup, Executable

base = None    

executables = [Executable("myfirstprog.py", base=base)]

packages = ["idna"]
options = {
    'build_exe': {    
        'packages':packages,
    },    
}

setup(
    name = "<any name>",
    options = options,
    version = "<any number>",
    description = '<any description>',
    executables = executables
)

EDIT:

be sure that instead of myfirstprog.py you should put your .pyextension file name as created in step 4;
you should include each imported package in your .py into packages list (ex: packages = ["idna", "os","sys"])
any name, any number, any description in setup.py file should not remain the same, you should change it accordingly (ex:name = "<first_ever>", version = "0.11", description = '' )
the imported packages must be installed before you start step 8.

Question 14

Python 3.6 is supported by PyInstaller.

Open a cmd window in your Python folder (open a command window and use cd or while holding shift, right click it on Windows Explorer and choose ‘Open command window here’). Then just enter

pip install pyinstaller

And that’s it.

The simplest way to use it is by entering on your command prompt

pyinstaller file_name.py

For more details on how to use it, take a look at this question.

Question 15

There is an open source project called auto-py-to-exe on GitHub. Actually it also just uses PyInstaller internally but since it is has a simple GUI that controls PyInstaller it may be a comfortable alternative. It can also output a standalone file in contrast to other solutions. They also provide a video showing how to set it up.

GUI:

Output:

Question 16

I can’t tell you what’s best, but a tool I have used with success in the past was cx_Freeze. They recently updated (on Jan. 7, ’17) to version 5.0.1 and it supports Python 3.6.

Here’s the pypi https://pypi.python.org/pypi/cx_Freeze

The documentation shows that there is more than one way to do it, depending on your needs. http://cx-freeze.readthedocs.io/en/latest/overview.html

I have not tried it out yet, so I’m going to point to a post where the simple way of doing it was discussed. Some things may or may not have changed though.

How do I use cx_freeze?

Question 17

I’ve been using Nuitka and PyInstaller with my package, PySimpleGUI.

Nuitka There were issues getting tkinter to compile with Nuikta. One of the project contributors developed a script that fixed the problem.

If you’re not using tkinter it may “just work” for you. If you are using tkinter say so and I’ll try to get the script and instructions published.

PyInstaller I’m running 3.6 and PyInstaller is working great! The command I use to create my exe file is:

pyinstaller -wF myfile.py

The -wF will create a single EXE file. Because all of my programs have a GUI and I do not want to command window to show, the -w option will hide the command window.

This is as close to getting what looks like a Winforms program to run that was written in Python.

[Update 20-Jul-2019]

There is PySimpleGUI GUI based solution that uses PyInstaller. It uses PySimpleGUI. It’s called pysimplegui-exemaker and can be pip installed.

pip install PySimpleGUI-exemaker

To run it after installing:

python -m pysimplegui-exemaker.pysimplegui-exemaker

Question 18

Now you can convert it by using PyInstaller. It works with even Python 3.

Steps:

Fire up your PC
Open command prompt
Enter command pip install pyinstaller
When it is installed, use the command ‘cd’ to go to the working directory.
Run command pyinstall <filename>

Question 19

I need to get the latest file of a folder using python. While using the code:

max(files, key = os.path.getctime)

I am getting the below error:

FileNotFoundError: [WinError 2] The system cannot find the file specified: 'a'

Question 20

Whatever is assigned to the files variable is incorrect. Use the following code.

import glob
import os

list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print latest_file

Question 21

max(files, key = os.path.getctime)

is quite incomplete code. What is files? It probably is a list of file names, coming out of os.listdir().

But this list lists only the filename parts (a. k. a. “basenames”), because their path is common. In order to use it correctly, you have to combine it with the path leading to it (and used to obtain it).

Such as (untested):

def newest(path):
    files = os.listdir(path)
    paths = [os.path.join(path, basename) for basename in files]
    return max(paths, key=os.path.getctime)

Question 22

I would suggest using glob.iglob() instead of the glob.glob(), as it is more efficient.

glob.iglob() Return an iterator which yields the same values as glob() without actually storing them all simultaneously.

Which means glob.iglob() will be more efficient.

I mostly use below code to find the latest file matching to my pattern:

LatestFile = max(glob.iglob(fileNamePattern),key=os.path.getctime)

NOTE: There are variants of max function, In case of finding the latest file we will be using below variant: max(iterable, *[, key, default])

which needs iterable so your first parameter should be iterable. In case of finding max of nums we can use beow variant : max (num1, num2, num3, *args[, key])

Question 23

Try to sort items by creation time. Example below sorts files in a folder and gets first element which is latest.

import glob
import os

files_path = os.path.join(folder, '*')
files = sorted(
    glob.iglob(files_path), key=os.path.getctime, reverse=True) 
print files[0]

Question 24

I lack the reputation to comment but ctime from Marlon Abeykoons response did not give the correct result for me. Using mtime does the trick though. (key=os.path.getmtime))

import glob
import os

list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getmtime)
print latest_file

I found two answers for that problem:

python os.path.getctime max does not return latest Difference between python – getmtime() and getctime() in unix system

Question 25

(Edited to improve answer)

First define a function get_latest_file

def get_latest_file(path, *paths):
    fullpath = os.path.join(path, paths)
    ...
get_latest_file('example', 'files','randomtext011.*.txt')

You may also use a docstring !

def get_latest_file(path, *paths):
    """Returns the name of the latest (most recent) file 
    of the joined path(s)"""
    fullpath = os.path.join(path, *paths)

If you use Python 3, you can use iglob instead.

Complete code to return the name of latest file:

def get_latest_file(path, *paths):
    """Returns the name of the latest (most recent) file 
    of the joined path(s)"""
    fullpath = os.path.join(path, *paths)
    files = glob.glob(fullpath)  # You may use iglob in Python3
    if not files:                # I prefer using the negation
        return None                      # because it behaves like a shortcut
    latest_file = max(files, key=os.path.getctime)
    _, filename = os.path.split(latest_file)
    return filename

Question 26

I have tried to use the above suggestions and my program crashed, than I figured out the file I’m trying to identify was used and when trying to use ‘os.path.getctime’ it crashed. what finally worked for me was:

    files_before = glob.glob(os.path.join(my_path,'*'))
    **code where new file is created**
    new_file = set(files_before).symmetric_difference(set(glob.glob(os.path.join(my_path,'*'))))

this codes gets the uncommon object between the two sets of file lists its not the most elegant, and if multiple files are created at the same time it would probably won’t be stable

Question 27

A much faster method on windows (0.05s), call a bat script that does this:

get_latest.bat

@echo off
for /f %%i in ('dir \\directory\in\question /b/a-d/od/t:c') do set LAST=%%i
%LAST%

where \\directory\in\question is the directory you want to investigate.

get_latest.py

from subprocess import Popen, PIPE
p = Popen("get_latest.bat", shell=True, stdout=PIPE,)
stdout, stderr = p.communicate()
print(stdout, stderr)

if it finds a file stdout is the path and stderr is None.

Use stdout.decode("utf-8").rstrip() to get the usable string representation of the file name.

Question 28

I’ve been using this in Python 3, including pattern matching on the filename.

from pathlib import Path

def latest_file(path: Path, pattern: str = "*"):
    files = path.glob(pattern)
    return max(files, key=lambda x: x.stat().st_ctime)

Question 29

In Python 2 I used:

print "a=%d,b=%d" % (f(x,n),g(x,n))

I’ve tried:

print("a=%d,b=%d") % (f(x,n),g(x,n))

Question 30

In Python2, print was a keyword which introduced a statement:

print "Hi"

In Python3, print is a function which may be invoked:

print ("Hi")

In both versions, % is an operator which requires a string on the left-hand side and a value or a tuple of values or a mapping object (like dict) on the right-hand side.

So, your line ought to look like this:

print("a=%d,b=%d" % (f(x,n),g(x,n)))

Also, the recommendation for Python3 and newer is to use {}-style formatting instead of %-style formatting:

print('a={:d}, b={:d}'.format(f(x,n),g(x,n)))

Python 3.6 introduces yet another string-formatting paradigm: f-strings.

print(f'a={f(x,n):d}, b={g(x,n):d}')

Question 31

The most recommended way to do is to use format method. Read more about it here

a, b = 1, 2

print("a={0},b={1}".format(a, b))

Question 32

Simple printf() function from O’Reilly’s Python Cookbook.

import sys
def printf(format, *args):
    sys.stdout.write(format % args)

Example output:

i = 7
pi = 3.14159265359
printf("hi there, i=%d, pi=%.2f\n", i, pi)
# hi there, i=7, pi=3.14

Question 33

Python 3.6 introduced f-strings for inline interpolation. What’s even nicer is it extended the syntax to also allow format specifiers with interpolation. Something I’ve been working on while I googled this (and came across this old question!):

print(f'{account:40s} ({ratio:3.2f}) -> AUD {splitAmount}')

PEP 498 has the details. And… it sorted my pet peeve with format specifiers in other langs — allows for specifiers that themselves can be expressions! Yay! See: Format Specifiers.

Question 34

Simple Example:

print("foo %d, bar %d" % (1,2))

Question 35

A simpler one.

def printf(format, *values):
    print(format % values )

Then:

printf("Hello, this is my name %s and my age %d", "Martin", 20)

Question 36

Because your % is outside the print(...) parentheses, you’re trying to insert your variables into the result of your print call. print(...) returns None, so this won’t work, and there’s also the small matter of you already having printed your template by this time and time travel being prohibited by the laws of the universe we inhabit.

The whole thing you want to print, including the % and its operand, needs to be inside your print(...) call, so that the string can be built before it is printed.

print( "a=%d,b=%d" % (f(x,n), g(x,n)) )

I have added a few extra spaces to make it clearer (though they are not necessary and generally not considered good style).

Question 37

Other words printf absent in python… I’m surprised! Best code is

def printf(format, *args):
    sys.stdout.write(format % args)

Because of this form allows not to print \n. All others no. That’s why print is bad operator. And also you need write args in special form. There is no disadvantages in function above. It’s a standard usual form of printf function.

Question 38

print("Name={}, balance={}".format(var-name, var-balance))

Question 39

I’m trying to install python 3.x on an AWS EC2 instance and:

sudo yum install python3

doesn’t work:

No package python3 available.

I’ve googled around and I can’t find anyone else who has this problem so I’m asking here. Do I have to manually download and install it?

Question 40

If you do a

sudo yum list | grep python3

you will see that while they don’t have a “python3” package, they do have a “python34” package, or a more recent release, such as “python36”. Installing it is as easy as:

sudo yum install python34 python34-pip

Question 41

Note: This may be obsolete for current versions of Amazon Linux 2 since late 2018 (see comments), you can now directly install it via yum install python3.

In Amazon Linux 2, there isn’t a python3[4-6] in the default yum repos, instead there’s the Amazon Extras Library.

sudo amazon-linux-extras install python3

If you want to set up isolated virtual environments with it; using yum install‘d virtualenv tools don’t seem to reliably work.

virtualenv --python=python3 my_venv

Calling the venv module/tool is less finicky, and you could double check it’s what you want/expect with python3 --version beforehand.

python3 -m venv my_venv

Other things it can install (versions as of 18 Jan 18):

[ec2-user@x ~]$ amazon-linux-extras list
  0  ansible2   disabled  [ =2.4.2 ]
  1  emacs   disabled  [ =25.3 ]
  2  memcached1.5   disabled  [ =1.5.1 ]
  3  nginx1.12   disabled  [ =1.12.2 ]
  4  postgresql9.6   disabled  [ =9.6.6 ]
  5  python3=latest  enabled  [ =3.6.2 ]
  6  redis4.0   disabled  [ =4.0.5 ]
  7  R3.4   disabled  [ =3.4.3 ]
  8  rust1   disabled  [ =1.22.1 ]
  9  vim   disabled  [ =8.0 ]
 10  golang1.9   disabled  [ =1.9.2 ]
 11  ruby2.4   disabled  [ =2.4.2 ]
 12  nano   disabled  [ =2.9.1 ]
 13  php7.2   disabled  [ =7.2.0 ]
 14  lamp-mariadb10.2-php7.2   disabled  [ =10.2.10_7.2.0 ]

Question 42

Here are the steps I used to manually install python3 for anyone else who wants to do it as it’s not super straight forward. EDIT: It’s almost certainly easier to use the yum package manager (see other answers).

Note, you’ll probably want to do sudo yum groupinstall 'Development Tools' before doing this otherwise pip won’t install.

wget https://www.python.org/ftp/python/3.4.2/Python-3.4.2.tgz
tar zxvf Python-3.4.2.tgz
cd Python-3.4.2
sudo yum install gcc
./configure --prefix=/opt/python3
make
sudo yum install openssl-devel
sudo make install
sudo ln -s /opt/python3/bin/python3 /usr/bin/python3
python3 (should start the interpreter if it's worked (quit() to exit)

Question 43

EC2 (on the Amazon Linux AMI) currently supports python3.4 and python3.5.

sudo yum install python35
sudo yum install python35-pip

Question 44

As of Amazon Linux version 2017.09 python 3.6 is now available:

sudo yum install python36 python36-virtualenv python36-pip

See the Release Notes for more info and other packages

Question 45

Amazon Linux now supports python36.

python36-pip is not available. So need to follow a different route.

sudo yum install python36 python36-devel python36-libs python36-tools

# If you like to have pip3.6:
curl -O https://bootstrap.pypa.io/get-pip.py
sudo python3 get-pip.py

Question 46

As @NickT said, there’s no python3[4-6] in the default yum repos in Amazon Linux 2, as of today it uses 3.7 and looking at all answers here we can say it will be changed over time.

I was looking for python3.6 on Amazon Linux 2 but amazon-linux-extras shows a lot of options but no python at all. in fact, you can try to find the version you know in epel repo:

sudo amazon-linux-extras install epel

yum search python | grep "^python3..x8"

python34.x86_64 : Version 3 of the Python programming language aka Python 3000
python36.x86_64 : Interpreter of the Python programming language

Question 47

Adding to all the answers already available for this question, I would like to add the steps I followed to install Python3 on AWS EC2 instance running CentOS 7. You can find the entire details at this link.

https://aws-labs.com/install-python-3-centos-7-2/

First, we need to enable SCL. SCL is a community project that allows you to build, install, and use multiple versions of software on the same system, without affecting system default packages.

sudo yum install centos-release-scl

Now that we have SCL repository, we can install the python3

sudo yum install rh-python36

To access Python 3.6 you need to launch a new shell instance using the Software Collection scl tool:

scl enable rh-python36 bash

If you check the Python version now you’ll notice that Python 3.6 is the default version

python --version

It is important to point out that Python 3.6 is the default Python version only in this shell session. If you exit the session or open a new session from another terminal Python 2.7 will be the default Python version.

Now, Install the python development tools by typing:

sudo yum groupinstall ‘Development Tools’

Now create a virtual environment so that the default python packages don’t get messed up.

mkdir ~/my_new_project
cd ~/my_new_project
python -m venv my_project_venv

To use this virtual environment,

source my_project_venv/bin/activate

Now, you have your virtual environment set up with python3.

Question 48

On Debian derivatives such as Ubuntu, use apt. Check the apt repository for the versions of Python available to you. Then, run a command similar to the following, substituting the correct package name:

sudo apt-get install python3

On Red Hat and derivatives, use yum. Check the yum repository for the versions of Python available to you. Then, run a command similar to the following, substituting the correct package name:

sudo yum install python36

On SUSE and derivatives, use zypper. Check the repository for the versions of Python available to you. Then. run a command similar to the following, substituting the correct package name:

sudo zypper install python3

Question 49

I am running this code with python, selenium, and firefox but still get ‘head’ version of firefox:

binary = FirefoxBinary('C:\\Program Files (x86)\\Mozilla Firefox\\firefox.exe', log_file=sys.stdout)
binary.add_command_line_options('-headless')
self.driver = webdriver.Firefox(firefox_binary=binary)

I also tried some variations of binary:

binary = FirefoxBinary('C:\\Program Files\\Nightly\\firefox.exe', log_file=sys.stdout)
        binary.add_command_line_options("--headless")

Question 50

To invoke Firefox Browser headlessly, you can set the headless property through Options() class as follows:

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.headless = True
driver = webdriver.Firefox(options=options, executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("http://google.com/")
print ("Headless Firefox Initialized")
driver.quit()

There’s another way to accomplish headless mode. If you need to disable or enable the headless mode in Firefox, without changing the code, you can set the environment variable MOZ_HEADLESS to whatever if you want Firefox to run headless, or don’t set it at all.

This is very useful when you are using for example continuous integration and you want to run the functional tests in the server but still be able to run the tests in normal mode in your PC.

$ MOZ_HEADLESS=1 python manage.py test # testing example in Django with headless Firefox

or

$ export MOZ_HEADLESS=1   # this way you only have to set it once
$ python manage.py test functional/tests/directory
$ unset MOZ_HEADLESS      # if you want to disable headless mode

Outro

How to configure ChromeDriver to initiate Chrome browser in Headless mode through Selenium?

Question 51

The first answer does’t work anymore.

This worked for me:

from selenium.webdriver.firefox.options import Options as FirefoxOptions
from selenium import webdriver

options = FirefoxOptions()
options.add_argument("--headless")
driver = webdriver.Firefox(options=options)
driver.get("http://google.com")

Question 52

My answer:

set_headless(headless=True) is deprecated.

https://seleniumhq.github.io/selenium/docs/api/py/webdriver_firefox/selenium.webdriver.firefox.options.html

options.headless = True

works for me

Question 53

Just a note for people who may have found this later (and want java way of achieving this); FirefoxOptions is also capable of enabling the headless mode:

FirefoxOptions firefoxOptions = new FirefoxOptions();
firefoxOptions.setHeadless(true);

Question 54

Used below code to set driver type based on need of Headless / Head for both Firefox and chrome:

// Can pass browser type 

if brower.lower() == 'chrome':
    driver = webdriver.Chrome('..\drivers\chromedriver')
elif brower.lower() == 'headless chrome':
    ch_Options = Options()
    ch_Options.add_argument('--headless')
    ch_Options.add_argument("--disable-gpu")
    driver = webdriver.Chrome('..\drivers\chromedriver',options=ch_Options)
elif brower.lower() == 'firefox':
    driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe')
elif brower.lower() == 'headless firefox':
    ff_option = FFOption()
    ff_option.add_argument('--headless')
    ff_option.add_argument("--disable-gpu")
    driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe', options=ff_option)
elif brower.lower() == 'ie':
    driver = webdriver.Ie('..\drivers\IEDriverServer')
else:
    raise Exception('Invalid Browser Type')

Question 55

This question is motivated by my another question: How to await in cdef?

There are tons of articles and blog posts on the web about asyncio, but they are all very superficial. I couldn’t find any information about how asyncio is actually implemented, and what makes I/O asynchronous. I was trying to read the source code, but it’s thousands of lines of not the highest grade C code, a lot of which deals with auxiliary objects, but most crucially, it is hard to connect between Python syntax and what C code it would translate into.

Asycnio’s own documentation is even less helpful. There’s no information there about how it works, only some guidelines about how to use it, which are also sometimes misleading / very poorly written.

I’m familiar with Go’s implementation of coroutines, and was kind of hoping that Python did the same thing. If that was the case, the code I came up in the post linked above would have worked. Since it didn’t, I’m now trying to figure out why. My best guess so far is as follows, please correct me where I’m wrong:

Procedure definitions of the form async def foo(): ... are actually interpreted as methods of a class inheriting coroutine.
Perhaps, async def is actually split into multiple methods by await statements, where the object, on which these methods are called is able to keep track of the progress it made through the execution so far.
If the above is true, then, essentially, execution of a coroutine boils down to calling methods of coroutine object by some global manager (loop?).
The global manager is somehow (how?) aware of when I/O operations are performed by Python (only?) code and is able to choose one of the pending coroutine methods to execute after the current executing method relinquished control (hit on the await statement).

In other words, here’s my attempt at “desugaring” of some asyncio syntax into something more understandable:

async def coro(name):
    print('before', name)
    await asyncio.sleep()
    print('after', name)

asyncio.gather(coro('first'), coro('second'))

# translated from async def coro(name)
class Coro(coroutine):
    def before(self, name):
        print('before', name)

    def after(self, name):
        print('after', name)

    def __init__(self, name):
        self.name = name
        self.parts = self.before, self.after
        self.pos = 0

    def __call__():
        self.parts[self.pos](self.name)
        self.pos += 1

    def done(self):
        return self.pos == len(self.parts)


# translated from asyncio.gather()
class AsyncIOManager:

    def gather(*coros):
        while not every(c.done() for c in coros):
            coro = random.choice(coros)
            coro()

Should my guess prove correct: then I have a problem. How does I/O actually happen in this scenario? In a separate thread? Is the whole interpreter suspended and I/O happens outside the interpreter? What exactly is meant by I/O? If my python procedure called C open() procedure, and it in turn sent interrupt to kernel, relinquishing control to it, how does Python interpreter know about this and is able to continue running some other code, while kernel code does the actual I/O and until it wakes up the Python procedure which sent the interrupt originally? How can Python interpreter in principle, be aware of this happening?

Question 56

How does asyncio work?

Before answering this question we need to understand a few base terms, skip these if you already know any of them.

Generators

Generators are objects that allow us to suspend the execution of a python function. User curated generators are implement using the keyword yield. By creating a normal function containing the yield keyword, we turn that function into a generator:

>>> def test():
...     yield 1
...     yield 2
...
>>> gen = test()
>>> next(gen)
1
>>> next(gen)
2
>>> next(gen)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

As you can see, calling next() on the generator causes the interpreter to load test’s frame, and return the yielded value. Calling next() again, cause the frame to load again into the interpreter stack, and continue on yielding another value.

By the third time next() is called, our generator was finished, and StopIteration was thrown.

Communicating with a generator

A less-known feature of generators, is the fact that you can communicate with them using two methods: send() and throw().

>>> def test():
...     val = yield 1
...     print(val)
...     yield 2
...     yield 3
...
>>> gen = test()
>>> next(gen)
1
>>> gen.send("abc")
abc
2
>>> gen.throw(Exception())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in test
Exception

Upon calling gen.send(), the value is passed as a return value from the yield keyword.

gen.throw() on the other hand, allows throwing Exceptions inside generators, with the exception raised at the same spot yield was called.

Returning values from generators

Returning a value from a generator, results in the value being put inside the StopIteration exception. We can later on recover the value from the exception and use it to our need.

>>> def test():
...     yield 1
...     return "abc"
...
>>> gen = test()
>>> next(gen)
1
>>> try:
...     next(gen)
... except StopIteration as exc:
...     print(exc.value)
...
abc

Behold, a new keyword: `yield from`

Python 3.4 came with the addition of a new keyword: yield from. What that keyword allows us to do, is pass on any next(), send() and throw() into an inner-most nested generator. If the inner generator returns a value, it is also the return value of yield from:

>>> def inner():
...     inner_result = yield 2
...     print('inner', inner_result)
...     return 3
...
>>> def outer():
...     yield 1
...     val = yield from inner()
...     print('outer', val)
...     yield 4
...
>>> gen = outer()
>>> next(gen)
1
>>> next(gen) # Goes inside inner() automatically
2
>>> gen.send("abc")
inner abc
outer 3
4

I’ve written an article to further elaborate on this topic.

Putting it all together

Upon introducing the new keyword yield from in Python 3.4, we were now able to create generators inside generators that just like a tunnel, pass the data back and forth from the inner-most to the outer-most generators. This has spawned a new meaning for generators – coroutines.

Coroutines are functions that can be stopped and resumed while being run. In Python, they are defined using the async def keyword. Much like generators, they too use their own form of yield from which is await. Before async and await were introduced in Python 3.5, we created coroutines in the exact same way generators were created (with yield from instead of await).

async def inner():
    return 1

async def outer():
    await inner()

Like every iterator or generator that implement the __iter__() method, coroutines implement __await__() which allows them to continue on every time await coro is called.

There’s a nice sequence diagram inside the Python docs that you should check out.

In asyncio, apart from coroutine functions, we have 2 important objects: tasks and futures.

Futures

Futures are objects that have the __await__() method implemented, and their job is to hold a certain state and result. The state can be one of the following:

PENDING – future does not have any result or exception set.
CANCELLED – future was cancelled using fut.cancel()
FINISHED – future was finished, either by a result set using fut.set_result() or by an exception set using fut.set_exception()

The result, just like you have guessed, can either be a Python object, that will be returned, or an exception which may be raised.

Another important feature of future objects, is that they contain a method called add_done_callback(). This method allows functions to be called as soon as the task is done – whether it raised an exception or finished.

Tasks

Task objects are special futures, which wrap around coroutines, and communicate with the inner-most and outer-most coroutines. Every time a coroutine awaits a future, the future is passed all the way back to the task (just like in yield from), and the task receives it.

Next, the task binds itself to the future. It does so by calling add_done_callback() on the future. From now on, if the future will ever be done, by either being cancelled, passed an exception or passed a Python object as a result, the task’s callback will be called, and it will rise back up to existence.

Asyncio

The final burning question we must answer is – how is the IO implemented?

Deep inside asyncio, we have an event loop. An event loop of tasks. The event loop’s job is to call tasks every time they are ready and coordinate all that effort into one single working machine.

The IO part of the event loop is built upon a single crucial function called select. Select is a blocking function, implemented by the operating system underneath, that allows waiting on sockets for incoming or outgoing data. Upon data being received it wakes up, and returns the sockets which received data, or the sockets whom are ready for writing.

When you try to receive or send data over a socket through asyncio, what actually happens below is that the socket is first checked if it has any data that can be immediately read or sent. If its .send() buffer is full, or the .recv() buffer is empty, the socket is registered to the select function (by simply adding it to one of the lists, rlist for recv and wlist for send) and the appropriate function awaits a newly created future object, tied to that socket.

When all available tasks are waiting for futures, the event loop calls select and waits. When the one of the sockets has incoming data, or its send buffer drained up, asyncio checks for the future object tied to that socket, and sets it to done.

Now all the magic happens. The future is set to done, the task that added itself before with add_done_callback() rises up back to life, and calls .send() on the coroutine which resumes the inner-most coroutine (because of the await chain) and you read the newly received data from a nearby buffer it was spilled unto.

Method chain again, in case of recv():

select.select waits.
A ready socket, with data is returned.
Data from the socket is moved into a buffer.
future.set_result() is called.
Task that added itself with add_done_callback() is now woken up.
Task calls .send() on the coroutine which goes all the way into the inner-most coroutine and wakes it up.
Data is being read from the buffer and returned to our humble user.

In summary, asyncio uses generator capabilities, that allow pausing and resuming functions. It uses yield from capabilities that allow passing data back and forth from the inner-most generator to the outer-most. It uses all of those in order to halt function execution while it’s waiting for IO to complete (by using the OS select function).

And the best of all? While one function is paused, another may run and interleave with the delicate fabric, which is asyncio.

Question 57

Talking about async/await and asyncio is not the same thing. The first is a fundamental, low-level construct (coroutines) while the later is a library using these constructs. Conversely, there is no single ultimate answer.

The following is a general description of how async/await and asyncio-like libraries work. That is, there may be other tricks on top (there are…) but they are inconsequential unless you build them yourself. The difference should be negligible unless you already know enough to not have to ask such a question.

1. Coroutines versus subroutines in a nut shell

Just like subroutines (functions, procedures, …), coroutines (generators, …) are an abstraction of call stack and instruction pointer: there is a stack of executing code pieces, and each is at a specific instruction.

The distinction of def versus async def is merely for clarity. The actual difference is return versus yield. From this, await or yield from take the difference from individual calls to entire stacks.

1.1. Subroutines

A subroutine represents a new stack level to hold local variables, and a single traversal of its instructions to reach an end. Consider a subroutine like this:

def subfoo(bar):
     qux = 3
     return qux * bar

When you run it, that means

allocate stack space for bar and qux
recursively execute the first statement and jump to the next statement
once at a return, push its value to the calling stack
clear the stack (1.) and instruction pointer (2.)

Notably, 4. means that a subroutine always starts at the same state. Everything exclusive to the function itself is lost upon completion. A function cannot be resumed, even if there are instructions after return.

root -\
  :    \- subfoo --\
  :/--<---return --/
  |
  V

1.2. Coroutines as persistent subroutines

A coroutine is like a subroutine, but can exit without destroying its state. Consider a coroutine like this:

 def cofoo(bar):
      qux = yield bar  # yield marks a break point
      return qux

When you run it, that means

allocate stack space for bar and qux
recursively execute the first statement and jump to the next statement
1. once at a yield, push its value to the calling stack but store the stack and instruction pointer
2. once calling into yield, restore stack and instruction pointer and push arguments to qux
once at a return, push its value to the calling stack
clear the stack (1.) and instruction pointer (2.)

Note the addition of 2.1 and 2.2 – a coroutine can be suspended and resumed at predefined points. This is similar to how a subroutine is suspended during calling another subroutine. The difference is that the active coroutine is not strictly bound to its calling stack. Instead, a suspended coroutine is part of a separate, isolated stack.

root -\
  :    \- cofoo --\
  :/--<+--yield --/
  |    :
  V    :

This means that suspended coroutines can be freely stored or moved between stacks. Any call stack that has access to a coroutine can decide to resume it.

1.3. Traversing the call stack

So far, our coroutine only goes down the call stack with yield. A subroutine can go down and up the call stack with return and (). For completeness, coroutines also need a mechanism to go up the call stack. Consider a coroutine like this:

def wrap():
    yield 'before'
    yield from cofoo()
    yield 'after'

When you run it, that means it still allocates the stack and instruction pointer like a subroutine. When it suspends, that still is like storing a subroutine.

However, yield from does both. It suspends stack and instruction pointer of wrap and runs cofoo. Note that wrap stays suspended until cofoo finishes completely. Whenever cofoo suspends or something is sent, cofoo is directly connected to the calling stack.

1.4. Coroutines all the way down

As established, yield from allows to connect two scopes across another intermediate one. When applied recursively, that means the top of the stack can be connected to the bottom of the stack.

root -\
  :    \-> coro_a -yield-from-> coro_b --\
  :/ <-+------------------------yield ---/
  |    :
  :\ --+-- coro_a.send----------yield ---\
  :                             coro_b <-/

Note that root and coro_b do not know about each other. This makes coroutines much cleaner than callbacks: coroutines still built on a 1:1 relation like subroutines. Coroutines suspend and resume their entire existing execution stack up until a regular call point.

Notably, root could have an arbitrary number of coroutines to resume. Yet, it can never resume more than one at the same time. Coroutines of the same root are concurrent but not parallel!

1.5. Python’s `async` and `await`

The explanation has so far explicitly used the yield and yield from vocabulary of generators – the underlying functionality is the same. The new Python3.5 syntax async and await exists mainly for clarity.

def foo():  # subroutine?
     return None

def foo():  # coroutine?
     yield from foofoo()  # generator? coroutine?

async def foo():  # coroutine!
     await foofoo()  # coroutine!
     return None

The async for and async with statements are needed because you would break the yield from/await chain with the bare for and with statements.

2. Anatomy of a simple event loop

By itself, a coroutine has no concept of yielding control to another coroutine. It can only yield control to the caller at the bottom of a coroutine stack. This caller can then switch to another coroutine and run it.

This root node of several coroutines is commonly an event loop: on suspension, a coroutine yields an event on which it wants resume. In turn, the event loop is capable of efficiently waiting for these events to occur. This allows it to decide which coroutine to run next, or how to wait before resuming.

Such a design implies that there is a set of pre-defined events that the loop understands. Several coroutines await each other, until finally an event is awaited. This event can communicate directly with the event loop by yielding control.

loop -\
  :    \-> coroutine --await--> event --\
  :/ <-+----------------------- yield --/
  |    :
  |    :  # loop waits for event to happen
  |    :
  :\ --+-- send(reply) -------- yield --\
  :        coroutine <--yield-- event <-/

The key is that coroutine suspension allows the event loop and events to directly communicate. The intermediate coroutine stack does not require any knowledge about which loop is running it, nor how events work.

2.1.1. Events in time

The simplest event to handle is reaching a point in time. This is a fundamental block of threaded code as well: a thread repeatedly sleeps until a condition is true. However, a regular sleep blocks execution by itself – we want other coroutines to not be blocked. Instead, we want tell the event loop when it should resume the current coroutine stack.

2.1.2. Defining an Event

An event is simply a value we can identify – be it via an enum, a type or other identity. We can define this with a simple class that stores our target time. In addition to storing the event information, we can allow to await a class directly.

class AsyncSleep:
    """Event to sleep until a point in time"""
    def __init__(self, until: float):
        self.until = until

    # used whenever someone ``await``s an instance of this Event
    def __await__(self):
        # yield this Event to the loop
        yield self
    
    def __repr__(self):
        return '%s(until=%.1f)' % (self.__class__.__name__, self.until)

This class only stores the event – it does not say how to actually handle it.

The only special feature is __await__ – it is what the await keyword looks for. Practically, it is an iterator but not available for the regular iteration machinery.

2.2.1. Awaiting an event

Now that we have an event, how do coroutines react to it? We should be able to express the equivalent of sleep by awaiting our event. To better see what is going on, we wait twice for half the time:

import time

async def asleep(duration: float):
    """await that ``duration`` seconds pass"""
    await AsyncSleep(time.time() + duration / 2)
    await AsyncSleep(time.time() + duration / 2)

We can directly instantiate and run this coroutine. Similar to a generator, using coroutine.send runs the coroutine until it yields a result.

coroutine = asleep(100)
while True:
    print(coroutine.send(None))
    time.sleep(0.1)

This gives us two AsyncSleep events and then a StopIteration when the coroutine is done. Notice that the only delay is from time.sleep in the loop! Each AsyncSleep only stores an offset from the current time.

2.2.2. Event + Sleep

At this point, we have two separate mechanisms at our disposal:

AsyncSleep Events that can be yielded from inside a coroutine
time.sleep that can wait without impacting coroutines

Notably, these two are orthogonal: neither one affects or triggers the other. As a result, we can come up with our own strategy to sleep to meet the delay of an AsyncSleep.

2.3. A naive event loop

If we have several coroutines, each can tell us when it wants to be woken up. We can then wait until the first of them wants to be resumed, then for the one after, and so on. Notably, at each point we only care about which one is next.

This makes for a straightforward scheduling:

sort coroutines by their desired wake up time
pick the first that wants to wake up
wait until this point in time
run this coroutine
repeat from 1.

A trivial implementation does not need any advanced concepts. A list allows to sort coroutines by date. Waiting is a regular time.sleep. Running coroutines works just like before with coroutine.send.

def run(*coroutines):
    """Cooperatively run all ``coroutines`` until completion"""
    # store wake-up-time and coroutines
    waiting = [(0, coroutine) for coroutine in coroutines]
    while waiting:
        # 2. pick the first coroutine that wants to wake up
        until, coroutine = waiting.pop(0)
        # 3. wait until this point in time
        time.sleep(max(0.0, until - time.time()))
        # 4. run this coroutine
        try:
            command = coroutine.send(None)
        except StopIteration:
            continue
        # 1. sort coroutines by their desired suspension
        if isinstance(command, AsyncSleep):
            waiting.append((command.until, coroutine))
            waiting.sort(key=lambda item: item[0])

Of course, this has ample room for improvement. We can use a heap for the wait queue or a dispatch table for events. We could also fetch return values from the StopIteration and assign them to the coroutine. However, the fundamental principle remains the same.

2.4. Cooperative Waiting

The AsyncSleep event and run event loop are a fully working implementation of timed events.

async def sleepy(identifier: str = "coroutine", count=5):
    for i in range(count):
        print(identifier, 'step', i + 1, 'at %.2f' % time.time())
        await asleep(0.1)

run(*(sleepy("coroutine %d" % j) for j in range(5)))

This cooperatively switches between each of the five coroutines, suspending each for 0.1 seconds. Even though the event loop is synchronous, it still executes the work in 0.5 seconds instead of 2.5 seconds. Each coroutine holds state and acts independently.

3. I/O event loop

An event loop that supports sleep is suitable for polling. However, waiting for I/O on a file handle can be done more efficiently: the operating system implements I/O and thus knows which handles are ready. Ideally, an event loop should support an explicit “ready for I/O” event.

3.1. The `select` call

Python already has an interface to query the OS for read I/O handles. When called with handles to read or write, it returns the handles ready to read or write:

readable, writeable, _ = select.select(rlist, wlist, xlist, timeout)

For example, we can open a file for writing and wait for it to be ready:

write_target = open('/tmp/foo')
readable, writeable, _ = select.select([], [write_target], [])

Once select returns, writeable contains our open file.

3.2. Basic I/O event

Similar to the AsyncSleep request, we need to define an event for I/O. With the underlying select logic, the event must refer to a readable object – say an open file. In addition, we store how much data to read.

class AsyncRead:
    def __init__(self, file, amount=1):
        self.file = file
        self.amount = amount
        self._buffer = ''

    def __await__(self):
        while len(self._buffer) < self.amount:
            yield self
            # we only get here if ``read`` should not block
            self._buffer += self.file.read(1)
        return self._buffer

    def __repr__(self):
        return '%s(file=%s, amount=%d, progress=%d)' % (
            self.__class__.__name__, self.file, self.amount, len(self._buffer)
        )

As with AsyncSleep we mostly just store the data required for the underlying system call. This time, __await__ is capable of being resumed multiple times – until our desired amount has been read. In addition, we return the I/O result instead of just resuming.

3.3. Augmenting an event loop with read I/O

The basis for our event loop is still the run defined previously. First, we need to track the read requests. This is no longer a sorted schedule, we only map read requests to coroutines.

# new
waiting_read = {}  # type: Dict[file, coroutine]

Since select.select takes a timeout parameter, we can use it in place of time.sleep.

# old
time.sleep(max(0.0, until - time.time()))
# new
readable, _, _ = select.select(list(reads), [], [])

This gives us all readable files – if there are any, we run the corresponding coroutine. If there are none, we have waited long enough for our current coroutine to run.

# new - reschedule waiting coroutine, run readable coroutine
if readable:
    waiting.append((until, coroutine))
    waiting.sort()
    coroutine = waiting_read[readable[0]]

Finally, we have to actually listen for read requests.

# new
if isinstance(command, AsyncSleep):
    ...
elif isinstance(command, AsyncRead):
    ...

3.4. Putting it together

The above was a bit of a simplification. We need to do some switching to not starve sleeping coroutines if we can always read. We need to handle having nothing to read or nothing to wait for. However, the end result still fits into 30 LOC.

def run(*coroutines):
    """Cooperatively run all ``coroutines`` until completion"""
    waiting_read = {}  # type: Dict[file, coroutine]
    waiting = [(0, coroutine) for coroutine in coroutines]
    while waiting or waiting_read:
        # 2. wait until the next coroutine may run or read ...
        try:
            until, coroutine = waiting.pop(0)
        except IndexError:
            until, coroutine = float('inf'), None
            readable, _, _ = select.select(list(waiting_read), [], [])
        else:
            readable, _, _ = select.select(list(waiting_read), [], [], max(0.0, until - time.time()))
        # ... and select the appropriate one
        if readable and time.time() < until:
            if until and coroutine:
                waiting.append((until, coroutine))
                waiting.sort()
            coroutine = waiting_read.pop(readable[0])
        # 3. run this coroutine
        try:
            command = coroutine.send(None)
        except StopIteration:
            continue
        # 1. sort coroutines by their desired suspension ...
        if isinstance(command, AsyncSleep):
            waiting.append((command.until, coroutine))
            waiting.sort(key=lambda item: item[0])
        # ... or register reads
        elif isinstance(command, AsyncRead):
            waiting_read[command.file] = coroutine

3.5. Cooperative I/O

The AsyncSleep, AsyncRead and run implementations are now fully functional to sleep and/or read. Same as for sleepy, we can define a helper to test reading:

async def ready(path, amount=1024*32):
    print('read', path, 'at', '%d' % time.time())
    with open(path, 'rb') as file:
        result = await AsyncRead(file, amount)
    print('done', path, 'at', '%d' % time.time())
    print('got', len(result), 'B')

run(sleepy('background', 5), ready('/dev/urandom'))

Running this, we can see that our I/O is interleaved with the waiting task:

id background round 1
read /dev/urandom at 1530721148
id background round 2
id background round 3
id background round 4
id background round 5
done /dev/urandom at 1530721148
got 1024 B

4. Non-Blocking I/O

While I/O on files gets the concept across, it is not really suitable for a library like asyncio: the select call always returns for files, and both open and read may block indefinitely. This blocks all coroutines of an event loop – which is bad. Libraries like aiofiles use threads and synchronization to fake non-blocking I/O and events on file.

However, sockets do allow for non-blocking I/O – and their inherent latency makes it much more critical. When used in an event loop, waiting for data and retrying can be wrapped without blocking anything.

4.1. Non-Blocking I/O event

Similar to our AsyncRead, we can define a suspend-and-read event for sockets. Instead of taking a file, we take a socket – which must be non-blocking. Also, our __await__ uses socket.recv instead of file.read.

class AsyncRecv:
    def __init__(self, connection, amount=1, read_buffer=1024):
        assert not connection.getblocking(), 'connection must be non-blocking for async recv'
        self.connection = connection
        self.amount = amount
        self.read_buffer = read_buffer
        self._buffer = b''

    def __await__(self):
        while len(self._buffer) < self.amount:
            try:
                self._buffer += self.connection.recv(self.read_buffer)
            except BlockingIOError:
                yield self
        return self._buffer

    def __repr__(self):
        return '%s(file=%s, amount=%d, progress=%d)' % (
            self.__class__.__name__, self.connection, self.amount, len(self._buffer)
        )

In contrast to AsyncRead, __await__ performs truly non-blocking I/O. When data is available, it always reads. When no data is available, it always suspends. That means the event loop is only blocked while we perform useful work.

4.2. Un-Blocking the event loop

As far as the event loop is concerned, nothing changes much. The event to listen for is still the same as for files – a file descriptor marked ready by select.

# old
elif isinstance(command, AsyncRead):
    waiting_read[command.file] = coroutine
# new
elif isinstance(command, AsyncRead):
    waiting_read[command.file] = coroutine
elif isinstance(command, AsyncRecv):
    waiting_read[command.connection] = coroutine

At this point, it should be obvious that AsyncRead and AsyncRecv are the same kind of event. We could easily refactor them to be one event with an exchangeable I/O component. In effect, the event loop, coroutines and events cleanly separate a scheduler, arbitrary intermediate code and the actual I/O.

4.3. The ugly side of non-blocking I/O

In principle, what you should do at this point is replicate the logic of read as a recv for AsyncRecv. However, this is much more ugly now – you have to handle early returns when functions block inside the kernel, but yield control to you. For example, opening a connection versus opening a file is much longer:

# file
file = open(path, 'rb')
# non-blocking socket
connection = socket.socket()
connection.setblocking(False)
# open without blocking - retry on failure
try:
    connection.connect((url, port))
except BlockingIOError:
    pass

Long story short, what remains is a few dozen lines of Exception handling. The events and event loop already work at this point.

id background round 1
read localhost:25000 at 1530783569
read /dev/urandom at 1530783569
done localhost:25000 at 1530783569 got 32768 B
id background round 2
id background round 3
id background round 4
done /dev/urandom at 1530783569 got 4096 B
id background round 5

Addendum

Example code at github

Question 58

Your coro desugaring is conceptually correct, but slightly incomplete.

await doesn’t suspend unconditionally, but only if it encounters a blocking call. How does it know that a call is blocking? This is decided by the code being awaited. For example, an awaitable implementation of socket read could be desugared to:

def read(sock, n):
    # sock must be in non-blocking mode
    try:
        return sock.recv(n)
    except EWOULDBLOCK:
        event_loop.add_reader(sock.fileno, current_task())
        return SUSPEND

In real asyncio the equivalent code modifies the state of a Future instead of returning magic values, but the concept is the same. When appropriately adapted to a generator-like object, the above code can be awaited.

On the caller side, when your coroutine contains:

data = await read(sock, 1024)

It desugars into something close to:

data = read(sock, 1024)
if data is SUSPEND:
    return SUSPEND
self.pos += 1
self.parts[self.pos](...)

People familiar with generators tend to describe the above in terms of yield from which does the suspension automatically.

The suspension chain continues all the way up to the event loop, which notices that the coroutine is suspended, removes it from the runnable set, and goes on to execute coroutines that are runnable, if any. If no coroutines are runnable, the loop waits in select() until either a file descriptor a coroutine is interested in becomes ready for IO. (The event loop maintains a file-descriptor-to-coroutine mapping.)

In the above example, once select() tells the event loop that sock is readable, it will re-add coro to the runnable set, so it will be continued from the point of suspension.

In other words:

Everything happens in the same thread by default.
The event loop is responsible for scheduling the coroutines and waking them up when whatever they were waiting for (typically an IO call that would normally block, or a timeout) becomes ready.

For insight on coroutine-driving event loops, I recommend this talk by Dave Beazley, where he demonstrates coding an event loop from scratch in front of live audience.

Question 59

It all boils down to the two main challenges that asyncio is addressing:

How to perform multiple I/O in a single thread?
How to implement cooperative multitasking?

The answer to the first point has been around for a long while and is called a select loop. In python, it is implemented in the selectors module.

The second question is related to the concept of coroutine, i.e. functions that can stop their execution and be restored later on. In python, coroutines are implemented using generators and the yield from statement. That’s what is hiding behind the async/await syntax.

问题：Python中是否有一个//运算符的上限？

回答 0

回答 1

回答 2

回答 3

解决方案1：通过求反将地板转换为天花板

解决方案2：让divmod（）完成工作

解决方案3：在除法之前调整分子

解决方案4：转换为浮点数以使用math.ceil（）

Solution 1: Convert floor to ceiling with negation

Solution 2: Let divmod() do the work

Solution 3: Adjust the numerator before the division

Solution 4: Convert to floats to use math.ceil()

回答 4

回答 5

回答 6

问题：在namedtuple中输入提示

回答 0

回答 1

问题：如何将Python的.py转换为.exe？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：如何使用python获取文件夹中的最新文件

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

问题：如何在Python3中像printf一样打印？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

问题：如何在AWS EC2实例上安装Python 3？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

问题：如何使用python在Selenium中以编程方式使Firefox无头？

回答 0

奥托罗

Outro

回答 1

回答 2

回答 3

回答 4

问题：异步实际上是如何工作的？

回答 0

asyncio如何工作？

与生成器通讯

从生成器返回值

看，一个新的关键字： yield from

放在一起

异步

How does asyncio work?

Communicating with a generator

Returning values from generators

Behold, a new keyword: yield from

Putting it all together

Asyncio

回答 1

1.坚果壳中的协程与子程序

看，一个新的关键字： `yield from`

Behold, a new keyword: `yield from`

1.5。Python的`async`和`await`

3.1。该`select`呼叫

1.5. Python’s `async` and `await`

3.1. The `select` call