标签归档:python-3.x

Python中是否有一个//运算符的上限?

问题:Python中是否有一个//运算符的上限?

我发现了//Python中的运算符,在Python 3中该运算符与下限相除。

是否有一个运算符与ceil分开?(我知道/在Python 3中执行浮点除法的运算符。)

I found out about the // operator in Python which in Python 3 does division with floor.

Is there an operator which divides with ceil instead? (I know about the / operator which in Python 3 does floating point division.)


回答 0

没有运算符与ceil分开。您需要import math使用math.ceil

There is no operator which divides with ceil. You need to import math and use math.ceil


回答 1

您可以做上下颠倒的楼层划分:

def ceildiv(a, b):
    return -(-a // b)

之所以有效,是因为Python的除法运算符执行地板除法(与C语言不同,整数除法会截断小数部分)。

这也适用于Python的大整数,因为没有(有损的)浮点转换。

这是一个示范:

>>> from __future__ import division   # a/b is float division
>>> from math import ceil
>>> b = 3
>>> for a in range(-7, 8):
...     print(["%d/%d" % (a, b), int(ceil(a / b)), -(-a // b)])
... 
['-7/3', -2, -2]
['-6/3', -2, -2]
['-5/3', -1, -1]
['-4/3', -1, -1]
['-3/3', -1, -1]
['-2/3', 0, 0]
['-1/3', 0, 0]
['0/3', 0, 0]
['1/3', 1, 1]
['2/3', 1, 1]
['3/3', 1, 1]
['4/3', 2, 2]
['5/3', 2, 2]
['6/3', 2, 2]
['7/3', 3, 3]

You can just do upside-down floor division:

def ceildiv(a, b):
    return -(-a // b)

This works because Python’s division operator does floor division (unlike in C, where integer division truncates the fractional part).

This also works with Python’s big integers, because there’s no (lossy) floating-point conversion.

Here’s a demonstration:

>>> from __future__ import division   # a/b is float division
>>> from math import ceil
>>> b = 3
>>> for a in range(-7, 8):
...     print(["%d/%d" % (a, b), int(ceil(a / b)), -(-a // b)])
... 
['-7/3', -2, -2]
['-6/3', -2, -2]
['-5/3', -1, -1]
['-4/3', -1, -1]
['-3/3', -1, -1]
['-2/3', 0, 0]
['-1/3', 0, 0]
['0/3', 0, 0]
['1/3', 1, 1]
['2/3', 1, 1]
['3/3', 1, 1]
['4/3', 2, 2]
['5/3', 2, 2]
['6/3', 2, 2]
['7/3', 3, 3]

回答 2

你可以做(x + (d-1)) // d划分时x通过d,即(x + 4) // 5

You could do (x + (d-1)) // d when dividing x by d, i.e. (x + 4) // 5.


回答 3

解决方案1:通过求反将地板转换为天花板

def ceiling_division(n, d):
    return -(n // -d)

让人联想到Penn&Teller的悬浮技巧,“将世界颠倒(带负号),使用普通地板分隔(天花板和地板已互换),然后使世界朝上(带负号)。 ”

解决方案2:让divmod()完成工作

def ceiling_division(n, d):
    q, r = divmod(n, d)
    return q + bool(r)

所述divmod()函数给出(a // b, a % b)为整数(这可能是用浮漂较不可靠,由于舍入误差)。bool(r)每当存在非零余数时,带有的步骤会将商加1。

解决方案3:在除法之前调整分子

def ceiling_division(n, d):
    return (n + d - 1) // d

向上平移分子,以便将地板划分向下舍入到所需的上限。注意,这仅适用于整数。

解决方案4:转换为浮点数以使用math.ceil()

def ceiling_division(n, d):
    return math.ceil(n / d)

math.ceil()代码很容易理解,但它从整数到彩车和背部转换。这不是很快,并且可能存在舍入问题。而且,它依赖于Python 3语义,其中“真除法”产生浮点,而ceil()函数返回整数。

Solution 1: Convert floor to ceiling with negation

def ceiling_division(n, d):
    return -(n // -d)

Reminiscent of the Penn & Teller levitation trick, this “turns the world upside down (with negation), uses plain floor division (where the ceiling and floor have been swapped), and then turns the world right-side up (with negation again)”

Solution 2: Let divmod() do the work

def ceiling_division(n, d):
    q, r = divmod(n, d)
    return q + bool(r)

The divmod() function gives (a // b, a % b) for integers (this may be less reliable with floats due to round-off error). The step with bool(r) adds one to the quotient whenever there is a non-zero remainder.

Solution 3: Adjust the numerator before the division

def ceiling_division(n, d):
    return (n + d - 1) // d

Translate the numerator upwards so that floor division rounds down to the intended ceiling. Note, this only works for integers.

Solution 4: Convert to floats to use math.ceil()

def ceiling_division(n, d):
    return math.ceil(n / d)

The math.ceil() code is easy to understand, but it converts from ints to floats and back. This isn’t very fast and it may have rounding issues. Also, it relies on Python 3 semantics where “true division” produces a float and where the ceil() function returns an integer.


回答 4

您也可以随时内联进行

((foo - 1) // bar) + 1

在python3中,只要您关心速度,这比强制进行float除法和调用ceil()快一个数量级。除非您已经通过使用证明,否则您不应该这样做。

>>> timeit.timeit("((5 - 1) // 4) + 1", number = 100000000)
1.7249219375662506
>>> timeit.timeit("ceil(5/4)", setup="from math import ceil", number = 100000000)
12.096064013894647

You can always just do it inline as well

((foo - 1) // bar) + 1

In python3, this is just shy of an order of magnitude faster than forcing the float division and calling ceil(), provided you care about the speed. Which you shouldn’t, unless you’ve proven through usage that you need to.

>>> timeit.timeit("((5 - 1) // 4) + 1", number = 100000000)
1.7249219375662506
>>> timeit.timeit("ceil(5/4)", setup="from math import ceil", number = 100000000)
12.096064013894647

回答 5

请注意math.ceil限制为53位精度。如果使用大整数,则可能无法获得准确的结果。

gmpy2 libary提供c_div它采用天花板的舍入函数。

免责声明:我维护gmpy2。

Note that math.ceil is limited to 53 bits of precision. If you are working with large integers, you may not get exact results.

The gmpy2 libary provides a c_div function which uses ceiling rounding.

Disclaimer: I maintain gmpy2.


回答 6

简单的解决方案:a // b + 1

Simple solution: a // b + 1


在namedtuple中输入提示

问题:在namedtuple中输入提示

考虑以下代码:

from collections import namedtuple
point = namedtuple("Point", ("x:int", "y:int"))

上面的代码只是演示我正在尝试实现的方法。我想namedtuple使用类型提示。

您知道如何以一种优雅的方式达到预期效果吗?

Consider following piece of code:

from collections import namedtuple
point = namedtuple("Point", ("x:int", "y:int"))

The Code above is just a way to demonstrate as to what I am trying to achieve. I would like to make namedtuple with type hints.

Do you know any elegant way how to achieve result as intended?


回答 0

从3.6开始,类型为命名元组的首选语法为

from typing import NamedTuple

class Point(NamedTuple):
    x: int
    y: int = 1  # Set default value

Point(3)  # -> Point(x=3, y=1)

编辑 从Python 3.7开始,请考虑使用dataclasses(您的IDE可能尚不支持它们进行静态类型检查):

from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int = 1  # Set default value

Point(3)  # -> Point(x=3, y=1)

The prefered Syntax for a typed named tuple since 3.6 is

from typing import NamedTuple

class Point(NamedTuple):
    x: int
    y: int = 1  # Set default value

Point(3)  # -> Point(x=3, y=1)

Edit Starting Python 3.7, consider using dataclasses (your IDE may not yet support them for static type checking):

from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int = 1  # Set default value

Point(3)  # -> Point(x=3, y=1)

回答 1

您可以使用 typing.NamedTuple

来自文档

类型版本namedtuple

>>> import typing
>>> Point = typing.NamedTuple("Point", [('x', int), ('y', int)])

仅在Python 3.5及更高版本中存在

You can use typing.NamedTuple

From the docs

Typed version of namedtuple.

>>> import typing
>>> Point = typing.NamedTuple("Point", [('x', int), ('y', int)])

This is present only in Python 3.5 onwards


如何将Python的.py转换为.exe?

问题:如何将Python的.py转换为.exe?

我试图将一个相当简单的Python程序转换为可执行文件,但是找不到我想要的东西,所以我有几个问题(我正在运行Python 3.6):

到目前为止,我发现的方法如下

  1. 下载旧版本的Python并使用 pyinstaller/py2exe
  2. 在Python 3.6中设置一个虚拟环境,这将允许我执行1。
  3. 下载Python到C ++转换器并使用它。

这是我尝试过的/遇到的问题。

  • 我在安装pyinstaller所需的下载之前安装了它(pypi-something),所以它无法正常工作。下载必备文件后,pyinstaller仍然无法识别它。
  • 如果要在Python 2.7中设置virtualenv,我是否真的需要安装Python 2.7?
  • 同样,我看到的唯一的python至C ++转换器只能在Python 3.5之前工作-尝试这样做是否需要下载并使用此版本?

I’m trying to convert a fairly simple Python program to an executable and couldn’t find what I was looking for, so I have a few questions (I’m running Python 3.6):

The methods of doing this that I have found so far are as follows

  1. downloading an old version of Python and using pyinstaller/py2exe
  2. setting up a virtual environment in Python 3.6 that will allow me to do 1.
  3. downloading a Python to C++ converter and using that.

Here is what I’ve tried/what problems I’ve run into.

  • I installed pyinstaller before the required download before it (pypi-something) so it did not work. After downloading the prerequisite file, pyinstaller still does not recognize it.
  • If I’m setting up a virtualenv in Python 2.7, do I actually need to have Python 2.7 installed?
  • similarly, the only python to C++ converters I see work only up until Python 3.5 – do I need to download and use this version if attempting this?

回答 0

在Python 3.6中将.py转换为.exe的步骤

  1. 安装Python 3.6
  2. 安装cx_Freeze,(打开命令提示符并输入pip install cx_Freeze
  3. 安装idna,(打开命令提示符并输入pip install idna
  4. 编写一个.py名为的程序myfirstprog.py
  5. setup.py在脚本的当前目录中创建一个新的python文件。
  6. setup.py文件中,复制下面的代码并保存。
  7. 按住Shift键并在同一目录上单击鼠标右键,因此您可以打开命令提示符窗口。
  8. 在提示中,键入 python setup.py build
  9. 如果您的脚本没有错误,那么创建应用程序将没有问题。
  10. 检查新创建的文件夹build。它有另一个文件夹。在该文件夹中,您可以找到您的应用程序。运行。让自己开心。

请参阅我的博客中的原始脚本。

setup.py:

from cx_Freeze import setup, Executable

base = None    

executables = [Executable("myfirstprog.py", base=base)]

packages = ["idna"]
options = {
    'build_exe': {    
        'packages':packages,
    },    
}

setup(
    name = "<any name>",
    options = options,
    version = "<any number>",
    description = '<any description>',
    executables = executables
)

编辑:

  • 确保不要myfirstprog.py步骤4中.py创建的扩展名放在文件名中;
  • 你应该包括每import版包您.pypackages列表(例如:packages = ["idna", "os","sys"]
  • any name, any number, any descriptionsetup.py文件不应保持不变,就应该相应地改变它(例如:name = "<first_ever>", version = "0.11", description = ''
  • import你开始之前,编辑软件包必须安装第8步

Steps to convert .py to .exe in Python 3.6

  1. Install Python 3.6.
  2. Install cx_Freeze, (open your command prompt and type pip install cx_Freeze.
  3. Install idna, (open your command prompt and type pip install idna.
  4. Write a .py program named myfirstprog.py.
  5. Create a new python file named setup.py on the current directory of your script.
  6. In the setup.py file, copy the code below and save it.
  7. With shift pressed right click on the same directory, so you are able to open a command prompt window.
  8. In the prompt, type python setup.py build
  9. If your script is error free, then there will be no problem on creating application.
  10. Check the newly created folder build. It has another folder in it. Within that folder you can find your application. Run it. Make yourself happy.

See the original script in my blog.

setup.py:

from cx_Freeze import setup, Executable

base = None    

executables = [Executable("myfirstprog.py", base=base)]

packages = ["idna"]
options = {
    'build_exe': {    
        'packages':packages,
    },    
}

setup(
    name = "<any name>",
    options = options,
    version = "<any number>",
    description = '<any description>',
    executables = executables
)

EDIT:

  • be sure that instead of myfirstprog.py you should put your .pyextension file name as created in step 4;
  • you should include each imported package in your .py into packages list (ex: packages = ["idna", "os","sys"])
  • any name, any number, any description in setup.py file should not remain the same, you should change it accordingly (ex:name = "<first_ever>", version = "0.11", description = '' )
  • the imported packages must be installed before you start step 8.

回答 1

PyInstaller支持Python 3.6。

在您的Python文件夹中打开一个cmd窗口(打开命令窗口并使用cd或按住shift键,在Windows资源管理器中右键单击它,然后选择“在此处打开命令窗口”)。然后输入

pip install pyinstaller

就是这样。

使用它的最简单方法是在命令提示符下输入

pyinstaller file_name.py

有关如何使用它的更多详细信息,请查看此问题

Python 3.6 is supported by PyInstaller.

Open a cmd window in your Python folder (open a command window and use cd or while holding shift, right click it on Windows Explorer and choose ‘Open command window here’). Then just enter

pip install pyinstaller

And that’s it.

The simplest way to use it is by entering on your command prompt

pyinstaller file_name.py

For more details on how to use it, take a look at this question.


回答 2

GitHub上有一个名为auto-py-to-exe的开源项目。实际上,它也仅在内部使用PyInstaller,但由于它具有控制PyInstaller的简单GUI,因此它可能是一个舒适的选择。与其他解决方案相比,它还可以输出独立文件。他们还提供了视频,展示了如何进行设置。

界面:

输出:

There is an open source project called auto-py-to-exe on GitHub. Actually it also just uses PyInstaller internally but since it is has a simple GUI that controls PyInstaller it may be a comfortable alternative. It can also output a standalone file in contrast to other solutions. They also provide a video showing how to set it up.

GUI:

Output:


回答 3

我无法告诉您什么是最好的,但是我过去成功使用的工具是cx_Freeze。他们最近(在17年1月7日)更新到了5.0.1版,它支持Python 3.6。

这是Pypi https://pypi.python.org/pypi/cx_Freeze

该文档显示,有多种方法可以完成此操作,具体取决于您的需求。 http://cx-freeze.readthedocs.io/en/latest/overview.html

我还没有尝试过,所以我将指向一个帖子,其中讨论了执行此操作的简单方法。有些事情可能会或可能不会改变。

如何使用cx_freeze?

I can’t tell you what’s best, but a tool I have used with success in the past was cx_Freeze. They recently updated (on Jan. 7, ’17) to version 5.0.1 and it supports Python 3.6.

Here’s the pypi https://pypi.python.org/pypi/cx_Freeze

The documentation shows that there is more than one way to do it, depending on your needs. http://cx-freeze.readthedocs.io/en/latest/overview.html

I have not tried it out yet, so I’m going to point to a post where the simple way of doing it was discussed. Some things may or may not have changed though.

How do I use cx_freeze?


回答 4

我一直在我的软件包PySimpleGUI中使用Nuitka和PyInstaller。

努伊特卡 存在使tkinter与Nuikta进行编译的问题。一个项目贡献者开发了一个脚本来解决该问题。

如果您不使用tkinter,则可能对您“有效”。如果您使用的是tkinter,请这样说,我将尝试发布脚本和说明。

PyInstaller 我正在运行3.6,PyInstaller运行良好!我用来创建exe文件的命令是:

pyinstaller -wF myfile.py

-wF将创建一个EXE文件。因为我的所有程序都具有GUI,并且我不想显示命令窗口,所以-w选项将隐藏命令窗口。

这几乎就像运行用Python编写的Winforms程序一样

[2019年7月20日更新]

有使用PyInstaller的基于PySimpleGUI GUI的解决方案。它使用PySimpleGUI。它称为pysimplegui-exemaker,可以进行pip安装。

pip install PySimpleGUI-exemaker

要在安装后运行它:

python -m pysimplegui-exemaker.pysimplegui-exemaker

I’ve been using Nuitka and PyInstaller with my package, PySimpleGUI.

Nuitka There were issues getting tkinter to compile with Nuikta. One of the project contributors developed a script that fixed the problem.

If you’re not using tkinter it may “just work” for you. If you are using tkinter say so and I’ll try to get the script and instructions published.

PyInstaller I’m running 3.6 and PyInstaller is working great! The command I use to create my exe file is:

pyinstaller -wF myfile.py

The -wF will create a single EXE file. Because all of my programs have a GUI and I do not want to command window to show, the -w option will hide the command window.

This is as close to getting what looks like a Winforms program to run that was written in Python.

[Update 20-Jul-2019]

There is PySimpleGUI GUI based solution that uses PyInstaller. It uses PySimpleGUI. It’s called pysimplegui-exemaker and can be pip installed.

pip install PySimpleGUI-exemaker

To run it after installing:

python -m pysimplegui-exemaker.pysimplegui-exemaker


回答 5

现在,您可以使用PyInstaller进行转换。我甚至使用Python 3。

脚步:

  1. 启动电脑
  2. 打开命令提示符
  3. 输入命令 pip install pyinstaller
  4. 安装后,使用命令“ cd”转到工作目录。
  5. 运行命令 pyinstall <filename>

Now you can convert it by using PyInstaller. It works with even Python 3.

Steps:

  1. Fire up your PC
  2. Open command prompt
  3. Enter command pip install pyinstaller
  4. When it is installed, use the command ‘cd’ to go to the working directory.
  5. Run command pyinstall <filename>

如何使用python获取文件夹中的最新文件

问题:如何使用python获取文件夹中的最新文件

我需要使用python获取文件夹的最新文件。使用代码时:

max(files, key = os.path.getctime)

我收到以下错误:

FileNotFoundError: [WinError 2] The system cannot find the file specified: 'a'

I need to get the latest file of a folder using python. While using the code:

max(files, key = os.path.getctime)

I am getting the below error:

FileNotFoundError: [WinError 2] The system cannot find the file specified: 'a'


回答 0

分配给files变量的任何内容均不正确。使用以下代码。

import glob
import os

list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print latest_file

Whatever is assigned to the files variable is incorrect. Use the following code.

import glob
import os

list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print latest_file

回答 1

max(files, key = os.path.getctime)

是非常不完整的代码。什么files啊 可能是来自的文件名列表os.listdir()

但是此列表仅列出了文件名部分(也称为“基本名称”),因为它们的路径是通用的。为了正确使用它,您必须将其与通向它的路径结合起来(并用于获得它)。

如(未测试):

def newest(path):
    files = os.listdir(path)
    paths = [os.path.join(path, basename) for basename in files]
    return max(paths, key=os.path.getctime)
max(files, key = os.path.getctime)

is quite incomplete code. What is files? It probably is a list of file names, coming out of os.listdir().

But this list lists only the filename parts (a. k. a. “basenames”), because their path is common. In order to use it correctly, you have to combine it with the path leading to it (and used to obtain it).

Such as (untested):

def newest(path):
    files = os.listdir(path)
    paths = [os.path.join(path, basename) for basename in files]
    return max(paths, key=os.path.getctime)

回答 2

我建议使用glob.iglob()代替glob.glob(),因为它效率更高。

glob.iglob()返回一个迭代器,该迭代器产生的值与glob()相同,而实际上并没有同时存储所有值。

意思是 glob.iglob()效率更高。

我主要使用以下代码查找与我的模式匹配的最新文件:

LatestFile = max(glob.iglob(fileNamePattern),key=os.path.getctime)


注意:max函数有多种变体,如果找到最新文件,我们将使用以下变体: max(iterable, *[, key, default])

它需要迭代,因此您的第一个参数应该是可迭代的。如果找到最大数量,我们可以使用beow变体:max (num1, num2, num3, *args[, key])

I would suggest using glob.iglob() instead of the glob.glob(), as it is more efficient.

glob.iglob() Return an iterator which yields the same values as glob() without actually storing them all simultaneously.

Which means glob.iglob() will be more efficient.

I mostly use below code to find the latest file matching to my pattern:

LatestFile = max(glob.iglob(fileNamePattern),key=os.path.getctime)


NOTE: There are variants of max function, In case of finding the latest file we will be using below variant: max(iterable, *[, key, default])

which needs iterable so your first parameter should be iterable. In case of finding max of nums we can use beow variant : max (num1, num2, num3, *args[, key])


回答 3

尝试按创建时间对项目排序。以下示例对文件夹中的文件进行排序,并获取最新的第一个元素。

import glob
import os

files_path = os.path.join(folder, '*')
files = sorted(
    glob.iglob(files_path), key=os.path.getctime, reverse=True) 
print files[0]

Try to sort items by creation time. Example below sorts files in a folder and gets first element which is latest.

import glob
import os

files_path = os.path.join(folder, '*')
files = sorted(
    glob.iglob(files_path), key=os.path.getctime, reverse=True) 
print files[0]

回答 4

我缺乏发表评论的声誉,但是Marlon Abeykoons回应的ctime并未为我提供正确的结果。使用mtime可以解决问题。(key = os.path.get m时间))

import glob
import os

list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getmtime)
print latest_file

对于该问题,我找到了两个答案:

python os.path.getctime max不返回最新的 python-unix系统中的getmtime()和getctime()之间的区别

I lack the reputation to comment but ctime from Marlon Abeykoons response did not give the correct result for me. Using mtime does the trick though. (key=os.path.getmtime))

import glob
import os

list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getmtime)
print latest_file

I found two answers for that problem:

python os.path.getctime max does not return latest Difference between python – getmtime() and getctime() in unix system


回答 5

(编辑以改善答案)

首先定义一个函数get_latest_file

def get_latest_file(path, *paths):
    fullpath = os.path.join(path, paths)
    ...
get_latest_file('example', 'files','randomtext011.*.txt')

您也可以使用文档字符串!

def get_latest_file(path, *paths):
    """Returns the name of the latest (most recent) file 
    of the joined path(s)"""
    fullpath = os.path.join(path, *paths)

如果使用Python 3,则可以改用iglob

完成代码以返回最新文件的名称:

def get_latest_file(path, *paths):
    """Returns the name of the latest (most recent) file 
    of the joined path(s)"""
    fullpath = os.path.join(path, *paths)
    files = glob.glob(fullpath)  # You may use iglob in Python3
    if not files:                # I prefer using the negation
        return None                      # because it behaves like a shortcut
    latest_file = max(files, key=os.path.getctime)
    _, filename = os.path.split(latest_file)
    return filename

(Edited to improve answer)

First define a function get_latest_file

def get_latest_file(path, *paths):
    fullpath = os.path.join(path, paths)
    ...
get_latest_file('example', 'files','randomtext011.*.txt')

You may also use a docstring !

def get_latest_file(path, *paths):
    """Returns the name of the latest (most recent) file 
    of the joined path(s)"""
    fullpath = os.path.join(path, *paths)

If you use Python 3, you can use iglob instead.

Complete code to return the name of latest file:

def get_latest_file(path, *paths):
    """Returns the name of the latest (most recent) file 
    of the joined path(s)"""
    fullpath = os.path.join(path, *paths)
    files = glob.glob(fullpath)  # You may use iglob in Python3
    if not files:                # I prefer using the negation
        return None                      # because it behaves like a shortcut
    latest_file = max(files, key=os.path.getctime)
    _, filename = os.path.split(latest_file)
    return filename

回答 6

我试图使用以上建议,但程序崩溃了,而不是我想识别的文件已被使用,并且在尝试使用“ os.path.getctime”时崩溃了。最终对我有用的是:

    files_before = glob.glob(os.path.join(my_path,'*'))
    **code where new file is created**
    new_file = set(files_before).symmetric_difference(set(glob.glob(os.path.join(my_path,'*'))))

此代码获取了两组文件列表之间最常见的对象,它并不是最优雅的对象,如果同时创建多个文件,则可能会不稳定

I have tried to use the above suggestions and my program crashed, than I figured out the file I’m trying to identify was used and when trying to use ‘os.path.getctime’ it crashed. what finally worked for me was:

    files_before = glob.glob(os.path.join(my_path,'*'))
    **code where new file is created**
    new_file = set(files_before).symmetric_difference(set(glob.glob(os.path.join(my_path,'*'))))

this codes gets the uncommon object between the two sets of file lists its not the most elegant, and if multiple files are created at the same time it would probably won’t be stable


回答 7

在Windows(0.05s)上更快的方法是,调用执行此操作的bat脚本:

get_latest.bat

@echo off
for /f %%i in ('dir \\directory\in\question /b/a-d/od/t:c') do set LAST=%%i
%LAST%

\\directory\in\question您要调查的目录在哪里。

get_latest.py

from subprocess import Popen, PIPE
p = Popen("get_latest.bat", shell=True, stdout=PIPE,)
stdout, stderr = p.communicate()
print(stdout, stderr)

如果找到文件stdout是路径,stderr则为None。

使用stdout.decode("utf-8").rstrip()来获取文件名使用字符串表示。

A much faster method on windows (0.05s), call a bat script that does this:

get_latest.bat

@echo off
for /f %%i in ('dir \\directory\in\question /b/a-d/od/t:c') do set LAST=%%i
%LAST%

where \\directory\in\question is the directory you want to investigate.

get_latest.py

from subprocess import Popen, PIPE
p = Popen("get_latest.bat", shell=True, stdout=PIPE,)
stdout, stderr = p.communicate()
print(stdout, stderr)

if it finds a file stdout is the path and stderr is None.

Use stdout.decode("utf-8").rstrip() to get the usable string representation of the file name.


回答 8

我在Python 3中一直在使用它,包括在文件名上进行模式匹配。

from pathlib import Path

def latest_file(path: Path, pattern: str = "*"):
    files = path.glob(pattern)
    return max(files, key=lambda x: x.stat().st_ctime)

I’ve been using this in Python 3, including pattern matching on the filename.

from pathlib import Path

def latest_file(path: Path, pattern: str = "*"):
    files = path.glob(pattern)
    return max(files, key=lambda x: x.stat().st_ctime)

如何在Python3中像printf一样打印?

问题:如何在Python3中像printf一样打印?

在Python 2中,我使用了:

print "a=%d,b=%d" % (f(x,n),g(x,n))

我试过了:

print("a=%d,b=%d") % (f(x,n),g(x,n))

In Python 2 I used:

print "a=%d,b=%d" % (f(x,n),g(x,n))

I’ve tried:

print("a=%d,b=%d") % (f(x,n),g(x,n))

回答 0

在Python2中,print是一个引入了以下语句的关键字:

print "Hi"

在Python3中,print是可以调用的函数:

print ("Hi")

在这两个版本中,%都是一个运算符,它在左侧需要一个字符串,在右侧需要一个值或一个值的元组或一个映射对象(如dict)。

因此,您的行应如下所示:

print("a=%d,b=%d" % (f(x,n),g(x,n)))

另外,对于Python3和更高版本,建议使用{}-style格式而不是%-style格式:

print('a={:d}, b={:d}'.format(f(x,n),g(x,n)))

Python 3.6引入了另一种字符串格式范例:f-strings

print(f'a={f(x,n):d}, b={g(x,n):d}')

In Python2, print was a keyword which introduced a statement:

print "Hi"

In Python3, print is a function which may be invoked:

print ("Hi")

In both versions, % is an operator which requires a string on the left-hand side and a value or a tuple of values or a mapping object (like dict) on the right-hand side.

So, your line ought to look like this:

print("a=%d,b=%d" % (f(x,n),g(x,n)))

Also, the recommendation for Python3 and newer is to use {}-style formatting instead of %-style formatting:

print('a={:d}, b={:d}'.format(f(x,n),g(x,n)))

Python 3.6 introduces yet another string-formatting paradigm: f-strings.

print(f'a={f(x,n):d}, b={g(x,n):d}')

回答 1

最推荐的方法是使用format方法。在这里了解更多

a, b = 1, 2

print("a={0},b={1}".format(a, b))

The most recommended way to do is to use format method. Read more about it here

a, b = 1, 2

print("a={0},b={1}".format(a, b))

回答 2

来自O’Reilly的Python Cookbook的简单printf()函数。

import sys
def printf(format, *args):
    sys.stdout.write(format % args)

输出示例:

i = 7
pi = 3.14159265359
printf("hi there, i=%d, pi=%.2f\n", i, pi)
# hi there, i=7, pi=3.14

Simple printf() function from O’Reilly’s Python Cookbook.

import sys
def printf(format, *args):
    sys.stdout.write(format % args)

Example output:

i = 7
pi = 3.14159265359
printf("hi there, i=%d, pi=%.2f\n", i, pi)
# hi there, i=7, pi=3.14

回答 3

Python 3.6引入了用于内联插值的f字符串。更好的是,它扩展了语法,还允许使用插值的格式说明符。我在Google上搜索时一直在努力的工作(并遇到了这个老问题!):

print(f'{account:40s} ({ratio:3.2f}) -> AUD {splitAmount}')

PEP 498包含详细信息。而且…它用其他语言的格式说明符排序了我的烦恼-允许说明符本身可以是表达式!好极了!请参阅:格式说明符

Python 3.6 introduced f-strings for inline interpolation. What’s even nicer is it extended the syntax to also allow format specifiers with interpolation. Something I’ve been working on while I googled this (and came across this old question!):

print(f'{account:40s} ({ratio:3.2f}) -> AUD {splitAmount}')

PEP 498 has the details. And… it sorted my pet peeve with format specifiers in other langs — allows for specifiers that themselves can be expressions! Yay! See: Format Specifiers.


回答 4

简单的例子:

print("foo %d, bar %d" % (1,2))

Simple Example:

print("foo %d, bar %d" % (1,2))


回答 5

一个简单的。

def printf(format, *values):
    print(format % values )

然后:

printf("Hello, this is my name %s and my age %d", "Martin", 20)

A simpler one.

def printf(format, *values):
    print(format % values )

Then:

printf("Hello, this is my name %s and my age %d", "Martin", 20)

回答 6

因为您%print(...)括号之外,所以您试图将变量插入到调用结果printprint(...)返回None,所以这将不起作用,还有一个小问题,您已经在这个时间和时间旅行中打印了模板,这是我们所居住的宇宙定律所禁止的。

你想整个事情进行打印,包括%和它的操作数,需要为内部print(...)通话,从而使打印之前它的字符串可以建成。

print( "a=%d,b=%d" % (f(x,n), g(x,n)) )

我添加了一些额外的空格以使其更清晰(尽管它们不是必需的,通常也不认为是好的样式)。

Because your % is outside the print(...) parentheses, you’re trying to insert your variables into the result of your print call. print(...) returns None, so this won’t work, and there’s also the small matter of you already having printed your template by this time and time travel being prohibited by the laws of the universe we inhabit.

The whole thing you want to print, including the % and its operand, needs to be inside your print(...) call, so that the string can be built before it is printed.

print( "a=%d,b=%d" % (f(x,n), g(x,n)) )

I have added a few extra spaces to make it clearer (though they are not necessary and generally not considered good style).


回答 7

python中没有其他的printf单词…我很惊讶!最好的代码是

def printf(format, *args):
    sys.stdout.write(format % args)

由于这种形式不允许打印\ n。其他所有人都没有。这就是为什么打印不好的原因。而且,您还需要以特殊形式编写args。上面的功能没有缺点。这是printf函数的标准常用形式。

Other words printf absent in python… I’m surprised! Best code is

def printf(format, *args):
    sys.stdout.write(format % args)

Because of this form allows not to print \n. All others no. That’s why print is bad operator. And also you need write args in special form. There is no disadvantages in function above. It’s a standard usual form of printf function.


回答 8

print("Name={}, balance={}".format(var-name, var-balance))
print("Name={}, balance={}".format(var-name, var-balance))

如何在AWS EC2实例上安装Python 3?

问题:如何在AWS EC2实例上安装Python 3?

我正在尝试在AWS EC2实例上安装python 3.x,并且:

sudo yum install python3

不起作用:

No package python3 available.

我已经四处搜寻,找不到其他人遇到这个问题,所以我在这里问。我必须手动下载并安装它吗?

I’m trying to install python 3.x on an AWS EC2 instance and:

sudo yum install python3

doesn’t work:

No package python3 available.

I’ve googled around and I can’t find anyone else who has this problem so I’m asking here. Do I have to manually download and install it?


回答 0

如果你做一个

sudo yum list | grep python3

您会看到,虽然它们没有“ python3”软件包,但确实有“ python34”软件包或更新的发行版,例如“ python36”。安装起来很简单:

sudo yum install python34 python34-pip

If you do a

sudo yum list | grep python3

you will see that while they don’t have a “python3” package, they do have a “python34” package, or a more recent release, such as “python36”. Installing it is as easy as:

sudo yum install python34 python34-pip

回答 1

注意:自2018年末以来,这对于Amazon Linux 2的当前版本可能已过时(请参阅评论),您现在可以通过进行直接安装yum install python3

在Amazon Linux 2中python3[4-6]默认的yum存储库中没有,而是Amazon Extras库

sudo amazon-linux-extras install python3

如果要使用它设置隔离的虚拟环境,请执行以下操作:使用yum install‘d virtualenv工具似乎无法可靠地工作。

virtualenv --python=python3 my_venv

调用venv模块/工具不太麻烦,您可以python3 --version事先检查一下它是否是您想要/期望的。

python3 -m venv my_venv

它可以安装的其他东西(截至18 Jan 18的版本):

[ec2-user@x ~]$ amazon-linux-extras list
  0  ansible2   disabled  [ =2.4.2 ]
  1  emacs   disabled  [ =25.3 ]
  2  memcached1.5   disabled  [ =1.5.1 ]
  3  nginx1.12   disabled  [ =1.12.2 ]
  4  postgresql9.6   disabled  [ =9.6.6 ]
  5  python3=latest  enabled  [ =3.6.2 ]
  6  redis4.0   disabled  [ =4.0.5 ]
  7  R3.4   disabled  [ =3.4.3 ]
  8  rust1   disabled  [ =1.22.1 ]
  9  vim   disabled  [ =8.0 ]
 10  golang1.9   disabled  [ =1.9.2 ]
 11  ruby2.4   disabled  [ =2.4.2 ]
 12  nano   disabled  [ =2.9.1 ]
 13  php7.2   disabled  [ =7.2.0 ]
 14  lamp-mariadb10.2-php7.2   disabled  [ =10.2.10_7.2.0 ]

Note: This may be obsolete for current versions of Amazon Linux 2 since late 2018 (see comments), you can now directly install it via yum install python3.

In Amazon Linux 2, there isn’t a python3[4-6] in the default yum repos, instead there’s the Amazon Extras Library.

sudo amazon-linux-extras install python3

If you want to set up isolated virtual environments with it; using yum install‘d virtualenv tools don’t seem to reliably work.

virtualenv --python=python3 my_venv

Calling the venv module/tool is less finicky, and you could double check it’s what you want/expect with python3 --version beforehand.

python3 -m venv my_venv

Other things it can install (versions as of 18 Jan 18):

[ec2-user@x ~]$ amazon-linux-extras list
  0  ansible2   disabled  [ =2.4.2 ]
  1  emacs   disabled  [ =25.3 ]
  2  memcached1.5   disabled  [ =1.5.1 ]
  3  nginx1.12   disabled  [ =1.12.2 ]
  4  postgresql9.6   disabled  [ =9.6.6 ]
  5  python3=latest  enabled  [ =3.6.2 ]
  6  redis4.0   disabled  [ =4.0.5 ]
  7  R3.4   disabled  [ =3.4.3 ]
  8  rust1   disabled  [ =1.22.1 ]
  9  vim   disabled  [ =8.0 ]
 10  golang1.9   disabled  [ =1.9.2 ]
 11  ruby2.4   disabled  [ =2.4.2 ]
 12  nano   disabled  [ =2.9.1 ]
 13  php7.2   disabled  [ =7.2.0 ]
 14  lamp-mariadb10.2-php7.2   disabled  [ =10.2.10_7.2.0 ]

回答 2

这是我用来为其他任何想要这样做的人手动安装python3的步骤,因为它不是超级简单的方法。编辑:使用yum软件包管理器几乎可以肯定更容易(请参阅其他答案)。

请注意,您可能需要sudo yum groupinstall 'Development Tools'先执行此操作,否则不会安装pip。

wget https://www.python.org/ftp/python/3.4.2/Python-3.4.2.tgz
tar zxvf Python-3.4.2.tgz
cd Python-3.4.2
sudo yum install gcc
./configure --prefix=/opt/python3
make
sudo yum install openssl-devel
sudo make install
sudo ln -s /opt/python3/bin/python3 /usr/bin/python3
python3 (should start the interpreter if it's worked (quit() to exit)

Here are the steps I used to manually install python3 for anyone else who wants to do it as it’s not super straight forward. EDIT: It’s almost certainly easier to use the yum package manager (see other answers).

Note, you’ll probably want to do sudo yum groupinstall 'Development Tools' before doing this otherwise pip won’t install.

wget https://www.python.org/ftp/python/3.4.2/Python-3.4.2.tgz
tar zxvf Python-3.4.2.tgz
cd Python-3.4.2
sudo yum install gcc
./configure --prefix=/opt/python3
make
sudo yum install openssl-devel
sudo make install
sudo ln -s /opt/python3/bin/python3 /usr/bin/python3
python3 (should start the interpreter if it's worked (quit() to exit)

回答 3

EC2(在Amazon Linux AMI上)当前支持python3.4和python3.5。

sudo yum install python35
sudo yum install python35-pip

EC2 (on the Amazon Linux AMI) currently supports python3.4 and python3.5.

sudo yum install python35
sudo yum install python35-pip

回答 4

自Amazon Linux版本2017.09起,python 3.6现在可用:

sudo yum install python36 python36-virtualenv python36-pip

有关更多信息和其他软件包,请参见发行说明

As of Amazon Linux version 2017.09 python 3.6 is now available:

sudo yum install python36 python36-virtualenv python36-pip

See the Release Notes for more info and other packages


回答 5

Amazon Linux现在支持python36。

python36-pip不可用。因此需要遵循不同的路线。

sudo yum install python36 python36-devel python36-libs python36-tools

# If you like to have pip3.6:
curl -O https://bootstrap.pypa.io/get-pip.py
sudo python3 get-pip.py

Amazon Linux now supports python36.

python36-pip is not available. So need to follow a different route.

sudo yum install python36 python36-devel python36-libs python36-tools

# If you like to have pip3.6:
curl -O https://bootstrap.pypa.io/get-pip.py
sudo python3 get-pip.py

回答 6

正如@NickT所说,Amazon Linux 2的默认yum存储库中没有python3 [4-6]今天为止它使用3.7,在这里查看所有答案,我们可以说它会随着时间的变化而变化。

我在Amazon Linux 2上寻找python3.6,但amazon-linux-extras显示了很多选项,但根本没有python。实际上,您可以尝试在epel回购中找到您知道的版本:

sudo amazon-linux-extras install epel

yum search python | grep "^python3..x8"

python34.x86_64 : Version 3 of the Python programming language aka Python 3000
python36.x86_64 : Interpreter of the Python programming language

As @NickT said, there’s no python3[4-6] in the default yum repos in Amazon Linux 2, as of today it uses 3.7 and looking at all answers here we can say it will be changed over time.

I was looking for python3.6 on Amazon Linux 2 but amazon-linux-extras shows a lot of options but no python at all. in fact, you can try to find the version you know in epel repo:

sudo amazon-linux-extras install epel

yum search python | grep "^python3..x8"

python34.x86_64 : Version 3 of the Python programming language aka Python 3000
python36.x86_64 : Interpreter of the Python programming language

回答 7

除了可以用于该问题的所有答案之外,我还要添加在运行CentOS 7的AWS EC2实例上安装Python3的步骤。您可以在此链接中找到完整的详细信息。

https://aws-labs.com/install-python-3-centos-7-2/

首先,我们需要启用SCL。SCL是一个社区项目,可让您在同一系统上构建,安装和使用多个版本的软件,而不会影响系统默认软件包。

sudo yum install centos-release-scl

现在我们有了SCL存储库,我们可以安装python3

sudo yum install rh-python36

要访问Python 3.6,您需要使用Software Collection scl工具启动一个新的shell实例:

scl enable rh-python36 bash

如果现在检查Python版本,您会注意到Python 3.6是默认版本

python --version

需要指出的是,仅在此Shell会话中,Python 3.6是默认的Python版本。如果退出会话或从另一个终端打开新会话,则Python 2.7将是默认的Python版本。

现在,输入以下命令安装python开发工具:

sudo yum groupinstall Development Tools

现在创建一个虚拟环境,以便不会弄乱默认的python包。

mkdir ~/my_new_project
cd ~/my_new_project
python -m venv my_project_venv

要使用此虚拟环境,

source my_project_venv/bin/activate

现在,您已经使用python3设置了虚拟环境。

Adding to all the answers already available for this question, I would like to add the steps I followed to install Python3 on AWS EC2 instance running CentOS 7. You can find the entire details at this link.

https://aws-labs.com/install-python-3-centos-7-2/

First, we need to enable SCL. SCL is a community project that allows you to build, install, and use multiple versions of software on the same system, without affecting system default packages.

sudo yum install centos-release-scl

Now that we have SCL repository, we can install the python3

sudo yum install rh-python36

To access Python 3.6 you need to launch a new shell instance using the Software Collection scl tool:

scl enable rh-python36 bash

If you check the Python version now you’ll notice that Python 3.6 is the default version

python --version

It is important to point out that Python 3.6 is the default Python version only in this shell session. If you exit the session or open a new session from another terminal Python 2.7 will be the default Python version.

Now, Install the python development tools by typing:

sudo yum groupinstall ‘Development Tools’

Now create a virtual environment so that the default python packages don’t get messed up.

mkdir ~/my_new_project
cd ~/my_new_project
python -m venv my_project_venv

To use this virtual environment,

source my_project_venv/bin/activate

Now, you have your virtual environment set up with python3.


回答 8

在Debian衍生产品(例如Ubuntu)上,使用apt。检查apt储存库中可用的Python版本。然后,运行类似于以下内容的命令,以替换正确的程序包名称:

sudo apt-get install python3

在Red Hat及其衍生产品上,使用yum。检查yum存储库中可用的Python版本。然后,运行类似于以下内容的命令,以替换正确的程序包名称:

sudo yum install python36

在SUSE及其衍生物上,请使用zypper。检查存储库中可用的Python版本。然后。运行类似于以下内容的命令,以替换正确的软件包名称:

sudo zypper install python3

On Debian derivatives such as Ubuntu, use apt. Check the apt repository for the versions of Python available to you. Then, run a command similar to the following, substituting the correct package name:

sudo apt-get install python3

On Red Hat and derivatives, use yum. Check the yum repository for the versions of Python available to you. Then, run a command similar to the following, substituting the correct package name:

sudo yum install python36

On SUSE and derivatives, use zypper. Check the repository for the versions of Python available to you. Then. run a command similar to the following, substituting the correct package name:

sudo zypper install python3

如何使用python在Selenium中以编程方式使Firefox无头?

问题:如何使用python在Selenium中以编程方式使Firefox无头?

我正在使用python,selenium和firefox运行此代码,但仍获得firefox的“ head”版本:

binary = FirefoxBinary('C:\\Program Files (x86)\\Mozilla Firefox\\firefox.exe', log_file=sys.stdout)
binary.add_command_line_options('-headless')
self.driver = webdriver.Firefox(firefox_binary=binary)

我还尝试了一些二进制的变体:

binary = FirefoxBinary('C:\\Program Files\\Nightly\\firefox.exe', log_file=sys.stdout)
        binary.add_command_line_options("--headless")

I am running this code with python, selenium, and firefox but still get ‘head’ version of firefox:

binary = FirefoxBinary('C:\\Program Files (x86)\\Mozilla Firefox\\firefox.exe', log_file=sys.stdout)
binary.add_command_line_options('-headless')
self.driver = webdriver.Firefox(firefox_binary=binary)

I also tried some variations of binary:

binary = FirefoxBinary('C:\\Program Files\\Nightly\\firefox.exe', log_file=sys.stdout)
        binary.add_command_line_options("--headless")

回答 0

要不费吹灰之力地调用Firefox浏览器,可以headless通过以下Options()类设置属性:

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.headless = True
driver = webdriver.Firefox(options=options, executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("http://google.com/")
print ("Headless Firefox Initialized")
driver.quit()

还有另一种方法可以完成无头模式。如果你需要禁用或启用Firefox中的无头模式,而无需修改代码,您可以设置环境变量MOZ_HEADLESS,以什么,如果你想Firefox的运行无头,或根本不设置它。

例如,在使用持续集成并且希望在服务器中运行功能测试但仍能够在PC上以正常模式运行测试时,此功能非常有用。

$ MOZ_HEADLESS=1 python manage.py test # testing example in Django with headless Firefox

要么

$ export MOZ_HEADLESS=1   # this way you only have to set it once
$ python manage.py test functional/tests/directory
$ unset MOZ_HEADLESS      # if you want to disable headless mode

奥托罗

如何配置ChromeDriver通过Selenium以无头模式启动Chrome浏览器?

To invoke Firefox Browser headlessly, you can set the headless property through Options() class as follows:

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.headless = True
driver = webdriver.Firefox(options=options, executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("http://google.com/")
print ("Headless Firefox Initialized")
driver.quit()

There’s another way to accomplish headless mode. If you need to disable or enable the headless mode in Firefox, without changing the code, you can set the environment variable MOZ_HEADLESS to whatever if you want Firefox to run headless, or don’t set it at all.

This is very useful when you are using for example continuous integration and you want to run the functional tests in the server but still be able to run the tests in normal mode in your PC.

$ MOZ_HEADLESS=1 python manage.py test # testing example in Django with headless Firefox

or

$ export MOZ_HEADLESS=1   # this way you only have to set it once
$ python manage.py test functional/tests/directory
$ unset MOZ_HEADLESS      # if you want to disable headless mode

Outro

How to configure ChromeDriver to initiate Chrome browser in Headless mode through Selenium?


回答 1

第一个答案不再起作用。

这对我有用:

from selenium.webdriver.firefox.options import Options as FirefoxOptions
from selenium import webdriver

options = FirefoxOptions()
options.add_argument("--headless")
driver = webdriver.Firefox(options=options)
driver.get("http://google.com")

The first answer does’t work anymore.

This worked for me:

from selenium.webdriver.firefox.options import Options as FirefoxOptions
from selenium import webdriver

options = FirefoxOptions()
options.add_argument("--headless")
driver = webdriver.Firefox(options=options)
driver.get("http://google.com")

回答 2

我的答案:

set_headless(headless=True) is deprecated. 

https://seleniumhq.github.io/selenium/docs/api/py/webdriver_firefox/selenium.webdriver.firefox.options.html

options.headless = True

为我工作

My answer:

set_headless(headless=True) is deprecated. 

https://seleniumhq.github.io/selenium/docs/api/py/webdriver_firefox/selenium.webdriver.firefox.options.html

options.headless = True

works for me


回答 3

只是为以后可能会发现此问题的人提供的注释(并希望使用java的方法来实现此目的);FirefoxOptions还能够启用无头模式:

FirefoxOptions firefoxOptions = new FirefoxOptions();
firefoxOptions.setHeadless(true);

Just a note for people who may have found this later (and want java way of achieving this); FirefoxOptions is also capable of enabling the headless mode:

FirefoxOptions firefoxOptions = new FirefoxOptions();
firefoxOptions.setHeadless(true);

回答 4

Used below code to set driver type based on need of Headless / Head for both Firefox and chrome:

// Can pass browser type 

if brower.lower() == 'chrome':
    driver = webdriver.Chrome('..\drivers\chromedriver')
elif brower.lower() == 'headless chrome':
    ch_Options = Options()
    ch_Options.add_argument('--headless')
    ch_Options.add_argument("--disable-gpu")
    driver = webdriver.Chrome('..\drivers\chromedriver',options=ch_Options)
elif brower.lower() == 'firefox':
    driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe')
elif brower.lower() == 'headless firefox':
    ff_option = FFOption()
    ff_option.add_argument('--headless')
    ff_option.add_argument("--disable-gpu")
    driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe', options=ff_option)
elif brower.lower() == 'ie':
    driver = webdriver.Ie('..\drivers\IEDriverServer')
else:
    raise Exception('Invalid Browser Type')
Used below code to set driver type based on need of Headless / Head for both Firefox and chrome:

// Can pass browser type 

if brower.lower() == 'chrome':
    driver = webdriver.Chrome('..\drivers\chromedriver')
elif brower.lower() == 'headless chrome':
    ch_Options = Options()
    ch_Options.add_argument('--headless')
    ch_Options.add_argument("--disable-gpu")
    driver = webdriver.Chrome('..\drivers\chromedriver',options=ch_Options)
elif brower.lower() == 'firefox':
    driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe')
elif brower.lower() == 'headless firefox':
    ff_option = FFOption()
    ff_option.add_argument('--headless')
    ff_option.add_argument("--disable-gpu")
    driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe', options=ff_option)
elif brower.lower() == 'ie':
    driver = webdriver.Ie('..\drivers\IEDriverServer')
else:
    raise Exception('Invalid Browser Type')

异步实际上是如何工作的?

问题:异步实际上是如何工作的?

这个问题是由我的另一个问题引起的:如何在cdef中等待?

网路上有关于的大量文章和网志文章asyncio,但它们都是非常肤浅的。我找不到任何有关如何asyncio实际实现以及使I / O异步的信息。我正在尝试阅读源代码,但是它是数千行,不是最高等级的C代码,其中很多处理辅助对象,但是最关键的是,很难在Python语法和它将翻译的C代码之间进行连接入。

Asycnio自己的文档甚至没有帮助。那里没有关于它如何工作的信息,只有一些有关如何使用它的指南,有时也会引起误解/写得很差。

我熟悉Go的协程实现,并希望Python做同样的事情。如果是这样的话,我在上面链接的帖子中出现的代码将奏效。既然没有,我现在想找出原因。到目前为止,我最好的猜测如下,请纠正我错的地方:

  1. 形式的过程定义async def foo(): ...实际上被解释为类继承的方法coroutine
  2. 也许async def实际上是通过await语句分为多个方法,在这些方法上被调用的对象能够跟踪到目前为止执行所取得的进展。
  3. 如果上述条件成立,那么从本质上讲,协程的执行归结为某个全局管理器调用循环对象的方法(循环?)。
  4. 全局管理器以某种方式(如何?)知道何时由Python代码执行I / O操作(仅?),并且能够选择当前执行方法放弃控制后执行的待处理协程方法之一(命中该await语句) )。

换句话说,这是我尝试将某些asyncio语法“简化”为更易于理解的内容:

async def coro(name):
    print('before', name)
    await asyncio.sleep()
    print('after', name)

asyncio.gather(coro('first'), coro('second'))

# translated from async def coro(name)
class Coro(coroutine):
    def before(self, name):
        print('before', name)

    def after(self, name):
        print('after', name)

    def __init__(self, name):
        self.name = name
        self.parts = self.before, self.after
        self.pos = 0

    def __call__():
        self.parts[self.pos](self.name)
        self.pos += 1

    def done(self):
        return self.pos == len(self.parts)


# translated from asyncio.gather()
class AsyncIOManager:

    def gather(*coros):
        while not every(c.done() for c in coros):
            coro = random.choice(coros)
            coro()

如果我的猜测证明是正确的:那么我有一个问题。在这种情况下,I / O实际如何发生?在单独的线程中?整个解释器是否已暂停并且I / O在解释器外部进行?I / O到底是什么意思?如果我的python过程称为C open()过程,然后它又向内核发送了中断,放弃了对它的控制,那么Python解释器如何知道这一点并能够继续运行其他代码,而内核代码则执行实际的I / O,直到它唤醒了最初发送中断的Python过程?原则上,Python解释器如何知道这种情况?

This question is motivated by my another question: How to await in cdef?

There are tons of articles and blog posts on the web about asyncio, but they are all very superficial. I couldn’t find any information about how asyncio is actually implemented, and what makes I/O asynchronous. I was trying to read the source code, but it’s thousands of lines of not the highest grade C code, a lot of which deals with auxiliary objects, but most crucially, it is hard to connect between Python syntax and what C code it would translate into.

Asycnio’s own documentation is even less helpful. There’s no information there about how it works, only some guidelines about how to use it, which are also sometimes misleading / very poorly written.

I’m familiar with Go’s implementation of coroutines, and was kind of hoping that Python did the same thing. If that was the case, the code I came up in the post linked above would have worked. Since it didn’t, I’m now trying to figure out why. My best guess so far is as follows, please correct me where I’m wrong:

  1. Procedure definitions of the form async def foo(): ... are actually interpreted as methods of a class inheriting coroutine.
  2. Perhaps, async def is actually split into multiple methods by await statements, where the object, on which these methods are called is able to keep track of the progress it made through the execution so far.
  3. If the above is true, then, essentially, execution of a coroutine boils down to calling methods of coroutine object by some global manager (loop?).
  4. The global manager is somehow (how?) aware of when I/O operations are performed by Python (only?) code and is able to choose one of the pending coroutine methods to execute after the current executing method relinquished control (hit on the await statement).

In other words, here’s my attempt at “desugaring” of some asyncio syntax into something more understandable:

async def coro(name):
    print('before', name)
    await asyncio.sleep()
    print('after', name)

asyncio.gather(coro('first'), coro('second'))

# translated from async def coro(name)
class Coro(coroutine):
    def before(self, name):
        print('before', name)

    def after(self, name):
        print('after', name)

    def __init__(self, name):
        self.name = name
        self.parts = self.before, self.after
        self.pos = 0

    def __call__():
        self.parts[self.pos](self.name)
        self.pos += 1

    def done(self):
        return self.pos == len(self.parts)


# translated from asyncio.gather()
class AsyncIOManager:

    def gather(*coros):
        while not every(c.done() for c in coros):
            coro = random.choice(coros)
            coro()

Should my guess prove correct: then I have a problem. How does I/O actually happen in this scenario? In a separate thread? Is the whole interpreter suspended and I/O happens outside the interpreter? What exactly is meant by I/O? If my python procedure called C open() procedure, and it in turn sent interrupt to kernel, relinquishing control to it, how does Python interpreter know about this and is able to continue running some other code, while kernel code does the actual I/O and until it wakes up the Python procedure which sent the interrupt originally? How can Python interpreter in principle, be aware of this happening?


回答 0

asyncio如何工作?

在回答这个问题之前,我们需要了解一些基本术语,如果您已经知道一些基本术语,请跳过这些基本术语。

生成器

生成器是使我们能够暂停python函数执行的对象。用户策划的生成器使用关键字实现yield。通过创建包含yield关键字的普通函数,我们将该函数转换为生成器:

>>> def test():
...     yield 1
...     yield 2
...
>>> gen = test()
>>> next(gen)
1
>>> next(gen)
2
>>> next(gen)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

如您所见,调用next()生成器会导致解释器加载测试的帧,并返回yielded值。next()再次调用,使框架再次加载到解释器堆栈中,并继续yield输入另一个值。

到第三次next()调用时,我们的生成器完成了StopIteration并被抛出。

与生成器通讯

生成器的鲜为人知的特点是,你可以与他们使用两种方法进行通信的事实:send()throw()

>>> def test():
...     val = yield 1
...     print(val)
...     yield 2
...     yield 3
...
>>> gen = test()
>>> next(gen)
1
>>> gen.send("abc")
abc
2
>>> gen.throw(Exception())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in test
Exception

调用时gen.send(),该值作为yield关键字的返回值传递。

gen.throw()另一方面,允许在生成器中引发Exception,但在同一位置引发了异常yield

从生成器返回值

从生成器返回一个值,结果将该值放入StopIteration异常中。稍后我们可以从异常中恢复值,并根据需要使用它。

>>> def test():
...     yield 1
...     return "abc"
...
>>> gen = test()
>>> next(gen)
1
>>> try:
...     next(gen)
... except StopIteration as exc:
...     print(exc.value)
...
abc

看,一个新的关键字: yield from

Python 3.4附带了一个新关键字:yield from。什么是关键字允许我们做的,是通过对任何next()send()throw()成为最内嵌套的生成器。如果内部生成器返回一个值,则它也是的返回值yield from

>>> def inner():
...     inner_result = yield 2
...     print('inner', inner_result)
...     return 3
...
>>> def outer():
...     yield 1
...     val = yield from inner()
...     print('outer', val)
...     yield 4
...
>>> gen = outer()
>>> next(gen)
1
>>> next(gen) # Goes inside inner() automatically
2
>>> gen.send("abc")
inner abc
outer 3
4

我写了一篇文章进一步阐述这个话题。

放在一起

yield from在Python 3.4中引入了new关键字之后,我们现在能够在生成器内部创建生成器,就像隧道一样,将数据从最内层生成器来回传递到最外层生成器。这为生成器- 协程产生了新的含义。

协程是可以在运行时停止和恢复的功能。在Python中,它们是使用async def关键字定义的。就像生成器一样,它们也使用自己的形式,yield fromawait。之前asyncawait被在Python 3.5推出,我们创建了创建完全相同的方式生成协同程序(带yield from代替await)。

async def inner():
    return 1

async def outer():
    await inner()

像实现该__iter__()方法的每个迭代器或生成器一样,协程实现__await__()也允许它们每次都继续执行await coro

有一个很好的序列图里面Python文档,你应该看看。

在异步中,除了协程功能外,我们还有两个重要的对象:任务期货

期货

期货是已__await__()实现该方法的对象,其任务是保持某种状态和结果。状态可以是以下之一:

  1. 待处理-未来未设置任何结果或exceptions。
  2. 已取消-将来已使用取消 fut.cancel()
  3. 完成-将来通过使用的结果集fut.set_result()或使用的异常集完成fut.set_exception()

就像您猜到的那样,结果可能是将返回的Python对象,也可能是引发异常的对象。

对象的另一个重要特征future是它们包含一个称为的方法add_done_callback()。此方法允许在任务完成后立即调用函数-无论是引发异常还是完成。

任务

任务对象是特殊的期货,它们围绕着协程,并与最内部和最外部的协程进行通信。每当协程成为await未来时,未来都会一直传递到任务中(就像中的一样yield from),任务会接收它。

接下来,任务将自己绑定到未来。它通过呼吁add_done_callback()未来来做到这一点。从现在开始,如果将来能够实现,通过取消,传递异常或传递Python对象作为结果,任务的回调将被调用,并将恢复为存在。

异步

我们必须回答的最后一个亟待解决的问题是-IO如何实现?

在异步内部,我们有一个事件循环。任务的事件循环。事件循环的工作是在每次准备就绪时调用任务,并将所有工作协调到一台工作机中。

事件循环的IO部分建立在一个称为的关键功能上select。Select是一种阻止功能,由下面的操作系统实现,它允许在套接字上等待传入或传出数据。接收到数据后,它将唤醒,并返回接收到数据的套接字或准备写入的套接字。

当您尝试通过asyncio通过套接字接收或发送数据时,下面实际发生的情况是,首先检查套接字是否有任何可以立即读取或发送的数据。如果其.send()缓冲区已满,或者.recv()缓冲区为空,则将套接字注册到该select函数(只需将其添加到rlistfor recvwlistfor 列表之一send)中,并将适当的函数(await新创建的future对象)绑定到该套接字。

当所有可用任务都在等待将来时,事件循环将调用select并等待。当其中一个套接字有传入数据,或者其send缓冲区耗尽时,asyncio会检查与该套接字绑定的将来对象,并将其设置为完成。

现在所有的魔术都发生了。未来已经完成,之前添加的任务又恢复了活力add_done_callback(),并调用.send()协程以恢复最内部的协程(由于该await链),并且您从附近的缓冲区读取了新接收到的数据被溅到了。

在以下情况下,再次使用方法链recv()

  1. select.select 等待。
  2. 准备好套接字,其中包含数据。
  3. 来自套接字的数据被移入缓冲区。
  4. future.set_result() 叫做。
  5. 添加自己的任务add_done_callback()现在被唤醒。
  6. Task调用.send()协程,协程将一直进入最内层的协程并唤醒它。
  7. 数据正在从缓冲区中读取,并返回给我们谦虚的用户。

总而言之,asyncio使用生成器功能,该功能允许暂停和恢复功能。它使用的yield from功能允许将数据从最内层生成器来回传递到最外层。它使用所有这些命令,以便在等待IO完成(通过使用OS select功能)时停止功能执行。

而最好的呢?当一种功能暂停时,另一种功能可能会运行并与精致的结构(即异步)交错。

How does asyncio work?

Before answering this question we need to understand a few base terms, skip these if you already know any of them.

Generators

Generators are objects that allow us to suspend the execution of a python function. User curated generators are implement using the keyword yield. By creating a normal function containing the yield keyword, we turn that function into a generator:

>>> def test():
...     yield 1
...     yield 2
...
>>> gen = test()
>>> next(gen)
1
>>> next(gen)
2
>>> next(gen)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

As you can see, calling next() on the generator causes the interpreter to load test’s frame, and return the yielded value. Calling next() again, cause the frame to load again into the interpreter stack, and continue on yielding another value.

By the third time next() is called, our generator was finished, and StopIteration was thrown.

Communicating with a generator

A less-known feature of generators, is the fact that you can communicate with them using two methods: send() and throw().

>>> def test():
...     val = yield 1
...     print(val)
...     yield 2
...     yield 3
...
>>> gen = test()
>>> next(gen)
1
>>> gen.send("abc")
abc
2
>>> gen.throw(Exception())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in test
Exception

Upon calling gen.send(), the value is passed as a return value from the yield keyword.

gen.throw() on the other hand, allows throwing Exceptions inside generators, with the exception raised at the same spot yield was called.

Returning values from generators

Returning a value from a generator, results in the value being put inside the StopIteration exception. We can later on recover the value from the exception and use it to our need.

>>> def test():
...     yield 1
...     return "abc"
...
>>> gen = test()
>>> next(gen)
1
>>> try:
...     next(gen)
... except StopIteration as exc:
...     print(exc.value)
...
abc

Behold, a new keyword: yield from

Python 3.4 came with the addition of a new keyword: yield from. What that keyword allows us to do, is pass on any next(), send() and throw() into an inner-most nested generator. If the inner generator returns a value, it is also the return value of yield from:

>>> def inner():
...     inner_result = yield 2
...     print('inner', inner_result)
...     return 3
...
>>> def outer():
...     yield 1
...     val = yield from inner()
...     print('outer', val)
...     yield 4
...
>>> gen = outer()
>>> next(gen)
1
>>> next(gen) # Goes inside inner() automatically
2
>>> gen.send("abc")
inner abc
outer 3
4

I’ve written an article to further elaborate on this topic.

Putting it all together

Upon introducing the new keyword yield from in Python 3.4, we were now able to create generators inside generators that just like a tunnel, pass the data back and forth from the inner-most to the outer-most generators. This has spawned a new meaning for generators – coroutines.

Coroutines are functions that can be stopped and resumed while being run. In Python, they are defined using the async def keyword. Much like generators, they too use their own form of yield from which is await. Before async and await were introduced in Python 3.5, we created coroutines in the exact same way generators were created (with yield from instead of await).

async def inner():
    return 1

async def outer():
    await inner()

Like every iterator or generator that implement the __iter__() method, coroutines implement __await__() which allows them to continue on every time await coro is called.

There’s a nice sequence diagram inside the Python docs that you should check out.

In asyncio, apart from coroutine functions, we have 2 important objects: tasks and futures.

Futures

Futures are objects that have the __await__() method implemented, and their job is to hold a certain state and result. The state can be one of the following:

  1. PENDING – future does not have any result or exception set.
  2. CANCELLED – future was cancelled using fut.cancel()
  3. FINISHED – future was finished, either by a result set using fut.set_result() or by an exception set using fut.set_exception()

The result, just like you have guessed, can either be a Python object, that will be returned, or an exception which may be raised.

Another important feature of future objects, is that they contain a method called add_done_callback(). This method allows functions to be called as soon as the task is done – whether it raised an exception or finished.

Tasks

Task objects are special futures, which wrap around coroutines, and communicate with the inner-most and outer-most coroutines. Every time a coroutine awaits a future, the future is passed all the way back to the task (just like in yield from), and the task receives it.

Next, the task binds itself to the future. It does so by calling add_done_callback() on the future. From now on, if the future will ever be done, by either being cancelled, passed an exception or passed a Python object as a result, the task’s callback will be called, and it will rise back up to existence.

Asyncio

The final burning question we must answer is – how is the IO implemented?

Deep inside asyncio, we have an event loop. An event loop of tasks. The event loop’s job is to call tasks every time they are ready and coordinate all that effort into one single working machine.

The IO part of the event loop is built upon a single crucial function called select. Select is a blocking function, implemented by the operating system underneath, that allows waiting on sockets for incoming or outgoing data. Upon data being received it wakes up, and returns the sockets which received data, or the sockets whom are ready for writing.

When you try to receive or send data over a socket through asyncio, what actually happens below is that the socket is first checked if it has any data that can be immediately read or sent. If its .send() buffer is full, or the .recv() buffer is empty, the socket is registered to the select function (by simply adding it to one of the lists, rlist for recv and wlist for send) and the appropriate function awaits a newly created future object, tied to that socket.

When all available tasks are waiting for futures, the event loop calls select and waits. When the one of the sockets has incoming data, or its send buffer drained up, asyncio checks for the future object tied to that socket, and sets it to done.

Now all the magic happens. The future is set to done, the task that added itself before with add_done_callback() rises up back to life, and calls .send() on the coroutine which resumes the inner-most coroutine (because of the await chain) and you read the newly received data from a nearby buffer it was spilled unto.

Method chain again, in case of recv():

  1. select.select waits.
  2. A ready socket, with data is returned.
  3. Data from the socket is moved into a buffer.
  4. future.set_result() is called.
  5. Task that added itself with add_done_callback() is now woken up.
  6. Task calls .send() on the coroutine which goes all the way into the inner-most coroutine and wakes it up.
  7. Data is being read from the buffer and returned to our humble user.

In summary, asyncio uses generator capabilities, that allow pausing and resuming functions. It uses yield from capabilities that allow passing data back and forth from the inner-most generator to the outer-most. It uses all of those in order to halt function execution while it’s waiting for IO to complete (by using the OS select function).

And the best of all? While one function is paused, another may run and interleave with the delicate fabric, which is asyncio.


回答 1

谈论async/awaitasyncio不是一回事。第一个是基本的低级构造(协程),而第二个是使用这些构造的库。相反,没有单一的最终答案。

下面是如何的一般说明async/awaitasyncio样库的工作。也就是说,可能还有其他的技巧(有…),但是除非您自己构建它们,否则它们是无关紧要的。除非您已经足够知道不必提出这样的问题,否则差异应该可以忽略不计。

1.坚果壳中的协程与子程序

就像子例程(函数,过程,…)一样,协程(生成器,…)是调用堆栈和指令指针的抽象:有执行代码段的堆栈,每个执行段都是特定的指令。

defvs 的区别async def只是为了清楚起见。实际的差别是returnyield。从此,awaityield from从单个调用到整个堆栈取不同。

1.1。子程序

子例程表示一个新的堆栈级别,用于保存局部变量,并且单次遍历其指令即可到达末尾。考虑这样的子例程:

def subfoo(bar):
     qux = 3
     return qux * bar

当您运行它时,这意味着

  1. bar和分配堆栈空间qux
  2. 递归执行第一个语句并跳转到下一个语句
  3. 一次return,将其值推入调用堆栈
  4. 清除堆栈(1.)和指令指针(2.)

值得注意的是,4.表示子例程始终以相同的状态开始。该功能本身专有的所有内容在完成后都会丢失。即使后面有说明,也无法恢复功能return

root -\
  :    \- subfoo --\
  :/--<---return --/
  |
  V

1.2。协程作为持久子例程

协程就像一个子例程,但是可以在破坏其状态的情况下退出。考虑这样的协程:

 def cofoo(bar):
      qux = yield bar  # yield marks a break point
      return qux

当您运行它时,这意味着

  1. bar和分配堆栈空间qux
  2. 递归执行第一个语句并跳转到下一个语句
    1. 一次yield,将其值压入调用堆栈,但存储堆栈和指令指针
    2. 一旦调用yield,恢复堆栈和指令指针并将参数推入qux
  3. 一次return,将其值推入调用堆栈
  4. 清除堆栈(1.)和指令指针(2.)

请注意,添加了2.1和2.2-协程可以在预定的位置挂起并恢复。这类似于在调用另一个子例程期间暂停子例程的方式。区别在于活动协程并不严格绑定到其调用堆栈。相反,悬挂的协程是单独的隔离堆栈的一部分。

root -\
  :    \- cofoo --\
  :/--<+--yield --/
  |    :
  V    :

这意味着悬浮的协程可以在堆栈之间自由存储或移动。任何有权访问协程的调用堆栈都可以决定恢复它。

1.3。遍历调用栈

到目前为止,我们的协程仅在调用堆栈中yield。子程序可以去和高达调用堆栈return()。为了完整性,协程还需要一种机制来提升调用堆栈。考虑这样的协程:

def wrap():
    yield 'before'
    yield from cofoo()
    yield 'after'

当您运行它时,这意味着它仍然像子例程一样分配堆栈和指令指针。当它挂起时,仍然就像存储一个子例程。

然而,yield from确实两者。它挂起堆栈wrap 运行指令cofoo。请注意,它将wrap保持挂起状态,直到cofoo完全完成。每当cofoo挂起或发送任何内容时,cofoo都直接连接到调用堆栈。

1.4。协程一直向下

如建立的那样,yield from允许将两个示波器连接到另一个中间示波器。递归应用时,这意味着堆栈的顶部可以连接到堆栈的底部

root -\
  :    \-> coro_a -yield-from-> coro_b --\
  :/ <-+------------------------yield ---/
  |    :
  :\ --+-- coro_a.send----------yield ---\
  :                             coro_b <-/

请注意,rootcoro_b不知道对方。这使得协程比回调更干净:协程仍然像子例程一样建立在1:1关系上。协程将暂停并恢复其整个现有执行堆栈,直到常规调用点为止。

值得注意的是,root可以恢复任意数量的协程。但是,它永远不能同时恢复多个。同一根的协程是并发的,但不是并行的!

1.5。Python的asyncawait

到目前为止,该解释已明确使用生成器的yieldyield from词汇-基本功能相同。新的Python3.5语法asyncawait主要是为了清楚起见。

def foo():  # subroutine?
     return None

def foo():  # coroutine?
     yield from foofoo()  # generator? coroutine?

async def foo():  # coroutine!
     await foofoo()  # coroutine!
     return None

需要使用async forand async with语句,因为您将yield from/await使用裸露的forand with语句断开链接。

2.简单事件循环的剖析

就一个协程本身而言,没有控制其他协程的概念。它只能对协程堆栈底部的调用者产生控制权。然后,此调用者可以切换到另一个协程并运行它。

几个协程的根节点通常是一个事件循环:在挂起时,协程会产生一个事件,并在该事件上恢复。反过来,事件循环能够有效地等待这些事件发生。这使它可以决定接下来要运行哪个协程,或在恢复之前如何等待。

这种设计意味着循环可以理解一组预定义的事件。几个协程await相互配合,直到最终完成一个事件await。该事件可以通过控制直接与事件循环通信yield

loop -\
  :    \-> coroutine --await--> event --\
  :/ <-+----------------------- yield --/
  |    :
  |    :  # loop waits for event to happen
  |    :
  :\ --+-- send(reply) -------- yield --\
  :        coroutine <--yield-- event <-/

关键是协程暂停允许事件循环和事件直接通信。中间协程堆栈不需要任何有关运行哪个循环或事件如何工作的知识。

2.1.1。及时事件

要处理的最简单事件是到达某个时间点。这也是线程代码的基本块:线程重复sleeps直到条件成立。但是,常规规则sleep本身会阻止执行-我们希望其他协程不被阻止。相反,我们想告诉事件循环何时应恢复当前协程堆栈。

2.1.2。定义事件

事件只是我们可以识别的值-通过枚举,类型或其他标识。我们可以使用存储目标时间的简单类来定义它。除了存储事件信息之外,我们还可以await直接允许一个类。

class AsyncSleep:
    """Event to sleep until a point in time"""
    def __init__(self, until: float):
        self.until = until

    # used whenever someone ``await``s an instance of this Event
    def __await__(self):
        # yield this Event to the loop
        yield self

    def __repr__(self):
        return '%s(until=%.1f)' % (self.__class__.__name__, self.until)

此类仅存储事件-它没有说明如何实际处理它。

唯一的特殊功能是__await__await关键字寻找的内容。实际上,它是一个迭代器,但不适用于常规迭代机制。

2.2.1。等待事件

现在我们有了一个事件,协程对此有何反应?我们应该能够表达相当于sleepawait荷兰国际集团我们的活动。为了更好地了解发生了什么,我们将等待一半的时间两次:

import time

async def asleep(duration: float):
    """await that ``duration`` seconds pass"""
    await AsyncSleep(time.time() + duration / 2)
    await AsyncSleep(time.time() + duration / 2)

我们可以直接实例化并运行此协程。类似于生成器,使用coroutine.send运行协程直到得到yield结果。

coroutine = asleep(100)
while True:
    print(coroutine.send(None))
    time.sleep(0.1)

这给了我们两个AsyncSleep事件,然后是StopIteration协程完成的一个事件。请注意,唯一的延迟来自time.sleep循环!每个AsyncSleep仅存储当前时间的偏移量。

2.2.2。活动+睡眠

目前,我们有两种独立的机制可供使用:

  • AsyncSleep 可以从协程内部产生的事件
  • time.sleep 可以等待而不会影响协程

值得注意的是,这两个是正交的:一个都不影响或触发另一个。结果,我们可以提出自己的策略sleep来应对延迟AsyncSleep

2.3。天真的事件循环

如果我们有几个协程,每个协程可以告诉我们何时要唤醒它。然后,我们可以等到第一个恢复之前,然后再恢复,依此类推。值得注意的是,在每一点上我们只关心下一个

这样可以进行简单的调度:

  1. 按照所需的唤醒时间对协程进行排序
  2. 选择第一个想要唤醒的人
  3. 等到这个时间点
  4. 运行这个协程
  5. 从1开始重复。

一个简单的实现不需要任何高级概念。A list允许按日期对协程进行排序。等待是有规律的time.sleep。运行协程的工作方式与之前一样coroutine.send

def run(*coroutines):
    """Cooperatively run all ``coroutines`` until completion"""
    # store wake-up-time and coroutines
    waiting = [(0, coroutine) for coroutine in coroutines]
    while waiting:
        # 2. pick the first coroutine that wants to wake up
        until, coroutine = waiting.pop(0)
        # 3. wait until this point in time
        time.sleep(max(0.0, until - time.time()))
        # 4. run this coroutine
        try:
            command = coroutine.send(None)
        except StopIteration:
            continue
        # 1. sort coroutines by their desired suspension
        if isinstance(command, AsyncSleep):
            waiting.append((command.until, coroutine))
            waiting.sort(key=lambda item: item[0])

当然,这还有很大的改进空间。我们可以将堆用于等待队列,或者将调度表用于事件。我们还可以从中获取返回值,StopIteration并将其分配给协程。但是,基本原理保持不变。

2.4。合作等待

AsyncSleep事件和run事件循环是定时事件的工作完全实现。

async def sleepy(identifier: str = "coroutine", count=5):
    for i in range(count):
        print(identifier, 'step', i + 1, 'at %.2f' % time.time())
        await asleep(0.1)

run(*(sleepy("coroutine %d" % j) for j in range(5)))

这将在五个协程中的每个协程之间进行协作切换,每个协程暂停0.1秒。即使事件循环是同步的,它仍然可以在0.5秒而不是2.5秒内执行工作。每个协程保持状态并独立运行。

3. I / O事件循环

支持的事件循环sleep适用于轮询。但是,等待文件句柄上的I / O可以更有效地完成:操作系统实现I / O,因此知道哪些句柄已准备就绪。理想情况下,事件循环应支持显式的“ Ready for I / O”事件。

3.1。该select呼叫

Python已经有一个接口可以查询OS的读取I / O句柄。当调用带有读取或写入的句柄时,它返回准备读取或写入的句柄:

readable, writeable, _ = select.select(rlist, wlist, xlist, timeout)

例如,我们可以open写入文件并等待其准备就绪:

write_target = open('/tmp/foo')
readable, writeable, _ = select.select([], [write_target], [])

select返回后,writeable包含我们的打开文件。

3.2。基本I / O事件

AsyncSleep请求类似,我们需要为I / O定义一个事件。使用底层select逻辑,事件必须引用可读对象-例如open文件。另外,我们存储要读取的数据量。

class AsyncRead:
    def __init__(self, file, amount=1):
        self.file = file
        self.amount = amount
        self._buffer = ''

    def __await__(self):
        while len(self._buffer) < self.amount:
            yield self
            # we only get here if ``read`` should not block
            self._buffer += self.file.read(1)
        return self._buffer

    def __repr__(self):
        return '%s(file=%s, amount=%d, progress=%d)' % (
            self.__class__.__name__, self.file, self.amount, len(self._buffer)
        )

AsyncSleep我们一样,我们大多只是存储底层系统调用所需的数据。这次__await__可以恢复多次-直到我们的需求amount被阅读为止。另外,我们return的I / O结果不只是恢复。

3.3。使用读取的I / O增强事件循环

事件循环的基础仍然是run先前定义的。首先,我们需要跟踪读取请求。这不再是排序的时间表,我们仅将读取请求映射到协程。

# new
waiting_read = {}  # type: Dict[file, coroutine]

由于select.select采用了超时参数,因此可以代替time.sleep

# old
time.sleep(max(0.0, until - time.time()))
# new
readable, _, _ = select.select(list(reads), [], [])

这将为我们提供所有可读文件-如果有的话,我们将运行相应的协程。如果没有,我们已经等待了足够长的时间来运行当前的协程。

# new - reschedule waiting coroutine, run readable coroutine
if readable:
    waiting.append((until, coroutine))
    waiting.sort()
    coroutine = waiting_read[readable[0]]

最后,我们必须实际侦听读取请求。

# new
if isinstance(command, AsyncSleep):
    ...
elif isinstance(command, AsyncRead):
    ...

3.4。把它放在一起

上面有点简化。如果我们总是可以阅读的话,我们需要做一些切换,以免饿死协程。我们需要处理没有阅读或等待的东西。但是,最终结果仍适合30 LOC。

def run(*coroutines):
    """Cooperatively run all ``coroutines`` until completion"""
    waiting_read = {}  # type: Dict[file, coroutine]
    waiting = [(0, coroutine) for coroutine in coroutines]
    while waiting or waiting_read:
        # 2. wait until the next coroutine may run or read ...
        try:
            until, coroutine = waiting.pop(0)
        except IndexError:
            until, coroutine = float('inf'), None
            readable, _, _ = select.select(list(waiting_read), [], [])
        else:
            readable, _, _ = select.select(list(waiting_read), [], [], max(0.0, until - time.time()))
        # ... and select the appropriate one
        if readable and time.time() < until:
            if until and coroutine:
                waiting.append((until, coroutine))
                waiting.sort()
            coroutine = waiting_read.pop(readable[0])
        # 3. run this coroutine
        try:
            command = coroutine.send(None)
        except StopIteration:
            continue
        # 1. sort coroutines by their desired suspension ...
        if isinstance(command, AsyncSleep):
            waiting.append((command.until, coroutine))
            waiting.sort(key=lambda item: item[0])
        # ... or register reads
        elif isinstance(command, AsyncRead):
            waiting_read[command.file] = coroutine

3.5。协同I / O

AsyncSleepAsyncRead并且run实现已全功能的睡眠和/或读取。与相同sleepy,我们可以定义一个帮助程序来测试阅读:

async def ready(path, amount=1024*32):
    print('read', path, 'at', '%d' % time.time())
    with open(path, 'rb') as file:
        result = return await AsyncRead(file, amount)
    print('done', path, 'at', '%d' % time.time())
    print('got', len(result), 'B')

run(sleepy('background', 5), ready('/dev/urandom'))

运行此命令,我们可以看到我们的I / O与等待的任务交错:

id background round 1
read /dev/urandom at 1530721148
id background round 2
id background round 3
id background round 4
id background round 5
done /dev/urandom at 1530721148
got 1024 B

4.非阻塞I / O

虽然文件上的I / O可以理解这个概念,但它实际上并不适合于像这样的库asyncioselect调用总是返回文件,并且两者都调用,open并且read可能无限期地阻塞。这阻止了事件循环的所有协程-这很糟糕。诸如此类的库aiofiles使用线程和同步来伪造文件中的非阻塞I / O和事件。

但是,套接字确实允许无阻塞的I / O-并且它们固有的延迟使其变得更加关键。在事件循环中使用时,可以包装等待数据和重试而不会阻塞任何内容。

4.1。非阻塞I / O事件

与我们类似AsyncRead,我们可以为套接字定义一个暂停和读取事件。我们不使用文件,而是使用套接字-该套接字必须是非阻塞的。另外,我们__await__使用socket.recv代替file.read

class AsyncRecv:
    def __init__(self, connection, amount=1, read_buffer=1024):
        assert not connection.getblocking(), 'connection must be non-blocking for async recv'
        self.connection = connection
        self.amount = amount
        self.read_buffer = read_buffer
        self._buffer = b''

    def __await__(self):
        while len(self._buffer) < self.amount:
            try:
                self._buffer += self.connection.recv(self.read_buffer)
            except BlockingIOError:
                yield self
        return self._buffer

    def __repr__(self):
        return '%s(file=%s, amount=%d, progress=%d)' % (
            self.__class__.__name__, self.connection, self.amount, len(self._buffer)
        )

与相比AsyncRead__await__执行真正的非阻塞I / O。当有数据时,它总是读取。如果没有可用数据,它将始终挂起。这意味着仅在我们执行有用的工作时才阻止事件循环。

4.2。解除阻塞事件循环

就事件循环而言,没有什么变化。要监听的事件仍然与文件相同-由标记为ready的文件描述符select

# old
elif isinstance(command, AsyncRead):
    waiting_read[command.file] = coroutine
# new
elif isinstance(command, AsyncRead):
    waiting_read[command.file] = coroutine
elif isinstance(command, AsyncRecv):
    waiting_read[command.connection] = coroutine

在这一点上,显然与AsyncReadAsyncRecv是同一种事件。我们可以轻松地将它们重构为一个具有可交换I / O组件的事件。实际上,事件循环,协程和事件调度程序,任意中间代码和实际I / O 清晰地分开

4.3。非阻塞I / O的丑陋一面

原则上,你应该在这一点上做的是复制的逻辑read作为recvAsyncRecv。但是,这现在变得更加丑陋-当函数在内核内部阻塞时,您必须处理早期返回,但要对您产生控制权。例如,打开连接与打开文件的时间更长:

# file
file = open(path, 'rb')
# non-blocking socket
connection = socket.socket()
connection.setblocking(False)
# open without blocking - retry on failure
try:
    connection.connect((url, port))
except BlockingIOError:
    pass

长话短说,剩下的就是几十行异常处理。此时事件和事件循环已经起作用。

id background round 1
read localhost:25000 at 1530783569
read /dev/urandom at 1530783569
done localhost:25000 at 1530783569 got 32768 B
id background round 2
id background round 3
id background round 4
done /dev/urandom at 1530783569 got 4096 B
id background round 5

附录

github上的示例代码

Talking about async/await and asyncio is not the same thing. The first is a fundamental, low-level construct (coroutines) while the later is a library using these constructs. Conversely, there is no single ultimate answer.

The following is a general description of how async/await and asyncio-like libraries work. That is, there may be other tricks on top (there are…) but they are inconsequential unless you build them yourself. The difference should be negligible unless you already know enough to not have to ask such a question.

1. Coroutines versus subroutines in a nut shell

Just like subroutines (functions, procedures, …), coroutines (generators, …) are an abstraction of call stack and instruction pointer: there is a stack of executing code pieces, and each is at a specific instruction.

The distinction of def versus async def is merely for clarity. The actual difference is return versus yield. From this, await or yield from take the difference from individual calls to entire stacks.

1.1. Subroutines

A subroutine represents a new stack level to hold local variables, and a single traversal of its instructions to reach an end. Consider a subroutine like this:

def subfoo(bar):
     qux = 3
     return qux * bar

When you run it, that means

  1. allocate stack space for bar and qux
  2. recursively execute the first statement and jump to the next statement
  3. once at a return, push its value to the calling stack
  4. clear the stack (1.) and instruction pointer (2.)

Notably, 4. means that a subroutine always starts at the same state. Everything exclusive to the function itself is lost upon completion. A function cannot be resumed, even if there are instructions after return.

root -\
  :    \- subfoo --\
  :/--<---return --/
  |
  V

1.2. Coroutines as persistent subroutines

A coroutine is like a subroutine, but can exit without destroying its state. Consider a coroutine like this:

 def cofoo(bar):
      qux = yield bar  # yield marks a break point
      return qux

When you run it, that means

  1. allocate stack space for bar and qux
  2. recursively execute the first statement and jump to the next statement
    1. once at a yield, push its value to the calling stack but store the stack and instruction pointer
    2. once calling into yield, restore stack and instruction pointer and push arguments to qux
  3. once at a return, push its value to the calling stack
  4. clear the stack (1.) and instruction pointer (2.)

Note the addition of 2.1 and 2.2 – a coroutine can be suspended and resumed at predefined points. This is similar to how a subroutine is suspended during calling another subroutine. The difference is that the active coroutine is not strictly bound to its calling stack. Instead, a suspended coroutine is part of a separate, isolated stack.

root -\
  :    \- cofoo --\
  :/--<+--yield --/
  |    :
  V    :

This means that suspended coroutines can be freely stored or moved between stacks. Any call stack that has access to a coroutine can decide to resume it.

1.3. Traversing the call stack

So far, our coroutine only goes down the call stack with yield. A subroutine can go down and up the call stack with return and (). For completeness, coroutines also need a mechanism to go up the call stack. Consider a coroutine like this:

def wrap():
    yield 'before'
    yield from cofoo()
    yield 'after'

When you run it, that means it still allocates the stack and instruction pointer like a subroutine. When it suspends, that still is like storing a subroutine.

However, yield from does both. It suspends stack and instruction pointer of wrap and runs cofoo. Note that wrap stays suspended until cofoo finishes completely. Whenever cofoo suspends or something is sent, cofoo is directly connected to the calling stack.

1.4. Coroutines all the way down

As established, yield from allows to connect two scopes across another intermediate one. When applied recursively, that means the top of the stack can be connected to the bottom of the stack.

root -\
  :    \-> coro_a -yield-from-> coro_b --\
  :/ <-+------------------------yield ---/
  |    :
  :\ --+-- coro_a.send----------yield ---\
  :                             coro_b <-/

Note that root and coro_b do not know about each other. This makes coroutines much cleaner than callbacks: coroutines still built on a 1:1 relation like subroutines. Coroutines suspend and resume their entire existing execution stack up until a regular call point.

Notably, root could have an arbitrary number of coroutines to resume. Yet, it can never resume more than one at the same time. Coroutines of the same root are concurrent but not parallel!

1.5. Python’s async and await

The explanation has so far explicitly used the yield and yield from vocabulary of generators – the underlying functionality is the same. The new Python3.5 syntax async and await exists mainly for clarity.

def foo():  # subroutine?
     return None

def foo():  # coroutine?
     yield from foofoo()  # generator? coroutine?

async def foo():  # coroutine!
     await foofoo()  # coroutine!
     return None

The async for and async with statements are needed because you would break the yield from/await chain with the bare for and with statements.

2. Anatomy of a simple event loop

By itself, a coroutine has no concept of yielding control to another coroutine. It can only yield control to the caller at the bottom of a coroutine stack. This caller can then switch to another coroutine and run it.

This root node of several coroutines is commonly an event loop: on suspension, a coroutine yields an event on which it wants resume. In turn, the event loop is capable of efficiently waiting for these events to occur. This allows it to decide which coroutine to run next, or how to wait before resuming.

Such a design implies that there is a set of pre-defined events that the loop understands. Several coroutines await each other, until finally an event is awaited. This event can communicate directly with the event loop by yielding control.

loop -\
  :    \-> coroutine --await--> event --\
  :/ <-+----------------------- yield --/
  |    :
  |    :  # loop waits for event to happen
  |    :
  :\ --+-- send(reply) -------- yield --\
  :        coroutine <--yield-- event <-/

The key is that coroutine suspension allows the event loop and events to directly communicate. The intermediate coroutine stack does not require any knowledge about which loop is running it, nor how events work.

2.1.1. Events in time

The simplest event to handle is reaching a point in time. This is a fundamental block of threaded code as well: a thread repeatedly sleeps until a condition is true. However, a regular sleep blocks execution by itself – we want other coroutines to not be blocked. Instead, we want tell the event loop when it should resume the current coroutine stack.

2.1.2. Defining an Event

An event is simply a value we can identify – be it via an enum, a type or other identity. We can define this with a simple class that stores our target time. In addition to storing the event information, we can allow to await a class directly.

class AsyncSleep:
    """Event to sleep until a point in time"""
    def __init__(self, until: float):
        self.until = until

    # used whenever someone ``await``s an instance of this Event
    def __await__(self):
        # yield this Event to the loop
        yield self
    
    def __repr__(self):
        return '%s(until=%.1f)' % (self.__class__.__name__, self.until)

This class only stores the event – it does not say how to actually handle it.

The only special feature is __await__ – it is what the await keyword looks for. Practically, it is an iterator but not available for the regular iteration machinery.

2.2.1. Awaiting an event

Now that we have an event, how do coroutines react to it? We should be able to express the equivalent of sleep by awaiting our event. To better see what is going on, we wait twice for half the time:

import time

async def asleep(duration: float):
    """await that ``duration`` seconds pass"""
    await AsyncSleep(time.time() + duration / 2)
    await AsyncSleep(time.time() + duration / 2)

We can directly instantiate and run this coroutine. Similar to a generator, using coroutine.send runs the coroutine until it yields a result.

coroutine = asleep(100)
while True:
    print(coroutine.send(None))
    time.sleep(0.1)

This gives us two AsyncSleep events and then a StopIteration when the coroutine is done. Notice that the only delay is from time.sleep in the loop! Each AsyncSleep only stores an offset from the current time.

2.2.2. Event + Sleep

At this point, we have two separate mechanisms at our disposal:

  • AsyncSleep Events that can be yielded from inside a coroutine
  • time.sleep that can wait without impacting coroutines

Notably, these two are orthogonal: neither one affects or triggers the other. As a result, we can come up with our own strategy to sleep to meet the delay of an AsyncSleep.

2.3. A naive event loop

If we have several coroutines, each can tell us when it wants to be woken up. We can then wait until the first of them wants to be resumed, then for the one after, and so on. Notably, at each point we only care about which one is next.

This makes for a straightforward scheduling:

  1. sort coroutines by their desired wake up time
  2. pick the first that wants to wake up
  3. wait until this point in time
  4. run this coroutine
  5. repeat from 1.

A trivial implementation does not need any advanced concepts. A list allows to sort coroutines by date. Waiting is a regular time.sleep. Running coroutines works just like before with coroutine.send.

def run(*coroutines):
    """Cooperatively run all ``coroutines`` until completion"""
    # store wake-up-time and coroutines
    waiting = [(0, coroutine) for coroutine in coroutines]
    while waiting:
        # 2. pick the first coroutine that wants to wake up
        until, coroutine = waiting.pop(0)
        # 3. wait until this point in time
        time.sleep(max(0.0, until - time.time()))
        # 4. run this coroutine
        try:
            command = coroutine.send(None)
        except StopIteration:
            continue
        # 1. sort coroutines by their desired suspension
        if isinstance(command, AsyncSleep):
            waiting.append((command.until, coroutine))
            waiting.sort(key=lambda item: item[0])

Of course, this has ample room for improvement. We can use a heap for the wait queue or a dispatch table for events. We could also fetch return values from the StopIteration and assign them to the coroutine. However, the fundamental principle remains the same.

2.4. Cooperative Waiting

The AsyncSleep event and run event loop are a fully working implementation of timed events.

async def sleepy(identifier: str = "coroutine", count=5):
    for i in range(count):
        print(identifier, 'step', i + 1, 'at %.2f' % time.time())
        await asleep(0.1)

run(*(sleepy("coroutine %d" % j) for j in range(5)))

This cooperatively switches between each of the five coroutines, suspending each for 0.1 seconds. Even though the event loop is synchronous, it still executes the work in 0.5 seconds instead of 2.5 seconds. Each coroutine holds state and acts independently.

3. I/O event loop

An event loop that supports sleep is suitable for polling. However, waiting for I/O on a file handle can be done more efficiently: the operating system implements I/O and thus knows which handles are ready. Ideally, an event loop should support an explicit “ready for I/O” event.

3.1. The select call

Python already has an interface to query the OS for read I/O handles. When called with handles to read or write, it returns the handles ready to read or write:

readable, writeable, _ = select.select(rlist, wlist, xlist, timeout)

For example, we can open a file for writing and wait for it to be ready:

write_target = open('/tmp/foo')
readable, writeable, _ = select.select([], [write_target], [])

Once select returns, writeable contains our open file.

3.2. Basic I/O event

Similar to the AsyncSleep request, we need to define an event for I/O. With the underlying select logic, the event must refer to a readable object – say an open file. In addition, we store how much data to read.

class AsyncRead:
    def __init__(self, file, amount=1):
        self.file = file
        self.amount = amount
        self._buffer = ''

    def __await__(self):
        while len(self._buffer) < self.amount:
            yield self
            # we only get here if ``read`` should not block
            self._buffer += self.file.read(1)
        return self._buffer

    def __repr__(self):
        return '%s(file=%s, amount=%d, progress=%d)' % (
            self.__class__.__name__, self.file, self.amount, len(self._buffer)
        )

As with AsyncSleep we mostly just store the data required for the underlying system call. This time, __await__ is capable of being resumed multiple times – until our desired amount has been read. In addition, we return the I/O result instead of just resuming.

3.3. Augmenting an event loop with read I/O

The basis for our event loop is still the run defined previously. First, we need to track the read requests. This is no longer a sorted schedule, we only map read requests to coroutines.

# new
waiting_read = {}  # type: Dict[file, coroutine]

Since select.select takes a timeout parameter, we can use it in place of time.sleep.

# old
time.sleep(max(0.0, until - time.time()))
# new
readable, _, _ = select.select(list(reads), [], [])

This gives us all readable files – if there are any, we run the corresponding coroutine. If there are none, we have waited long enough for our current coroutine to run.

# new - reschedule waiting coroutine, run readable coroutine
if readable:
    waiting.append((until, coroutine))
    waiting.sort()
    coroutine = waiting_read[readable[0]]

Finally, we have to actually listen for read requests.

# new
if isinstance(command, AsyncSleep):
    ...
elif isinstance(command, AsyncRead):
    ...

3.4. Putting it together

The above was a bit of a simplification. We need to do some switching to not starve sleeping coroutines if we can always read. We need to handle having nothing to read or nothing to wait for. However, the end result still fits into 30 LOC.

def run(*coroutines):
    """Cooperatively run all ``coroutines`` until completion"""
    waiting_read = {}  # type: Dict[file, coroutine]
    waiting = [(0, coroutine) for coroutine in coroutines]
    while waiting or waiting_read:
        # 2. wait until the next coroutine may run or read ...
        try:
            until, coroutine = waiting.pop(0)
        except IndexError:
            until, coroutine = float('inf'), None
            readable, _, _ = select.select(list(waiting_read), [], [])
        else:
            readable, _, _ = select.select(list(waiting_read), [], [], max(0.0, until - time.time()))
        # ... and select the appropriate one
        if readable and time.time() < until:
            if until and coroutine:
                waiting.append((until, coroutine))
                waiting.sort()
            coroutine = waiting_read.pop(readable[0])
        # 3. run this coroutine
        try:
            command = coroutine.send(None)
        except StopIteration:
            continue
        # 1. sort coroutines by their desired suspension ...
        if isinstance(command, AsyncSleep):
            waiting.append((command.until, coroutine))
            waiting.sort(key=lambda item: item[0])
        # ... or register reads
        elif isinstance(command, AsyncRead):
            waiting_read[command.file] = coroutine

3.5. Cooperative I/O

The AsyncSleep, AsyncRead and run implementations are now fully functional to sleep and/or read. Same as for sleepy, we can define a helper to test reading:

async def ready(path, amount=1024*32):
    print('read', path, 'at', '%d' % time.time())
    with open(path, 'rb') as file:
        result = await AsyncRead(file, amount)
    print('done', path, 'at', '%d' % time.time())
    print('got', len(result), 'B')

run(sleepy('background', 5), ready('/dev/urandom'))

Running this, we can see that our I/O is interleaved with the waiting task:

id background round 1
read /dev/urandom at 1530721148
id background round 2
id background round 3
id background round 4
id background round 5
done /dev/urandom at 1530721148
got 1024 B

4. Non-Blocking I/O

While I/O on files gets the concept across, it is not really suitable for a library like asyncio: the select call always returns for files, and both open and read may block indefinitely. This blocks all coroutines of an event loop – which is bad. Libraries like aiofiles use threads and synchronization to fake non-blocking I/O and events on file.

However, sockets do allow for non-blocking I/O – and their inherent latency makes it much more critical. When used in an event loop, waiting for data and retrying can be wrapped without blocking anything.

4.1. Non-Blocking I/O event

Similar to our AsyncRead, we can define a suspend-and-read event for sockets. Instead of taking a file, we take a socket – which must be non-blocking. Also, our __await__ uses socket.recv instead of file.read.

class AsyncRecv:
    def __init__(self, connection, amount=1, read_buffer=1024):
        assert not connection.getblocking(), 'connection must be non-blocking for async recv'
        self.connection = connection
        self.amount = amount
        self.read_buffer = read_buffer
        self._buffer = b''

    def __await__(self):
        while len(self._buffer) < self.amount:
            try:
                self._buffer += self.connection.recv(self.read_buffer)
            except BlockingIOError:
                yield self
        return self._buffer

    def __repr__(self):
        return '%s(file=%s, amount=%d, progress=%d)' % (
            self.__class__.__name__, self.connection, self.amount, len(self._buffer)
        )

In contrast to AsyncRead, __await__ performs truly non-blocking I/O. When data is available, it always reads. When no data is available, it always suspends. That means the event loop is only blocked while we perform useful work.

4.2. Un-Blocking the event loop

As far as the event loop is concerned, nothing changes much. The event to listen for is still the same as for files – a file descriptor marked ready by select.

# old
elif isinstance(command, AsyncRead):
    waiting_read[command.file] = coroutine
# new
elif isinstance(command, AsyncRead):
    waiting_read[command.file] = coroutine
elif isinstance(command, AsyncRecv):
    waiting_read[command.connection] = coroutine

At this point, it should be obvious that AsyncRead and AsyncRecv are the same kind of event. We could easily refactor them to be one event with an exchangeable I/O component. In effect, the event loop, coroutines and events cleanly separate a scheduler, arbitrary intermediate code and the actual I/O.

4.3. The ugly side of non-blocking I/O

In principle, what you should do at this point is replicate the logic of read as a recv for AsyncRecv. However, this is much more ugly now – you have to handle early returns when functions block inside the kernel, but yield control to you. For example, opening a connection versus opening a file is much longer:

# file
file = open(path, 'rb')
# non-blocking socket
connection = socket.socket()
connection.setblocking(False)
# open without blocking - retry on failure
try:
    connection.connect((url, port))
except BlockingIOError:
    pass

Long story short, what remains is a few dozen lines of Exception handling. The events and event loop already work at this point.

id background round 1
read localhost:25000 at 1530783569
read /dev/urandom at 1530783569
done localhost:25000 at 1530783569 got 32768 B
id background round 2
id background round 3
id background round 4
done /dev/urandom at 1530783569 got 4096 B
id background round 5

Addendum

Example code at github


回答 2

coro概念上讲,您的退货是正确的,但略微不完整。

await不会无条件地挂起,只有在遇到阻塞调用时才挂起。它如何知道呼叫正在阻塞?这由等待的代码决定。例如,可以将套接字读取的等待实现改为:

def read(sock, n):
    # sock must be in non-blocking mode
    try:
        return sock.recv(n)
    except EWOULDBLOCK:
        event_loop.add_reader(sock.fileno, current_task())
        return SUSPEND

在实际的异步中,等效代码修改a的状态,Future而不返回魔术值,但是概念是相同的。当适当地适合于类似生成器的对象时,可以await编辑以上代码。

在呼叫方,当协程包含:

data = await read(sock, 1024)

它减少了接近:

data = read(sock, 1024)
if data is SUSPEND:
    return SUSPEND
self.pos += 1
self.parts[self.pos](...)

熟悉生成器的人往往会根据yield from悬浮液自动进行描述。

挂起链一直持续到事件循环,该循环注意到协程已挂起,将其从可运行集合中删除,然后继续执行可运行的协程(如果有)。如果没有协程可运行,则循环等待,select()直到协程感兴趣的文件描述符中的任何一个都准备好进行IO。(事件循环维护文件描述符到协程的映射。)

在上面的示例中,一旦select()告知事件循环sock可读,它将重新添加coro到可运行集,因此将从暂停点继续执行。

换一种说法:

  1. 默认情况下,所有操作都在同一线程中发生。

  2. 事件循环负责安排协程,并在协程等待(通常会阻塞或超时的IO调用)准备就绪时将其唤醒。

为了深入了解协程驱动事件循环,我推荐Dave Beazley的演讲,他在现场观众面前演示了从头开始编写事件循环的过程。

Your coro desugaring is conceptually correct, but slightly incomplete.

await doesn’t suspend unconditionally, but only if it encounters a blocking call. How does it know that a call is blocking? This is decided by the code being awaited. For example, an awaitable implementation of socket read could be desugared to:

def read(sock, n):
    # sock must be in non-blocking mode
    try:
        return sock.recv(n)
    except EWOULDBLOCK:
        event_loop.add_reader(sock.fileno, current_task())
        return SUSPEND

In real asyncio the equivalent code modifies the state of a Future instead of returning magic values, but the concept is the same. When appropriately adapted to a generator-like object, the above code can be awaited.

On the caller side, when your coroutine contains:

data = await read(sock, 1024)

It desugars into something close to:

data = read(sock, 1024)
if data is SUSPEND:
    return SUSPEND
self.pos += 1
self.parts[self.pos](...)

People familiar with generators tend to describe the above in terms of yield from which does the suspension automatically.

The suspension chain continues all the way up to the event loop, which notices that the coroutine is suspended, removes it from the runnable set, and goes on to execute coroutines that are runnable, if any. If no coroutines are runnable, the loop waits in select() until either a file descriptor a coroutine is interested in becomes ready for IO. (The event loop maintains a file-descriptor-to-coroutine mapping.)

In the above example, once select() tells the event loop that sock is readable, it will re-add coro to the runnable set, so it will be continued from the point of suspension.

In other words:

  1. Everything happens in the same thread by default.

  2. The event loop is responsible for scheduling the coroutines and waking them up when whatever they were waiting for (typically an IO call that would normally block, or a timeout) becomes ready.

For insight on coroutine-driving event loops, I recommend this talk by Dave Beazley, where he demonstrates coding an event loop from scratch in front of live audience.


回答 3

归结为异步解决的两个主要挑战:

  • 如何在单个线程中执行多个I / O?
  • 如何实现协作式多任务处理?

关于第一点的答案已经存在了很长一段时间,被称为选择循环。在python中,它是在选择器模块中实现的。

第二个问题与协程的概念有关,即协程可以停止执行并在以后恢复。在python中,协程是使用生成器yield from语句实现的。这就是隐藏在async / await语法后面的东西。

答案中的更多资源。


编辑:解决您对goroutines的评论:

在asyncio中,与goroutine最接近的等效项实际上不是协程,而是任务(请参见文档中的区别)。在python中,协程(或生成器)对事件循环或I / O的概念一无所知。它只是一个可以yield在保持其当前状态的同时停止使用其执行的功能,因此可以在以后还原。该yield from语法允许以透明方式链接它们。

现在,在异步任务中,位于链最底部的协程始终最终产生了未来。然后,这种未来会上升到事件循环,并集成到内部机制中。当将来通过其他内部回调设置为完成时,事件循环可以通过将将来发送回协程链来恢复任务。


编辑:解决您帖子中的一些问题:

在这种情况下,I / O实际如何发生?在单独的线程中?整个解释器是否已暂停并且I / O在解释器外部进行?

不,线程中什么也没有发生。I / O始终由事件循环管理,主要是通过文件描述符进行。但是,这些文件描述符的注册通常被高级协同程序隐藏,这使您的工作变得很脏。

I / O到底是什么意思?如果我的python过程称为C open()过程,然后它向内核发送了中断,放弃了对它的控制,那么Python解释器如何知道这一点并能够继续运行其他代码,而内核代码则执行实际的I / O,直到唤醒原来发送中断的Python过程?原则上,Python解释器如何知道这种情况?

I / O是任何阻塞调用。在asyncio中,所有I / O操作都应经过事件循环,因为正如您所说,事件循环无法知道某个同步代码中正在执行阻塞调用。这意味着您不应该open在协程的上下文中使用同步。相反,请使用aiofiles这样的专用库,该库提供的异步版本open

It all boils down to the two main challenges that asyncio is addressing:

  • How to perform multiple I/O in a single thread?
  • How to implement cooperative multitasking?

The answer to the first point has been around for a long while and is called a select loop. In python, it is implemented in the selectors module.

The second question is related to the concept of coroutine, i.e. functions that can stop their execution and be restored later on. In python, coroutines are implemented using generators and the yield from statement. That’s what is hiding behind the async/await syntax.

More resources in this answer.


EDIT: Addressing your comment about goroutines:

The closest equivalent to a goroutine in asyncio is actually not a coroutine but a task (see the difference in the documentation). In python, a coroutine (or a generator) knows nothing about the concepts of event loop or I/O. It simply is a function that can stop its execution using yield while keeping its current state, so it can be restored later on. The yield from syntax allows for chaining them in a transparent way.

Now, within an asyncio task, the coroutine at the very bottom of the chain always ends up yielding a future. This future then bubbles up to the event loop, and gets integrated into the inner machinery. When the future is set to done by some other inner callback, the event loop can restore the task by sending the future back into the coroutine chain.


EDIT: Addressing some of the questions in your post:

How does I/O actually happen in this scenario? In a separate thread? Is the whole interpreter suspended and I/O happens outside the interpreter?

No, nothing happens in a thread. I/O is always managed by the event loop, mostly through file descriptors. However the registration of those file descriptors is usually hidden by high-level coroutines, making the dirty work for you.

What exactly is meant by I/O? If my python procedure called C open() procedure, and it in turn sent interrupt to kernel, relinquishing control to it, how does Python interpreter know about this and is able to continue running some other code, while kernel code does the actual I/O and until it wakes up the Python procedure which sent the interrupt originally? How can Python interpreter in principle, be aware of this happening?

An I/O is any blocking call. In asyncio, all the I/O operations should go through the event loop, because as you said, the event loop has no way to be aware that a blocking call is being performed in some synchronous code. That means you’re not supposed to use a synchronous open within the context of a coroutine. Instead, use a dedicated library such aiofiles which provides an asynchronous version of open.


virtualenvwrapper和Python 3

问题:virtualenvwrapper和Python 3

我在ubuntu lucid上安装了python 3.3.1并成功创建了virtualenv,如下所示

virtualenv envpy331 --python=/usr/local/bin/python3.3

envpy331在我的主目录上创建了一个文件夹。

我也已经virtualenvwrapper安装了。但是在文档中仅支持的2.4-2.7版本。python是否有人试图组织python3virtualenv?如果是这样,您能告诉我如何吗?

I installed python 3.3.1 on ubuntu lucid and successfully created a virtualenv as below

virtualenv envpy331 --python=/usr/local/bin/python3.3

this created a folder envpy331 on my home dir.

I also have virtualenvwrapper installed.But in the docs only 2.4-2.7 versions of python are supported..Has anyone tried to organize the python3 virtualenv ? If so, can you tell me how ?


回答 0

virtualenvwrapper的最新版本是Python3.2下进行测试。很有可能它也可以与Python3.3一起使用。

The latest version of virtualenvwrapper is tested under Python3.2. Chances are good it will work with Python3.3 too.


回答 1

如果您已经安装了python3以及virtualenvwrapper,那么在虚拟环境中使用python3的唯一操作就是使用以下命令创建环境:

which python3 #Output: /usr/bin/python3
mkvirtualenv --python=/usr/bin/python3 nameOfEnvironment

或者,(至少在使用brew的OSX上):

mkvirtualenv --python=`which python3` nameOfEnvironment

开始使用环境,您将看到在键入python后立即开始使用python3

If you already have python3 installed as well virtualenvwrapper the only thing you would need to do to use python3 with the virtual environment is creating an environment using:

which python3 #Output: /usr/bin/python3
mkvirtualenv --python=/usr/bin/python3 nameOfEnvironment

Or, (at least on OSX using brew):

mkvirtualenv --python=`which python3` nameOfEnvironment

Start using the environment and you’ll see that as soon as you type python you’ll start using python3


回答 2

您可以使virtualenvwrapper使用自定义的Python二进制文件,而不是运行一个virtualenvwrapper。为此,您需要使用virtualenv使用的VIRTUALENV_PYTHON变量:

$ export VIRTUALENV_PYTHON=/usr/bin/python3
$ mkvirtualenv -a myproject myenv
Running virtualenv with interpreter /usr/bin/python3
New python executable in myenv/bin/python3
Also creating executable in myenv/bin/python
(myenv)$ python
Python 3.2.3 (default, Oct 19 2012, 19:53:16) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.

You can make virtualenvwrapper use a custom Python binary instead of the one virtualenvwrapper is run with. To do that you need to use VIRTUALENV_PYTHON variable which is utilized by virtualenv:

$ export VIRTUALENV_PYTHON=/usr/bin/python3
$ mkvirtualenv -a myproject myenv
Running virtualenv with interpreter /usr/bin/python3
New python executable in myenv/bin/python3
Also creating executable in myenv/bin/python
(myenv)$ python
Python 3.2.3 (default, Oct 19 2012, 19:53:16) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.

回答 3

virtualenvwrapper现在允许您指定不带路径的python可执行文件。

因此(至少在OSX上)mkvirtualenv --python=python3 nameOfEnvironment就足够了。

virtualenvwrapper now lets you specify the python executable without the path.

So (on OSX at least)mkvirtualenv --python=python3 nameOfEnvironment will suffice.


回答 4

在Ubuntu上;使用使用mkvirtualenv -p python3 env_namepython3加载virtualenv。

在环境内部,用于python --version验证。

On Ubuntu; using mkvirtualenv -p python3 env_name loads the virtualenv with python3.

Inside the env, use python --version to verify.


回答 5

您可以将其添加到您的.bash_profile或类似文件中:

alias mkvirtualenv3='mkvirtualenv --python=`which python3`'

然后在要创建python 3环境时使用mkvirtualenv3代替mkvirtualenv

You can add this to your .bash_profile or similar:

alias mkvirtualenv3='mkvirtualenv --python=`which python3`'

Then use mkvirtualenv3 instead of mkvirtualenv when you want to create a python 3 environment.


回答 6

我发现跑步

export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3

export VIRTUALENVWRAPPER_VIRTUALENV=/usr/bin/virtualenv-3.4

在Ubuntu上的命令行中,强制mkvirtualenv使用python3和virtualenv-3.4。仍然要做

mkvirtualenv --python=/usr/bin/python3 nameOfEnvironment

创造环境。假设您在/ usr / bin / python3中有python3,在/usr/local/bin/virtualenv-3.4中有virtualenv-3.4。

I find that running

export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3

and

export VIRTUALENVWRAPPER_VIRTUALENV=/usr/bin/virtualenv-3.4

in the command line on Ubuntu forces mkvirtualenv to use python3 and virtualenv-3.4. One still has to do

mkvirtualenv --python=/usr/bin/python3 nameOfEnvironment

to create the environment. This is assuming that you have python3 in /usr/bin/python3 and virtualenv-3.4 in /usr/local/bin/virtualenv-3.4.


回答 7

关于virtualenvwrapper的bitbucket问题跟踪器的这篇文章可能很有趣。在那里提到,大多数virtualenvwrapper的功能都可以在Python 3.3中的venv虚拟环境中使用。

This post on the bitbucket issue tracker of virtualenvwrapper may be of interest. It is mentioned there that most of virtualenvwrapper’s functions work with the venv virtual environments in Python 3.3.


回答 8

我这样添加export VIRTUALENV_PYTHON=/usr/bin/python3到我的~/.bashrc

export WORKON_HOME=$HOME/.virtualenvs
export VIRTUALENV_PYTHON=/usr/bin/python3
source /usr/local/bin/virtualenvwrapper.sh

然后跑 source .bashrc

您可以为每个新环境指定python版本 mkvirtualenv --python=python2 env_name

I added export VIRTUALENV_PYTHON=/usr/bin/python3 to my ~/.bashrc like this:

export WORKON_HOME=$HOME/.virtualenvs
export VIRTUALENV_PYTHON=/usr/bin/python3
source /usr/local/bin/virtualenvwrapper.sh

then run source .bashrc

and you can specify the python version for each new env mkvirtualenv --python=python2 env_name


Django模型“未声明显式的app_label”

问题:Django模型“未声明显式的app_label”

我机智的结束了。经过十多个小时的故障排除(可能还有更多),我认为自己终于可以从事业务了,但是后来我得到了:

Model class django.contrib.contenttypes.models.ContentType doesn't declare an explicit app_label 

网络上对此信息很少,目前还没有解决方案解决了我的问题。任何建议将不胜感激。

我正在使用Python 3.4和Django 1.10。

在我的settings.py中:

INSTALLED_APPS = [
    'DeleteNote.apps.DeletenoteConfig',
    'LibrarySync.apps.LibrarysyncConfig',
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
]

我的apps.py文件如下所示:

from django.apps import AppConfig


class DeletenoteConfig(AppConfig):
    name = 'DeleteNote'

from django.apps import AppConfig


class LibrarysyncConfig(AppConfig):
    name = 'LibrarySync'

I’m at wit’s end. After a dozen hours of troubleshooting, probably more, I thought I was finally in business, but then I got:

Model class django.contrib.contenttypes.models.ContentType doesn't declare an explicit app_label 

There is SO LITTLE info on this on the web, and no solution out there has resolved my issue. Any advice would be tremendously appreciated.

I’m using Python 3.4 and Django 1.10.

From my settings.py:

INSTALLED_APPS = [
    'DeleteNote.apps.DeletenoteConfig',
    'LibrarySync.apps.LibrarysyncConfig',
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
]

And my apps.py files look like this:

from django.apps import AppConfig


class DeletenoteConfig(AppConfig):
    name = 'DeleteNote'

and

from django.apps import AppConfig


class LibrarysyncConfig(AppConfig):
    name = 'LibrarySync'

回答 0

您是否缺少将应用程序名称放入设置文件中的信息?这myAppNameConfig.manage.py createapp myAppName命令在apps.py生成的默认类。其中myAppName是您的应用程序的名称。

settings.py

INSTALLED_APPS = [
'myAppName.apps.myAppNameConfig',
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
]

这样,设置文件可以找出您要调用的应用程序。您可以通过在以下代码中添加以下代码来更改apps.py文件中的外观:

myAppName / apps.py

class myAppNameConfig(AppConfig):
    name = 'myAppName'
    verbose_name = 'A Much Better Name'

Are you missing putting in your application name into the settings file? The myAppNameConfig is the default class generated at apps.py by the .manage.py createapp myAppName command. Where myAppName is the name of your app.

settings.py

INSTALLED_APPS = [
'myAppName.apps.myAppNameConfig',
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
]

This way, the settings file finds out what you want to call your application. You can change how it looks later in the apps.py file by adding the following code in

myAppName/apps.py

class myAppNameConfig(AppConfig):
    name = 'myAppName'
    verbose_name = 'A Much Better Name'

回答 1

我收到相同的错误,但我不知道如何解决此问题。我花了很多时间才注意到,我有一个与django的manage.py相同的initcopy文件。

之前:

|-- myproject
  |-- __init__.py
  |-- manage.py
  |-- myproject
    |-- ...
  |-- app1
    |-- models.py
  |-- app2
    |-- models.py

后:

|-- myproject
  |-- manage.py
  |-- myproject
    |-- ...
  |-- app1
    |-- models.py
  |-- app2
    |-- models.py

您会收到此“未声明显式app_label”错误,这非常令人困惑。但是删除此初始化文件解决了我的问题。

I get the same error and I don´t know how to figure out this problem. It took me many hours to notice that I have a init.py at the same direcory as the manage.py from django.

Before:

|-- myproject
  |-- __init__.py
  |-- manage.py
  |-- myproject
    |-- ...
  |-- app1
    |-- models.py
  |-- app2
    |-- models.py

After:

|-- myproject
  |-- manage.py
  |-- myproject
    |-- ...
  |-- app1
    |-- models.py
  |-- app2
    |-- models.py

It is quite confused that you get this “doesn’t declare an explicit app_label” error. But deleting this init file solved my problem.


回答 2

使用PyCharm运行测试时,我遇到了完全相同的错误。我已经通过显式设置DJANGO_SETTINGS_MODULE环境变量来修复它。如果您使用的是PyCharm,只需点击编辑配置按钮,然后选择环境变量

将变量设置为your_project_name.settings,这应该可以解决问题。

似乎发生此错误,因为PyCharm使用自己的运行测试manage.py

I had exactly the same error when running tests with PyCharm. I’ve fixed it by explicitly setting DJANGO_SETTINGS_MODULE environment variable. If you’re using PyCharm, just hit Edit Configurations button and choose Environment Variables.

Set the variable to your_project_name.settings and that should fix the thing.

It seems like this error occurs, because PyCharm runs tests with its own manage.py.


回答 3

我在使用时得到了这个,./manage.py shell 然后我不小心从根项目级别目录中导入了

# don't do this
from project.someapp.someModule import something_using_a_model
# do this
from someapp.someModule import something_using_a_model

something_using_a_model()

I got this one when I used ./manage.py shell then I accidentally imported from the root project level directory

# don't do this
from project.someapp.someModule import something_using_a_model
# do this
from someapp.someModule import something_using_a_model

something_using_a_model()

回答 4

作为使用Python3的菜鸟,我发现它可能是导入错误而不是Django错误

错误:

from someModule import someClass

对:

from .someModule import someClass

这种情况发生在几天前,但我真的无法复制它…我认为只有Django新手才可能遇到这种情况。

尝试在admin.py中注册模型:

from django.contrib import admin
from user import User
admin.site.register(User)

尝试运行服务器,错误看起来像这样

some lines...
File "/path/to/admin.py" ,line 6
tell you there is an import error
some lines...
Model class django.contrib.contenttypes.models.ContentType doesn't declare an explicit app_label

更改user.user,问题已解决

as a noob using Python3 ,I find it might be an import error instead of a Django error

wrong:

from someModule import someClass

right:

from .someModule import someClass

this happens a few days ago but I really can’t reproduce it…I think only people new to Django may encounter this.here’s what I remember:

try to register a model in admin.py:

from django.contrib import admin
from user import User
admin.site.register(User)

try to run server, error looks like this

some lines...
File "/path/to/admin.py" ,line 6
tell you there is an import error
some lines...
Model class django.contrib.contenttypes.models.ContentType doesn't declare an explicit app_label

change user to .user ,problem solved


回答 5

我刚才有同样的问题。我通过在应用程序名称上添加命名空间来修复我的问题。希望有人觉得这有帮助。

apps.py

from django.apps import AppConfig    

class SalesClientConfig(AppConfig):
        name = 'portal.sales_client'
        verbose_name = 'Sales Client'

I had the same problem just now. I’ve fixed mine by adding a namespace on the app name. Hope someone find this helpful.

apps.py

from django.apps import AppConfig    

class SalesClientConfig(AppConfig):
        name = 'portal.sales_client'
        verbose_name = 'Sales Client'

回答 6

我在测试中导入模型时遇到此错误,即鉴于此Django项目结构:

|-- myproject
    |-- manage.py
    |-- myproject
    |-- myapp
        |-- models.py  # defines model: MyModel
        |-- tests
            |-- test_models.py

test_models.pyMyModel以这种方式导入的文件中:

from models import MyModel

如果以这种方式导入,则问题已解决:

from myapp.models import MyModel

希望这可以帮助!

PS:也许这有点晚了,但是在我的代码中没有其他人找到如何解决此问题的答案,我想分享我的解决方案。

I got this error on importing models in tests, i.e. given this Django project structure:

|-- myproject
    |-- manage.py
    |-- myproject
    |-- myapp
        |-- models.py  # defines model: MyModel
        |-- tests
            |-- test_models.py

in file test_models.py I imported MyModel in this way:

from models import MyModel

The problem was fixed if it is imported in this way:

from myapp.models import MyModel

Hope this helps!

PS: Maybe this is a bit late, but I not found in others answers how to solve this problem in my code and I want to share my solution.


回答 7

在继续遇到这个问题并继续回到这个问题之后,我想我会分享我的问题所在。

@Xeberdee正确的所有内容,请按照以下说明操作,看看是否可以解决问题,如果不是,这是我的问题:

在我的apps.py中,这就是我拥有的:

class AlgoExplainedConfig(AppConfig):
    name = 'algo_explained'
    verbose_name = "Explain_Algo"
    ....

我所做的就是在我的应用名称之前添加了项目名称,如下所示:

class AlgoExplainedConfig(AppConfig):
name = '**algorithms_explained**.algo_explained'
verbose_name = "Explain_Algo"

这样就解决了我的问题,之后我就可以运行makemigrations和migration命令!祝好运

After keep on running into this issue and keep on coming back to this question I thought I’d share what my problem was.

Everything that @Xeberdee is correct so follow that and see if that solves the issue, if not this was my issue:

In my apps.py this is what I had:

class AlgoExplainedConfig(AppConfig):
    name = 'algo_explained'
    verbose_name = "Explain_Algo"
    ....

And all I did was I added the project name in front of my app name like this:

class AlgoExplainedConfig(AppConfig):
name = '**algorithms_explained**.algo_explained'
verbose_name = "Explain_Algo"

and that solved my problem and I was able to run the makemigrations and migrate command after that! good luck


回答 8

我今天在尝试运行Django测试时遇到此错误,因为我from .models import *在其中一个文件中使用了速记语法。问题是我的文件结构如下:

    apps/
      myapp/
        models/
          __init__.py
          foo.py
          bar.py

models/__init__.py我使用速记语法导入模型时:

    from .foo import *
    from .bar import *

在我的应用程序中,我正在导入如下模型:

    from myapp.models import Foo, Bar

这导致了Django model doesn't declare an explicit app_label运行时./manage.py test

要解决此问题,我必须从完整路径中显式导入models/__init__.py

    from myapp.models.foo import *
    from myapp.models.bar import *

那解决了错误。

H / t https://medium.com/@michal.bock/fix-weird-exceptions-when-running-django-tests-f58def71b59a

I had this error today trying to run Django tests because I was using the shorthand from .models import * syntax in one of my files. The issue was that I had a file structure like so:

    apps/
      myapp/
        models/
          __init__.py
          foo.py
          bar.py

and in models/__init__.py I was importing my models using the shorthand syntax:

    from .foo import *
    from .bar import *

In my application I was importing models like so:

    from myapp.models import Foo, Bar

This caused the Django model doesn't declare an explicit app_label when running ./manage.py test.

To fix the problem, I had to explicitly import from the full path in models/__init__.py:

    from myapp.models.foo import *
    from myapp.models.bar import *

That took care of the error.

H/t https://medium.com/@michal.bock/fix-weird-exceptions-when-running-django-tests-f58def71b59a


回答 9

就我而言,这是因为我在项目级urls.py中使用了相对模块路径,INSTALLED_APPSapps.py不是植根于项目根目录中。即,绝对模块路径始终存在,而不是相对模块路径+ hack。

无论我在应用程序中INSTALLED_APPSapps.py应用程序中弄乱了多少,我都无法同时使用它们runserverpytest直到这三个都植根于项目根目录为止。

资料夹结构:

|-- manage.py
|-- config
    |-- settings.py
    |-- urls.py
|-- biz_portal
    |-- apps
        |-- portal
            |-- models.py
            |-- urls.py
            |-- views.py
            |-- apps.py

用下面的,我可以运行manage.py runserver,并与WSGI和使用gunicorn portal应用看法,并没有麻烦,但pytest将与错误ModuleNotFoundError: No module named 'apps',尽管DJANGO_SETTINGS_MODULE被正确配置。

config / settings.py:

INSTALLED_APPS = [
    ...
    "apps.portal.apps.PortalConfig",
]

biz_portal / apps / portal / apps.py:

class PortalConfig(AppConfig):
    name = 'apps.portal'

config / urls.py:

urlpatterns = [
    path('', include('apps.portal.urls')),
    ...
]

改变应用程序的参考配置/ settings.py来biz_portal.apps.portal.apps.PortalConfigPortalConfig.namebiz_portal.apps.portal允许pytest来运行(我没有做检查portal的意见还),但runserver将与错误

RuntimeError:模型类apps.portal.models.Business没有声明显式的app_label,并且不在INSTALLED_APPS中的应用程序中

最后,我摸索apps.portal着查看仍然在使用相对路径的内容,并发现config / urls.py也应该使用biz_portal.apps.portal.urls

In my case, this was happening because I used a relative module path in project-level urls.py, INSTALLED_APPS and apps.py instead of being rooted in the project root. i.e. absolute module paths throughout, rather than relative modules paths + hacks.

No matter how much I messed with the paths in INSTALLED_APPS and apps.py in my app, I couldn’t get both runserver and pytest to work til all three of those were rooted in the project root.

Folder structure:

|-- manage.py
|-- config
    |-- settings.py
    |-- urls.py
|-- biz_portal
    |-- apps
        |-- portal
            |-- models.py
            |-- urls.py
            |-- views.py
            |-- apps.py

With the following, I could run manage.py runserver and gunicorn with wsgi and use portal app views without trouble, but pytest would error with ModuleNotFoundError: No module named 'apps' despite DJANGO_SETTINGS_MODULE being configured correctly.

config/settings.py:

INSTALLED_APPS = [
    ...
    "apps.portal.apps.PortalConfig",
]

biz_portal/apps/portal/apps.py:

class PortalConfig(AppConfig):
    name = 'apps.portal'

config/urls.py:

urlpatterns = [
    path('', include('apps.portal.urls')),
    ...
]

Changing the app reference in config/settings.py to biz_portal.apps.portal.apps.PortalConfig and PortalConfig.name to biz_portal.apps.portal allowed pytest to run (I don’t have tests for portal views yet) but runserver would error with

RuntimeError: Model class apps.portal.models.Business doesn’t declare an explicit app_label and isn’t in an application in INSTALLED_APPS

Finally I grepped for apps.portal to see what’s still using a relative path, and found that config/urls.py should also use biz_portal.apps.portal.urls.


回答 10

当我尝试为单个应用生成迁移时遇到了此错误,该应用由于git合并而存在现有的格式错误的迁移。例如

manage.py makemigrations myapp

当我删除它的迁移然后运行:

manage.py makemigrations

不会发生该错误,并且迁移成功生成。

I ran into this error when I tried generating migrations for a single app which had existing malformed migrations due to a git merge. e.g.

manage.py makemigrations myapp

When I deleted it’s migrations and then ran:

manage.py makemigrations

the error did not occur and the migrations generated successfully.


回答 11

我有一个类似的问题,但是我可以通过在模型类中使用Meta类明确指定app_label来解决我的问题

class Meta:
    app_label  = 'name_of_my_app'

I had a similar issue, but I was able to solve mine by specifying explicitly the app_label using Meta Class in my models class

class Meta:
    app_label  = 'name_of_my_app'

回答 12

尝试将Django Rest Framework应用程序升级到DRF 3.6.3和Django 1.11.1时出现此错误。

对于这种情况下的其他人,我在GitHub问题中找到了解决方案,该问题UNAUTHENTICATED_USERDRF设置中取消设置

# webapp/settings.py
...
REST_FRAMEWORK = {
    ...
    'UNAUTHENTICATED_USER': None
    ...
}

I got this error while trying to upgrade my Django Rest Framework app to DRF 3.6.3 and Django 1.11.1.

For anyone else in this situation, I found my solution in a GitHub issue, which was to unset the UNAUTHENTICATED_USER setting in the DRF settings:

# webapp/settings.py
...
REST_FRAMEWORK = {
    ...
    'UNAUTHENTICATED_USER': None
    ...
}

回答 13

我只是遇到了这个问题,并弄清楚出了什么问题。由于以前没有答案描述发生在我身上的问题,因此我将其发布给其他人:

  • 问题出python migrate.py startapp myApp在我的项目根文件夹中,然后将myApp移到一个子文件夹中mv myApp myFolderWithApps/
  • 我写了myApp.models并运行 python migrate.py makemigrations。一切顺利。
  • 然后我对另一个从myApp导入模型的应用做了同样的操作。b!执行迁移时,我遇到了这个错误。那是因为我不得不使用myFolderWithApps.myApp引用我的应用程序,但是我却忘记了更新MyApp / apps.py。因此,我在第二个应用程序中更正了myApp / apps.py,设置/ INSTALLED_APPS和导入路径。
  • 但是随后错误不断发生:原因是我进行了迁移,试图使用错误的路径从myApp导入模型。我试图更正迁移文件,但我发现重置数据库和删除迁移从头开始更容易。

简而言之:-问题最初是由于myApp的apps.py,第二个应用程序的设置和导入路径中错误的应用程序名称引起的。-但这不足以更正这三个位置的路径,因为迁移是使用导入引用了错误的应用程序名称创建的。因此,相同的错误在迁移过程中始终发生(这次迁移除外)。

所以…请检查您的迁移情况,祝您好运!

I just ran into this issue and figured out what was going wrong. Since no previous answer described the issue as it happened to me, I though I would post it for others:

  • the issue came from using python migrate.py startapp myApp from my project root folder, then move myApp to a child folder with mv myApp myFolderWithApps/.
  • I wrote myApp.models and ran python migrate.py makemigrations. All went well.
  • then I did the same with another app that was importing models from myApp. Kaboom! I ran into this error, while performing makemigrations. That was because I had to use myFolderWithApps.myApp to reference my app, but I had forgotten to update MyApp/apps.py. So I corrected myApp/apps.py, settings/INSTALLED_APPS and my import path in my second app.
  • but then the error kept happening: the reason was that I had migrations trying to import the models from myApp with the wrong path. I tried to correct the migration file, but I went at the point where it was easier to reset the DB and delete the migrations to start from scratch.

So to make a long story short: – the issue was initially coming from the wrong app name in apps.py of myApp, in settings and in the import path of my second app. – but it was not enough to correct the paths in these three places, as migrations had been created with imports referencing the wrong app name. Therefore, the same error kept happening while migrating (except this time from migrations).

So… check your migrations, and good luck!


回答 14

在Django rest_framework中构建API时遇到类似的错误。

RuntimeError:模型类apps.core.models.University未声明显式> app_label,并且不在INSTALLED_APPS中的应用程序中。

luke_aus的答案通过纠正我的问题帮助了我 urls.py

from project.apps.views import SurgeryView

from apps.views import SurgeryView

I’ve got a similar error while building an API in Django rest_framework.

RuntimeError: Model class apps.core.models.University doesn’t declare an explicit > app_label and isn’t in an application in INSTALLED_APPS.

luke_aus’s answer helped me by correcting my urls.py

from

from project.apps.views import SurgeryView

to

from apps.views import SurgeryView

回答 15

就我而言,将代码从Django 1.11.11移植到Django 2.2时出现此错误。我正在定义一个自定义FileSystemStorage派生类。在Django 1.11.11中,我在models.py中包含以下行:

from django.core.files.storage import Storage, DefaultStorage

后来在文件中,我有了类定义:

class MyFileStorage(FileSystemStorage):

但是,在Django 2.2中,我需要FileSystemStorage在导入时显式引用类:

from django.core.files.storage import Storage, DefaultStorage, FileSystemStorage

和贴吧,错误消失了。

请注意,每个人都在报告Django服务器随地吐痰的错误消息的最后一部分。但是,如果向上滚动,则会在该错误的中间部分找到原因。

In my case I got this error when porting code from Django 1.11.11 to Django 2.2. I was defining a custom FileSystemStorage derived class. In Django 1.11.11 I was having the following line in models.py:

from django.core.files.storage import Storage, DefaultStorage

and later in the file I had the class definition:

class MyFileStorage(FileSystemStorage):

However, in Django 2.2 I need to explicitly reference FileSystemStorage class when importing:

from django.core.files.storage import Storage, DefaultStorage, FileSystemStorage

and voilà!, the error dissapears.

Note, that everyone is reporting the last part of the error message spitted by Django server. However, if you scroll up you will find the reason in the middle of that error mambo-jambo.


回答 16

就我而言,我能够找到一个修复程序,并且通过查看其他人的代码也可能是相同的问题。文件。

希望这对某人有帮助。这是我对编码社区的第一个贡献

in my case I was able to find a fix and by looking at the everyone else’s code it may be the same issue.. I simply just had to add ‘django.contrib.sites’ to the list of installed apps in the settings.py file.

hope this helps someone. this is my first contribution to the coding community


回答 17

TL; DR:添加空白__init__.py为我解决了此问题。

我在PyCharm中遇到此错误,并意识到我的设置文件根本没有被导入。没有明显的错误告诉我,但是当我在settings.py中放入一些废话时,它没有引起错误。

我在local_settings文件夹中有settings.py。但是,我希望在同一文件夹中包含__init__.py以便导入。一旦我添加了这个,错误就消失了。

TL;DR: Adding a blank __init__.py fixed the issue for me.

I got this error in PyCharm and realised that my settings file was not being imported at all. There was no obvious error telling me this, but when I put some nonsense code into the settings.py, it didn’t cause an error.

I had settings.py inside a local_settings folder. However, I’d fogotten to include a __init__.py in the same folder to allow it to be imported. Once I’d added this, the error went away.


回答 18

如果所有配置正确,则可能只是导入混乱。密切关注您如何导入违规模型。

以下无效from .models import Business。请使用完整的导入路径:from myapp.models import Business

If you have got all the config right, it might just be an import mess. keep an eye on how you are importing the offending model.

The following won’t work from .models import Business. Use full import path instead: from myapp.models import Business


回答 19

如果所有其他方法均失败,并且在尝试导入PyCharm“ Python控制台”(或“ Django控制台”)时遇到此错误:

尝试重新启动控制台。

这真是令人尴尬,但是花了我一段时间我才意识到自己忘记这样做了。

这是发生了什么:

添加了一个新的应用程序,然后添加了一个最小模型,然后尝试在Python / Django控制台中导入模型(PyCharm pro 2019.2)。这引发了doesn't declare an explicit app_label错误,因为我没有将新应用添加到中INSTALLED_APPS。因此,我将应用添加到了INSTALLED_APPS,再次尝试了导入,但是仍然遇到相同的错误。

来到这里,阅读所有其他答案,但似乎没有合适的答案。

最后,令我惊讶的是,在将新应用添加到后,我还没有重新启动Python控制台INSTALLED_APPS

注意:在将新对象添加到模块后,无法重新启动PyCharm Python控制台也是一种使人非常困惑的好方法 ImportError: Cannot import name ...

If all else fails, and if you are seeing this error while trying to import in a PyCharm “Python console” (or “Django console”):

Try restarting the console.

This is pretty embarassing, but it took me a while before I realized I had forgotten to do that.

Here’s what happened:

Added a fresh app, then added a minimal model, then tried to import the model in the Python/Django console (PyCharm pro 2019.2). This raised the doesn't declare an explicit app_label error, because I had not added the new app to INSTALLED_APPS. So, I added the app to INSTALLED_APPS, tried the import again, but still got the same error.

Came here, read all the other answers, but nothing seemed to fit.

Finally it hit me that I had not yet restarted the Python console after adding the new app to INSTALLED_APPS.

Note: failing to restart the PyCharm Python console, after adding a new object to a module, is also a great way to get a very confusing ImportError: Cannot import name ...


回答 20

O … M … G我也遇到了这个错误,我花了将近2天的时间,现在终于设法解决了。老实说…错误与问题无关。就我而言,这只是语法问题。我试图独立运行一个在django上下文中使用某些django模型的python模块,但该模块本身不是django模型。但是我宣布全班不对

而不是

class Scrapper:
    name = ""
    main_link= ""
    ...

我在做

class Scrapper(Website):
    name = ""
    main_link= ""
    ...

这显然是错误的。该消息是如此令人误解,以至于我忍不住想,但我认为这是配置方面的问题,或者只是以错误的方式使用django,因为我对此很陌生。

我将在这里与新手分享,因为我经历同样的愚蠢期盼能解决他们的问题。

O…M…G I was getting this error too and I spent almost 2 days on it and now I finally managed to solve it. Honestly…the error had nothing to do with what the problem was. In my case it was a simple matter of syntax. I was trying to run a python module standalone that used some django models in a django context, but the module itself wasn’t a django model. But I was declaring the class wrong

instead of having

class Scrapper:
    name = ""
    main_link= ""
    ...

I was doing

class Scrapper(Website):
    name = ""
    main_link= ""
    ...

which is obviously wrong. The message is so misleading that I couldn’t help myself but think it was some issue with configuration or just using django in a wrong way since I’m very new to it.

I’ll share this here for someone newbie as me going through the same silliness can hopefully solve their issue.


回答 21

SECRET_KEY从环境变量中移出并在运行应用程序时忘记对其进行设置后,收到了此错误。如果你有这样的事情settings.py

SECRET_KEY = os.getenv('SECRET_KEY')

然后确保您实际上在设置环境变量。

I received this error after I moved the SECRET_KEY to pull from an environment variable and forgot to set it when running the application. If you have something like this in your settings.py

SECRET_KEY = os.getenv('SECRET_KEY')

then make sure you are actually setting the environment variable.


回答 22

很可能您有依赖进口

在我的情况下,我在模型中使用了序列化程序类作为参数,而序列化程序类则在使用以下模型:serializer_class = AccountSerializer

from ..api.serializers import AccountSerializer

class Account(AbstractBaseUser):
    serializer_class = AccountSerializer
    ...

并在“序列化器”文件中:

from ..models import Account

class AccountSerializer(serializers.ModelSerializer):
    class Meta:
        model = Account
        fields = (
            'id', 'email', 'date_created', 'date_modified',
            'firstname', 'lastname', 'password', 'confirm_password')
    ...

Most probably you have dependent imports.

In my case I used a serializer class as a parameter in my model, and the serializer class was using this model: serializer_class = AccountSerializer

from ..api.serializers import AccountSerializer

class Account(AbstractBaseUser):
    serializer_class = AccountSerializer
    ...

And in the “serializers” file:

from ..models import Account

class AccountSerializer(serializers.ModelSerializer):
    class Meta:
        model = Account
        fields = (
            'id', 'email', 'date_created', 'date_modified',
            'firstname', 'lastname', 'password', 'confirm_password')
    ...

回答 23

我今天遇到了这个错误,并在谷歌搜索后最终到达了这里。现有的答案似乎都与我的情况无关。我唯一需要做的就是从__init__.py应用程序顶层的文件中导入模型。我必须使用模型将导入移动到函数中。

Django似乎有一些奇怪的代码,在许多不同的情况下都可能会失败!

I got this error today and ended up here after googling. None of the existing answers seem relevant to my situation. The only thing I needed to do was to import a model from my __init__.py file in the top level of an app. I had to move my imports into the functions using the model.

Django seems to have some weird code that can fail like this in so many different scenarios!


回答 24

我今天也遇到了这个错误。该消息引用了INSTALLED_APPS我的应用程序中的某些特定应用程序。但实际上,它与此特定的应用程序无关。我使用了一个新的虚拟环境,却忘记安装一些在该项目中使用的库。在我安装了附加库之后,它开始工作了。

I got this error also today. The Message referenced to some specific app of my apps in INSTALLED_APPS. But in fact it had nothing to do with this specific App. I used a new virtual Environment and forgot to install some Libraries, that i used in this project. After i installed the additional Libraries, it worked.


回答 25

对于PyCharm用户:使用“不干净”的项目结构时出现错误。

是:

project_root_directory
└── src
    ├── chat
       ├── migrations
       └── templates
    ├── django_channels
    └── templates

现在:

project_root_directory
├── chat
   ├── migrations
   └── templates
       └── chat
├── django_channels
└── templates

这里有很多好的解决方案,但是我认为,首先,您应该在设置之前清理项目结构或调整PyCharm Django设置 DJANGO_SETTINGS_MODULE变量等。

希望对别人有帮助。干杯。

For PyCharm users: I had an error using not “clean” project structure.

Was:

project_root_directory
└── src
    ├── chat
    │   ├── migrations
    │   └── templates
    ├── django_channels
    └── templates

Now:

project_root_directory
├── chat
│   ├── migrations
│   └── templates
│       └── chat
├── django_channels
└── templates

Here is a lot of good solutions, but I think, first of all, you should clean your project structure or tune PyCharm Django settings before setting DJANGO_SETTINGS_MODULE variables and so on.

Hope it’ll help someone. Cheers.


回答 26

问题是:

  1. 您已经对模型文件进行了修改,但尚未将其添加到数据库中,但是您正在尝试运行Python manage.py runserver。

  2. 运行Python manage.py makemigrations

  3. Python manage.py迁移

  4. 现在,Python manage.py runserver和一切都应该没问题。

The issue is that:

  1. You have made modifications to your models file, but not addedd them yet to the DB, but you are trying to run Python manage.py runserver.

  2. Run Python manage.py makemigrations

  3. Python manage.py migrate

  4. Now Python manage.py runserver and all should be fine.