内容 隐藏
图书馆

目录

*1.数据结构: ListDictionarySetTupleRangeEnumerateIteratorGenerator
*2.类型:*TypeStringRegular_ExpFormatNumbersCombinatoricsDatetime
*3.语法:*ArgsInlineClosureDecoratorClassDuck_TypeEnumException
*4.系统:*ExitPrintInputCommand_Line_ArgumentsOpenPathOS_Commands
*5.数据:*JSONPickleCSVSQLiteBytesStructArrayMemory_ViewDeque
*6.高级:*ThreadingOperatorIntrospectionMetaprogramingEvalCoroutines
*7.模块:*Progress_BarPlotTableCursesLoggingScrapingWebProfile
*NumPyImageAudioGamesData

Main

if __name__ == '__main__':     # Runs main() if file wasn't imported. main()

列表

<list> = <list>[from_inclusive : to_exclusive : ±step_size]
<list>.append(<el>)            # Or: <list> += [<el>] <list>.extend(<collection>)    # Or: <list> += <collection>
<list>.sort()
<list>.reverse()
<list> = sorted(<collection>)
<iter> = reversed(<list>)
sum_of_elements = sum(<collection>)
elementwise_sum = [sum(pair) for pair in zip(list_a, list_b)]
sorted_by_second = sorted(<collection>, key=lambda el: el[1])
sorted_by_both = sorted(<collection>, key=lambda el: (el[1], el[0]))
flatter_list = list(itertools.chain.from_iterable(<list>))
product_of_elems = functools.reduce(lambda out, el: out * el, <collection>)
list_of_chars = list(<str>)
  • 模块operator提供函数itemgetter()和mul(),它们提供的功能与lambda上面的表达式
<list>.insert(<int>, <el>)     # Inserts item at index and moves the rest to the right. <el> = <list>.pop([<int>])    # Returns and removes item at index or from the end. <int> = <list>.count(<el>)     # Returns number of occurrences. Also works on strings. <int> = <list>.index(<el>)     # Returns index of the first occurrence or raises ValueError. <list>.remove(<el>)            # Removes first occurrence of the item or raises ValueError. <list>.clear()                 # Removes all items. Also works on dictionary and set.

词典

<view> = <dict>.keys()                          # Coll. of keys that reflects changes. <view> = <dict>.values()                        # Coll. of values that reflects changes. <view> = <dict>.items()                         # Coll. of key-value tuples that reflects chgs.
value = <dict>.get(key, default=None)          # Returns default if key is missing. value = <dict>.setdefault(key, default=None)   # Returns and writes default if key is missing. <dict> = collections.defaultdict(<type>)        # Creates a dict with default value of type. <dict> = collections.defaultdict(lambda: 1)     # Creates a dict with default value 1.
<dict> = dict(<collection>)                     # Creates a dict from coll. of key-value pairs. <dict> = dict(zip(keys, values))                # Creates a dict from two collections. <dict> = dict.fromkeys(keys [, value])          # Creates a dict from collection of keys.
<dict>.update(<dict>)                           # Adds items. Replaces ones with matching keys. value = <dict>.pop(key)                         # Removes item or raises KeyError.
{k for k, v in <dict>.items() if v == value}    # Returns set of keys that point to the value.
{k: v for k, v in <dict>.items() if k in keys}  # Returns a dictionary, filtered by keys.

计数器

>>> from collections import Counter >>> colors = ['blue', 'blue', 'blue', 'red', 'red']
>>> counter = Counter(colors)
>>> counter['yellow'] += 1 Counter({'blue': 3, 'red': 2, 'yellow': 1})
>>> counter.most_common()[0]
('blue', 3)

设置

<set> = set()
<set>.add(<el>)                                 # Or: <set> |= {<el>} <set>.update(<collection> [, ...])              # Or: <set> |= <set>
<set> = <set>.union(<coll.>)                   # Or: <set> | <set> <set> = <set>.intersection(<coll.>)            # Or: <set> & <set> <set> = <set>.difference(<coll.>)              # Or: <set> - <set> <set> = <set>.symmetric_difference(<coll.>)    # Or: <set> ^ <set> <bool> = <set>.issubset(<coll.>)                # Or: <set> <= <set> <bool> = <set>.issuperset(<coll.>)              # Or: <set> >= <set>
<el> = <set>.pop()                              # Raises KeyError if empty. <set>.remove(<el>)                              # Raises KeyError if missing. <set>.discard(<el>)                             # Doesn't raise an error.

冻结集

  • 是不可变的和可哈希的
  • 这意味着它可以用作字典中的键或集合中的元素
<frozenset> = frozenset(<collection>)

元组

元组是一个不可变且可散列的列表

<tuple> = ()
<tuple> = (<el>,)                           # Or: <el>, <tuple> = (<el_1>, <el_2> [, ...])          # Or: <el_1>, <el_2> [, ...]

命名元组

具有命名元素的元组的子类

>>> from collections import namedtuple >>> Point = namedtuple('Point', 'x y')
>>> p = Point(1, y=2)
Point(x=1, y=2)
>>> p[0]
1 >>> p.x 1 >>> getattr(p, 'y')
2 >>> p._fields # Or: Point._fields
('x', 'y')

范围

<range> = range(to_exclusive)
<range> = range(from_inclusive, to_exclusive)
<range> = range(from_inclusive, to_exclusive, ±step_size)
from_inclusive = <range>.start to_exclusive = <range>.stop

枚举

for i, el in enumerate(<collection> [, i_start]):
    ...

迭代器

<iter> = iter(<collection>)                 # `iter(<iter>)` returns unmodified iterator. <iter> = iter(<function>, to_exclusive)     # A sequence of return values until 'to_exclusive'. <el> = next(<iter> [, default])           # Raises StopIteration or returns 'default' on end. <list> = list(<iter>)                       # Returns a list of iterator's remaining elements.

迭代工具

from itertools import count, repeat, cycle, chain, islice
<iter> = count(start=0, step=1)             # Returns updated value endlessly. Accepts floats. <iter> = repeat(<el> [, times])             # Returns element endlessly or 'times' times. <iter> = cycle(<collection>)                # Repeats the sequence endlessly.
<iter> = chain(<coll_1>, <coll_2> [, ...])  # Empties collections in order. <iter> = chain.from_iterable(<collection>)  # Empties collections inside a collection in order.
<iter> = islice(<coll>, to_exclusive)       # Only returns first 'to_exclusive' elements. <iter> = islice(<coll>, from_inclusive, …)  # `to_exclusive, step_size`.

发电机

  • 任何包含Year语句的函数都返回生成器
  • 生成器和迭代器是可互换的
def count(start, step):
    while True:
        yield start start += step
>>> counter = count(10, 2)
>>> next(counter), next(counter), next(counter)
(10, 12, 14)

类型

  • 一切都是对象
  • 每个对象都有一个类型
  • 类型和类是同义词
<type> = type(<el>)                          # Or: <el>.__class__ <bool> = isinstance(<el>, <type>)            # Or: issubclass(type(<el>), <type>)
>>> type('a'), 'a'.__class__, str
(<class 'str'>, <class 'str'>, <class 'str'>)

某些类型没有内置名称,因此必须导入:

from types import FunctionType, MethodType, LambdaType, GeneratorType

抽象基类

每个抽象基类指定一组虚拟子类。这些类随后被isinstance()和issubclass()识别为ABC的子类,尽管它们实际上不是。ABC还可以手动决定特定类是否为其虚拟子类,通常基于该类实现了哪些方法。例如,Iterable ABC查找方法ITER(),而集合ABC查找方法ITER()、CONTAINS()和len()

>>> from collections.abc import Sequence, Collection, Iterable >>> isinstance([1, 2, 3], Iterable)
True
+------------------+------------+------------+------------+
|                  |  Sequence  | Collection |  Iterable  |
+------------------+------------+------------+------------+
| list, range, str |    yes     |    yes     |    yes     |
| dict, set        |            |    yes     |    yes     |
| iter             |            |            |    yes     |
+------------------+------------+------------+------------+
>>> from numbers import Integral, Rational, Real, Complex, Number >>> isinstance(123, Number)
True
+--------------------+----------+----------+----------+----------+----------+
|                    | Integral | Rational |   Real   | Complex  |  Number  |
+--------------------+----------+----------+----------+----------+----------+
| int                |   yes    |   yes    |   yes    |   yes    |   yes    |
| fractions.Fraction |          |   yes    |   yes    |   yes    |   yes    |
| float              |          |          |   yes    |   yes    |   yes    |
| complex            |          |          |          |   yes    |   yes    |
| decimal.Decimal    |          |          |          |          |   yes    |
+--------------------+----------+----------+----------+----------+----------+

字符串

<str> = <str>.strip()                       # Strips all whitespace characters from both ends. <str> = <str>.strip('<chars>')              # Strips all passed characters from both ends.
<list> = <str>.split()                       # Splits on one or more whitespace characters. <list> = <str>.split(sep=None, maxsplit=-1)  # Splits on 'sep' str at most 'maxsplit' times. <list> = <str>.splitlines(keepends=False)    # Splits on [\n\r\f\v\x1c\x1d\x1e\x85] and '\r\n'. <str> = <str>.join(<coll_of_strings>)       # Joins elements using string as a separator.
<bool> = <sub_str> in <str> # Checks if string contains a substring. <bool> = <str>.startswith(<sub_str>)         # Pass tuple of strings for multiple options. <bool> = <str>.endswith(<sub_str>)           # Pass tuple of strings for multiple options. <int> = <str>.find(<sub_str>)               # Returns start index of the first match or -1. <int> = <str>.index(<sub_str>)              # Same but raises ValueError if missing.
<str> = <str>.replace(old, new [, count])   # Replaces 'old' with 'new' at most 'count' times. <str> = <str>.translate(<table>)            # Use `str.maketrans(<dict>)` to generate table.
<str> = chr(<int>)                          # Converts int to Unicode char. <int> = ord(<str>)                          # Converts Unicode char to int.
  • 另外:'lstrip()''rstrip()'
  • 另外:'lower()''upper()''capitalize()''title()'

属性方法

+---------------+----------+----------+----------+----------+----------+
|               | [ !#$%…] | [a-zA-Z] |  [¼½¾]   |  [²³¹]   |  [0-9]   |
+---------------+----------+----------+----------+----------+----------+
| isprintable() |   yes    |   yes    |   yes    |   yes    |   yes    |
| isalnum()     |          |   yes    |   yes    |   yes    |   yes    |
| isnumeric()   |          |          |   yes    |   yes    |   yes    |
| isdigit()     |          |          |          |   yes    |   yes    |
| isdecimal()   |          |          |          |          |   yes    |
+---------------+----------+----------+----------+----------+----------+
  • 另外:'isspace()'检查'[ \t\n\r\f\v…]'

正则表达式

import re <str> = re.sub(<regex>, new, text, count=0)  # Substitutes all occurrences with 'new'. <list> = re.findall(<regex>, text)            # Returns all occurrences as strings. <list> = re.split(<regex>, text, maxsplit=0)  # Use brackets in regex to include the matches. <Match> = re.search(<regex>, text)             # Searches for first occurrence of the pattern. <Match> = re.match(<regex>, text)              # Searches only at the beginning of the text. <iter> = re.finditer(<regex>, text)           # Returns all occurrences as match objects.
  • 如果Search()和Match()找不到匹配项,则返回NONE
  • 论据'flags=re.IGNORECASE'可以与所有功能一起使用
  • 论据'flags=re.MULTILINE'使'^''$'匹配每行的开始/结束
  • 论据'flags=re.DOTALL'使点也接受'\n'
  • 使用r'\1''\\1'用于反向引用
  • 添加'?'在操作员之后,使其不贪婪

匹配对象

<str> = <Match>.group()                      # Returns the whole match. Also group(0). <str> = <Match>.group(1)                     # Returns part in the first bracket. <tuple> = <Match>.groups()                     # Returns all bracketed parts. <int> = <Match>.start()                      # Returns start index of the match. <int> = <Match>.end()                        # Returns exclusive end index of the match.

特殊序列

  • 默认情况下,匹配所有字母表中的十进制字符、字母数字和空格,除非'flags=re.ASCII'参数被使用
  • 如下所示,它将特殊序列匹配限制为前128个字符,并防止'\s'从接受'[\x1c-\x1f]'
  • 用大写字母表示否定
'\d' == '[0-9]' # Matches decimal characters. '\w' == '[a-zA-Z0-9_]' # Matches alphanumerics and underscore. '\s' == '[ \t\n\r\f\v]' # Matches whitespaces.

格式化

<str> = f'{<el_1>}, {<el_2>}' <str> = '{}, {}'.format(<el_1>, <el_2>)

属性

>>> from collections import namedtuple >>> Person = namedtuple('Person', 'name height')
>>> person = Person('Jean-Luc', 187)
>>> f'{person.height}' '187' >>> '{p.height}'.format(p=person)
'187'

常规选项

{<el>:<10}                                     # '<el>      '
{<el>:^10}                                     # '   <el>   '
{<el>:>10}                                     # '      <el>'
{<el>:.<10}                                    # '<el>......'
{<el>:0}                                       # '<el>'
  • 使用'{<el>:{<str/int/float>}[...]}'要动态设置选项,请执行以下操作
  • 添加'!r'在冒号通过调用其repr()方法

字符串

{'abcde'!r:10}                                 # "'abcde'   "
{'abcde':10.3}                                 # 'abc       '
{'abcde':.3}                                   # 'abc'

数字

{ 123456:10,}                                  # '   123,456'
{ 123456:10_}                                  # '   123_456'
{ 123456:+10}                                  # '   +123456'
{-123456:=10}                                  # '-   123456'
{ 123456: }                                    # ' 123456'
{-123456: }                                    # '-123456'

浮动车

{1.23456:10.3}                                 # '      1.23'
{1.23456:10.3f}                                # '     1.235'
{1.23456:10.3e}                                # ' 1.235e+00'
{1.23456:10.3%}                                # '  123.456%'

演示文稿类型比较:

+--------------+----------------+----------------+----------------+----------------+
|              |    {<float>}   |   {<float>:f}  |   {<float>:e}  |   {<float>:%}  |
+--------------+----------------+----------------+----------------+----------------+
|  0.000056789 |   '5.6789e-05' |    '0.000057'  | '5.678900e-05' |    '0.005679%' |
|  0.00056789  |   '0.00056789' |    '0.000568'  | '5.678900e-04' |    '0.056789%' |
|  0.0056789   |   '0.0056789'  |    '0.005679'  | '5.678900e-03' |    '0.567890%' |
|  0.056789    |   '0.056789'   |    '0.056789'  | '5.678900e-02' |    '5.678900%' |
|  0.56789     |   '0.56789'    |    '0.567890'  | '5.678900e-01' |   '56.789000%' |
|  5.6789      |   '5.6789'     |    '5.678900'  | '5.678900e+00' |  '567.890000%' |
| 56.789       |  '56.789'      |   '56.789000'  | '5.678900e+01' | '5678.900000%' |
+--------------+----------------+----------------+----------------+----------------+
+--------------+----------------+----------------+----------------+----------------+
|              |  {<float>:.2}  |  {<float>:.2f} |  {<float>:.2e} |  {<float>:.2%} |
+--------------+----------------+----------------+----------------+----------------+
|  0.000056789 |    '5.7e-05'   |      '0.00'    |   '5.68e-05'   |      '0.01%'   |
|  0.00056789  |    '0.00057'   |      '0.00'    |   '5.68e-04'   |      '0.06%'   |
|  0.0056789   |    '0.0057'    |      '0.01'    |   '5.68e-03'   |      '0.57%'   |
|  0.056789    |    '0.057'     |      '0.06'    |   '5.68e-02'   |      '5.68%'   |
|  0.56789     |    '0.57'      |      '0.57'    |   '5.68e-01'   |     '56.79%'   |
|  5.6789      |    '5.7'       |      '5.68'    |   '5.68e+00'   |    '567.89%'   |
| 56.789       |    '5.7e+01'   |     '56.79'    |   '5.68e+01'   |   '5678.90%'   |
+--------------+----------------+----------------+----------------+----------------+
  • 当向上舍入和向下舍入都可以时,将选择返回结果的最后一个数字为偶数位的结果。这使得'{6.5:.0f}'一个'6''{7.5:.0f}'一个'8'

INTS

{90:c}                                   # 'Z'
{90:b}                                   # '1011010'
{90:X}                                   # '5A'

数字

类型

<int> = int(<float/str/bool>)       # Or: math.floor(<float>) <float> = float(<int/str/bool>)       # Or: <real>e±<int> <complex> = complex(real=0, imag=0)     # Or: <real> ± <real>j <Fraction> = fractions.Fraction(0, 1)    # Or: Fraction(numerator=0, denominator=1) <Decimal> = decimal.Decimal(<str/int>)  # Or: Decimal((sign, digits, exponent))
  • 'int(<str>)''float(<str>)'对格式错误的字符串引发ValueError
  • 十进制数可以精确表示,这与浮点数不同'1.1 + 2.2 != 3.3'
  • 小数运算的精度通过以下方式设置:'decimal.getcontext().prec = <int>'

基本功能

<num> = pow(<num>, <num>)                # Or: <num> ** <num> <num> = abs(<num>)                       # <float> = abs(<complex>) <num> = round(<num> [, ±ndigits])        # `round(126, -1) == 130`

数学

from math import e, pi, inf, nan, isinf, isnan from math import sin, cos, tan, asin, acos, atan, degrees, radians from math import log, log10, log2

统计数据

from statistics import mean, median, variance, stdev, pvariance, pstdev

随机的

from random import random, randint, choice, shuffle, gauss, seed <float> = random()                       # A float inside [0, 1). <int> = randint(from_inc, to_inc)      # An int inside [from_inc, to_inc]. <el> = choice(<list>)                 # Keeps the list intact.

宾,祸不单行

<int> = ±0b<bin> # Or: ±0x<hex> <int> = int('±<bin>', 2)                 # Or: int('±<hex>', 16) <int> = int('±0b<bin>', 0)               # Or: int('±0x<hex>', 0) <str> = bin(<int>)                       # Returns '[-]0b<bin>'.

位运算符

<int> = <int> & <int> # And <int> = <int> | <int> # Or <int> = <int> ^ <int> # Xor (0 if both bits equal) <int> = <int> << n_bits # Left shift (>> for right) <int> = ~<int> # Not (also: -<int> - 1)

组合学

  • 每个函数都返回一个迭代器
  • 如果要打印迭代器,则需要首先将其传递给list()函数!
from itertools import product, combinations, combinations_with_replacement, permutations
>>> product([0, 1], repeat=3)
[(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), ..., (1, 1, 1)]
>>> product('abc', 'abc')                    #   a  b  c
[('a', 'a'), ('a', 'b'), ('a', 'c'),         # a x  x  x
 ('b', 'a'), ('b', 'b'), ('b', 'c'),         # b x  x  x
 ('c', 'a'), ('c', 'b'), ('c', 'c')]         # c x  x  x
>>> combinations('abc', 2)                   #   a  b  c
[('a', 'b'), ('a', 'c'),                     # a .  x  x
 ('b', 'c')]                                 # b .  .  x
>>> combinations_with_replacement('abc', 2)  #   a  b  c
[('a', 'a'), ('a', 'b'), ('a', 'c'),         # a x  x  x
 ('b', 'b'), ('b', 'c'),                     # b .  x  x
 ('c', 'c')]                                 # c .  .  x
>>> permutations('abc', 2)                   #   a  b  c
[('a', 'b'), ('a', 'c'),                     # a .  x  x
 ('b', 'a'), ('b', 'c'),                     # b x  .  x
 ('c', 'a'), ('c', 'b')]                     # c x  x  .

日期时间

  • 模块“DateTime”提供“Date”<D>,‘时间’<T>,‘DateTime’<DT>和“时间增量”<TD>上课。所有这些都是不可变的和可哈希的
  • Time和DateTime对象可以是“感知的”<a>,意味着他们已经定义了时区,或者说是“幼稚的”<n>,也就是说他们不会
  • 如果对象是朴素的,则假定它位于系统的时区
from datetime import date, time, datetime, timedelta from dateutil.tz import UTC, tzlocal, gettz, datetime_exists, resolve_imaginary

构造函数

<D> = date(year, month, day)
<T> = time(hour=0, minute=0, second=0, microsecond=0, tzinfo=None, fold=0)
<DT> = datetime(year, month, day, hour=0, minute=0, second=0, ...)
<TD> = timedelta(days=0, seconds=0, microseconds=0, milliseconds=0,
                 minutes=0, hours=0, weeks=0)
  • 使用'<D/DT>.weekday()'获取星期几(MON==0)
  • 'fold=1'表示时间倒退一小时的情况下的第二次传递
  • '<DTa> = resolve_imaginary(<DTa>)'修复了丢失小时内的DT

现在

<D/DTn> = D/DT.today()                     # Current local date or naive datetime. <DTn> = DT.utcnow()                      # Naive datetime from current UTC time. <DTa> = DT.now(<tzinfo>)                 # Aware datetime from current tz time.
  • 提取时间使用情况'<DTn>.time()''<DTa>.time()''<DTa>.timetz()'

时区

<tzinfo> = UTC # UTC timezone. London without DST. <tzinfo> = tzlocal()                        # Local timezone. Also gettz(). <tzinfo> = gettz('<Continent>/<City>')      # 'Continent/City_Name' timezone or None. <DTa> = <DT>.astimezone(<tzinfo>)        # Datetime, converted to the passed timezone. <Ta/DTa> = <T/DT>.replace(tzinfo=<tzinfo>)  # Unconverted object with a new timezone.

编码

<D/T/DT> = D/T/DT.fromisoformat('<iso>')    # Object from ISO string. Raises ValueError. <DT> = DT.strptime(<str>, '<format>')   # Datetime from str, according to format. <D/DTn> = D/DT.fromordinal(<int>)          # D/DTn from days since the Gregorian NYE 1. <DTn> = DT.fromtimestamp(<real>)         # Local time DTn from seconds since the Epoch. <DTa> = DT.fromtimestamp(<real>, <tz.>)  # Aware datetime from seconds since the Epoch.
  • ISO字符串采用以下形式:'YYYY-MM-DD''HH:MM:SS.ffffff[±<offset>]',或两者均由任意字符分隔。偏移的格式为:'HH:MM'
  • Unix系统上的纪元是:'1970-01-01 00:00 UTC''1970-01-01 01:00 CET'

解码

<str> = <D/T/DT>.isoformat(sep='T')      # Also timespec='auto/hours/minutes/seconds'. <str> = <D/T/DT>.strftime('<format>')    # Custom string representation. <int> = <D/DT>.toordinal()               # Days since Gregorian NYE 1, ignoring time and tz. <float> = <DTn>.timestamp()                # Seconds since the Epoch, from DTn in local tz. <float> = <DTa>.timestamp()                # Seconds since the Epoch, from DTa.

格式化

>>> from datetime import datetime >>> dt = datetime.strptime('2015-05-14 23:39:00.00 +0200', '%Y-%m-%d %H:%M:%S.%f %z')
>>> dt.strftime("%A, %dth of %B '%y, %I:%M%p %Z")
"Thursday, 14th of May '15, 11:39PM UTC+02:00"
  • 在解析时,'%z'也接受'±HH:MM'
  • 用于缩写的工作日和月份使用'%a''%b'

算术

<D/DT> = <D/DT>   ± <TD> # Returned datetime can fall into missing hour. <TD> = <D/DTn> - <D/DTn> # Returns the difference, ignoring time jumps. <TD> = <DTa> - <DTa> # Ignores time jumps if they share tzinfo object. <TD> = <DT_UTC> - <DT_UTC> # Convert DTs to UTC to get the actual delta.

论据

内部函数调用

<function>(<positional_args>)                  # f(0, 0) <function>(<keyword_args>)                     # f(x=0, y=0) <function>(<positional_args>, <keyword_args>)  # f(0, y=0)

内部函数定义

def f(<nondefault_args>):                      # def f(x, y): def f(<default_args>):                         # def f(x=0, y=0): def f(<nondefault_args>, <default_args>):      # def f(x, y=0):

Splat运算符

内部函数调用

Splat将集合扩展为位置参数,而Splatty-Splat将字典扩展为关键字参数

args = (1, 2)
kwargs = {'x': 3, 'y': 4, 'z': 5}
func(*args, **kwargs)

等同于:

func(1, 2, x=3, y=4, z=5)

内部函数定义

Splat将零个或多个位置参数组合到一个元组中,而Splatty-Splat将零个或多个关键字参数组合到字典中

def add(*a):
    return sum(a)
>>> add(1, 2, 3)
6

法律论据组合:

def f(x, y, z):                # f(x=1, y=2, z=3) | f(1, y=2, z=3) | f(1, 2, z=3) | f(1, 2, 3) def f(*, x, y, z):             # f(x=1, y=2, z=3) def f(x, *, y, z):             # f(x=1, y=2, z=3) | f(1, y=2, z=3) def f(x, y, *, z):             # f(x=1, y=2, z=3) | f(1, y=2, z=3) | f(1, 2, z=3)
def f(*args):                  # f(1, 2, 3) def f(x, *args):               # f(1, 2, 3) def f(*args, z):               # f(1, 2, z=3) def f(x, *args, z):            # f(1, 2, z=3)
def f(**kwargs):               # f(x=1, y=2, z=3) def f(x, **kwargs):            # f(x=1, y=2, z=3) | f(1, y=2, z=3) def f(*, x, **kwargs):         # f(x=1, y=2, z=3)
def f(*args, **kwargs):        # f(x=1, y=2, z=3) | f(1, y=2, z=3) | f(1, 2, z=3) | f(1, 2, 3) def f(x, *args, **kwargs):     # f(x=1, y=2, z=3) | f(1, y=2, z=3) | f(1, 2, z=3) | f(1, 2, 3) def f(*args, y, **kwargs):     # f(x=1, y=2, z=3) | f(1, y=2, z=3) def f(x, *args, z, **kwargs):  # f(x=1, y=2, z=3) | f(1, y=2, z=3) | f(1, 2, z=3)

其他用途

<list> = [*<collection> [, ...]]
<set> = {*<collection> [, ...]}
<tup.> = (*<collection>, [...])
<dict> = {**<dict> [, ...]}
head, *body, tail = <collection>

内联

兰姆达

<func> = lambda: <return_value> <func> = lambda <arg_1>, <arg_2>: <return_value>

理解

<list> = [i+1 for i in range(10)]                         # [1, 2, ..., 10] <set> = {i for i in range(10) if i > 5}                  # {6, 7, 8, 9} <iter> = (i+5 for i in range(10))                         # (5, 6, ..., 14) <dict> = {i: i*2 for i in range(10)}                      # {0: 0, 1: 2, ..., 9: 18}
>>> [l+r for l in 'abc' for r in 'abc']
['aa', 'ab', 'ac', ..., 'cc']

地图、过滤、缩小

<iter> = map(lambda x: x + 1, range(10))                  # (1, 2, ..., 10) <iter> = filter(lambda x: x > 5, range(10))               # (6, 7, 8, 9) <obj> = reduce(lambda out, x: out + x, range(10))        # 45
  • Reduce必须从functools模块导入

任何、全部

<bool> = any(<collection>)                                # False if empty. <bool> = all(el[1] for el in <collection>)                # True if empty.

条件表达式

<obj> = <exp_if_true> if <condition> else <exp_if_false>
>>> [a if a else 'zero' for a in (0, 1, 2, 3)]
['zero', 1, 2, 3]

命名元组、枚举、数据类

from collections import namedtuple Point = namedtuple('Point', 'x y')
point = Point(0, 0)
from enum import Enum Direction = Enum('Direction', 'n e s w')
direction = Direction.n
from dataclasses import make_dataclass Creature = make_dataclass('Creature', ['loc', 'dir'])
creature = Creature(Point(0, 0), Direction.n)

闭合

在以下情况下,我们在Python中有一个闭包:

  • 嵌套函数引用其封闭函数的值,然后
  • 封闭函数返回嵌套函数
def get_multiplier(a):
    def out(b):
        return a * b return out
>>> multiply_by_3 = get_multiplier(3)
>>> multiply_by_3(10)
30
  • 如果封闭函数内的多个嵌套函数引用相同的值,则共享该值
  • 动态访问函数的第一个自由变量使用'<function>.__closure__[0].cell_contents'

部分

from functools import partial <function> = partial(<function> [, <arg_1>, <arg_2>, ...])
>>> import operator as op >>> multiply_by_3 = partial(op.mul, 3)
>>> multiply_by_3(10)
30
  • 在需要将函数作为参数传递的情况下,Partial也很有用,因为它使我们能够预先设置其参数
  • 以下是几个例子:'defaultdict(<function>)''iter(<function>, to_exclusive)'和数据类的'field(default_factory=<function>)'

非本地

如果将变量赋给作用域中的任何位置,则将其视为局部变量,除非将其声明为“全局”或“非局部”

def get_counter():
    i = 0 def out():
        nonlocal i i += 1 return i return out
>>> counter = get_counter()
>>> counter(), counter(), counter()
(1, 2, 3)

装饰师

装饰者接受一个函数,添加一些功能并返回它

@decorator_name def function_that_gets_passed_to_decorator():
    ...

调试器示例

每次调用时打印函数名称的装饰器

from functools import wraps def debug(func):
    @wraps(func) def out(*args, **kwargs):
        print(func.__name__)
        return func(*args, **kwargs)
    return out @debug def add(x, y):
    return x + y
  • Wraps是一个辅助装饰器,它将传递的函数(Func)的元数据复制到它正在包装(Out)的函数
  • 没有它的话'add.__name__'会回来'out'

LRU缓存

缓存函数返回值的装饰器。所有函数的参数都必须是可哈希的

from functools import lru_cache @lru_cache(maxsize=None) def fib(n):
    return n if n < 2 else fib(n-2) + fib(n-1)
  • 默认情况下,CPython解释器将递归深度限制为1000。要增加它的使用量,请执行以下操作'sys.setrecursionlimit(<depth>)'

参数化装饰器

接受参数的修饰器,并返回接受函数的普通修饰器

from functools import wraps def debug(print_result=False):
    def decorator(func):
        @wraps(func) def out(*args, **kwargs):
            result = func(*args, **kwargs)
            print(func.__name__, result if print_result else '')
            return result return out return decorator @debug(print_result=True) def add(x, y):
    return x + y

班级

class <name>:
    def __init__(self, a):
        self.a = a def __repr__(self):
        class_name = self.__class__.__name__ return f'{class_name}({self.a!r})' def __str__(self):
        return str(self.a)

    @classmethod def get_class_name(cls):
        return cls.__name__
  • repr()的返回值应该是明确的,并且str()的返回值应该是可读的
  • 如果只定义了repr(),它也将用于str()

str()使用案例:

print(<el>)
print(f'{<el>}')
raise Exception(<el>)
loguru.logger.debug(<el>)
csv.writer(<file>).writerow([<el>])

repr()用例:

print([<el>])
print(f'{<el>!r}')
>>> <el> loguru.logger.exception()
Z = dataclasses.make_dataclass('Z', ['a']); print(Z(<el>))

构造函数重载

class <name>:
    def __init__(self, a=None):
        self.a = a

继承

class Person:
    def __init__(self, name, age):
        self.name = name self.age = age class Employee(Person):
    def __init__(self, name, age, staff_num):
        super().__init__(name, age)
        self.staff_num = staff_num

多重继承

class A: pass class B: pass class C(A, B): pass

MRO确定在搜索方法时遍历父类的顺序:

>>> C.mro()
[<class 'C'>, <class 'A'>, <class 'B'>, <class 'object'>]

属性

实现getter和setter的Python方法

class MyClass:
    @property def a(self):
        return self._a @a.setter def a(self, value):
        self._a = value
>>> el = MyClass()
>>> el.a = 123 >>> el.a 123

数据类

自动生成init()、repr()和eq()特殊方法的装饰器

from dataclasses import dataclass, field @dataclass(order=False, frozen=False) class <class_name>:
    <attr_name_1>: <type> <attr_name_2>: <type> = <default_value> <attr_name_3>: list/dict/set = field(default_factory=list/dict/set)
  • 可以使用以下命令使对象可排序'order=True'并且不会随'frozen=True'
  • 要使对象可哈希,所有属性都必须是可哈希的,并且冻结必须为True
  • 函数字段()是必需的,因为'<attr_name>: list = []'将生成一个在所有实例之间共享的列表
  • DEFAULT_FACTORY可以是ANYcallable

内联:

from dataclasses import make_dataclass <class> = make_dataclass('<class_name>', <coll_of_attribute_names>)
<class> = make_dataclass('<class_name>', <coll_of_tuples>)
<tuple> = ('<attr_name>', <type> [, <default_value>])

插槽

一种将对象限制为“插槽”中列出的属性并显著减少其内存占用的机制

class MyClassWithSlots:
    __slots__ = ['a']
    def __init__(self):
        self.a = 1

复制

from copy import copy, deepcopy <object> = copy(<object>)
<object> = deepcopy(<object>)

鸭种类型

鸭子类型是一种隐式类型,它规定了一组特殊的方法。任何定义了这些方法的对象都被视为该鸭子类型的成员

可比的

  • 如果eq()方法未被重写,则返回'id(self) == id(other)',它与'self is other'
  • 这意味着默认情况下,所有对象的比较结果并不相等
  • 只有左侧对象调用了eq()方法,除非它返回NotImplemented,在这种情况下会咨询右侧对象
class MyComparable:
    def __init__(self, a):
        self.a = a def __eq__(self, other):
        if isinstance(other, type(self)):
            return self.a == other.a return NotImplemented

可哈希

  • 哈希对象同时需要hash()和eq()方法,并且其哈希值不应更改
  • 比较相等的Hasable对象必须具有相同的散列值,这意味着返回的默认hash()'id(self)'不会成功的
  • 这就是为什么如果只实现eq(),Python会自动使类不可散列
class MyHashable:
    def __init__(self, a):
        self._a = a @property def a(self):
        return self._a def __eq__(self, other):
        if isinstance(other, type(self)):
            return self.a == other.a return NotImplemented def __hash__(self):
        return hash(self.a)

可排序的

  • 使用TOTAL_ORDING修饰符,您只需要提供eq()和lt()、gt()、le()或ge()中的一个特殊方法
from functools import total_ordering @total_ordering class MySortable:
    def __init__(self, a):
        self.a = a def __eq__(self, other):
        if isinstance(other, type(self)):
            return self.a == other.a return NotImplemented def __lt__(self, other):
        if isinstance(other, type(self)):
            return self.a < other.a return NotImplemented

迭代器

  • 任何具有方法Next()和ITER()的对象都是迭代器
  • Next()应返回下一项或引发StopIteration
  • iter()应返回“”self“”
class Counter:
    def __init__(self):
        self.i = 0 def __next__(self):
        self.i += 1 return self.i def __iter__(self):
        return self
>>> counter = Counter()
>>> next(counter), next(counter), next(counter)
(1, 2, 3)

Python具有许多不同的迭代器对象:

可调用

  • 所有函数和类都有call()方法,因此是可调用的
  • 当此小抄使用'<function>'作为一个论点,它实际上意味着'<callable>'
class Counter:
    def __init__(self):
        self.i = 0 def __call__(self):
        self.i += 1 return self.i
>>> counter = Counter()
>>> counter(), counter(), counter()
(1, 2, 3)

上下文管理器

  • Enter()应锁定资源并可选地返回对象
  • exit()应该释放资源
  • 在WITH挡路中发生的任何异常都会传递给exit()方法
  • 如果它希望取消该异常,则必须返回TRUE值
class MyOpen:
    def __init__(self, filename):
        self.filename = filename def __enter__(self):
        self.file = open(self.filename)
        return self.file def __exit__(self, exc_type, exception, traceback):
        self.file.close()
>>> with open('test.txt', 'w') as file:
...     file.write('Hello World!')
>>> with MyOpen('test.txt') as file:
...     print(file.read())
Hello World!

可重复使用的鸭种

可迭代的

  • 唯一需要的方法是ITER()。它应该返回对象项的迭代器
  • CONTAINS()自动作用于定义了ITER()的任何对象
class MyIterable:
    def __init__(self, a):
        self.a = a def __iter__(self):
        return iter(self.a)
    def __contains__(self, el):
        return el in self.a
>>> obj = MyIterable([1, 2, 3])
>>> [el for el in obj]
[1, 2, 3]
>>> 1 in obj True

集合

  • 唯一需要的方法是iter()和len()
  • 这张小抄实际上意味着'<iterable>'当它使用'<collection>'
  • 我选择不使用“iterable”这个名称,因为它听起来比“集合”更可怕、更模糊。
class MyCollection:
    def __init__(self, a):
        self.a = a def __iter__(self):
        return iter(self.a)
    def __contains__(self, el):
        return el in self.a def __len__(self):
        return len(self.a)

序列

  • 只有len()和getitem()是必需的方法
  • Getitem()应返回索引处的项或引发IndexError
  • ITER()和CONTAINS()自动处理定义了getitem()的任何对象
  • reverted()自动作用于定义了len()和getitem()的任何对象
class MySequence:
    def __init__(self, a):
        self.a = a def __iter__(self):
        return iter(self.a)
    def __contains__(self, el):
        return el in self.a def __len__(self):
        return len(self.a)
    def __getitem__(self, i):
        return self.a[i]
    def __reversed__(self):
        return reversed(self.a)

ABC序列

  • 这是一个比基本序列更丰富的界面
  • 扩展它将生成ITER()、CONTAINS()、REVERED()、INDEX()和COUNT()
  • 不像'abc.Iterable''abc.Collection',它不是鸭型的。这就是为什么'issubclass(MySequence, abc.Sequence)'即使MySequence定义了所有方法,也会返回false
from collections import abc class MyAbcSequence(abc.Sequence):
    def __init__(self, a):
        self.a = a def __len__(self):
        return len(self.a)
    def __getitem__(self, i):
        return self.a[i]

所需和自动可用的特殊方法表:

+------------+------------+------------+------------+--------------+
|            |  Iterable  | Collection |  Sequence  | abc.Sequence |
+------------+------------+------------+------------+--------------+
| iter()     |    REQ     |    REQ     |    Yes     |     Yes      |
| contains() |    Yes     |    Yes     |    Yes     |     Yes      |
| len()      |            |    REQ     |    REQ     |     REQ      |
| getitem()  |            |            |    REQ     |     REQ      |
| reversed() |            |            |    Yes     |     Yes      |
| index()    |            |            |            |     Yes      |
| count()    |            |            |            |     Yes      |
+------------+------------+------------+------------+--------------+
  • 其他生成缺少方法的ABC有:MutableSequence、Set、MutableSet、Mapping和MutableMapping
  • 它们所需方法的名称存储在'<abc>.__abstractmethods__'

枚举

from enum import Enum, auto
class <enum_name>(Enum):
    <member_name_1> = <value_1> <member_name_2> = <value_2_a>, <value_2_b> <member_name_3> = auto()
  • 如果auto()之前没有数值,则返回1
  • 否则,它返回最后一个数值的增量
<member> = <enum>.<member_name> # Returns a member. <member> = <enum>['<member_name>']              # Returns a member or raises KeyError. <member> = <enum>(<value>)                      # Returns a member or raises ValueError. <str> = <member>.name # Returns member's name. <obj> = <member>.value # Returns member's value.
list_of_members = list(<enum>)
member_names = [a.name for a in <enum>]
member_values = [a.value for a in <enum>]
random_member = random.choice(list(<enum>))
def get_next_member(member):
    members = list(member.__class__)
    index = (members.index(member) + 1) % len(members)
    return members[index]

内联

Cutlery = Enum('Cutlery', 'fork knife spoon')
Cutlery = Enum('Cutlery', ['fork', 'knife', 'spoon'])
Cutlery = Enum('Cutlery', {'fork': 1, 'knife': 2, 'spoon': 3})

用户定义的函数不能是值,因此必须对其进行包装:

from functools import partial LogicOp = Enum('LogicOp', {'AND': partial(lambda l, r: l and r),
                           'OR' : partial(lambda l, r: l or r)})
  • 此特定情况下的另一个解决方案是使用模块中的函数和_()和或_(operator

例外情况

基本示例

try:
    <code> except <exception>:
    <code>

复杂示例

try:
    <code_1> except <exception_a>:
    <code_2_a> except <exception_b>:
    <code_2_b> else:
    <code_2_c> finally:
    <code_3>
  • 中的代码。'else'挡路只有在以下情况下才会被执行'try'挡路也不例外。
  • 中的代码。'finally'挡路永远会被执行

捕获异常

except <exception>:
except <exception> as <name>:
except (<exception>, [...]):
except (<exception>, [...]) as <name>:
  • 还捕获异常的子类
  • 使用'traceback.print_exc()'要将错误消息打印到标准错误,请执行以下操作
  • 使用'print(<name>)'仅打印异常的原因(其参数)

引发异常

raise <exception> raise <exception>()
raise <exception>(<el> [, ...])

重新引发捕获的异常:

except <exception> as <name>:
    ...
    raise

异常对象

arguments = <name>.args exc_type = <name>.__class__ filename = <name>.__traceback__.tb_frame.f_code.co_filename func_name = <name>.__traceback__.tb_frame.f_code.co_name line = linecache.getline(filename, <name>.__traceback__.tb_lineno)
error_msg = ''.join(traceback.format_exception(exc_type, <name>, <name>.__traceback__))

内置异常

BaseException
 +-- SystemExit                   # Raised by the sys.exit() function.
 +-- KeyboardInterrupt            # Raised when the user hits the interrupt key (ctrl-c).
 +-- Exception                    # User-defined exceptions should be derived from this class.
      +-- ArithmeticError         # Base class for arithmetic errors.
      |    +-- ZeroDivisionError  # Raised when dividing by zero.
      +-- AttributeError          # Raised when an attribute is missing.
      +-- EOFError                # Raised by input() when it hits end-of-file condition.
      +-- LookupError             # Raised when a look-up on a collection fails.
      |    +-- IndexError         # Raised when a sequence index is out of range.
      |    +-- KeyError           # Raised when a dictionary key or set element is not found.
      +-- NameError               # Raised when a variable name is not found.
      +-- OSError                 # Errors such as “file not found” or “disk full” (see Open).
      |    +-- FileNotFoundError  # When a file or directory is requested but doesn't exist.
      +-- RuntimeError            # Raised by errors that don't fall into other categories.
      |    +-- RecursionError     # Raised when the maximum recursion depth is exceeded.
      +-- StopIteration           # Raised by next() when run on an empty iterator.
      +-- TypeError               # Raised when an argument is of wrong type.
      +-- ValueError              # When an argument is of right type but inappropriate value.
           +-- UnicodeError       # Raised when encoding/decoding strings to/from bytes fails.

集合及其异常:

+-----------+------------+------------+------------+
|           |    List    |    Set     |    Dict    |
+-----------+------------+------------+------------+
| getitem() | IndexError |            |  KeyError  |
| pop()     | IndexError |  KeyError  |  KeyError  |
| remove()  | ValueError |  KeyError  |            |
| index()   | ValueError |            |            |
+-----------+------------+------------+------------+

有用的内置异常:

raise TypeError('Argument is of wrong type!')
raise ValueError('Argument is of right type but inappropriate value!')
raise RuntimeError('None of above!')

用户定义的异常

class MyError(Exception):
    pass class MyInputError(MyError):
    pass

出口

通过引发SystemExit异常退出解释器

import sys sys.exit()                        # Exits with exit code 0 (success). sys.exit(<el>)                    # Prints to stderr and exits with 1. sys.exit(<int>)                   # Exits with passed exit code.

打印

print(<el_1>, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
  • 使用'file=sys.stderr'有关错误的消息,请参阅
  • 使用'flush=True'强行冲刷溪流

漂亮的印刷体

from pprint import pprint pprint(<collection>, width=80, depth=None, compact=False, sort_dicts=True)
  • 比“深度”更深的标高将替换为“”。

输入

从用户输入或管道(如果存在)中读取行

<str> = input(prompt=None)
  • 拖尾换行符被剥离
  • 在读取输入之前,提示字符串将打印到标准输出
  • 当用户点击EOF(ctrl-d/z)或输入流耗尽时引发EOFError

命令行参数

import sys scripts_path = sys.argv[0]
arguments = sys.argv[1:]

参数解析器

from argparse import ArgumentParser, FileType p = ArgumentParser(description=<str>)
p.add_argument('-<short_name>', '--<name>', action='store_true')  # Flag p.add_argument('-<short_name>', '--<name>', type=<type>)          # Option p.add_argument('<name>', type=<type>, nargs=1)                    # First argument p.add_argument('<name>', type=<type>, nargs='+')                  # Remaining arguments p.add_argument('<name>', type=<type>, nargs='*')                  # Optional arguments args = p.parse_args()                                            # Exits on error. value = args.<name>
  • 使用'help=<str>'要设置参数说明,请执行以下操作
  • 使用'default=<el>'要设置默认值,请执行以下操作
  • 使用'type=FileType(<mode>)'对于文件

打开

打开文件并返回相应的文件对象

<file> = open(<path>, mode='r', encoding=None, newline=None)
  • 'encoding=None'表示使用默认编码,这取决于平台。最佳实践是使用'encoding="utf-8"'任何可能的时候
  • 'newline=None'表示所有不同的行尾组合在读取时转换为‘\n’,而在写入时所有‘\n’字符转换为系统的默认行分隔符
  • 'newline=""'表示不进行转换,但输入仍被‘\n’、‘\r’或‘\r\n’上的readline()和readline()分成块

模式

  • 'r'-读取(默认)
  • 'w'-写入(截断)
  • 'x'-如果文件已存在,则写入或失败
  • 'a'-附加
  • 'w+'-读写(截断)
  • 'r+'-从头开始读写
  • 'a+'-从末尾读写
  • 't'-文本模式(默认)
  • 'b'-二进制模式

例外情况

  • 'FileNotFoundError'可以在阅读时引发'r''r+'
  • 'FileExistsError'在使用'x'
  • 'IsADirectoryError''PermissionError'可以由任何
  • 'OSError'是所有列出的异常的父类。

文件对象

<file>.seek(0)                      # Moves to the start of the file. <file>.seek(offset)                 # Moves 'offset' chars/bytes from the start. <file>.seek(0, 2)                   # Moves to the end of the file. <bin_file>.seekoffset, <anchor>)  # Anchor: 0 start, 1 current position, 2 end.
<str/bytes> = <file>.read(size=-1)  # Reads 'size' chars/bytes or until EOF. <str/bytes> = <file>.readline()     # Returns a line or empty string/bytes on EOF. <list> = <file>.readlines()    # Returns a list of remaining lines. <str/bytes> = next(<file>)          # Returns a line using buffer. Do not mix.
<file>.write(<str/bytes>)           # Writes a string or bytes object. <file>.writelines(<collection>)     # Writes a coll. of strings or bytes objects. <file>.flush()                      # Flushes write buffer.
  • 方法不添加或去掉尾随换行符,甚至不添加或去掉写入行()

从文件读取文本

def read_file(filename):
    with open(filename, encoding='utf-8') as file:
        return file.readlines()

将文本写入文件

def write_to_file(filename, text):
    with open(filename, 'w', encoding='utf-8') as file:
        file.write(text)

路径

from os import getcwd, path, listdir from glob import glob
<str> = getcwd()                   # Returns the current working directory. <str> = path.join(<path>, ...)     # Joins two or more pathname components. <str> = path.abspath(<path>)       # Returns absolute path.
<str> = path.basename(<path>)      # Returns final component of the path. <str> = path.dirname(<path>)       # Returns path without the final component. <tup.> = path.splitext(<path>)      # Splits on last period of the final component.
<list> = listdir(path='.')          # Returns filenames located at path. <list> = glob('<pattern>')          # Returns paths matching the wildcard pattern.
<bool> = path.exists(<path>)        # Or: <Path>.exists() <bool> = path.isfile(<path>)        # Or: <DirEntry/Path>.is_file() <bool> = path.isdir(<path>)         # Or: <DirEntry/Path>.is_dir()

目录条目

使用scandir()代替listdir()可以显著提高同样需要文件类型信息的代码的性能

from os import scandir
<iter> = scandir(path='.')          # Returns DirEntry objects located at path. <str> = <DirEntry>.path # Returns whole path as a string. <str> = <DirEntry>.name # Returns final component as a string. <file> = open(<DirEntry>)           # Opens the file and returns file object.

路径对象

from pathlib import Path
<Path> = Path(<path> [, ...])       # Accepts strings, Paths and DirEntry objects. <Path> = <path> / <path> [/ ...]    # One of the paths must be a Path object.
<Path> = Path()                     # Returns relative cwd. Also Path('.'). <Path> = Path.cwd()                 # Returns absolute cwd. Also Path().resolve(). <Path> = Path.home()                # Returns user's home directory. <Path> = Path(__file__).resolve()   # Returns script's path if cwd wasn't changed.
<Path> = <Path>.parent # Returns Path without final component. <str> = <Path>.name # Returns final component as a string. <str> = <Path>.stem # Returns final component without extension. <str> = <Path>.suffix # Returns final component's extension. <tup.> = <Path>.parts # Returns all components as strings.
<iter> = <Path>.iterdir()           # Returns dir contents as Path objects. <iter> = <Path>.glob('<pattern>')   # Returns Paths matching the wildcard pattern.
<str> = str(<Path>)                # Returns path as a string. <file> = open(<Path>)               # Opens the file and returns file object.

操作系统命令

文件和目录

  • 路径可以是字符串、路径或DirEntry对象
  • 函数通过引发OSError或其subclasses
import os, shutil
os.chdir(<path>)                    # Changes the current working directory. os.mkdir(<path>, mode=0o777)        # Creates a directory. Mode is in octal. os.makedirs(<path>, mode=0o777)     # Creates all directories in the path.
shutil.copy(from, to)               # Copies the file. 'to' can exist or be a dir. shutil.copytree(from, to)           # Copies the directory. 'to' must not exist.
os.rename(from, to)                 # Renames/moves the file or directory. os.replace(from, to)                # Same, but overwrites 'to' if it exists.
os.remove(<path>)                   # Deletes the file. os.rmdir(<path>)                    # Deletes the empty directory. shutil.rmtree(<path>)               # Deletes the directory.

Shell命令

import os <str> = os.popen('<shell_command>').read()

将‘1+1’发送到基本计算器并捕获其输出:

>>> from subprocess import run >>> run('bc', input='1 + 1\n', capture_output=True, encoding='utf-8')
CompletedProcess(args='bc', returncode=0, stdout='2\n', stderr='')

将test.in发送到在标准模式下运行的基本计算器,并将其输出保存到test.out:

>>> from shlex import split >>> os.popen('echo 1 + 1 > test.in')
>>> run(split('bc -s'), stdin=open('test.in'), stdout=open('test.out', 'w'))
CompletedProcess(args=['bc', '-s'], returncode=0)
>>> open('test.out').read()
'2\n'

JSON

用于存储字符串和数字集合的文本文件格式

import json <str> = json.dumps(<object>, ensure_ascii=True, indent=None)
<object> = json.loads(<str>)

从JSON文件读取对象

def read_json_file(filename):
    with open(filename, encoding='utf-8') as file:
        return json.load(file)

将对象写入JSON文件

def write_to_json_file(filename, an_object):
    with open(filename, 'w', encoding='utf-8') as file:
        json.dump(an_object, file, ensure_ascii=False, indent=2)

泡菜

用于存储对象的二进制文件格式

import pickle <bytes> = pickle.dumps(<object>)
<object> = pickle.loads(<bytes>)

从文件读取对象

def read_pickle_file(filename):
    with open(filename, 'rb') as file:
        return pickle.load(file)

将对象写入文件

def write_to_pickle_file(filename, an_object):
    with open(filename, 'wb') as file:
        pickle.dump(an_object, file)

CSV

用于存储电子表格的文本文件格式

import csv

朗读

<reader> = csv.reader(<file>)       # Also: `dialect='excel', delimiter=','`. <list> = next(<reader>)           # Returns next row as a list of strings. <list> = list(<reader>)           # Returns list of remaining rows.
  • 打开文件时必须使用'newline=""'参数,否则将无法正确解释嵌入在带引号的字段中的换行符!

<writer> = csv.writer(<file>)       # Also: `dialect='excel', delimiter=','`. <writer>.writerow(<collection>)     # Encodes objects using `str(<el>)`. <writer>.writerows(<coll_of_coll>)  # Appends multiple rows.
  • 打开文件时必须使用'newline=""'参数,或在使用‘\r\n’行结尾的平台上的每个‘\n’前面添加‘\r’!

参数

  • 'dialect'-设置默认值的主参数
  • 'delimiter'-用于分隔字段的单字符字符串
  • 'quotechar'-用于引用包含特殊字符的字段的字符
  • 'doublequote'-字段内的报价是加倍还是转义
  • 'skipinitialspace'-分隔符后的空格是否被剥离
  • 'lineterminator'-指定编写器如何终止行
  • 'quoting'-控制报价数量:0-根据需要,1-全部
  • 'escapechar'-如果‘doublequote’为false,则用于转义‘quotechar’的字符

方言

+------------------+--------------+--------------+--------------+
|                  |     excel    |   excel-tab  |     unix     |
+------------------+--------------+--------------+--------------+
| delimiter        |       ','    |      '\t'    |       ','    |
| quotechar        |       '"'    |       '"'    |       '"'    |
| doublequote      |      True    |      True    |      True    |
| skipinitialspace |     False    |     False    |     False    |
| lineterminator   |    '\r\n'    |    '\r\n'    |      '\n'    |
| quoting          |         0    |         0    |         1    |
| escapechar       |      None    |      None    |      None    |
+------------------+--------------+--------------+--------------+

从CSV文件读取行

def read_csv_file(filename):
    with open(filename, encoding='utf-8', newline='') as file:
        return list(csv.reader(file))

将行写入CSV文件

def write_to_csv_file(filename, rows):
    with open(filename, 'w', encoding='utf-8', newline='') as file:
        writer = csv.writer(file)
        writer.writerows(rows)

SQLite

将每个数据库存储到单独文件中的无服务器数据库引擎

连接

打开到数据库文件的连接。如果路径不存在,则创建新文件

import sqlite3 <conn> = sqlite3.connect(<path>)                # Also ':memory:'. <conn>.close()                                  # Closes the connection.

朗读

返回值的类型可以是STR、INT、FLOAT、BYTES或NONE

<cursor> = <conn>.execute('<query>')            # Can raise a subclass of sqlite3.Error. <tuple> = <cursor>.fetchone()                  # Returns next row. Also next(<cursor>). <list> = <cursor>.fetchall()                  # Returns remaining rows. Also list(<cursor>).

<conn>.execute('<query>')                       # Can raise a subclass of sqlite3.Error. <conn>.commit()                                 # Saves all changes since the last commit. <conn>.rollback()                               # Discards all changes since the last commit.

或者:

with <conn>:                                    # Exits the block with commit() or rollback(), <conn>.execute('<query>')                   # depending on whether an exception occurred.

占位符

  • 传递的值可以是STR、INT、FLOAT、BYTES、NONE、BOOL、datetime.date或datetime.datetme类型
  • 布尔值将以整数形式存储并返回,日期形式为ISO formatted strings
<conn>.execute('<query>', <list/tuple>)         # Replaces '?'s in query with values. <conn>.execute('<query>', <dict/namedtuple>)    # Replaces ':<key>'s with values. <conn>.executemany('<query>', <coll_of_above>)  # Runs execute() multiple times.

示例

在此示例中,并未实际保存值,因为'conn.commit()'被省略了!

>>> conn = sqlite3.connect('test.db')
>>> conn.execute('CREATE TABLE person (person_id INTEGER PRIMARY KEY, name, height)')
>>> conn.execute('INSERT INTO person VALUES (NULL, ?, ?)', ('Jean-Luc', 187)).lastrowid 1 >>> conn.execute('SELECT * FROM person').fetchall()
[(1, 'Jean-Luc', 187)]

MySQL

具有非常相似的界面,不同之处如下所示

# $ pip3 install mysql-connector from mysql import connector <conn> = connector.connect(host=<str>, …)     # `user=<str>, password=<str>, database=<str>`. <cursor> = <conn>.cursor()                      # Only cursor has execute method. <cursor>.execute('<query>')                     # Can raise a subclass of connector.Error. <cursor>.execute('<query>', <list/tuple>)       # Replaces '%s's in query with values. <cursor>.execute('<query>', <dict/namedtuple>)  # Replaces '%(<key>)s's with values.

字节数

Bytes对象是一个不变的单字节序列。可变版本称为字节数组

<bytes> = b'<str>' # Only accepts ASCII characters and \x00-\xff. <int> = <bytes>[<index>]               # Returns int in range from 0 to 255. <bytes> = <bytes>[<slice>]               # Returns bytes even if it has only one element. <bytes> = <bytes>.join(<coll_of_bytes>)  # Joins elements using bytes as a separator.

编码

<bytes> = bytes(<coll_of_ints>)          # Ints must be in range from 0 to 255. <bytes> = bytes(<str>, 'utf-8')          # Or: <str>.encode('utf-8') <bytes> = <int>.to_bytes(n_bytes, …)     # `byteorder='big/little', signed=False`. <bytes> = bytes.fromhex('<hex>')         # Hex pairs can be separated by spaces.

解码

<list> = list(<bytes>)                  # Returns ints in range from 0 to 255. <str> = str(<bytes>, 'utf-8')          # Or: <bytes>.decode('utf-8') <int> = int.from_bytes(<bytes>, …)     # `byteorder='big/little', signed=False`.
'<hex>' = <bytes>.hex()                  # Returns a string of hexadecimal pairs.

从文件读取字节

def read_bytes(filename):
    with open(filename, 'rb') as file:
        return file.read()

将字节写入文件

def write_bytes(filename, bytes_obj):
    with open(filename, 'wb') as file:
        file.write(bytes_obj)

结构

  • 在数字序列和字节对象之间执行转换的模块
  • 缺省情况下使用系统的类型大小和字节顺序
from struct import pack, unpack, iter_unpack
<bytes> = pack('<format>', <num_1> [, <num_2>, ...])
<tuple> = unpack('<format>', <bytes>)
<tuples> = iter_unpack('<format>', <bytes>)

示例

>>> pack('>hhl', 1, 2, 3)
b'\x00\x01\x00\x02\x00\x00\x00\x03' >>> unpack('>hhl', b'\x00\x01\x00\x02\x00\x00\x00\x03')
(1, 2, 3)

格式化

对于标准字体大小,格式字符串的开头为:

  • '='-系统的字节顺序(通常为小端)
  • '<'-小端字节序
  • '>'-Big-Endian(另请参阅'!')

整数类型。无符号文字使用大写字母。括号中有最小尺寸和标准尺寸:

  • 'x'-填充字节
  • 'b'-字符(1/1)
  • 'h'-短(2/2)
  • 'i'-整型(2/4)
  • 'l'-长(4/4)
  • 'q'-Long Long(8/8)

浮点类型:

  • 'f'-浮动(4/4)
  • 'd'-双倍(8/8)

阵列

只能包含预定义类型的数字的列表。上面列出了可用的类型及其以字节为单位的最小大小。大小和字节顺序始终由系统确定

from array import array <array> = array('<typecode>', <collection>)    # Array from collection of numbers. <array> = array('<typecode>', <bytes>)         # Array from bytes object. <array> = array('<typecode>', <array>)         # Treats array as a sequence of numbers. <bytes> = bytes(<array>)                       # Or: <array>.tobytes() <file>.write(<array>)                          # Writes array to the binary file.

内存视图

  • 指向另一个对象的内存的序列对象
  • 每个元素可以引用单个或多个连续字节,具体取决于格式
  • 元素的顺序和数量可以通过切片进行更改
  • 强制转换仅在char和其他类型之间工作,并使用系统的大小和字节顺序
<mview> = memoryview(<bytes/bytearray/array>)  # Immutable if bytes, else mutable. <real> = <mview>[<index>]                     # Returns an int or a float. <mview> = <mview>[<slice>]                     # Mview with rearranged elements. <mview> = <mview>.cast('<typecode>')           # Casts memoryview to the new format. <mview>.release()                              # Releases the object's memory buffer.

解码

<bytes> = bytes(<mview>)                       # Creates a new bytes object. <bytes> = <bytes>.join(<coll_of_mviews>)       # Joins mviews using bytes object as sep. <array> = array('<typecode>', <mview>)         # Treats mview as a sequence of numbers. <file>.write(<mview>)                          # Writes mview to the binary file.
<list> = list(<mview>)                        # Returns list of ints or floats. <str> = str(<mview>, 'utf-8')                # Treats mview as a bytes object. <int> = int.from_bytes(<mview>, …)           # `byteorder='big/little', signed=False`.
'<hex>' = <mview>.hex()                        # Treats mview as a bytes object.

Deque

一个线程安全的列表,具有高效的追加和从两端弹出的功能。发音为“deck”

from collections import deque <deque> = deque(<collection>, maxlen=None)
<deque>.appendleft(<el>)                       # Opposite element is dropped if full. <deque>.extendleft(<collection>)               # Collection gets reversed. <el> = <deque>.popleft()                       # Raises IndexError if empty. <deque>.rotate(n=1)                            # Rotates elements to the right.

穿线

  • CPython解释器一次只能运行一个线程
  • 这就是使用多个线程不会导致更快执行的原因,除非至少有一个线程包含I/O操作
from threading import Thread, RLock, Semaphore, Event, Barrier from concurrent.futures import ThreadPoolExecutor

螺纹

<Thread> = Thread(target=<function>)           # Use `args=<collection>` to set the arguments. <Thread>.start()                               # Starts the thread. <bool> = <Thread>.is_alive()                   # Checks if the thread has finished executing. <Thread>.join()                                # Waits for the thread to finish.
  • 使用'kwargs=<dict>'将关键字参数传递给函数
  • 使用'daemon=True',否则程序将无法在线程处于活动状态时退出。

锁定

<lock> = RLock()                               # Lock that can only be released by the owner. <lock>.acquire()                               # Waits for the lock to be available. <lock>.release()                               # Makes the lock available again.

或者:

with <lock>:                                   # Enters the block by calling acquire(),
    ...                                        # and exits it with release().

信号量,事件,屏障

<Semaphore> = Semaphore(value=1)               # Lock that can be acquired by 'value' threads. <Event> = Event()                          # Method wait() blocks until set() is called. <Barrier> = Barrier(n_times)                 # Wait() blocks until it's called n_times.

线程池执行器

管理线程执行的

<Exec> = ThreadPoolExecutor(max_workers=None)  # Or: `with ThreadPoolExecutor() as <name>: …` <Exec>.shutdown(wait=True)                     # Blocks until all threads finish executing.
<iter> = <Exec>.map(<func>, <args_1>, ...)     # A multithreaded and non-lazy map(). <Futr> = <Exec>.submit(<func>, <arg_1>, ...)   # Starts a thread and returns its Future object. <bool> = <Futr>.done()                         # Checks if the thread has finished executing. <obj> = <Futr>.result()                       # Waits for thread to finish and returns result.

队列

线程安全的FIFO队列。对于后进先出队列,请使用LifoQueue

from queue import Queue <Queue> = Queue(maxsize=0)
<Queue>.put(<el>)                              # Blocks until queue stops being full. <Queue>.put_nowait(<el>)                       # Raises queue.Full exception if full. <el> = <Queue>.get()                           # Blocks until queue stops being empty. <el> = <Queue>.get_nowait()                    # Raises queue.Empty exception if empty.

操作员

提供操作员功能的功能模块

from operator import add, sub, mul, truediv, floordiv, mod, pow, neg, abs from operator import eq, ne, lt, le, gt, ge from operator import and_, or_, xor, not_ from operator import itemgetter, attrgetter, methodcaller
import operator as op elementwise_sum = map(op.add, list_a, list_b)
sorted_by_second = sorted(<collection>, key=op.itemgetter(1))
sorted_by_both = sorted(<collection>, key=op.itemgetter(1, 0))
product_of_elems = functools.reduce(op.mul, <collection>)
union_of_sets = functools.reduce(op.or_, <coll_of_sets>)
LogicOp = enum.Enum('LogicOp', {'AND': op.and_, 'OR': op.or_})
last_el = op.methodcaller('pop')(<list>)

反省

在运行时检查代码

变量

<list> = dir()                             # Names of local variables (incl. functions). <dict> = vars()                            # Dict of local variables. Also locals(). <dict> = globals()                         # Dict of global variables.

属性

<list> = dir(<object>)                     # Names of object's attributes (incl. methods). <dict> = vars(<object>)                    # Dict of writable attributes. Also <obj>.__dict__. <bool> = hasattr(<object>, '<attr_name>')  # Checks if getattr() raises an AttributeError. value = getattr(<object>, '<attr_name>')  # Raises AttributeError if attribute is missing. setattr(<object>, '<attr_name>', value)    # Only works on objects with __dict__ attribute. delattr(<object>, '<attr_name>')           # Equivalent to `del <object>.<attr_name>`.

参数

from inspect import signature <Sig> = signature(<function>)             # Function's Signature object. <dict> = <Sig>.parameters # Dict of function's Parameter objects. <str> = <Param>.name # Parameter's name. <memb> = <Param>.kind # Member of ParameterKind enum.

元编程

生成代码的代码

类型

类型是根类。如果只传递一个对象,它将返回其类型(类)。否则,它将创建一个新类

<class> = type('<class_name>', <parents_tuple>, <attributes_dict>)
>>> Z = type('Z', (), {'a': 'abcde', 'b': 12345})
>>> z = Z()

元类

创建类的类

def my_meta_class(name, parents, attrs):
    attrs['a'] = 'abcde' return type(name, parents, attrs)

或者:

class MyMetaClass(type):
    def __new__(cls, name, parents, attrs):
        attrs['a'] = 'abcde' return type.__new__(cls, name, parents, attrs)
  • new()是在init()之前调用的类方法。如果它返回其类的实例,则该实例将作为‘self’参数传递给init()
  • 它接收与init()相同的参数,只是第一个参数指定了所需的返回实例类型(在本例中为MyMetaClass)
  • 与我们的示例一样,new()也可以直接调用,通常从子类(def __new__(cls): return super().__new__(cls))
  • 以上示例之间的唯一区别是my_meta_class()返回类型为类型的类,而MyMetaClass()返回类型为MyMetaClass的类

元类属性

就在创建类之前,它检查是否定义了“metaclass”属性。如果没有,它会递归检查他的父母中是否有人定义了它,并最终进入类型()

class MyClass(metaclass=MyMetaClass):
    b = 12345
>>> MyClass.a, MyClass.b
('abcde', 12345)

类型图

type(MyClass)     == MyMetaClass # MyClass is an instance of MyMetaClass. type(MyMetaClass) == type # MyMetaClass is an instance of type.
+-------------+-------------+
|   Classes   | Metaclasses |
+-------------+-------------|
|   MyClass --> MyMetaClass |
|             |     v       |
|    object -----> type <+  |
|             |     ^ +--+  |
|     str ----------+       |
+-------------+-------------+

继承图

MyClass.__base__ == object # MyClass is a subclass of object. MyMetaClass.__base__ == type # MyMetaClass is a subclass of type.
+-------------+-------------+
|   Classes   | Metaclasses |
+-------------+-------------|
|   MyClass   | MyMetaClass |
|      v      |     v       |
|    object <----- type     |
|      ^      |             |
|     str     |             |
+-------------+-------------+

评估

>>> from ast import literal_eval >>> literal_eval('[1, 2, 3]')
[1, 2, 3]
>>> literal_eval('1 + 2')
ValueError: malformed node or string

协同程序

  • 协程程序与线程有很多共同之处,但与线程不同的是,它们只在调用另一个协程程序时放弃控制,并且不使用那么多内存
  • 协程定义以'async'以及它对它的调用'await'
  • 'asyncio.run(<coroutine>)'是异步程序的主要入口点。
  • 当需要同时启动多个协程时,可以使用函数WAIT()、GATHER()和AS_COMPLETED()
  • Asyncio模块还提供了自己的QueueEventLockSemaphore班级

运行终端游戏,您可以在其中控制必须避免数字的星号:

import asyncio, collections, curses, enum, random P = collections.namedtuple('P', 'x y')         # Position D = enum.Enum('D', 'n e s w')                  # Direction def main(screen):
    curses.curs_set(0)                         # Makes cursor invisible. screen.nodelay(True)                       # Makes getch() non-blocking. asyncio.run(main_coroutine(screen))        # Starts running asyncio code. async def main_coroutine(screen):
    state = {'*': P(0, 0), **{id_: P(30, 10) for id_ in range(10)}}
    moves = asyncio.Queue()
    coros = (*(random_controller(id_, moves) for id_ in range(10)),
             human_controller(screen, moves),
             model(moves, state, *screen.getmaxyx()),
             view(state, screen))
    await asyncio.wait(coros, return_when=asyncio.FIRST_COMPLETED)

async def random_controller(id_, moves):
    while True:
        d = random.choice(list(D))
        moves.put_nowait((id_, d))
        await asyncio.sleep(random.random() / 2)

async def human_controller(screen, moves):
    while True:
        ch = screen.getch()
        key_mappings = {259: D.n, 261: D.e, 258: D.s, 260: D.w}
        if ch in key_mappings:
            moves.put_nowait(('*', key_mappings[ch]))
        await asyncio.sleep(0.01)  

async def model(moves, state, height, width):
    while state['*'] not in {p for id_, p in state.items() if id_ != '*'}:
        id_, d = await moves.get()
        p = state[id_]
        deltas = {D.n: P(0, -1), D.e: P(1, 0), D.s: P(0, 1), D.w: P(-1, 0)}
        new_p = P(p.x + deltas[d].x, p.y + deltas[d].y)
        if 0 <= new_p.x < width-1 and 0 <= new_p.y < height:
            state[id_] = new_p async def view(state, screen):
    while True:
        screen.clear()
        for id_, p in state.items():
            screen.addstr(p.y, p.x, str(id_))
        await asyncio.sleep(0.01)  

if __name__ == '__main__':
    curses.wrapper(main)

图书馆

进度条

# $ pip3 install tqdm >>> from tqdm import tqdm >>> from time import sleep >>> for el in tqdm([1, 2, 3], desc='Processing'):
...     sleep(1)
Processing: 100%|████████████████████| 3/3 [00:03<00:00,  1.00s/it]

打印

# $ pip3 install matplotlib import matplotlib.pyplot as plt plt.plot(<x_data>, <y_data> [, label=<str>])   # Or: plt.plot(<y_data>) plt.legend()                                   # Adds a legend. plt.savefig(<path>)                            # Saves the figure. plt.show()                                     # Displays the figure. plt.clf()                                      # Clears the figure.

表格

将CSV文件打印为ASCII表格:

# $ pip3 install tabulate import csv, tabulate with open('test.csv', encoding='utf-8', newline='') as file:
    rows = csv.reader(file)
    header = [a.title() for a in next(rows)]
    table = tabulate.tabulate(rows, header)
    print(table)

诅咒

在终端中运行基本文件资源管理器:

from curses import wrapper, ascii, A_REVERSE, KEY_UP, KEY_DOWN, KEY_LEFT, KEY_RIGHT, KEY_ENTER from os import listdir, path, chdir def main(screen):
    ch, first, selected, paths = 0, 0, 0, listdir()
    while ch != ascii.ESC:
        height, _ = screen.getmaxyx()
        screen.clear()
        for y, a_path in enumerate(paths[first : first+height]):
            screen.addstr(y, 0, a_path, A_REVERSE * (selected == first + y))
        ch = screen.getch()
        selected += (ch == KEY_DOWN) - (ch == KEY_UP)
        selected = max(0, min(len(paths)-1, selected))
        first += (first <= selected - height) - (first > selected)
        if ch in [KEY_LEFT, KEY_RIGHT, KEY_ENTER, 10, 13]:
            new_dir = '..' if ch == KEY_LEFT else paths[selected]
            if path.isdir(new_dir):
                chdir(new_dir)
                first, selected, paths = 0, 0, listdir()

if __name__ == '__main__':
    wrapper(main)

日志记录

# $ pip3 install loguru from loguru import logger
logger.add('debug_{time}.log', colorize=True)  # Connects a log file. logger.add('error_{time}.log', level='ERROR')  # Another file for errors or higher. logger.<level>('A logging message.')
  • 级别:'debug''info''success''warning''error''critical'

例外情况

自动追加异常描述、堆栈跟踪和变量值

try:
    ...
except <exception>:
    logger.exception('An error happened.')

旋转

参数,该参数设置创建新日志文件时的条件

rotation=<int>|<datetime.timedelta>|<datetime.time>|<str>
  • '<int>'-最大文件大小(字节)
  • '<timedelta>'-文件的最长使用期限
  • '<time>'-一天中的时间
  • '<str>'-以上任意字符串形式:'100 MB''1 month''monday at 12:00'

留存

设置删除旧日志文件的条件

retention=<int>|<datetime.timedelta>|<str>
  • '<int>'-最大文件数
  • '<timedelta>'-文件的最长使用期限
  • '<str>'-以字符串表示的最长时间:'1 week, 3 days''2 months'

刮刮

从Python的维基百科页面上抓取Python的URL、版本号和徽标:

# $ pip3 install requests beautifulsoup4 import requests, bs4, sys WIKI_URL = 'https://en.wikipedia.org/wiki/Python_(programming_language)' try:
    html = requests.get(WIKI_URL).text document = bs4.BeautifulSoup(html, 'html.parser')
    table = document.find('table', class_='infobox vevent')
    python_url = table.find('th', text='Website').next_sibling.a['href']
    version = table.find('th', text='Stable release').next_sibling.strings.__next__()
    logo_url = table.find('img')['src']
    logo = requests.get(f'https:{logo_url}').content with open('test.png', 'wb') as file:
        file.write(logo)
    print(python_url, version)
except requests.exceptions.ConnectionError:
    print("You've got problems with connection.", file=sys.stderr)

网络

# $ pip3 install bottle from bottle import run, route, static_file, template, post, request, response import json

run(host='localhost', port=8080)        # Runs locally. run(host='0.0.0.0', port=80)            # Runs globally.

静电请求

@route('/img/<image>') def send_image(image):
    return static_file(image, 'img_dir/', mimetype='image/png')

动态请求

@route('/<sport>') def send_page(sport):
    return template('<h1>{{title}}</h1>', title=sport)

睡觉请求

@post('/<sport>/odds') def odds_handler(sport):
    team = request.forms.get('team')
    home_odds, away_odds = 2.44, 3.29 response.headers['Content-Type'] = 'application/json' response.headers['Cache-Control'] = 'no-cache' return json.dumps([team, home_odds, away_odds])

测试:

# $ pip3 install requests >>> import threading, requests >>> threading.Thread(target=run, daemon=True).start()
>>> url = 'http://localhost:8080/football/odds' >>> data = {'team': 'arsenal f.c.'}
>>> response = requests.post(url, data=data)
>>> response.json()
['arsenal f.c.', 2.44, 3.29]

剖析

秒表

from time import time start_time = time()                     # Seconds since the Epoch.
...
duration = time() - start_time

高性能:

from time import perf_counter start_time = perf_counter()             # Seconds since the restart.
...
duration = perf_counter() - start_time

计时代码段

>>> from timeit import timeit >>> timeit("''.join(str(i) for i in range(100))",
...        number=10000, globals=globals(), setup='pass')
0.34986

按行分析

# $ pip3 install line_profiler memory_profiler @profile def main():
    a = [*range(10000)]
    b = {*range(10000)}
main()
$ kernprof -lv test.py
Line #   Hits     Time  Per Hit   % Time  Line Contents
=======================================================
     1                                    @profile
     2                                    def main():
     3      1    955.0    955.0     43.7      a = [*range(10000)]
     4      1   1231.0   1231.0     56.3      b = {*range(10000)}
$ python3 -m memory_profiler test.py
Line #         Mem usage      Increment   Line Contents
=======================================================
     1        37.668 MiB     37.668 MiB   @profile
     2                                    def main():
     3        38.012 MiB      0.344 MiB       a = [*range(10000)]
     4        38.477 MiB      0.465 MiB       b = {*range(10000)}

调用图

生成突出显示瓶颈的调用图的PNG图像:

# $ pip3 install pycallgraph2 from pycallgraph2 import output, PyCallGraph from datetime import datetime filename = f'profile-{datetime.now():%Y%m%d%H%M%S}.png' drawer = output.GraphvizOutput(output_file=filename)
with PyCallGraph(drawer):
    <code_to_be_profiled>

NumPy

数组操作的迷你语言。它的运行速度可以比同等的Python代码快100倍。在GPU上运行的另一种速度更快的替代方案称为CuPy

# $ pip3 install numpy import numpy as np
<array> = np.array(<list>)
<array> = np.arange(from_inclusive, to_exclusive, ±step_size)
<array> = np.ones(<shape>)
<array> = np.random.randint(from_inclusive, to_exclusive, <shape>)
<array>.shape = <shape> <view> = <array>.reshape(<shape>)
<view> = np.broadcast_to(<array>, <shape>)
<array> = <array>.sum(axis)
indexes = <array>.argmin(axis)
  • 形状是尺寸大小的元组
  • 轴是折叠的维的索引。最左边的维度具有索引0

标引

<el>       = <2d_array>[row_index, column_index]
<1d_view>  = <2d_array>[row_index]
<1d_view>  = <2d_array>[:, column_index]
<1d_array> = <2d_array>[row_indexes, column_indexes]
<2d_array> = <2d_array>[row_indexes]
<2d_array> = <2d_array>[:, column_indexes]
<2d_bools> = <2d_array> ><== <el> <1d_array> = <2d_array>[<2d_bools>]

广播

广播是一组规则,NumPy函数根据这些规则对不同大小和/或维数的数组进行操作

left = [[0.1], [0.6], [0.8]]        # Shape: (3, 1) right = [ 0.1 ,  0.6 ,  0.8 ]        # Shape: (3)

1.如果数组形状的长度不同,则将较短的形状左键填入长度不同的形状:

left = [[0.1], [0.6], [0.8]]        # Shape: (3, 1) right = [[0.1 ,  0.6 ,  0.8]]        # Shape: (1, 3) <- !

2.如果任何维的大小不同,请通过复制其元素来展开大小为1的维:

left = [[0.1, 0.1, 0.1], [0.6, 0.6, 0.6], [0.8, 0.8, 0.8]]  # Shape: (3, 3) <- ! right = [[0.1, 0.6, 0.8], [0.1, 0.6, 0.8], [0.1, 0.6, 0.8]]  # Shape: (3, 3) <- !

3.如果两个不匹配的维度都没有大小1,则引发错误

示例

对于每个点,返回其最近点的索引([0.1, 0.6, 0.8] => [1, 2, 1]):

>>> points = np.array([0.1, 0.6, 0.8])
 [ 0.1,  0.6,  0.8]
>>> wrapped_points = points.reshape(3, 1)
[[ 0.1],
 [ 0.6],
 [ 0.8]]
>>> distances = wrapped_points - points
[[ 0. , -0.5, -0.7],
 [ 0.5,  0. , -0.2],
 [ 0.7,  0.2,  0. ]]
>>> distances = np.abs(distances)
[[ 0. ,  0.5,  0.7],
 [ 0.5,  0. ,  0.2],
 [ 0.7,  0.2,  0. ]]
>>> i = np.arange(3)
[0, 1, 2]
>>> distances[i, i] = np.inf
[[ inf,  0.5,  0.7],
 [ 0.5,  inf,  0.2],
 [ 0.7,  0.2,  inf]]
>>> distances.argmin(1)
[1, 2, 1]

图像

# $ pip3 install pillow from PIL import Image
<Image> = Image.new('<mode>', (width, height))  # Also: `color=<int/tuple/str>`. <Image> = Image.open(<path>)                    # Identifies format based on file contents. <Image> = <Image>.convert('<mode>')             # Converts image to the new mode. <Image>.save(<path>)                            # Selects format based on the path extension. <Image>.show()                                  # Opens image in default preview app.
<int/tuple> = <Image>.getpixel((x, y))          # Returns a pixel. <Image>.putpixel((x, y), <int/tuple>)           # Writes a pixel to the image. <ImagingCore> = <Image>.getdata()               # Returns a sequence of pixels. <Image>.putdata(<list/ImagingCore>)             # Writes a sequence of pixels. <Image>.paste(<Image>, (x, y))                  # Writes an image to the image.
<2d_array> = np.array(<Image_L>)                # Creates NumPy array from greyscale image. <3d_array> = np.array(<Image_RGB>)              # Creates NumPy array from color image. <Image>    = Image.fromarray(<array>)           # Creates image from NumPy array of floats.

模式

  • '1'-1位黑白像素,每个字节存储一个像素
  • 'L'-8位像素,灰度
  • 'RGB'-3×8位像素,真彩色
  • 'RGBA'-4×8位像素,真彩色,带透明蒙版
  • 'HSV'-3×8位像素、色调、饱和度、值颜色空间

示例

创建彩虹渐变的PNG图像:

WIDTH, HEIGHT = 100, 100 size = WIDTH * HEIGHT hues = (255 * i/size for i in range(size))
img = Image.new('HSV', (WIDTH, HEIGHT))
img.putdata([(int(h), 255, 255) for h in hues])
img.convert('RGB').save('test.png')

向PNG图像添加噪波:

from random import randint add_noise = lambda value: max(0, min(255, value + randint(-20, 20)))
img = Image.open('test.png').convert('HSV')
img.putdata([(add_noise(h), s, v) for h, s, v in img.getdata()])
img.convert('RGB').save('test.png')

图像绘制

from PIL import ImageDraw <ImageDraw> = ImageDraw.Draw(<Image>)
<ImageDraw>.point((x, y), fill=None)
<ImageDraw>.line((x1, y1, x2, y2 [, ...]), fill=None, width=0, joint=None) 
<ImageDraw>.arc((x1, y1, x2, y2), from_deg, to_deg, fill=None, width=0)
<ImageDraw>.rectangle((x1, y1, x2, y2), fill=None, outline=None, width=0)
<ImageDraw>.polygon((x1, y1, x2, y2 [, ...]), fill=None, outline=None)
<ImageDraw>.ellipse((x1, y1, x2, y2), fill=None, outline=None, width=0)
  • 使用'fill=<color>'要设置主色,请执行以下操作
  • 使用'outline=<color>'要设置辅助颜色,请执行以下操作
  • 颜色可以指定为整数、元组'#rrggbb[aa]'字符串或颜色名称

动画

创建反弹球的GIF:

# $ pip3 install imageio from PIL import Image, ImageDraw import imageio WIDTH, R = 126, 10 frames = []
for velocity in range(1, 16):
    y = sum(range(velocity))
    frame = Image.new('L', (WIDTH, WIDTH))
    draw = ImageDraw.Draw(frame)
    draw.ellipse((WIDTH/2-R, y, WIDTH/2+R, y+R*2), fill='white')
    frames.append(frame)
frames += reversed(frames[1:-1])
imageio.mimsave('test.gif', frames, duration=0.03)

音频

import wave
<Wave_read> = wave.open('<path>', 'rb')        # Opens the WAV file. framerate = <Wave_read>.getframerate()       # Number of frames per second. nchannels = <Wave_read>.getnchannels()       # Number of samples per frame. sampwidth = <Wave_read>.getsampwidth()       # Sample size in bytes. nframes = <Wave_read>.getnframes()         # Number of frames. <params> = <Wave_read>.getparams()          # Immutable collection of above. <bytes> = <Wave_read>.readframes(nframes)  # Returns next 'nframes' frames.
<Wave_write> = wave.open('<path>', 'wb')        # Truncates existing file. <Wave_write>.setframerate(<int>)                # 44100 for CD, 48000 for video. <Wave_write>.setnchannels(<int>)                # 1 for mono, 2 for stereo. <Wave_write>.setsampwidth(<int>)                # 2 for CD quality sound. <Wave_write>.setparams(<params>)                # Sets all parameters. <Wave_write>.writeframes(<bytes>)               # Appends frames to the file.
  • Bytes对象包含一系列帧,每个帧由一个或多个样本组成
  • 在立体声信号中,帧的第一个样本属于左声道
  • 每个样本由一个或多个字节组成,当转换为整数时,表示扬声器膜在给定时刻的位移
  • 如果采样宽度为1,则应对整数进行无符号编码
  • 对于所有其他大小,整数应使用小端字节顺序进行有符号编码

样本值

+-----------+-------------+------+-------------+
| sampwidth |     min     | zero |     max     |
+-----------+-------------+------+-------------+
|     1     |           0 |  128 |         255 |
|     2     |      -32768 |    0 |       32767 |
|     3     |    -8388608 |    0 |     8388607 |
|     4     | -2147483648 |    0 |  2147483647 |
+-----------+-------------+------+-------------+

从WAV文件读取浮动样本

def read_wav_file(filename):
    def get_int(bytes_obj):
        an_int = int.from_bytes(bytes_obj, 'little', signed=sampwidth!=1)
        return an_int - 128 * (sampwidth == 1)
    with wave.open(filename, 'rb') as file:
        sampwidth = file.getsampwidth()
        frames = file.readframes(-1)
    bytes_samples = (frames[i : i+sampwidth] for i in range(0, len(frames), sampwidth))
    return [get_int(b) / pow(2, sampwidth * 8 - 1) for b in bytes_samples]

将浮点采样写入WAV文件

def write_to_wav_file(filename, float_samples, nchannels=1, sampwidth=2, framerate=44100):
    def get_bytes(a_float):
        a_float = max(-1, min(1 - 2e-16, a_float))
        a_float += sampwidth == 1 a_float *= pow(2, sampwidth * 8 - 1)
        return int(a_float).to_bytes(sampwidth, 'little', signed=sampwidth!=1) 
    with wave.open(filename, 'wb') as file:
        file.setnchannels(nchannels)
        file.setsampwidth(sampwidth)
        file.setframerate(framerate)
        file.writeframes(b''.join(get_bytes(f) for f in float_samples))

示例

将正弦波保存为单声道WAV文件:

from math import pi, sin samples_f = (sin(i * 2 * pi * 440 / 44100) for i in range(100000))
write_to_wav_file('test.wav', samples_f)

向单声道WAV文件添加噪波:

from random import random add_noise = lambda value: value + (random() - 0.5) * 0.03 samples_f = (add_noise(f) for f in read_wav_file('test.wav'))
write_to_wav_file('test.wav', samples_f)

播放WAV文件:

# $ pip3 install simpleaudio from simpleaudio import play_buffer with wave.open('test.wav', 'rb') as file:
    p = file.getparams()
    frames = file.readframes(-1)
    play_buffer(frames, p.nchannels, p.sampwidth, p.framerate)

文本到语音转换

# $ pip3 install pyttsx3 import pyttsx3 engine = pyttsx3.init()
engine.say('Sally sells seashells by the seashore.')
engine.runAndWait()

合成器

格森·金斯利(Gershon Kingsley)饰演爆米花:

# $ pip3 install simpleaudio import math, struct, simpleaudio from itertools import repeat, chain F = 44100 P1 = '71♩,69♪,,71♩,66♪,,62♩,66♪,,59♩,,,' P2 = '71♩,73♪,,74♩,73♪,,74♪,,71♪,,73♩,71♪,,73♪,,69♪,,71♩,69♪,,71♪,,67♪,,71♩,,,' get_pause = lambda seconds: repeat(0, int(seconds * F))
sin_f = lambda i, hz: math.sin(i * 2 * math.pi * hz / F)
get_wave = lambda hz, seconds: (sin_f(i, hz) for i in range(int(seconds * F)))
get_hz = lambda key: 8.176 * 2 ** (int(key) / 12)
parse_note = lambda note: (get_hz(note[:2]), 1/4 if '♩' in note else 1/8)
get_samples = lambda note: get_wave(*parse_note(note)) if note else get_pause(1/8)
samples_f = chain.from_iterable(get_samples(n) for n in f'{P1}{P1}{P2}'.split(','))
samples_b = b''.join(struct.pack('<h', int(f * 30000)) for f in samples_f)
simpleaudio.play_buffer(samples_b, 1, 2, F)

PYGAME

基本示例

# $ pip3 install pygame import pygame as pg pg.init()
screen = pg.display.set_mode((500, 500))
rect = pg.Rect(240, 240, 20, 20)
while all(event.type != pg.QUIT for event in pg.event.get()):
    deltas = {pg.K_UP: (0, -1), pg.K_RIGHT: (1, 0), pg.K_DOWN: (0, 1), pg.K_LEFT: (-1, 0)}
    for key_code, is_pressed in enumerate(pg.key.get_pressed()):
        rect = rect.move(deltas[key_code]) if key_code in deltas and is_pressed else rect screen.fill((0, 0, 0))
    pg.draw.rect(screen, (255, 255, 255), rect)
    pg.display.flip()

矩形

用于存储直角坐标的对象

<Rect> = pg.Rect(x, y, width, height)           # Floats get truncated into ints. <int> = <Rect>.x/y/centerx/centery/# Top, right, bottom, left. Allows assignments. <tup.> = <Rect>.topleft/center/# Topright, bottomright, bottomleft. <Rect> = <Rect>.move((x, y))                    # Use move_ip() to move in place.
<bool> = <Rect>.collidepoint((x, y))            # Checks if rectangle contains a point. <bool> = <Rect>.colliderect(<Rect>)             # Checks if two rectangles overlap. <int> = <Rect>.collidelist(<list_of_Rect>)     # Returns index of first colliding Rect or -1. <list> = <Rect>.collidelistall(<list_of_Rect>)  # Returns indexes of all colliding Rects.

曲面

用于表示图像的对象

<Surf> = pg.display.set_mode((width, height))   # Returns display surface. <Surf> = pg.Surface((width, height), …)         # New RGB surface. Add `pg.SRCALPHA` for RGBA. <Surf> = pg.image.load('<path>')                # Loads the image. Format depends on source. <Surf> = <Surf>.subsurface(<Rect>)              # Returns a subsurface.
<Surf>.fill(color)                              # Tuple, Color('#rrggbb[aa]') or Color(<name>). <Surf>.set_at((x, y), color)                    # Updates pixel. <Surf>.blit(<Surf>, (x, y))                     # Draws passed surface to the surface.
from pygame.transform import scale, ...
<Surf> = scale(<Surf>, (width, height))         # Returns scaled surface. <Surf> = rotate(<Surf>, degrees)                # Returns rotated and scaled surface. <Surf> = flip(<Surf>, x_bool, y_bool)           # Returns flipped surface.
from pygame.draw import line, ...
line(<Surf>, color, (x1, y1), (x2, y2), width)  # Draws a line to the surface. arc(<Surf>, color, <Rect>, from_rad, to_rad)    # Also: ellipse(<Surf>, color, <Rect>) rect(<Surf>, color, <Rect>)                     # Also: polygon(<Surf>, color, points)

字体

<Font> = pg.font.SysFont('<name>', size)        # Loads the system font or default if missing. <Font> = pg.font.Font('<path>', size)           # Loads the TTF file. Pass None for default. <Surf> = <Font>.render(text, antialias, color)  # Background color can be specified at the end.

声音

<Sound> = pg.mixer.Sound('<path>')              # Loads the WAV file. <Sound>.play()                                  # Starts playing the sound.

马里奥兄弟基本示例

import collections, dataclasses, enum, io, itertools as it, pygame as pg, urllib.request from random import randint P = collections.namedtuple('P', 'x y')          # Position D = enum.Enum('D', 'n e s w')                   # Direction SIZE, MAX_SPEED = 50, P(5, 10)                  # Screen size, Speed limit def main():
    def get_screen():
        pg.init()
        return pg.display.set_mode(2 * [SIZE*16])
    def get_images():
        url = 'https://gto76.github.io/python-cheatsheet/web/mario_bros.png' img = pg.image.load(io.BytesIO(urllib.request.urlopen(url).read()))
        return [img.subsurface(get_rect(x, 0)) for x in range(img.get_width() // 16)]
    def get_mario():
        Mario = dataclasses.make_dataclass('Mario', 'rect spd facing_left frame_cycle'.split())
        return Mario(get_rect(1, 1), P(0, 0), False, it.cycle(range(3)))
    def get_tiles():
        positions = [p for p in it.product(range(SIZE), repeat=2) if {*p} & {0, SIZE-1}] + \
            [(randint(1, SIZE-2), randint(2, SIZE-2)) for _ in range(SIZE**2 // 10)]
        return [get_rect(*p) for p in positions]
    def get_rect(x, y):
        return pg.Rect(x*16, y*16, 16, 16)
    run(get_screen(), get_images(), get_mario(), get_tiles())

def run(screen, images, mario, tiles):
    clock = pg.time.Clock()
    while all(event.type != pg.QUIT for event in pg.event.get()):
        keys = {pg.K_UP: D.n, pg.K_RIGHT: D.e, pg.K_DOWN: D.s, pg.K_LEFT: D.w}
        pressed = {keys.get(i) for i, on in enumerate(pg.key.get_pressed()) if on}
        update_speed(mario, tiles, pressed)
        update_position(mario, tiles)
        draw(screen, images, mario, tiles, pressed)
        clock.tick(28)

def update_speed(mario, tiles, pressed):
    x, y = mario.spd x += 2 * ((D.e in pressed) - (D.w in pressed))
    x -= x // abs(x) if x else 0 y += 1 if D.s not in get_boundaries(mario.rect, tiles) else (D.n in pressed) * -10 mario.spd = P(*[max(-limit, min(limit, s)) for limit, s in zip(MAX_SPEED, P(x, y))])

def update_position(mario, tiles):
    x, y = mario.rect.topleft n_steps = max(abs(s) for s in mario.spd)
    for _ in range(n_steps):
        mario.spd = stop_on_collision(mario.spd, get_boundaries(mario.rect, tiles))
        x, y = x + mario.spd.x/n_steps, y + mario.spd.y/n_steps mario.rect.topleft = x, y def get_boundaries(rect, tiles):
    deltas = {D.n: P(0, -1), D.e: P(1, 0), D.s: P(0, 1), D.w: P(-1, 0)}
    return {d for d, delta in deltas.items() if rect.move(delta).collidelist(tiles) != -1}

def stop_on_collision(spd, bounds):
    return P(x=0 if (D.w in bounds and spd.x < 0) or (D.e in bounds and spd.x > 0) else spd.x,
             y=0 if (D.n in bounds and spd.y < 0) or (D.s in bounds and spd.y > 0) else spd.y)

def draw(screen, images, mario, tiles, pressed):
    def get_frame_index():
        if D.s not in get_boundaries(mario.rect, tiles):
            return 4 return next(mario.frame_cycle) if {D.w, D.e} & pressed else 6 screen.fill((85, 168, 255))
    mario.facing_left = (D.w in pressed) if {D.w, D.e} & pressed else mario.facing_left screen.blit(images[get_frame_index() + mario.facing_left * 9], mario.rect)
    for rect in tiles:
        screen.blit(images[18 if {*rect.topleft} & {0, (SIZE-1)*16} else 19], rect)
    pg.display.flip()

if __name__ == '__main__':
    main()

熊猫

# $ pip3 install pandas import pandas as pd from pandas import Series, DataFrame

系列

带名称的有序词典

>>> Series([1, 2], index=['x', 'y'], name='a')
x 1 y 2 Name: a, dtype: int64
<Sr> = Series(<list>)                         # Assigns RangeIndex starting at 0. <Sr> = Series(<dict>)                         # Takes dictionary's keys for index. <Sr> = Series(<dict/Series>, index=<list>)    # Only keeps items with keys specified in index.
<el> = <Sr>.loc[key]                          # Or: <Sr>.iloc[index] <Sr> = <Sr>.loc[keys]                         # Or: <Sr>.iloc[indexes] <Sr> = <Sr>.loc[from_key : to_key_inclusive]  # Or: <Sr>.iloc[from_i : to_i_exclusive]
<el> = <Sr>[key/index]                        # Or: <Sr>.key <Sr> = <Sr>[keys/indexes]                     # Or: <Sr>[<key_range/range>] <Sr> = <Sr>[bools]                            # Or: <Sr>.i/loc[bools]
<Sr> = <Sr> ><== <el/Sr> # Returns a Series of bools. <Sr> = <Sr> +-*/ <el/Sr> # Items with non-matching keys get value NaN.
<Sr> = <Sr>.append(<Sr>)                      # Or: pd.concat(<coll_of_Sr>) <Sr> = <Sr>.combine_first(<Sr>)               # Adds items that are not yet present. <Sr>.update(<Sr>)                             # Updates items that are already present.

聚合、变换、映射:

<el> = <Sr>.sum/max/mean/idxmax/all()         # Or: <Sr>.aggregate(<agg_func>) <Sr> = <Sr>.rank/diff/cumsum/ffill/interpl()  # Or: <Sr>.agg/transform(<trans_func>) <Sr> = <Sr>.fillna(<el>)                      # Or: <Sr>.apply/agg/transform/map(<map_func>)
  • 这条路'aggregate()''transform()'首先通过向其传递单个值来确定传递的函数接受的是元素还是整个系列,如果引发错误,则将其传递给整个系列
>>> sr = Series([1, 2], index=['x', 'y'])
x 1 y 2
+-------------+-------------+-------------+---------------+
|             |    'sum'    |   ['sum']   | {'s': 'sum'}  |
+-------------+-------------+-------------+---------------+
| sr.apply(…) |      3      |    sum  3   |     s  3      |
| sr.agg(…)   |             |             |               |
+-------------+-------------+-------------+---------------+
+-------------+-------------+-------------+---------------+
|             |    'rank'   |   ['rank']  | {'r': 'rank'} |
+-------------+-------------+-------------+---------------+
| sr.apply(…) |             |      rank   |               |
| sr.agg(…)   |     x  1    |   x     1   |    r  x  1    |
| sr.trans(…) |     y  2    |   y     2   |       y  2    |
+-------------+-------------+-------------+---------------+
  • 最后一个结果有一个分层索引。使用'<Sr>[key_1, key_2]'去获取它的价值

数据帧

带有标签行和列的表

>>> DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['x', 'y'])
   x y a 1 2 b 3 4
<DF> = DataFrame(<list_of_rows>)           # Rows can be either lists, dicts or series. <DF> = DataFrame(<dict_of_columns>)        # Columns can be either lists, dicts or series.
<el> = <DF>.loc[row_key, column_key]       # Or: <DF>.iloc[row_index, column_index] <Sr/DF> = <DF>.loc[row_key/s]                 # Or: <DF>.iloc[row_index/es] <Sr/DF> = <DF>.loc[:, column_key/s]           # Or: <DF>.iloc[:, column_index/es] <DF> = <DF>.loc[row_bools, column_bools]   # Or: <DF>.iloc[row_bools, column_bools]
<Sr/DF> = <DF>[column_key/s]                  # Or: <DF>.column_key <DF> = <DF>[row_bools]                     # Keeps rows as specified by bools. <DF> = <DF>[<DF_of_bools>]                 # Assigns NaN to False values.
<DF> = <DF> ><== <el/Sr/DF> # Returns DF of bools. Sr is treated as a row. <DF> = <DF> +-*/ <el/Sr/DF> # Items with non-matching keys get value NaN.
<DF> = <DF>.set_index(column_key)          # Replaces row keys with values from a column. <DF> = <DF>.reset_index()                  # Moves row keys to a column named index. <DF> = <DF>.filter('<regex>', axis=1)      # Only keeps columns whose key matches the regex. <DF> = <DF>.melt(id_vars=column_key/s)     # Converts DataFrame from wide to long format.

合并、联接、合并:

>>> l = DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['x', 'y'])
   x y a 1 2 b 3 4 >>> r = DataFrame([[4, 5], [6, 7]], index=['b', 'c'], columns=['y', 'z'])
   y z b 4 5 c 6 7
+------------------------+---------------+------------+------------+--------------------------+
|                        |    'outer'    |   'inner'  |   'left'   |       Description        |
+------------------------+---------------+------------+------------+--------------------------+
| l.merge(r, on='y',     |    x   y   z  | x   y   z  | x   y   z  | Joins/merges on column.  |
|            how=…)      | 0  1   2   .  | 3   4   5  | 1   2   .  | Also accepts left_on and |
|                        | 1  3   4   5  |            | 3   4   5  | right_on parameters.     |
|                        | 2  .   6   7  |            |            | Uses 'inner' by default. |
+------------------------+---------------+------------+------------+--------------------------+
| l.join(r, lsuffix='l', |    x yl yr  z |            | x yl yr  z | Joins/merges on row keys.|
|           rsuffix='r', | a  1  2  .  . | x yl yr  z | 1  2  .  . | Uses 'left' by default.  |
|           how=…)       | b  3  4  4  5 | 3  4  4  5 | 3  4  4  5 | If r is a series, it is  |
|                        | c  .  .  6  7 |            |            | treated as a column.     |
+------------------------+---------------+------------+------------+--------------------------+
| pd.concat([l, r],      |    x   y   z  |     y      |            | Adds rows at the bottom. |
|           axis=0,      | a  1   2   .  |     2      |            | Uses 'outer' by default. |
|           join=…)      | b  3   4   .  |     4      |            | A series is treated as a |
|                        | b  .   4   5  |     4      |            | column. Use l.append(r)  |
|                        | c  .   6   7  |     6      |            | to add a row instead.    |
+------------------------+---------------+------------+------------+--------------------------+
| pd.concat([l, r],      |    x  y  y  z |            |            | Adds columns at the      |
|           axis=1,      | a  1  2  .  . | x  y  y  z |            | right end. Uses 'outer'  |
|           join=…)      | b  3  4  4  5 | 3  4  4  5 |            | by default. A series is  |
|                        | c  .  .  6  7 |            |            | treated as a column.     |
+------------------------+---------------+------------+------------+--------------------------+
| l.combine_first(r)     |    x   y   z  |            |            | Adds missing rows and    |
|                        | a  1   2   .  |            |            | columns. Also updates    |
|                        | b  3   4   5  |            |            | items that contain NaN.  |
|                        | c  .   6   7  |            |            | R must be a DataFrame.   |
+------------------------+---------------+------------+------------+--------------------------+

聚合、变换、映射:

<Sr> = <DF>.sum/max/mean/idxmax/all()         # Or: <DF>.apply/agg/transform(<agg_func>) <DF> = <DF>.rank/diff/cumsum/ffill/interpl()  # Or: <DF>.apply/agg/transform(<trans_func>) <DF> = <DF>.fillna(<el>)                      # Or: <DF>.applymap(<map_func>)
  • 默认情况下,所有操作都在列上操作。使用'axis=1'参数来处理行,而不是处理行。
>>> df = DataFrame([[1, 2], [3, 4]], index=['a', 'b'], columns=['x', 'y'])
   x y a 1 2 b 3 4
+-------------+-------------+-------------+---------------+
|             |    'sum'    |   ['sum']   | {'x': 'sum'}  |
+-------------+-------------+-------------+---------------+
| df.apply(…) |             |       x  y  |               |
| df.agg(…)   |     x  4    |  sum  4  6  |     x  4      |
|             |     y  6    |             |               |
+-------------+-------------+-------------+---------------+
+-------------+-------------+-------------+---------------+
|             |    'rank'   |   ['rank']  | {'x': 'rank'} |
+-------------+-------------+-------------+---------------+
| df.apply(…) |      x  y   |      x    y |        x      |
| df.agg(…)   |   a  1  1   |   rank rank |     a  1      |
| df.trans(…) |   b  2  2   | a    1    1 |     b  2      |
|             |             | b    2    2 |               |
+-------------+-------------+-------------+---------------+
  • 使用'<DF>[col_key_1, col_key_2][row_key]'要获取第五个结果的值,请执行以下操作

编码、解码:

<DF> = pd.read_json/html('<str/path/url>')
<DF> = pd.read_csv/pickle/excel('<path/url>')
<DF> = pd.read_sql('<table_name/query>', <connection>)
<DF> = pd.read_clipboard()
<dict> = <DF>.to_dict(['d/l/s/sp/r/i'])
<str> = <DF>.to_json/html/csv/markdown/latex([<path>])
<DF>.to_pickle/excel(<path>)
<DF>.to_sql('<table_name>', <connection>)

分组依据

对象,该对象根据传递的列的值将数据帧的行分组在一起。

>>> df = DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 6]], index=list('abc'), columns=list('xyz'))
>>> df.groupby('z').get_group(3)
   x y a 1 2 >>> df.groupby('z').get_group(6)
   x y b 4 5 c 7 8
<GB> = <DF>.groupby(column_key/s)             # DF is split into groups based on passed column. <DF> = <GB>.get_group(group_key/s)            # Selects a group by value of grouping column.

聚合、变换、映射:

<DF> = <GB>.sum/max/mean/idxmax/all()         # Or: <GB>.apply/agg(<agg_func>) <DF> = <GB>.rank/diff/cumsum/ffill()          # Or: <GB>.aggregate(<trans_func>)   <DF> = <GB>.fillna(<el>)                      # Or: <GB>.transform(<map_func>)
>>> gb = df.groupby('z')
      x y z 3: a 1 2 3 6: b 4 5 6 c 7 8 6
+-------------+-------------+-------------+-------------+---------------+
|             |    'sum'    |    'rank'   |   ['rank']  | {'x': 'rank'} |
+-------------+-------------+-------------+-------------+---------------+
| gb.agg(…)   |      x   y  |      x  y   |      x    y |        x      |
|             |  z          |   a  1  1   |   rank rank |     a  1      |
|             |  3   1   2  |   b  1  1   | a    1    1 |     b  1      |
|             |  6  11  13  |   c  2  2   | b    1    1 |     c  2      |
|             |             |             | c    2    2 |               |
+-------------+-------------+-------------+-------------+---------------+
| gb.trans(…) |      x   y  |      x  y   |             |               |
|             |  a   1   2  |   a  1  1   |             |               |
|             |  b  11  13  |   b  1  1   |             |               |
|             |  c  11  13  |   c  1  1   |             |               |
+-------------+-------------+-------------+-------------+---------------+

轧制

用于滚动窗口计算的对象

<R_Sr/R_DF/R_GB> = <Sr/DF/GB>.rolling(window_size)  # Also: `min_periods=None, center=False`. <R_Sr/R_DF> = <R_DF/R_GB>[column_key/s]        # Or: <R>.column_key <Sr/DF/DF> = <R_Sr/R_DF/R_GB>.sum/max/mean()  # Or: <R>.apply/agg(<agg_func/str>)

插图地

# $ pip3 install plotly kaleido from plotly.express import line <Figure> = line(<DF>, x=<col_name>, y=<col_name>)        # Or: line(x=<list>, y=<list>) <Figure>.update_layout(margin=dict(t=0, r=0, b=0, l=0))  # Or: paper_bgcolor='rgba(0, 0, 0, 0)' <Figure>.write_html/json/image('<path>')                 # Also: <Figure>.show()

按大陆划分的Covid死亡人数:

Covid Deaths

covid = pd.read_csv('https://covid.ourworldindata.org/data/owid-covid-data.csv', 
                    usecols=['iso_code', 'date', 'total_deaths', 'population'])
continents = pd.read_csv('https://datahub.io/JohnSnowLabs/country-and-continent-codes-' + \
                         'list/r/country-and-continent-codes-list-csv.csv',
                         usecols=['Three_Letter_Country_Code', 'Continent_Name'])
df = pd.merge(covid, continents, left_on='iso_code', right_on='Three_Letter_Country_Code')
df = df.groupby(['Continent_Name', 'date']).sum().reset_index()
df['Total Deaths per Million'] = df.total_deaths * 1e6 / df.population df = df[('2020-03-14' < df.date) & (df.date < '2020-11-25')]
df = df.rename({'date': 'Date', 'Continent_Name': 'Continent'}, axis='columns')
line(df, x='Date', y='Total Deaths per Million', color='Continent').show()

确认的Covid案例,道琼斯,黄金和比特币价格:

Covid Cases

import pandas as pd import plotly.graph_objects as go def main():
    display_data(wrangle_data(*scrape_data()))

def scrape_data():
    def scrape_covid():
        url = 'https://covid.ourworldindata.org/data/owid-covid-data.csv' df = pd.read_csv(url, usecols=['location', 'date', 'total_cases'])
        return df[df.location == 'World'].set_index('date').total_cases def scrape_yahoo(slug):
        url = f'https://query1.finance.yahoo.com/v7/finance/download/{slug}' + \
              '?period1=1579651200&period2=1608850800&interval=1d&events=history' df = pd.read_csv(url, usecols=['Date', 'Close'])
        return df.set_index('Date').Close return scrape_covid(), scrape_yahoo('BTC-USD'), scrape_yahoo('GC=F'), scrape_yahoo('^DJI')

def wrangle_data(covid, bitcoin, gold, dow):
    df = pd.concat([bitcoin, gold, dow], axis=1)
    df = df.sort_index().interpolate()
    df = df.rolling(10, min_periods=1, center=True).mean()
    df = df.loc['2020-02-23':'2020-11-25']
    df = (df / df.iloc[0]) * 100 return pd.concat([covid, df], axis=1, join='inner')

def display_data(df):
    df.columns = ['Total Cases', 'Bitcoin', 'Gold', 'Dow Jones']
    figure = go.Figure()
    for col_name in df:
        yaxis = 'y1' if col_name == 'Total Cases' else 'y2' trace = go.Scatter(x=df.index, y=df[col_name], name=col_name, yaxis=yaxis)
        figure.add_trace(trace)
    figure.update_layout(
        yaxis1=dict(title='Total Cases', rangemode='tozero'),
        yaxis2=dict(title='%', rangemode='tozero', overlaying='y', side='right'),
        legend=dict(x=1.1)
    ).show()

if __name__ == '__main__':
    main()

PySimpleGUI

# $ pip3 install PySimpleGUI import PySimpleGUI as sg layout = [[sg.Text("What's your name?")], [sg.Input()], [sg.Button('Ok')]]
window = sg.Window('Window Title', layout)
event, values = window.read()
print(f'Hello {values[0]}!' if event == 'Ok' else '')

附录

Cython

将Python代码编译为C++的库

# $ pip3 install cython import pyximport; pyximport.install()
import <cython_script> <cython_script>.main()

定义:

  • 'cdef'定义是可选的,但它们有助于提高速度
  • 脚本需要使用'pyx'分机
cdef <type> <var_name> = <el> cdef <type>[n_elements] <var_name> = [<el_1>, <el_2>, ...]
cdef <type/void> <func_name>(<type> <arg_name_1>, ...):
cdef class <class_name>:
    cdef public <type> <attr_name> def __init__(self, <type> <arg_name>):
        self.<attr_name> = <arg_name>
cdef enum <enum_name>: <member_name_1>, <member_name_2>, ...

PyInstaller

$ pip3 install pyinstaller
$ pyinstaller script.py                        # Compiles into './dist/script' directory.
$ pyinstaller script.py --onefile              # Compiles into './dist/script' console app.
$ pyinstaller script.py --windowed             # Compiles into './dist/script' windowed app.
$ pyinstaller script.py --add-data '<path>:.' # Adds file to the root of the executable.
  • 文件路径需要更新为'os.path.join(sys._MEIPASS, <path>)'

基本脚本模板

#!/usr/bin/env python3 # # Usage: .py # from sys import argv, exit from collections import defaultdict, namedtuple from dataclasses import make_dataclass from enum import Enum import functools as ft, itertools as it, operator as op, re def main():
    pass ### ##  UTIL # def read_file(filename):
    with open(filename, encoding='utf-8') as file:
        return file.readlines()


if __name__ == '__main__':
    main()

索引

  • 仅在PDF
  • ⌘+F/CTRL F通常就足够了
  • 搜索'#<title>'在一个webpage将搜索范围限制在书目
声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。