Python 实用宝典

Question 1

I have a few Pandas DataFrames sharing the same value scale, but having different columns and indices. When invoking df.plot(), I get separate plot images. what I really want is to have them all in the same plot as subplots, but I’m unfortunately failing to come up with a solution to how and would highly appreciate some help.

Question 2

You can manually create the subplots with matplotlib, and then plot the dataframes on a specific subplot using the ax keyword. For example for 4 subplots (2×2):

import matplotlib.pyplot as plt

fig, axes = plt.subplots(nrows=2, ncols=2)

df1.plot(ax=axes[0,0])
df2.plot(ax=axes[0,1])
...

Here axes is an array which holds the different subplot axes, and you can access one just by indexing axes.
If you want a shared x-axis, then you can provide sharex=True to plt.subplots.

Question 3

You can see e.gs. in the documentation demonstrating joris answer. Also from the documentation, you could also set subplots=True and layout=(,) within the pandas plot function:

df.plot(subplots=True, layout=(1,2))

You could also use fig.add_subplot() which takes subplot grid parameters such as 221, 222, 223, 224, etc. as described in the post here. Nice examples of plot on pandas data frame, including subplots, can be seen in this ipython notebook.

Question 4

You can use the familiar Matplotlib style calling a figure and subplot, but you simply need to specify the current axis using plt.gca(). An example:

plt.figure(1)
plt.subplot(2,2,1)
df.A.plot() #no need to specify for first axis
plt.subplot(2,2,2)
df.B.plot(ax=plt.gca())
plt.subplot(2,2,3)
df.C.plot(ax=plt.gca())

etc…

Question 5

You can plot multiple subplots of multiple pandas data frames using matplotlib with a simple trick of making a list of all data frame. Then using the for loop for plotting subplots.

Working code:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# dataframe sample data
df1 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df2 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df3 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df4 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df5 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
df6 = pd.DataFrame(np.random.rand(10,2)*100, columns=['A', 'B'])
#define number of rows and columns for subplots
nrow=3
ncol=2
# make a list of all dataframes 
df_list = [df1 ,df2, df3, df4, df5, df6]
fig, axes = plt.subplots(nrow, ncol)
# plot counter
count=0
for r in range(nrow):
    for c in range(ncol):
        df_list[count].plot(ax=axes[r,c])
        count=+1

Using this code you can plot subplots in any configuration. You need to just define number of rows nrow and number of columns ncol. Also, you need to make list of data frames df_list which you wanted to plot.

Question 6

You can use this:

fig = plt.figure()
ax = fig.add_subplot(221)
plt.plot(x,y)

ax = fig.add_subplot(222)
plt.plot(x,z)
...

plt.show()

Question 7

You may not need to use Pandas at all. Here’s a matplotlib plot of cat frequencies:

x = np.linspace(0, 2*np.pi, 400)
y = np.sin(x**2)

f, axes = plt.subplots(2, 1)
for c, i in enumerate(axes):
  axes[c].plot(x, y)
  axes[c].set_title('cats')
plt.tight_layout()

Question 8

Building on @joris response above, if you have already established a reference to the subplot, you can use the reference as well. For example,

ax1 = plt.subplot2grid((50,100), (0, 0), colspan=20, rowspan=10)
...

df.plot.barh(ax=ax1, stacked=True)

Question 9

How to create multiple plots from a dictionary of dataframes with long (tidy) data

Assumptions
- There is a dictionary of multiple dataframes of tidy data
  - Created by reading in from files
  - Created by separating a single dataframe into multiple dataframes
- The categories, cat, may be overlapping, but all dataframes may not contain all values of cat
- hue='cat'
Because dataframes are being iterated through, there’s not guarantee that colors will be mapped the same for each plot
- A custom color map needs to be created from the unique 'cat' values for all the dataframes
- Since the colors will be the same, place one legend to the side of the plots, instead of a legend in every plot

Imports and synthetic data

import pandas as pd
import numpy as np  # used for random data
import random  # used for random data
import matplotlib.pyplot as plt
from matplotlib.patches import Patch  # for custom legend
import seaborn as sns
import math import ceil  # determine correct number of subplot


# synthetic data
df_dict = dict()
for i in range(1, 7):
    np.random.seed(i)
    random.seed(i)
    data_length = 100
    data = {'cat': [random.choice(['A', 'B', 'C']) for _ in range(data_length)],
            'x': np.random.rand(data_length),
            'y': np.random.rand(data_length)}
    df_dict[i] = pd.DataFrame(data)


# display(df_dict[1].head())

  cat         x         y
0   A  0.417022  0.326645
1   C  0.720324  0.527058
2   A  0.000114  0.885942
3   B  0.302333  0.357270
4   A  0.146756  0.908535

Create color mappings and plot

# create color mapping based on all unique values of cat
unique_cat = {cat for v in df_dict.values() for cat in v.cat.unique()}  # get unique cats
colors = sns.color_palette('husl', n_colors=len(unique_cat))  # get a number of colors
cmap = dict(zip(unique_cat, colors))  # zip values to colors

# iterate through dictionary and plot
col_nums = 3  # how many plots per row
row_nums = math.ceil(len(df_dict) / col_nums)  # how many rows of plots
plt.figure(figsize=(10, 5))  # change the figure size as needed
for i, (k, v) in enumerate(df_dict.items(), 1):
    plt.subplot(row_nums, col_nums, i)  # create subplots
    p = sns.scatterplot(data=v, x='x', y='y', hue='cat', palette=cmap)
    p.legend_.remove()  # remove the individual plot legends
    plt.title(f'DataFrame: {k}')

plt.tight_layout()
# create legend from cmap
patches = [Patch(color=v, label=k) for k, v in cmap.items()]
# place legend outside of plot; change the right bbox value to move the legend up or down
plt.legend(handles=patches, bbox_to_anchor=(1.06, 1.2), loc='center left', borderaxespad=0)
plt.show()

Question 10

I have installed the GPU version of tensorflow on an Ubuntu 14.04.

I am on a GPU server where tensorflow can access the available GPUs.

I want to run tensorflow on the CPUs.

Normally I can use env CUDA_VISIBLE_DEVICES=0 to run on GPU no. 0.

How can I pick between the CPUs instead?

I am not intersted in rewritting my code with with tf.device("/cpu:0"):

Question 11

You can apply device_count parameter per tf.Session:

config = tf.ConfigProto(
        device_count = {'GPU': 0}
    )
sess = tf.Session(config=config)

Setting `DJANGO_SETTINGS_MODULE` using a script or a shell

You can then use a bootstrap script or a process manager to load the correct settings (by setting the environment), or just run it from your shell before starting Django: export DJANGO_SETTINGS_MODULE=myapp.production_settings.

Note that you can run this export at any time from a shell — it does not need to live in your .bashrc or anything.

Setting `DJANGO_SETTINGS_MODULE` using a Process Manager

If you’re not fond of writing a bootstrap script that sets the environment (and there are very good reasons to feel that way!), I would recommend using a process manager:

Supervisor lets you pass environment variables to managed processes using a program’s environment configuration key.
Honcho (a pure-Python equivalent of Ruby’s Foreman) lets you define environment variables in an “environment” (.env) file.

Finally, note that you can take advantage of the PYTHONPATH variable to store the settings in a completely different location (e.g. on a production server, storing them in /etc/). This allows for separating configuration from application files. You may or may not want that, it depends on how your app is structured.

Question 21

By default use production settings, but create a file called settings_dev.py in the same folder as your settings.py file. Add overrides there, such as DEBUG=True.

On the computer that will be used for development, add this to your ~/.bashrc file:

export DJANGO_DEVELOPMENT=true

At the bottom of your settings.py file, add the following.

# Override production variables if DJANGO_DEVELOPMENT env variable is set
if os.environ.get('DJANGO_DEVELOPMENT'):
    from settings_dev import *  # or specific overrides

(Note that importing * should generally be avoided in Python)

By default the production servers will not override anything. Done!

Compared to the other answers, this one is simpler because it doesn’t require updating PYTHONPATH, or setting DJANGO_SETTINGS_MODULE which only allows you to work on one django project at a time.

Question 22

I usually have one settings file per environment, and a shared settings file:

/myproject/
  settings.production.py
  settings.development.py
  shared_settings.py

Each of my environment files has:

try:
    from shared_settings import *
except ImportError:
    pass

This allows me to override shared settings if necessary (by adding the modifications below that stanza).

I then select which settings files to use by linking it in to settings.py:

ln -s settings.development.py settings.py

Question 23

This is how I do it in 6 easy steps:

Create a folder inside your project directory and name it settings.

Project structure:

myproject/
       myapp1/
       myapp2/              
       myproject/
              settings/

Create four python files inside of the settings directory namely __init__.py, base.py, dev.py and prod.py

Settings files:
```
settings/
     __init__.py
     base.py
     prod.py
     dev.py 
```

Open __init__.py and fill it with the following content:

init.py:

from .base import *
# you need to set "myproject = 'prod'" as an environment variable
# in your OS (on which your website is hosted)
if os.environ['myproject'] == 'prod':
   from .prod import *
else:
   from .dev import *

Open base.py and fill it with all the common settings (that will be used in both production as well as development.) for example:

base.py:

import os
...
INSTALLED_APPS = [...]
MIDDLEWARE = [...]
TEMPLATES = [{...}]
...
STATIC_URL = '/static/'
STATIC_ROOT = os.path.join(BASE_DIR, 'staticfiles')
MEDIA_ROOT = os.path.join(BASE_DIR, '/path/')
MEDIA_URL = '/path/'

Open dev.py and include that stuff which is development specific for example:

dev.py:
```
DEBUG = True
ALLOWED_HOSTS = ['localhost']
...
```
Open prod.py and include that stuff which is production specific for example:

prod.py:
```
DEBUG = False
ALLOWED_HOSTS = ['www.example.com']
LOGGING = [...]
...
```

Question 24

Create multiple settings*.py files, extrapolating the variables that need to change per environment. Then at the end of your master settings.py file:

try:
  from settings_dev import *
except ImportError:
  pass

You keep the separate settings_* files for each stage.

At the top of your settings_dev.py file, add this:

import sys
globals().update(vars(sys.modules['settings']))

To import variables that you need to modify.

This wiki entry has more ideas on how to split your settings.

Question 25

I use the awesome django-configurations, and all the settings are stored in my settings.py:

from configurations import Configuration

class Base(Configuration):
    # all the base settings here...
    BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    ...

class Develop(Base):
    # development settings here...
    DEBUG = True 
    ...

class Production(Base):
    # production settings here...
    DEBUG = False

To configure the Django project I just followed the docs.

Question 26

Here is the approach we use :

a settings module to split settings into multiple files for readability ;
a .env.json file to store credentials and parameters that we want excluded from our git repository, or that are environment specific ;
an env.py file to read the .env.json file

Considering the following structure :

...
.env.json           # the file containing all specific credentials and parameters
.gitignore          # the .gitignore file to exclude `.env.json`
project_name/       # project dir (the one which django-admin.py creates)
  accounts/         # project's apps
    __init__.py
    ...
  ...
  env.py            # the file to load credentials
  settings/
    __init__.py     # main settings file
    database.py     # database conf
    storage.py      # storage conf
    ...
venv                # virtualenv
...

With `.env.json` like :

{
    "debug": false,
    "allowed_hosts": ["mydomain.com"],
    "django_secret_key": "my_very_long_secret_key",
    "db_password": "my_db_password",
    "db_name": "my_db_name",
    "db_user": "my_db_user",
    "db_host": "my_db_host",
}

And `project_name/env.py` :

<!-- language: lang-python -->
import json
import os


def get_credentials():
    env_file_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    with open(os.path.join(env_file_dir, '.env.json'), 'r') as f:
        creds = json.loads(f.read())
    return creds


credentials = get_credentials()

We can have the following settings:

<!-- language: lang-py -->
# project_name/settings/__init__.py
from project_name.env import credentials
from project_name.settings.database import *
from project_name.settings.storage import *
...

SECRET_KEY = credentials.get('django_secret_key')

DEBUG = credentials.get('debug')

ALLOWED_HOSTS = credentials.get('allowed_hosts', [])

INSTALLED_APPS = [
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',

    ...
]

if DEBUG:
    INSTALLED_APPS += ['debug_toolbar']

...

# project_name/settings/database.py
from project_name.env import credentials

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': credentials.get('db_name', ''),
        'USER': credentials.get('db_user', ''),
        'HOST': credentials.get('db_host', ''),
        'PASSWORD': credentials.get('db_password', ''),
        'PORT': '5432',
    }
}

the benefits of this solution are :

user specific credentials and configurations for local development without modifying the git repository ;
environment specific configuration, you can have for example three different environments with three different .env.json like dev, stagging and production ;
credentials are not in the repository

I hope this helps, just let me know if you see any caveats with this solution.

Question 27

I use the folloring file structure:

project/
   ...
   settings/
   settings/common.py
   settings/local.py
   settings/prod.py
   settings/__init__.py -> local.py

So __init__.py is a link (ln in unix or mklink in windows) to local.py or can be to prod.py so the configuration is still in the project.settings module is clean and organized, and if you want to use a particular config you can use the environment variable DJANGO_SETTINGS_MODULE to project.settings.prod if you need to run a command for production environment.

In the files prod.py and local.py:

from .shared import *

DATABASE = {
    ...
}

and the shared.py file keeps as global without specific configs.

Question 28

building off cs01’s answer:

if you’re having problems with the environment variable, set its value to a string (e.g. I did DJANGO_DEVELOPMENT="true").

I also changed cs01’s file workflow as follows:

#settings.py
import os
if os.environ.get('DJANGO_DEVELOPMENT') is not None:
    from settings_dev import * 
else:
    from settings_production import *
#settings_dev.py
development settings go here
#settings_production.py
production settings go here

This way, Django doesn’t have to read through the entirety of a settings file before running the appropriate settings file. This solution comes in handy if your production file needs stuff that’s only on your production server.

Note: in Python 3, imported files need to have a . appended (e.g. from .settings_dev import *)

Question 29

If you want to keep 1 settings file, and your development operating system is different than your production operating system, you can put this at the bottom of your settings.py:

from sys import platform
if platform == "linux" or platform == "linux2":
    # linux
    # some special setting here for when I'm on my prod server
elif platform == "darwin":
    # OS X
    # some special setting here for when I'm developing on my mac
elif platform == "win32":
    # Windows...
    # some special setting here for when I'm developing on my pc

Read more: How do I check the operating system in Python?

Question 30

This seems to have been answered, however a method which I use as combined with version control is the following:

Setup a env.py file in the same directory as settings on my local development environment that I also add to .gitignore:

env.py:

#!usr/bin/python

DJANGO_ENV = True
ALLOWED_HOSTS = ['127.0.0.1', 'dev.mywebsite.com']

.gitignore:

mywebsite/env.py

settings.py:

if os.path.exists(os.getcwd() + '/env.py'):
    #env.py is excluded using the .gitignore file - when moving to production we can automatically set debug mode to off:
    from env import *
else:
    DJANGO_ENV = False

DEBUG = DJANGO_ENV

I just find this works and is far more elegant – with env.py it is easy to see our local environment variables and we can handle all of this without multiple settings.py files or the likes. This methods allows for all sorts of local environment variables to be used that we wouldn’t want set on our production server. Utilising the .gitignore via version control we are also keeping everything seamlessly integrated.

Question 31

Use settings.py for production. In the same directory create settings_dev.py for overrides.

# settings_dev.py

from .settings import * 

DEBUG = False

On a dev machine run your Django app with:

DJANGO_SETTINGS_MODULE=<your_app_name>.settings_dev python3 manage.py runserver

On a prod machine run as if you just had settings.py and nothing else.

ADVANTAGES

settings.py (used for production) is completely agnostic to the fact that any other environments even exist.
To see the difference between prod and dev you just look into a single location – settings_dev.py. No need to gather configurations scattered across settings_prod.py, settings_dev.py and settings_shared.py.
If someone adds a setting to your prod config after troubleshooting a production issue you can rest assured that it will appear in your dev config as well (unless explicitly overridden). Thus the divergence between different config files will be minimized.

Question 32

For the problem of setting files, I choose to copy

Project
   |---__init__.py   [ write code to copy setting file from subdir to current dir]
   |---settings.py  (do not commit this file to git)
   |---setting1_dir
   |         |--  settings.py
   |---setting2_dir
   |         |--  settings.py

When you run django, __init__py will be ran. At this time , settings.py in setting1_dir will replace settings.py in Project.

How to choose different env?

modify __init__.py directly.
make a bash file to modify __init__.py.
modify env in linux, and then let __init__.py read this variable.

Why use to this way?

Because I don’t like so many files in the same directory, too many files will confuse other partners and not very well for IDE.(IDE cannot find what file we use)

If you do not want to see all these details, you can divide project into two part.

make your small tool like Spring Initializr, just for setup your project.(do sth like copy file)
your project code

Question 33

I’m using different app.yaml file to change configuration between environments in google cloud app engine.

You can use this to create a proxy connection in your terminal command:

./cloud_sql_proxy -instances=<INSTANCE_CONNECTION_NAME>=tcp:1433

https://cloud.google.com/sql/docs/sqlserver/connect-admin-proxy#macos-64-bit

File: app.yaml

# [START django_app]
service: development
runtime: python37

env_variables:
  DJANGO_DB_HOST: '/cloudsql/myproject:myregion:myinstance'
  DJANGO_DEBUG: True

handlers:
# This configures Google App Engine to serve the files in the app's static
# directory.
- url: /static
  static_dir: static/

# This handler routes all requests not caught above to your main app. It is
# required when static routes are defined, but can be omitted (along with
# the entire handlers section) when there are no static files defined.
- url: /.*
  script: auto
# [END django_app]

Question 34

This is my solution, with different environements for dev, test and prod

import socket

[...]

DEV_PC = 'PC059'
host_name = socket.gethostname()

if host_name == DEV_PC:
   #do something
   pass
elif [...]

Question 35

Can we use elif in list comprehension?

Example :

l = [1, 2, 3, 4, 5]

for values in l:
    if values==1:
        print 'yes'
    elif values==2:
        print 'no'
    else:
        print 'idle'

Can we include the elif in our list comprehension, in a similar fashion to the code above?

For example, an answer like:

['yes', 'no', 'idle', 'idle', 'idle']

Up until now, I have only used if and else in list comprehension.

Question 36

Python’s conditional expressions were designed exactly for this sort of use-case:

>>> l = [1, 2, 3, 4, 5]
>>> ['yes' if v == 1 else 'no' if v == 2 else 'idle' for v in l]
['yes', 'no', 'idle', 'idle', 'idle']

Question 37

>>> d = {1: 'yes', 2: 'no'}
>>> [d.get(x, 'idle') for x in l]
['yes', 'no', 'idle', 'idle', 'idle']

Question 38

You can, sort of.

Note that when you use sytax like:

['yes' if v == 1 else 'no' for v in l]

You are using the ternary form of the if/else operator (if you’re familiar with languages like C, this is like the ?: construct: (v == 1 ? 'yes' : 'no')).

The ternary form of the if/else operator doesn’t have an ‘elif’ built in, but you can simulate it in the ‘else’ condition:

['yes' if v == 1 else 'no' if v == 2 else 'idle' for v in l]

This is like saying:

for v in l:
    if v == 1 :
        print 'yes'
    else:
        if v == 2:
            print 'no'
        else:
            print 'idle'

So there’s no direct ‘elif’ construct like you asked about, but it can be simulated with nested if/else statements.

Question 39

Maybe you want this:

l = [1, 2, 3, 4, 5] 

print ([['idle','no','yes'][2*(n==1)+(n==2)] for n in l])

Question 40

You can use list comprehension is you are going to create another list from original.

>>> l = [1, 2, 3, 4, 5]
>>> result_map = {1: 'yes', 2: 'no'}
>>> [result_map[x] if x in result_map else 'idle' for x in l]
['yes', 'no', 'idle', 'idle', 'idle']

Question 41

Another easy way is to use conditional list comprehension like this:

l=[1,2,3,4,5]
print [[["no","yes"][v==1],"idle"][v!=1 and v!=2] for v in l]

gives you the correct anwer:

[‘yes’, ‘no’, ‘idle’, ‘idle’, ‘idle’]

Question 42

I wish to draw lines on a square graph.

The scales of x-axis and y-axis should be the same.

e.g. x ranges from 0 to 10 and it is 10cm on the screen. y has to also range from 0 to 10 and has to be also 10 cm.

The square shape has to be maintained, even if I mess around with the window size.

Currently, my graph scales together with the window size.

How may I achieve this?

UPDATE:

I tried the following, but it did not work.

plt.xlim(-3, 3)
plt.ylim(-3, 3)
plt.axis('equal')

Question 43

You need to dig a bit deeper into the api to do this:

from matplotlib import pyplot as plt
plt.plot(range(5))
plt.xlim(-3, 3)
plt.ylim(-3, 3)
plt.gca().set_aspect('equal', adjustable='box')
plt.draw()

doc for set_aspect

Question 44

plt.axis('scaled')

works well for me.

Question 45

Try something like:

import pylab as p
p.plot(x,y)
p.axis('equal')
p.show()

Question 46

See the documentation on plt.axis(). This:

plt.axis('equal')

doesn’t work because it changes the limits of the axis to make circles appear circular. What you want is:

plt.axis('square')

This creates a square plot with equal axes.

Question 47

My Dockerfile is something like

FROM my/base

ADD . /srv
RUN pip install -r requirements.txt
RUN python setup.py install

ENTRYPOINT ["run_server"]

Every time I build a new image, dependencies have to be reinstalled, which could be very slow in my region.

One way I think of to cache packages that have been installed is to override the my/base image with newer images like this:

docker build -t new_image_1 .
docker tag new_image_1 my/base

So next time I build with this Dockerfile, my/base already has some packages installed.

But this solution has two problems:

It is not always possible to override a base image
The base image grow bigger and bigger as newer images are layered on it

So what better solution could I use to solve this problem?

EDIT##:

Some information about the docker on my machine:

☁  test  docker version
Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.1
Git commit (client): d84a070
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.1
Git commit (server): d84a070
☁  test  docker info
Containers: 0
Images: 56
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Dirs: 56
Execution Driver: native-0.2
Kernel Version: 3.13.0-29-generic
WARNING: No swap limit support

Question 48

Try to build a Dockerfile which looks something like this:

FROM my/base

WORKDIR /srv
ADD ./requirements.txt /srv/requirements.txt
RUN pip install -r requirements.txt
ADD . /srv
RUN python setup.py install

ENTRYPOINT ["run_server"]

Docker will use cache during pip install as long as you do not make any changes to the requirements.txt, irrespective of the fact whether other code files at . were changed or not. Here’s an example.

Here’s a simple Hello, World! program:

$ tree
.
├── Dockerfile
├── requirements.txt
└── run.py   

0 directories, 3 file

# Dockerfile

FROM dockerfile/python
WORKDIR /srv
ADD ./requirements.txt /srv/requirements.txt
RUN pip install -r requirements.txt
ADD . /srv
CMD python /srv/run.py

# requirements.txt
pytest==2.3.4

# run.py
print("Hello, World")

The output of docker build:

Step 1 : WORKDIR /srv
---> Running in 22d725d22e10
---> 55768a00fd94
Removing intermediate container 22d725d22e10
Step 2 : ADD ./requirements.txt /srv/requirements.txt
---> 968a7c3a4483
Removing intermediate container 5f4e01f290fd
Step 3 : RUN pip install -r requirements.txt
---> Running in 08188205e92b
Downloading/unpacking pytest==2.3.4 (from -r requirements.txt (line 1))
  Running setup.py (path:/tmp/pip_build_root/pytest/setup.py) egg_info for package pytest
....
Cleaning up...
---> bf5c154b87c9
Removing intermediate container 08188205e92b
Step 4 : ADD . /srv
---> 3002a3a67e72
Removing intermediate container 83defd1851d0
Step 5 : CMD python /srv/run.py
---> Running in 11e69b887341
---> 5c0e7e3726d6
Removing intermediate container 11e69b887341
Successfully built 5c0e7e3726d6

Let’s modify run.py:

# run.py
print("Hello, Python")

Try to build again, below is the output:

Sending build context to Docker daemon  5.12 kB
Sending build context to Docker daemon 
Step 0 : FROM dockerfile/python
---> f86d6993fc7b
Step 1 : WORKDIR /srv
---> Using cache
---> 55768a00fd94
Step 2 : ADD ./requirements.txt /srv/requirements.txt
---> Using cache
---> 968a7c3a4483
Step 3 : RUN pip install -r requirements.txt
---> Using cache
---> bf5c154b87c9
Step 4 : ADD . /srv
---> 9cc7508034d6
Removing intermediate container 0d7cf71eb05e
Step 5 : CMD python /srv/run.py
---> Running in f25c21135010
---> 4ffab7bc66c7
Removing intermediate container f25c21135010
Successfully built 4ffab7bc66c7

As you can see above, this time docker uses cache during the build. Now, let’s update requirements.txt:

# requirements.txt

pytest==2.3.4
ipython

Below is the output of docker build:

Sending build context to Docker daemon  5.12 kB
Sending build context to Docker daemon 
Step 0 : FROM dockerfile/python
---> f86d6993fc7b
Step 1 : WORKDIR /srv
---> Using cache
---> 55768a00fd94
Step 2 : ADD ./requirements.txt /srv/requirements.txt
---> b6c19f0643b5
Removing intermediate container a4d9cb37dff0
Step 3 : RUN pip install -r requirements.txt
---> Running in 4b7a85a64c33
Downloading/unpacking pytest==2.3.4 (from -r requirements.txt (line 1))
  Running setup.py (path:/tmp/pip_build_root/pytest/setup.py) egg_info for package pytest

Downloading/unpacking ipython (from -r requirements.txt (line 2))
Downloading/unpacking py>=1.4.12 (from pytest==2.3.4->-r requirements.txt (line 1))
  Running setup.py (path:/tmp/pip_build_root/py/setup.py) egg_info for package py

Installing collected packages: pytest, ipython, py
  Running setup.py install for pytest

Installing py.test script to /usr/local/bin
Installing py.test-2.7 script to /usr/local/bin
  Running setup.py install for py

Successfully installed pytest ipython py
Cleaning up...
---> 23a1af3df8ed
Removing intermediate container 4b7a85a64c33
Step 4 : ADD . /srv
---> d8ae270eca35
Removing intermediate container 7f003ebc3179
Step 5 : CMD python /srv/run.py
---> Running in 510359cf9e12
---> e42fc9121a77
Removing intermediate container 510359cf9e12
Successfully built e42fc9121a77

Notice how docker didn’t use cache during pip install. If it doesn’t work, check your docker version.

Client version: 1.1.2
Client API version: 1.13
Go version (client): go1.2.1
Git commit (client): d84a070
Server version: 1.1.2
Server API version: 1.13
Go version (server): go1.2.1
Git commit (server): d84a070

Question 49

To minimise the network activity, you could point pip to a cache directory on your host machine.

Run your docker container with your host’s pip cache directory bind mounted into your container’s pip cache directory. docker run command should look like this:

docker run -v $HOME/.cache/pip-docker/:/root/.cache/pip image_1

Then in your Dockerfile install your requirements as a part of ENTRYPOINT statement (or CMD statement) instead of as a RUN command. This is important, because (as pointed out in comments) the mount is not available during image building (when RUN statements are executed). Docker file should look like this:

FROM my/base

ADD . /srv

ENTRYPOINT ["sh", "-c", "pip install -r requirements.txt && python setup.py install && run_server"]

Question 50

I understand this question has some popular answers already. But there is a newer way to cache files for package managers. I think it could be a good answer in the future when BuildKit becomes more standard.

As of Docker 18.09 there is experimental support for BuildKit. BuildKit adds support for some new features in the Dockerfile including experimental support for mounting external volumes into RUN steps. This allows us to create caches for things like $HOME/.cache/pip/.

We’ll use the following requirements.txt file as an example:

Click==7.0
Django==2.2.3
django-appconf==1.0.3
django-compressor==2.3
django-debug-toolbar==2.0
django-filter==2.2.0
django-reversion==3.0.4
django-rq==2.1.0
pytz==2019.1
rcssmin==1.0.6
redis==3.3.4
rjsmin==1.1.0
rq==1.1.0
six==1.12.0
sqlparse==0.3.0

A typical example Python Dockerfile might look like:

FROM python:3.7
WORKDIR /usr/src/app
COPY requirements.txt /usr/src/app/
RUN pip install -r requirements.txt
COPY . /usr/src/app

With BuildKit enabled using the DOCKER_BUILDKIT environment variable we can build the uncached pip step in about 65 seconds:

$ export DOCKER_BUILDKIT=1
$ docker build -t test .
[+] Building 65.6s (10/10) FINISHED                                                                                                                                             
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => => transferring context: 2B                                                                                                                                            0.0s
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => => transferring dockerfile: 120B                                                                                                                                       0.0s
 => [internal] load metadata for docker.io/library/python:3.7                                                                                                              0.5s
 => CACHED [1/4] FROM docker.io/library/python:3.7@sha256:6eaf19442c358afc24834a6b17a3728a45c129de7703d8583392a138ecbdb092                                                 0.0s
 => [internal] load build context                                                                                                                                          0.6s
 => => transferring context: 899.99kB                                                                                                                                      0.6s
 => CACHED [internal] helper image for file operations                                                                                                                     0.0s
 => [2/4] COPY requirements.txt /usr/src/app/                                                                                                                              0.5s
 => [3/4] RUN pip install -r requirements.txt                                                                                                                             61.3s
 => [4/4] COPY . /usr/src/app                                                                                                                                              1.3s
 => exporting to image                                                                                                                                                     1.2s
 => => exporting layers                                                                                                                                                    1.2s
 => => writing image sha256:d66a2720e81530029bf1c2cb98fb3aee0cffc2f4ea2aa2a0760a30fb718d7f83                                                                               0.0s
 => => naming to docker.io/library/test                                                                                                                                    0.0s

Now, let us add the experimental header and modify the RUN step to cache the Python packages:

# syntax=docker/dockerfile:experimental

FROM python:3.7
WORKDIR /usr/src/app
COPY requirements.txt /usr/src/app/
RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt
COPY . /usr/src/app

Go ahead and do another build now. It should take the same amount of time. But this time it is caching the Python packages in our new cache mount:

$ docker build -t pythontest .
[+] Building 60.3s (14/14) FINISHED                                                                                                                                             
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => => transferring dockerfile: 120B                                                                                                                                       0.0s
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => => transferring context: 2B                                                                                                                                            0.0s
 => resolve image config for docker.io/docker/dockerfile:experimental                                                                                                      0.5s
 => CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:9022e911101f01b2854c7a4b2c77f524b998891941da55208e71c0335e6e82c3                                 0.0s
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => => transferring dockerfile: 120B                                                                                                                                       0.0s
 => [internal] load metadata for docker.io/library/python:3.7                                                                                                              0.5s
 => CACHED [1/4] FROM docker.io/library/python:3.7@sha256:6eaf19442c358afc24834a6b17a3728a45c129de7703d8583392a138ecbdb092                                                 0.0s
 => [internal] load build context                                                                                                                                          0.7s
 => => transferring context: 899.99kB                                                                                                                                      0.6s
 => CACHED [internal] helper image for file operations                                                                                                                     0.0s
 => [2/4] COPY requirements.txt /usr/src/app/                                                                                                                              0.6s
 => [3/4] RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt                                                                                  53.3s
 => [4/4] COPY . /usr/src/app                                                                                                                                              2.6s
 => exporting to image                                                                                                                                                     1.2s
 => => exporting layers                                                                                                                                                    1.2s
 => => writing image sha256:0b035548712c1c9e1c80d4a86169c5c1f9e94437e124ea09e90aea82f45c2afc                                                                               0.0s
 => => naming to docker.io/library/test                                                                                                                                    0.0s

About 60 seconds. Similar to our first build.

Make a small change to the requirements.txt (such as adding a new line between two packages) to force a cache invalidation and run again:

$ docker build -t pythontest .
[+] Building 15.9s (14/14) FINISHED                                                                                                                                             
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => => transferring dockerfile: 120B                                                                                                                                       0.0s
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => => transferring context: 2B                                                                                                                                            0.0s
 => resolve image config for docker.io/docker/dockerfile:experimental                                                                                                      1.1s
 => CACHED docker-image://docker.io/docker/dockerfile:experimental@sha256:9022e911101f01b2854c7a4b2c77f524b998891941da55208e71c0335e6e82c3                                 0.0s
 => [internal] load build definition from Dockerfile                                                                                                                       0.0s
 => => transferring dockerfile: 120B                                                                                                                                       0.0s
 => [internal] load .dockerignore                                                                                                                                          0.0s
 => [internal] load metadata for docker.io/library/python:3.7                                                                                                              0.5s
 => CACHED [1/4] FROM docker.io/library/python:3.7@sha256:6eaf19442c358afc24834a6b17a3728a45c129de7703d8583392a138ecbdb092                                                 0.0s
 => CACHED [internal] helper image for file operations                                                                                                                     0.0s
 => [internal] load build context                                                                                                                                          0.7s
 => => transferring context: 899.99kB                                                                                                                                      0.7s
 => [2/4] COPY requirements.txt /usr/src/app/                                                                                                                              0.6s
 => [3/4] RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt                                                                                   8.8s
 => [4/4] COPY . /usr/src/app                                                                                                                                              2.1s
 => exporting to image                                                                                                                                                     1.1s
 => => exporting layers                                                                                                                                                    1.1s
 => => writing image sha256:fc84cd45482a70e8de48bfd6489e5421532c2dd02aaa3e1e49a290a3dfb9df7c                                                                               0.0s
 => => naming to docker.io/library/test                                                                                                                                    0.0s

Only about 16 seconds!

We are getting this speedup because we are no longer downloading all the Python packages. They were cached by the package manager (pip in this case) and stored in a cache volume mount. The volume mount is provided to the run step so that pip can reuse our already downloaded packages. This happens outside any Docker layer caching.

The gains should be much better on larger requirements.txt.

Notes:

This is experimental Dockerfile syntax and should be treated as such. You may not want to build with this in production at the moment.
~~The BuildKit stuff doesn’t work under Docker Compose or other tools that directly use the Docker API at the moment.~~ There is now support for this in Docker Compose as of 1.25.0. See How do you enable BuildKit with docker-compose?
There isn’t any direct interface for managed the cache at the moment. It is purged when you do a docker system prune -a.

Hopefully, these features will make it into Docker for building and BuildKit will become the default. If / when that happens I will try to update this answer.

Question 51

I found that a better way is to just add the Python site-packages directory as a volume.

services:
    web:
        build: .
        command: python manage.py runserver 0.0.0.0:8000
        volumes:
            - .:/code
            -  /usr/local/lib/python2.7/site-packages/

This way I can just pip install new libraries without having to do a full rebuild.

EDIT: Disregard this answer, jkukul’s answer above worked for me. My intent was to cache the site-packages folder. That would have looked something more like:

volumes:
   - .:/code
   - ./cached-packages:/usr/local/lib/python2.7/site-packages/

Caching the download folder is alot cleaner though. That also caches the wheels, so it properly achieves the task.

Question 52

Using Python 3’s function annotations, it is possible to specify the type of items contained within a homogeneous list (or other collection) for the purpose of type hinting in PyCharm and other IDEs?

A pseudo-python code example for a list of int:

def my_func(l:list<int>):
    pass

I know it’s possible using Docstring…

def my_func(l):
    """
    :type l: list[int]
    """
    pass

… but I prefer the annotation style if it’s possible.

Question 53

Answering my own question; the TLDR answer is No Yes.

Update 2

In September 2015, Python 3.5 was released with support for Type Hints and includes a new typing module. This allows for the specification of types contained within collections. As of November 2015, JetBrains PyCharm 5.0 fully supports Python 3.5 to include Type Hints as illustrated below.

Update 1

As of May 2015, PEP0484 (Type Hints) has been formally accepted. The draft implementation is also available at github under ambv/typehinting.

Original Answer

As of Aug 2014, I have confirmed that it is not possible to use Python 3 type annotations to specify types within collections (ex: a list of strings).

The use of formatted docstrings such as reStructuredText or Sphinx are viable alternatives and supported by various IDEs.

It also appears that Guido is mulling over the idea of extending type annotations in the spirit of mypy: http://mail.python.org/pipermail/python-ideas/2014-August/028618.html

Question 54

Now that Python 3.5 is officially out, there is the Type Hints supporting module – typing and the relevant List “type” for the generic containers.

In other words, now you can do:

from typing import List

def my_func(l: List[int]):
    pass

Question 55

Type comments have been added since PEP 484

from . import Monitor
from typing import List, Set, Tuple, Dict


active_monitors = [] # type: List[Monitor]
# or
active_monitors: List[Monitor] = []

# bonus
active_monitors: Set[Monitor] = set()
monitor_pair: Tuple[Monitor, Monitor] = (Monitor(), Monitor())
monitor_dict: Dict[str, Monitor] = {'codename': Monitor()}

# nested
monitor_pair_list: List[Dict[str, Monitor]] = [{'codename': Monitor()}]

This is currently working for me on PyCharm with Python 3.6.4

Example Picture in Pycharm

Question 56

With support from the BDFL, it’s almost certain now that python (probably 3.5) will provide a standardized syntax for type hints via function annotations.

https://www.python.org/dev/peps/pep-0484/

As referenced in the PEP, there is an experimental type-checker (kind of like pylint, but for types) called mypy that already uses this standard, and doesn’t require any new syntax.

http://mypy-lang.org/

Question 57

As of Python 3.9, builtin types are generic with respect to type annotations (see PEP 585). This allows to directly specify the type of elements:

def my_func(l: list[int]):
    pass

Various tools may support this syntax earlier than Python 3.9. When annotations are not inspected at runtime, the syntax is valid using quoting or __future__.annotations.

# quoted
def my_func(l: 'list[int]'):
    pass

# postponed evaluation of annotation
from __future__ import annotations

def my_func(l: list[int]):
    pass

Question 58

I have a string that is HTML encoded:

'''&lt;img class=&quot;size-medium wp-image-113&quot;\
 style=&quot;margin-left: 15px;&quot; title=&quot;su1&quot;\
 src=&quot;http://blah.org/wp-content/uploads/2008/10/su1-300x194.jpg&quot;\
 alt=&quot;&quot; width=&quot;300&quot; height=&quot;194&quot; /&gt;'''

I want to change that to:

<img class="size-medium wp-image-113" style="margin-left: 15px;" 
  title="su1" src="http://blah.org/wp-content/uploads/2008/10/su1-300x194.jpg" 
  alt="" width="300" height="194" />

I want this to register as HTML so that it is rendered as an image by the browser instead of being displayed as text.

The string is stored like that because I am using a web-scraping tool called BeautifulSoup, it “scans” a web-page and gets certain content from it, then returns the string in that format.

I’ve found how to do this in C# but not in Python. Can someone help me out?

def html_decode(s):
    """
    Returns the ASCII decoded version of the given HTML string. This does
    NOT remove normal HTML tags like <p>.
    """
    htmlCodes = (
            ("'", '&#39;'),
            ('"', '&quot;'),
            ('>', '&gt;'),
            ('<', '&lt;'),
            ('&', '&amp;')
        )
    for code in htmlCodes:
        s = s.replace(code[1], code[0])
    return s

unescaped = html_decode(my_string)

This, however, is not a general solution; it is only appropriate for strings encoded with django.utils.html.escape. More generally, it is a good idea to stick with the standard library:

# Python 2.x:
import HTMLParser
html_parser = HTMLParser.HTMLParser()
unescaped = html_parser.unescape(my_string)

# Python 3.x:
import html.parser
html_parser = html.parser.HTMLParser()
unescaped = html_parser.unescape(my_string)

# >= Python 3.5:
from html import unescape
unescaped = unescape(my_string)

As a suggestion: it may make more sense to store the HTML unescaped in your database. It’d be worth looking into getting unescaped results back from BeautifulSoup if possible, and avoiding this process altogether.

With Django, escaping only occurs during template rendering; so to prevent escaping you just tell the templating engine not to escape your string. To do that, use one of these options in your template:

{{ context_var|safe }}
{% autoescape off %}
    {{ context_var }}
{% endautoescape %}

Question 60

With the standard library:

HTML Escape

try:
    from html import escape  # python 3.x
except ImportError:
    from cgi import escape  # python 2.x

print(escape("<"))

HTML Unescape

try:
    from html import unescape  # python 3.4+
except ImportError:
    try:
        from html.parser import HTMLParser  # python 3.x (<3.4)
    except ImportError:
        from HTMLParser import HTMLParser  # python 2.x
    unescape = HTMLParser().unescape

print(unescape("&gt;"))

Question 61

For html encoding, there’s cgi.escape from the standard library:

>> help(cgi.escape)
cgi.escape = escape(s, quote=None)
    Replace special characters "&", "<" and ">" to HTML-safe sequences.
    If the optional flag quote is true, the quotation mark character (")
    is also translated.

For html decoding, I use the following:

import re
from htmlentitydefs import name2codepoint
# for some reason, python 2.5.2 doesn't have this one (apostrophe)
name2codepoint['#39'] = 39

def unescape(s):
    "unescape HTML code refs; c.f. http://wiki.python.org/moin/EscapingHtml"
    return re.sub('&(%s);' % '|'.join(name2codepoint),
              lambda m: unichr(name2codepoint[m.group(1)]), s)

For anything more complicated, I use BeautifulSoup.

Question 62

Use daniel’s solution if the set of encoded characters is relatively restricted. Otherwise, use one of the numerous HTML-parsing libraries.

I like BeautifulSoup because it can handle malformed XML/HTML :

http://www.crummy.com/software/BeautifulSoup/

for your question, there’s an example in their documentation

from BeautifulSoup import BeautifulStoneSoup
BeautifulStoneSoup("Sacr&eacute; bl&#101;u!", 
                   convertEntities=BeautifulStoneSoup.HTML_ENTITIES).contents[0]
# u'Sacr\xe9 bleu!'

Question 63

In Python 3.4+:

import html

html.unescape(your_string)

Question 64

See at the bottom of this page at Python wiki, there are at least 2 options to “unescape” html.

问题：如何将单独的Pan​​das DataFrame绘制为子图？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

如何从具有长（整洁）数据的数据帧字典中创建多个图

导入和综合数据

创建颜色映射并绘制

How to create multiple plots from a dictionary of dataframes with long (tidy) data

Imports and synthetic data

Create color mappings and plot

问题：如何在CPU上运行Tensorflow

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

问题：Django：如何管理开发和生产设置？

回答 0

设置DJANGO_SETTINGS_MODULE使用脚本或shell

DJANGO_SETTINGS_MODULE使用流程管理器进行设置

Setting DJANGO_SETTINGS_MODULE using a script or a shell

Setting DJANGO_SETTINGS_MODULE using a Process Manager

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

考虑以下结构：

与.env.json喜欢：

和project_name/env.py：

我们可以进行以下设置：

该解决方案的好处是：

Considering the following structure :

With .env.json like :

And project_name/env.py :

We can have the following settings:

the benefits of this solution are :

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

问题：清单理解条件中的`elif`

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：如何在Python matplotlib中使x轴和y轴的比例相等？

回答 0

回答 1

回答 2

回答 3

问题：为Python项目构建Docker映像时如何避免重新安装软件包？

编辑＃＃：

EDIT##:

回答 0

回答 1

回答 2

回答 3

问题：类型提示指定类型的列表

回答 0

回答 1

回答 2

回答 3

回答 4

问题：如何使用Python / Django执行HTML解码/编码？

问题：如何将单独的Pandas DataFrame绘制为子图？

设置`DJANGO_SETTINGS_MODULE`使用脚本或shell

`DJANGO_SETTINGS_MODULE`使用流程管理器进行设置

Setting `DJANGO_SETTINGS_MODULE` using a script or a shell

Setting `DJANGO_SETTINGS_MODULE` using a Process Manager

与`.env.json`喜欢：

和`project_name/env.py`：

With `.env.json` like :

And `project_name/env.py` :