Python 实用宝典

Question 1

I am using matplotlib to make some graphs and unfortunately I cannot export them without the white background.

In other words, when I export a plot like this and position it on top of another image, the white background hides what is behind it rather than allowing it to show through. How can I export plots with a transparent background instead?

Question 2

Use the matplotlib savefig function with the keyword argument transparent=True to save the image as a png file.

In [30]: x = np.linspace(0,6,31)

In [31]: y = np.exp(-0.5*x) * np.sin(x)

In [32]: plot(x, y, 'bo-')
Out[32]: [<matplotlib.lines.Line2D at 0x3f29750>]            

In [33]: savefig('demo.png', transparent=True)

Result:

Of course, that plot doesn’t demonstrate the transparency. Here’s a screenshot of the PNG file displayed using the ImageMagick display command. The checkerboard pattern is the background that is visible through the transparent parts of the PNG file.

Question 3

Png files can handle transparency. So you could use this question Save plot to image file instead of displaying it using Matplotlib so as to save you graph as a png file.

And if you want to turn all white pixel transparent, there’s this other question : Using PIL to make all white pixels transparent?

If you want to turn an entire area to transparent, then there’s this question: And then use the PIL library like in this question Python PIL: how to make area transparent in PNG? so as to make your graph transparent.

Question 4

I have some nodes coming from a script that I want to map on to a graph. In the below, I want to use Arrow to go from A to D and probably have the edge colored too in (red or something).

This is basically, like a path from A to D when all other nodes are present. you can imagine each nodes as cities and traveling from A to D requires directions (with arrow heads).

This code below builds the graph

import networkx as nx
import numpy as np
import matplotlib.pyplot as plt

G = nx.Graph()
G.add_edges_from(
    [('A', 'B'), ('A', 'C'), ('D', 'B'), ('E', 'C'), ('E', 'F'),
     ('B', 'H'), ('B', 'G'), ('B', 'F'), ('C', 'G')])

val_map = {'A': 1.0,
           'D': 0.5714285714285714,
           'H': 0.0}

values = [val_map.get(node, 0.25) for node in G.nodes()]

nx.draw(G, cmap = plt.get_cmap('jet'), node_color = values)
plt.show()

but I want something like shown in the image.

Arrow heads of the first image and the edges in red color onto the second image.

Question 5

Fully fleshed out example with arrows for only the red edges:

import networkx as nx
import matplotlib.pyplot as plt

G = nx.DiGraph()
G.add_edges_from(
    [('A', 'B'), ('A', 'C'), ('D', 'B'), ('E', 'C'), ('E', 'F'),
     ('B', 'H'), ('B', 'G'), ('B', 'F'), ('C', 'G')])

val_map = {'A': 1.0,
           'D': 0.5714285714285714,
           'H': 0.0}

values = [val_map.get(node, 0.25) for node in G.nodes()]

# Specify the edges you want here
red_edges = [('A', 'C'), ('E', 'C')]
edge_colours = ['black' if not edge in red_edges else 'red'
                for edge in G.edges()]
black_edges = [edge for edge in G.edges() if edge not in red_edges]

# Need to create a layout when doing
# separate calls to draw nodes and edges
pos = nx.spring_layout(G)
nx.draw_networkx_nodes(G, pos, cmap=plt.get_cmap('jet'), 
                       node_color = values, node_size = 500)
nx.draw_networkx_labels(G, pos)
nx.draw_networkx_edges(G, pos, edgelist=red_edges, edge_color='r', arrows=True)
nx.draw_networkx_edges(G, pos, edgelist=black_edges, arrows=False)
plt.show()

Question 6

I only put this in for completeness. I’ve learned plenty from marius and mdml. Here are the edge weights. Sorry about the arrows. Looks like I’m not the only one saying it can’t be helped. I couldn’t render this with ipython notebook I had to go straight from python which was the problem with getting my edge weights in sooner.

import networkx as nx
import numpy as np
import matplotlib.pyplot as plt
import pylab

G = nx.DiGraph()

G.add_edges_from([('A', 'B'),('C','D'),('G','D')], weight=1)
G.add_edges_from([('D','A'),('D','E'),('B','D'),('D','E')], weight=2)
G.add_edges_from([('B','C'),('E','F')], weight=3)
G.add_edges_from([('C','F')], weight=4)


val_map = {'A': 1.0,
                   'D': 0.5714285714285714,
                              'H': 0.0}

values = [val_map.get(node, 0.45) for node in G.nodes()]
edge_labels=dict([((u,v,),d['weight'])
                 for u,v,d in G.edges(data=True)])
red_edges = [('C','D'),('D','A')]
edge_colors = ['black' if not edge in red_edges else 'red' for edge in G.edges()]

pos=nx.spring_layout(G)
nx.draw_networkx_edge_labels(G,pos,edge_labels=edge_labels)
nx.draw(G,pos, node_color = values, node_size=1500,edge_color=edge_colors,edge_cmap=plt.cm.Reds)
pylab.show()

Question 7

Instead of regular nx.draw you may want to use:

nx.draw_networkx(G[, pos, arrows, with_labels])

For example:

nx.draw_networkx(G, arrows=True, **options)

You can add options by initialising that ** variable like this:

options = {
    'node_color': 'blue',
    'node_size': 100,
    'width': 3,
    'arrowstyle': '-|>',
    'arrowsize': 12,
}

Also some functions support the directed=True parameter In this case this state is the default one:

G = nx.DiGraph(directed=True)

The networkx reference is found here.

Question 8

You need to use a directed graph instead of a graph, i.e.

G = nx.DiGraph()

Then, create a list of the edge colors you want to use and pass those to nx.draw (as shown by @Marius).

Putting this all together, I get the image below. Still not quite the other picture you show (I don’t know where your edge weights are coming from), but much closer! If you want more control of how your output graph looks (e.g. get arrowheads that look like arrows), I’d check out NetworkX with Graphviz.

Question 9

import networkx as nx
import matplotlib.pyplot as plt

g = nx.DiGraph()
g.add_nodes_from([1,2,3,4,5])
g.add_edge(1,2)
g.add_edge(4,2)
g.add_edge(3,5)
g.add_edge(2,3)
g.add_edge(5,4)

nx.draw(g,with_labels=True)
plt.draw()
plt.show()

This is just simple how to draw directed graph using python 3.x using networkx. just simple representation and can be modified and colored etc. See the generated graph here.

Note: It’s just a simple representation. Weighted Edges could be added like

g.add_edges_from([(1,2),(2,5)], weight=2)

and hence plotted again.

Question 10

import networkx as nx
import matplotlib.pyplot as plt

G = nx.DiGraph()
G.add_node("A")
G.add_node("B")
G.add_node("C")
G.add_node("D")
G.add_node("E")
G.add_node("F")
G.add_node("G")
G.add_edge("A","B")
G.add_edge("B","C")
G.add_edge("C","E")
G.add_edge("C","F")
G.add_edge("D","E")
G.add_edge("F","G")

print(G.nodes())
print(G.edges())

pos = nx.spring_layout(G)

nx.draw_networkx_nodes(G, pos)
nx.draw_networkx_labels(G, pos)
nx.draw_networkx_edges(G, pos, edge_color='r', arrows = True)

plt.show()

Question 11

The following code plots to two PostScript (.ps) files, but the second one contains both lines.

import matplotlib
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab

plt.subplot(111)
x = [1,10]
y = [30, 1000]
plt.loglog(x, y, basex=10, basey=10, ls="-")
plt.savefig("first.ps")


plt.subplot(111)
x = [10,100]
y = [10, 10000]
plt.loglog(x, y, basex=10, basey=10, ls="-")
plt.savefig("second.ps")

How can I tell matplotlib to start afresh for the second plot?

Question 12

You can use figure to create a new plot, for example, or use close after the first plot.

Question 13

There is a clear figure command, and it should do it for you:

plt.clf()

If you have multiple subplots in the same figure

plt.cla()

clears the current axes.

Question 14

As stated from David Cournapeau, use figure().

import matplotlib
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab

plt.figure()
x = [1,10]
y = [30, 1000]
plt.loglog(x, y, basex=10, basey=10, ls="-")
plt.savefig("first.ps")


plt.figure()
x = [10,100]
y = [10, 10000]
plt.loglog(x, y, basex=10, basey=10, ls="-")
plt.savefig("second.ps")

Or subplot(121) / subplot(122) for the same plot, different position.

import matplotlib
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab

plt.subplot(121)
x = [1,10]
y = [30, 1000]
plt.loglog(x, y, basex=10, basey=10, ls="-")

plt.subplot(122)
x = [10,100]
y = [10, 10000]
plt.loglog(x, y, basex=10, basey=10, ls="-")
plt.savefig("second.ps")

Question 15

Just enter plt.hold(False) before the first plt.plot, and you can stick to your original code.

Question 16

If you’re using Matplotlib interactively, for example in a web application, (e.g. ipython) you maybe looking for

plt.show()

instead of plt.close() or plt.clf().

Question 17

If none of them are working then check this.. say if you have x and y arrays of data along respective axis. Then check in which cell(jupyter) you have initialized x and y to empty. This is because , maybe you are appending data to x and y without re-initializing them. So plot has old data too. So check that..

Question 18

I have an array of timestamps in the format (HH:MM:SS.mmmmmm) and another array of floating point numbers, each corresponding to a value in the timestamp array.

Can I plot time on the x axis and the numbers on the y-axis using Matplotlib?

I was trying to, but somehow it was only accepting arrays of floats. How can I get it to plot the time? Do I have to modify the format in any way?

Question 19

You must first convert your timestamps to Python datetime objects (use datetime.strptime). Then use date2num to convert the dates to matplotlib format.

Plot the dates and values using plot_date:

dates = matplotlib.dates.date2num(list_of_datetimes)
matplotlib.pyplot.plot_date(dates, values)

Question 20

You can also plot the timestamp, value pairs using pyplot.plot (after parsing them from their string representation). (Tested with matplotlib versions 1.2.0 and 1.3.1.)

Example:

import datetime
import random
import matplotlib.pyplot as plt

# make up some data
x = [datetime.datetime.now() + datetime.timedelta(hours=i) for i in range(12)]
y = [i+random.gauss(0,1) for i,_ in enumerate(x)]

# plot
plt.plot(x,y)
# beautify the x-labels
plt.gcf().autofmt_xdate()

plt.show()

Resulting image:

Here’s the same as a scatter plot:

import datetime
import random
import matplotlib.pyplot as plt

# make up some data
x = [datetime.datetime.now() + datetime.timedelta(hours=i) for i in range(12)]
y = [i+random.gauss(0,1) for i,_ in enumerate(x)]

# plot
plt.scatter(x,y)
# beautify the x-labels
plt.gcf().autofmt_xdate()

plt.show()

Produces an image similar to this:

Question 21

7 years later and this code has helped me. However, my times still were not showing up correctly.

Using Matplotlib 2.0.0 and I had to add the following bit of code from Editing the date formatting of x-axis tick labels in matplotlib by Paul H.

import matplotlib.dates as mdates
myFmt = mdates.DateFormatter('%d')
ax.xaxis.set_major_formatter(myFmt)

I changed the format to (%H:%M) and the time displayed correctly.

All thanks to the community.

Question 22

I had trouble with this using matplotlib version: 2.0.2. Running the example from above I got a centered stacked set of bubbles.

I “fixed” the problem by adding another line:

plt.plot([],[])

The entire code snippet becomes:

import datetime
import random
import matplotlib.pyplot as plt
import matplotlib.dates as mdates


# make up some data
x = [datetime.datetime.now() + datetime.timedelta(minutes=i) for i in range(12)]
y = [i+random.gauss(0,1) for i,_ in enumerate(x)]

# plot
plt.plot([],[])
plt.scatter(x,y)

# beautify the x-labels
plt.gcf().autofmt_xdate()
myFmt = mdates.DateFormatter('%H:%M')
plt.gca().xaxis.set_major_formatter(myFmt)

plt.show()
plt.close()

This produces an image with the bubbles distributed as desired.

Question 23

I want to make a scatterplot (using matplotlib) where the points are shaded according to a third variable. I’ve got very close with this:

plt.scatter(w, M, c=p, marker='s')

where w and M are the datapoints and p is the variable I want to shade with respect to.
However I want to do it in greyscale rather than colour. Can anyone help?

Question 24

There’s no need to manually set the colors. Instead, specify a grayscale colormap…

import numpy as np
import matplotlib.pyplot as plt

# Generate data...
x = np.random.random(10)
y = np.random.random(10)

# Plot...
plt.scatter(x, y, c=y, s=500)
plt.gray()

plt.show()

Or, if you’d prefer a wider range of colormaps, you can also specify the cmap kwarg to scatter. To use the reversed version of any of these, just specify the “_r” version of any of them. E.g. gray_r instead of gray. There are several different grayscale colormaps pre-made (e.g. gray, gist_yarg, binary, etc).

import matplotlib.pyplot as plt
import numpy as np

# Generate data...
x = np.random.random(10)
y = np.random.random(10)

plt.scatter(x, y, c=y, s=500, cmap='gray')
plt.show()

Question 25

In matplotlib grey colors can be given as a string of a numerical value between 0-1.
For example c = '0.1'

Then you can convert your third variable in a value inside this range and to use it to color your points.
In the following example I used the y position of the point as the value that determines the color:

from matplotlib import pyplot as plt

x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y = [125, 32, 54, 253, 67, 87, 233, 56, 67]

color = [str(item/255.) for item in y]

plt.scatter(x, y, s=500, c=color)

plt.show()

Question 26

Sometimes you may need to plot color precisely based on the x-value case. For example, you may have a dataframe with 3 types of variables and some data points. And you want to do following,

Plot points corresponding to Physical variable ‘A’ in RED.
Plot points corresponding to Physical variable ‘B’ in BLUE.
Plot points corresponding to Physical variable ‘C’ in GREEN.

In this case, you may have to write to short function to map the x-values to corresponding color names as a list and then pass on that list to the plt.scatter command.

x=['A','B','B','C','A','B']
y=[15,30,25,18,22,13]

# Function to map the colors as a list from the input list of x variables
def pltcolor(lst):
    cols=[]
    for l in lst:
        if l=='A':
            cols.append('red')
        elif l=='B':
            cols.append('blue')
        else:
            cols.append('green')
    return cols
# Create the colors list using the function above
cols=pltcolor(x)

plt.scatter(x=x,y=y,s=500,c=cols) #Pass on the list created by the function here
plt.grid(True)
plt.show()

Question 27

I want to plot data, then create a new figure and plot data2, and finally come back to the original plot and plot data3, kinda like this:

import numpy as np
import matplotlib as plt

x = arange(5)
y = np.exp(5)
plt.figure()
plt.plot(x, y)

z = np.sin(x)
plt.figure()
plt.plot(x, z)

w = np.cos(x)
plt.figure("""first figure""") # Here's the part I need
plt.plot(x, w)

FYI How do I tell matplotlib that I am done with a plot? does something similar, but not quite! It doesn’t let me get access to that original plot.

Question 28

If you find yourself doing things like this regularly it may be worth investigating the object-oriented interface to matplotlib. In your case:

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(5)
y = np.exp(x)
fig1, ax1 = plt.subplots()
ax1.plot(x, y)
ax1.set_title("Axis 1 title")
ax1.set_xlabel("X-label for axis 1")

z = np.sin(x)
fig2, (ax2, ax3) = plt.subplots(nrows=2, ncols=1) # two axes on figure
ax2.plot(x, z)
ax3.plot(x, -z)

w = np.cos(x)
ax1.plot(x, w) # can continue plotting on the first axis

It is a little more verbose but it’s much clearer and easier to keep track of, especially with several figures each with multiple subplots.

Question 29

When you call figure, simply number the plot.

x = arange(5)
y = np.exp(5)
plt.figure(0)
plt.plot(x, y)

z = np.sin(x)
plt.figure(1)
plt.plot(x, z)

w = np.cos(x)
plt.figure(0) # Here's the part I need
plt.plot(x, w)

Edit: Note that you can number the plots however you want (here, starting from 0) but if you don’t provide figure with a number at all when you create a new one, the automatic numbering will start at 1 (“Matlab Style” according to the docs).

Question 30

However, numbering starts at 1, so:

x = arange(5)
y = np.exp(5)
plt.figure(1)
plt.plot(x, y)

z = np.sin(x)
plt.figure(2)
plt.plot(x, z)

w = np.cos(x)
plt.figure(1) # Here's the part I need, but numbering starts at 1!
plt.plot(x, w)

Also, if you have multiple axes on a figure, such as subplots, use the axes(h) command where h is the handle of the desired axes object to focus on that axes.

(don’t have comment privileges yet, sorry for new answer!)

Question 31

One way I found after some struggling is creating a function which gets data_plot matrix, file name and order as parameter to create boxplots from the given data in the ordered figure (different orders = different figures) and save it under the given file_name.

def plotFigure(data_plot,file_name,order):
    fig = plt.figure(order, figsize=(9, 6))
    ax = fig.add_subplot(111)
    bp = ax.boxplot(data_plot)
    fig.savefig(file_name, bbox_inches='tight')
    plt.close()

Question 32

I have one figure which contains many subplots.

fig = plt.figure(num=None, figsize=(26, 12), dpi=80, facecolor='w', edgecolor='k')
fig.canvas.set_window_title('Window Title')

# Returns the Axes instance
ax = fig.add_subplot(311) 
ax2 = fig.add_subplot(312) 
ax3 = fig.add_subplot(313)

How do I add titles to the subplots?

fig.suptitle adds a title to all graphs and although ax.set_title() exists, the latter does not add any title to my subplots.

Thank you for your help.

Edit: Corrected typo about set_title(). Thanks Rutger Kassies

Question 33

ax.title.set_text('My Plot Title') seems to work too.

fig = plt.figure()
ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
ax3 = fig.add_subplot(223)
ax4 = fig.add_subplot(224)
ax1.title.set_text('First Plot')
ax2.title.set_text('Second Plot')
ax3.title.set_text('Third Plot')
ax4.title.set_text('Fourth Plot')
plt.show()

Question 34

ax.set_title() should set the titles for separate subplots:

import matplotlib.pyplot as plt

if __name__ == "__main__":
    data = [1, 2, 3, 4, 5]

    fig = plt.figure()
    fig.suptitle("Title for whole figure", fontsize=16)
    ax = plt.subplot("211")
    ax.set_title("Title for first plot")
    ax.plot(data)

    ax = plt.subplot("212")
    ax.set_title("Title for second plot")
    ax.plot(data)

    plt.show()

Can you check if this code works for you? Maybe something overwrites them later?

Question 35

A shorthand answer assuming import matplotlib.pyplot as plt:

plt.gca().set_title('title')

as in:

plt.subplot(221)
plt.gca().set_title('title')
plt.subplot(222)
etc...

Then there is no need for superfluous variables.

Question 36

If you want to make it shorter, you could write :

import matplolib.pyplot as plt
for i in range(4):
    plt.subplot(2,2,i+1).set_title('Subplot n°{}' .format(i+1))
plt.show()

It makes it maybe less clear but you don’t need more lines or variables

Question 37

In case you have multiple images and you want to loop though them and show them 1 by 1 along with titles – this is what you can do. No need to explicitly define ax1, ax2, etc.

The catch is you can define dynamic axes(ax) as in Line 1 of code and you can set its title inside a loop.
The rows of 2D array is length (len) of axis(ax)
Each row has 2 items i.e. It is list within a list (Point No.2)
set_title can be used to set title, once the proper axes(ax) or subplot is selected.

import matplotlib.pyplot as plt    
fig, ax = plt.subplots(2, 2, figsize=(6, 8))  
for i in range(len(ax)): 
    for j in range(len(ax[i])):
        ## ax[i,j].imshow(test_images_gr[0].reshape(28,28))
        ax[i,j].set_title('Title-' + str(i) + str(j))

Question 38

fig, (ax1, ax2, ax3, ax4) = plt.subplots(nrows=1, ncols=4,figsize=(11, 7))

grid = plt.GridSpec(2, 2, wspace=0.2, hspace=0.5)

ax1 = plt.subplot(grid[0, 0])
ax2 = plt.subplot(grid[0, 1:])
ax3 = plt.subplot(grid[1, :1])
ax4 = plt.subplot(grid[1, 1:])

ax1.title.set_text('First Plot')
ax2.title.set_text('Second Plot')
ax3.title.set_text('Third Plot')
ax4.title.set_text('Fourth Plot')

plt.show()

Question 39

A solution I tend to use more and more is this one:

import matplotlib.pyplot as plt

fig, axs = plt.subplots(2, 2)  # 1
for i, ax in enumerate(axs.ravel()): # 2
    ax.set_title("Plot #{}".format(i)) # 3

Create your arbitrary number of axes
axs.ravel() converts your 2-dim object to a 1-dim vector in row-major style
assigns the title to the current axis-object

Question 40

I need to add two subplots to a figure. One subplot needs to be about three times as wide as the second (same height). I accomplished this using GridSpec and the colspan argument but I would like to do this using figure so I can save to PDF. I can adjust the first figure using the figsize argument in the constructor, but how do I change the size of the second plot?

Question 41

Another way is to use the subplots function and pass the width ratio with gridspec_kw:

import numpy as np
import matplotlib.pyplot as plt 

# generate some data
x = np.arange(0, 10, 0.2)
y = np.sin(x)

# plot it
f, (a0, a1) = plt.subplots(1, 2, gridspec_kw={'width_ratios': [3, 1]})
a0.plot(x, y)
a1.plot(y, x)

f.tight_layout()
f.savefig('grid_figure.pdf')

Question 42

You can use gridspec and figure:

import numpy as np
import matplotlib.pyplot as plt 
from matplotlib import gridspec

# generate some data
x = np.arange(0, 10, 0.2)
y = np.sin(x)

# plot it
fig = plt.figure(figsize=(8, 6)) 
gs = gridspec.GridSpec(1, 2, width_ratios=[3, 1]) 
ax0 = plt.subplot(gs[0])
ax0.plot(x, y)
ax1 = plt.subplot(gs[1])
ax1.plot(y, x)

plt.tight_layout()
plt.savefig('grid_figure.pdf')

Question 43

Probably the simplest way is using subplot2grid, described in Customizing Location of Subplot Using GridSpec.

ax = plt.subplot2grid((2, 2), (0, 0))

is equal to

import matplotlib.gridspec as gridspec
gs = gridspec.GridSpec(2, 2)
ax = plt.subplot(gs[0, 0])

so bmu’s example becomes:

import numpy as np
import matplotlib.pyplot as plt

# generate some data
x = np.arange(0, 10, 0.2)
y = np.sin(x)

# plot it
fig = plt.figure(figsize=(8, 6))
ax0 = plt.subplot2grid((1, 3), (0, 0), colspan=2)
ax0.plot(x, y)
ax1 = plt.subplot2grid((1, 3), (0, 2))
ax1.plot(y, x)

plt.tight_layout()
plt.savefig('grid_figure.pdf')

Question 44

I used pyplot‘s axes object to manually adjust the sizes without using GridSpec:

import matplotlib.pyplot as plt
import numpy as np
x = np.arange(0, 10, 0.2)
y = np.sin(x)

# definitions for the axes
left, width = 0.07, 0.65
bottom, height = 0.1, .8
bottom_h = left_h = left+width+0.02

rect_cones = [left, bottom, width, height]
rect_box = [left_h, bottom, 0.17, height]

fig = plt.figure()

cones = plt.axes(rect_cones)
box = plt.axes(rect_box)

cones.plot(x, y)

box.plot(y, x)

plt.show()

Question 45

I created a histogram plot using data from a file and no problem. Now I wanted to superpose data from another file in the same histogram, so I do something like this

n,bins,patchs = ax.hist(mydata1,100)
n,bins,patchs = ax.hist(mydata2,100)

but the problem is that for each interval, only the bar with the highest value appears, and the other is hidden. I wonder how could I plot both histograms at the same time with different colors.

Question 46

Here you have a working example:

import random
import numpy
from matplotlib import pyplot

x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]

bins = numpy.linspace(-10, 10, 100)

pyplot.hist(x, bins, alpha=0.5, label='x')
pyplot.hist(y, bins, alpha=0.5, label='y')
pyplot.legend(loc='upper right')
pyplot.show()

Question 47

The accepted answers gives the code for a histogram with overlapping bars, but in case you want each bar to be side-by-side (as I did), try the variation below:

import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-deep')

x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
bins = np.linspace(-10, 10, 30)

plt.hist([x, y], bins, label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()

Reference: http://matplotlib.org/examples/statistics/histogram_demo_multihist.html

EDIT [2018/03/16]: Updated to allow plotting of arrays of different sizes, as suggested by @stochastic_zeitgeist

Question 48

In the case you have different sample sizes, it may be difficult to compare the distributions with a single y-axis. For example:

import numpy as np
import matplotlib.pyplot as plt

#makes the data
y1 = np.random.normal(-2, 2, 1000)
y2 = np.random.normal(2, 2, 5000)
colors = ['b','g']

#plots the histogram
fig, ax1 = plt.subplots()
ax1.hist([y1,y2],color=colors)
ax1.set_xlim(-10,10)
ax1.set_ylabel("Count")
plt.tight_layout()
plt.show()

In this case, you can plot your two data sets on different axes. To do so, you can get your histogram data using matplotlib, clear the axis, and then re-plot it on two separate axes (shifting the bin edges so that they don’t overlap):

#sets up the axis and gets histogram data
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.hist([y1, y2], color=colors)
n, bins, patches = ax1.hist([y1,y2])
ax1.cla() #clear the axis

#plots the histogram data
width = (bins[1] - bins[0]) * 0.4
bins_shifted = bins + width
ax1.bar(bins[:-1], n[0], width, align='edge', color=colors[0])
ax2.bar(bins_shifted[:-1], n[1], width, align='edge', color=colors[1])

#finishes the plot
ax1.set_ylabel("Count", color=colors[0])
ax2.set_ylabel("Count", color=colors[1])
ax1.tick_params('y', colors=colors[0])
ax2.tick_params('y', colors=colors[1])
plt.tight_layout()
plt.show()

Question 49

As a completion to Gustavo Bezerra’s answer:

If you want each histogram to be normalized (normed for mpl<=2.1 and density for mpl>=3.1) you cannot just use normed/density=True, you need to set the weights for each value instead:

import numpy as np
import matplotlib.pyplot as plt

x = np.random.normal(1, 2, 5000)
y = np.random.normal(-1, 3, 2000)
x_w = np.empty(x.shape)
x_w.fill(1/x.shape[0])
y_w = np.empty(y.shape)
y_w.fill(1/y.shape[0])
bins = np.linspace(-10, 10, 30)

plt.hist([x, y], bins, weights=[x_w, y_w], label=['x', 'y'])
plt.legend(loc='upper right')
plt.show()

As a comparison, the exact same x and y vectors with default weights and density=True:

Question 50

You should use bins from the values returned by hist:

import numpy as np
import matplotlib.pyplot as plt

foo = np.random.normal(loc=1, size=100) # a normal distribution
bar = np.random.normal(loc=-1, size=10000) # a normal distribution

_, bins, _ = plt.hist(foo, bins=50, range=[-6, 6], normed=True)
_ = plt.hist(bar, bins=bins, alpha=0.5, normed=True)

Question 51

Here is a simple method to plot two histograms, with their bars side-by-side, on the same plot when the data has different sizes:

def plotHistogram(p, o):
    """
    p and o are iterables with the values you want to 
    plot the histogram of
    """
    plt.hist([p, o], color=['g','r'], alpha=0.8, bins=50)
    plt.show()

Question 52

It sounds like you might want just a bar graph:

Alternatively, you can use subplots.

Question 53

Just in case you have pandas (import pandas as pd) or are ok with using it:

test = pd.DataFrame([[random.gauss(3,1) for _ in range(400)], 
                     [random.gauss(4,2) for _ in range(400)]])
plt.hist(test.values.T)
plt.show()

Question 54

There is one caveat when you want to plot the histogram from a 2-d numpy array. You need to swap the 2 axes.

import numpy as np
import matplotlib.pyplot as plt

data = np.random.normal(size=(2, 300))
# swapped_data.shape == (300, 2)
swapped_data = np.swapaxes(x, axis1=0, axis2=1)
plt.hist(swapped_data, bins=30, label=['x', 'y'])
plt.legend()
plt.show()

Question 55

This question has been answered before, but wanted to add another quick/easy workaround that might help other visitors to this question.

import seasborn as sns 
sns.kdeplot(mydata1)
sns.kdeplot(mydata2)

Some helpful examples are here for kde vs histogram comparison.

Question 56

Inspired by Solomon’s answer, but to stick with the question, which is related to histogram, a clean solution is:

sns.distplot(bar)
sns.distplot(foo)
plt.show()

Make sure to plot the taller one first, otherwise you would need to set plt.ylim(0,0.45) so that the taller histogram is not chopped off.

Question 57

Also an option which is quite similar to joaquin answer:

import random
from matplotlib import pyplot

#random data
x = [random.gauss(3,1) for _ in range(400)]
y = [random.gauss(4,2) for _ in range(400)]

#plot both histograms(range from -10 to 10), bins set to 100
pyplot.hist([x,y], bins= 100, range=[-10,10], alpha=0.5, label=['x', 'y'])
#plot legend
pyplot.legend(loc='upper right')
#show it
pyplot.show()

Gives the following output:

Question 58

I’m learning to use matplotlib by studying examples, and a lot of examples seem to include a line like the following before creating a single plot…

fig, ax = plt.subplots()

Here are some examples…

I see this function used a lot, even though the example is only attempting to create a single chart. Is there some other advantage? The official demo for subplots() also uses f, ax = subplots when creating a single chart, and it only ever references ax after that. This is the code they use.

# Just a figure and one subplot
f, ax = plt.subplots()
ax.plot(x, y)
ax.set_title('Simple plot')

Question 59

plt.subplots() is a function that returns a tuple containing a figure and axes object(s). Thus when using fig, ax = plt.subplots() you unpack this tuple into the variables fig and ax. Having fig is useful if you want to change figure-level attributes or save the figure as an image file later (e.g. with fig.savefig('yourfilename.png')). You certainly don’t have to use the returned figure object but many people do use it later so it’s common to see. Also, all axes objects (the objects that have plotting methods), have a parent figure object anyway, thus:

fig, ax = plt.subplots()

is more concise than this:

fig = plt.figure()
ax = fig.add_subplot(111)

Question 60

Just a supplement here.

The following question is that what if I want more subplots in the figure?

As mentioned in the Doc, we can use fig = plt.subplots(nrows=2, ncols=2) to set a group of subplots with grid(2,2) in one figure object.

Then as we know, the fig, ax = plt.subplots() returns a tuple, let’s try fig, ax1, ax2, ax3, ax4 = plt.subplots(nrows=2, ncols=2) firstly.

ValueError: not enough values to unpack (expected 4, got 2)

It raises a error, but no worry, because we now see that plt.subplots() actually returns a tuple with two elements. The 1st one must be a figure object, and the other one should be a group of subplots objects.

So let’s try this again:

fig, [[ax1, ax2], [ax3, ax4]] = plt.subplots(nrows=2, ncols=2)

and check the type:

type(fig) #<class 'matplotlib.figure.Figure'>
type(ax1) #<class 'matplotlib.axes._subplots.AxesSubplot'>

Of course, if you use parameters as (nrows=1, ncols=4), then the format should be:

fig, [ax1, ax2, ax3, ax4] = plt.subplots(nrows=1, ncols=4)

So just remember to keep the construction of the list as the same as the subplots grid we set in the figure.

Hope this would be helpful for you.

Question 61

As a supplement to the question and above answers there is also an important difference between plt.subplots() and plt.subplot(), notice the missing 's' at the end.

One can use plt.subplots() to make all their subplots at once and it returns the figure and axes (plural of axis) of the subplots as a tuple. A figure can be understood as a canvas where you paint your sketch.

# create a subplot with 2 rows and 1 columns
fig, ax = plt.subplots(2,1)

Whereas, you can use plt.subplot() if you want to add the subplots separately. It returns only the axis of one subplot.

fig = plt.figure() # create the canvas for plotting
ax1 = plt.subplot(2,1,1) 
# (2,1,1) indicates total number of rows, columns, and figure number respectively
ax2 = plt.subplot(2,1,2)

However, plt.subplots() is preferred because it gives you easier options to directly customize your whole figure

# for example, sharing x-axis, y-axis for all subplots can be specified at once
fig, ax = plt.subplots(2,2, sharex=True, sharey=True)

whereas, with plt.subplot(), one will have to specify individually for each axis which can become cumbersome.

Question 62

In addition to the answers above, you can check the type of object using type(plt.subplots()) which returns a tuple, on the other hand, type(plt.subplot()) returns matplotlib.axes._subplots.AxesSubplot which you can’t unpack.

问题：如何从具有透明背景的matplotlib中导出图？

回答 0

回答 1

问题：如何在python中使用networkx绘制有向图？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：如何告诉matplotlib我已经完成了情节？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：使用Matplotlib在Python中绘制时间

回答 0

回答 1

回答 2

回答 3

问题：Matplotlib散点图; 颜色作为第三个变量的函数

回答 0

回答 1

回答 2

问题：我如何告诉Matplotlib创建第二个（新的）图，然后在旧的图上进行更新？

回答 0

回答 1

回答 2

回答 3

问题：如何在Matplotlib中的子图中添加标题？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

问题：Matplotlib不同大小的子图

回答 0

回答 1

回答 2

回答 3

问题：使用matplotlib在单个图表上绘制两个直方图

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

问题：为什么很多示例在Matplotlib / pyplot / python中使用`fig，ax = plt.subplots（）`

回答 0

回答 1

回答 2

回答 3

有趣好用的Python教程