标签归档:matplotlib

如何使用pyplot.barh()在每个条形上显示条形的值?

问题:如何使用pyplot.barh()在每个条形上显示条形的值?

我生成了条形图,如何在每个条形上显示条形的值?

当前情节:

在此处输入图片说明

我想要得到的是:

在此处输入图片说明

我的代码:

import os
import numpy as np
import matplotlib.pyplot as plt

x = [u'INFO', u'CUISINE', u'TYPE_OF_PLACE', u'DRINK', u'PLACE', u'MEAL_TIME', u'DISH', u'NEIGHBOURHOOD']
y = [160, 167, 137, 18, 120, 36, 155, 130]

fig, ax = plt.subplots()    
width = 0.75 # the width of the bars 
ind = np.arange(len(y))  # the x locations for the groups
ax.barh(ind, y, width, color="blue")
ax.set_yticks(ind+width/2)
ax.set_yticklabels(x, minor=False)
plt.title('title')
plt.xlabel('x')
plt.ylabel('y')      
#plt.show()
plt.savefig(os.path.join('test.png'), dpi=300, format='png', bbox_inches='tight') # use format='svg' or 'pdf' for vectorial pictures

I generated a bar plot, how can I display the value of the bar on each bar?

Current plot:

enter image description here

What I am trying to get:

enter image description here

My code:

import os
import numpy as np
import matplotlib.pyplot as plt

x = [u'INFO', u'CUISINE', u'TYPE_OF_PLACE', u'DRINK', u'PLACE', u'MEAL_TIME', u'DISH', u'NEIGHBOURHOOD']
y = [160, 167, 137, 18, 120, 36, 155, 130]

fig, ax = plt.subplots()    
width = 0.75 # the width of the bars 
ind = np.arange(len(y))  # the x locations for the groups
ax.barh(ind, y, width, color="blue")
ax.set_yticks(ind+width/2)
ax.set_yticklabels(x, minor=False)
plt.title('title')
plt.xlabel('x')
plt.ylabel('y')      
#plt.show()
plt.savefig(os.path.join('test.png'), dpi=300, format='png', bbox_inches='tight') # use format='svg' or 'pdf' for vectorial pictures

回答 0

加:

for i, v in enumerate(y):
    ax.text(v + 3, i + .25, str(v), color='blue', fontweight='bold')

结果:

在此处输入图片说明

y值v既是x位置,也是的字符串值ax.text,并且方便地,条形图的每个条形的度量均为1,因此枚举i是y位置。

Add:

for i, v in enumerate(y):
    ax.text(v + 3, i + .25, str(v), color='blue', fontweight='bold')

result:

enter image description here

The y-values v are both the x-location and the string values for ax.text, and conveniently the barplot has a metric of 1 for each bar, so the enumeration i is the y-location.


回答 1

我注意到api示例代码包含一个条形图示例,其中每个条形图上都显示了条形图的值:

"""
========
Barchart
========

A bar plot with errorbars and height labels on individual bars
"""
import numpy as np
import matplotlib.pyplot as plt

N = 5
men_means = (20, 35, 30, 35, 27)
men_std = (2, 3, 4, 1, 2)

ind = np.arange(N)  # the x locations for the groups
width = 0.35       # the width of the bars

fig, ax = plt.subplots()
rects1 = ax.bar(ind, men_means, width, color='r', yerr=men_std)

women_means = (25, 32, 34, 20, 25)
women_std = (3, 5, 2, 3, 3)
rects2 = ax.bar(ind + width, women_means, width, color='y', yerr=women_std)

# add some text for labels, title and axes ticks
ax.set_ylabel('Scores')
ax.set_title('Scores by group and gender')
ax.set_xticks(ind + width / 2)
ax.set_xticklabels(('G1', 'G2', 'G3', 'G4', 'G5'))

ax.legend((rects1[0], rects2[0]), ('Men', 'Women'))


def autolabel(rects):
    """
    Attach a text label above each bar displaying its height
    """
    for rect in rects:
        height = rect.get_height()
        ax.text(rect.get_x() + rect.get_width()/2., 1.05*height,
                '%d' % int(height),
                ha='center', va='bottom')

autolabel(rects1)
autolabel(rects2)

plt.show()

输出:

在此处输入图片说明

仅供参考matplotlib的“ barh”中的高度变量的单位是什么?(到目前为止,还没有简便的方法为每个钢筋设置固定高度)

I have noticed api example code contains an example of barchart with the value of the bar displayed on each bar:

"""
========
Barchart
========

A bar plot with errorbars and height labels on individual bars
"""
import numpy as np
import matplotlib.pyplot as plt

N = 5
men_means = (20, 35, 30, 35, 27)
men_std = (2, 3, 4, 1, 2)

ind = np.arange(N)  # the x locations for the groups
width = 0.35       # the width of the bars

fig, ax = plt.subplots()
rects1 = ax.bar(ind, men_means, width, color='r', yerr=men_std)

women_means = (25, 32, 34, 20, 25)
women_std = (3, 5, 2, 3, 3)
rects2 = ax.bar(ind + width, women_means, width, color='y', yerr=women_std)

# add some text for labels, title and axes ticks
ax.set_ylabel('Scores')
ax.set_title('Scores by group and gender')
ax.set_xticks(ind + width / 2)
ax.set_xticklabels(('G1', 'G2', 'G3', 'G4', 'G5'))

ax.legend((rects1[0], rects2[0]), ('Men', 'Women'))


def autolabel(rects):
    """
    Attach a text label above each bar displaying its height
    """
    for rect in rects:
        height = rect.get_height()
        ax.text(rect.get_x() + rect.get_width()/2., 1.05*height,
                '%d' % int(height),
                ha='center', va='bottom')

autolabel(rects1)
autolabel(rects2)

plt.show()

output:

enter image description here

FYI What is the unit of height variable in “barh” of matplotlib? (as of now, there is no easy way to set a fixed height for each bar)


回答 2

对于任何想要在标签的底部放置标签的人,只需将v除以标签的值即可,如下所示:

for i, v in enumerate(labels):
    axes.text(i-.25, 
              v/labels[i]+100, 
              labels[i], 
              fontsize=18, 
              color=label_color_list[i])

(注意:我加了100,所以不是绝对在底部)

要获得这样的结果: 在此处输入图片说明

For anyone wanting to have their label at the base of their bars just divide v by the value of the label like this:

for i, v in enumerate(labels):
    axes.text(i-.25, 
              v/labels[i]+100, 
              labels[i], 
              fontsize=18, 
              color=label_color_list[i])

(note: I added 100 so it wasn’t absolutely at the bottom)

To get a result like this: enter image description here


回答 3

我知道这是一个老话题,但是我通过Google登陆了几次,认为还没有一个令人满意的答案。尝试使用以下功能之一:

编辑:当我在这个旧线程上受到喜欢时,我也想分享一个更新的解决方案(基本上将我先前的两个函数放在一起,并自动确定它是条形图还是hbar图):

def label_bars(ax, bars, text_format, **kwargs):
    """
    Attaches a label on every bar of a regular or horizontal bar chart
    """
    ys = [bar.get_y() for bar in bars]
    y_is_constant = all(y == ys[0] for y in ys)  # -> regular bar chart, since all all bars start on the same y level (0)

    if y_is_constant:
        _label_bar(ax, bars, text_format, **kwargs)
    else:
        _label_barh(ax, bars, text_format, **kwargs)


def _label_bar(ax, bars, text_format, **kwargs):
    """
    Attach a text label to each bar displaying its y value
    """
    max_y_value = ax.get_ylim()[1]
    inside_distance = max_y_value * 0.05
    outside_distance = max_y_value * 0.01

    for bar in bars:
        text = text_format.format(bar.get_height())
        text_x = bar.get_x() + bar.get_width() / 2

        is_inside = bar.get_height() >= max_y_value * 0.15
        if is_inside:
            color = "white"
            text_y = bar.get_height() - inside_distance
        else:
            color = "black"
            text_y = bar.get_height() + outside_distance

        ax.text(text_x, text_y, text, ha='center', va='bottom', color=color, **kwargs)


def _label_barh(ax, bars, text_format, **kwargs):
    """
    Attach a text label to each bar displaying its y value
    Note: label always outside. otherwise it's too hard to control as numbers can be very long
    """
    max_x_value = ax.get_xlim()[1]
    distance = max_x_value * 0.0025

    for bar in bars:
        text = text_format.format(bar.get_width())

        text_x = bar.get_width() + distance
        text_y = bar.get_y() + bar.get_height() / 2

        ax.text(text_x, text_y, text, va='center', **kwargs)

现在,您可以将它们用于常规条形图:

fig, ax = plt.subplots((5, 5))
bars = ax.bar(x_pos, values, width=0.5, align="center")
value_format = "{:.1%}"  # displaying values as percentage with one fractional digit
label_bars(ax, bars, value_format)

或对于水平条形图:

fig, ax = plt.subplots((5, 5))
horizontal_bars = ax.barh(y_pos, values, width=0.5, align="center")
value_format = "{:.1%}"  # displaying values as percentage with one fractional digit
label_bars(ax, horizontal_bars, value_format)

I know it’s an old thread, but I landed here several times via Google and think no given answer is really satisfying yet. Try using one of the following functions:

EDIT: As I’m getting some likes on this old thread, I wanna share an updated solution as well (basically putting my two previous functions together and automatically deciding whether it’s a bar or hbar plot):

def label_bars(ax, bars, text_format, **kwargs):
    """
    Attaches a label on every bar of a regular or horizontal bar chart
    """
    ys = [bar.get_y() for bar in bars]
    y_is_constant = all(y == ys[0] for y in ys)  # -> regular bar chart, since all all bars start on the same y level (0)

    if y_is_constant:
        _label_bar(ax, bars, text_format, **kwargs)
    else:
        _label_barh(ax, bars, text_format, **kwargs)


def _label_bar(ax, bars, text_format, **kwargs):
    """
    Attach a text label to each bar displaying its y value
    """
    max_y_value = ax.get_ylim()[1]
    inside_distance = max_y_value * 0.05
    outside_distance = max_y_value * 0.01

    for bar in bars:
        text = text_format.format(bar.get_height())
        text_x = bar.get_x() + bar.get_width() / 2

        is_inside = bar.get_height() >= max_y_value * 0.15
        if is_inside:
            color = "white"
            text_y = bar.get_height() - inside_distance
        else:
            color = "black"
            text_y = bar.get_height() + outside_distance

        ax.text(text_x, text_y, text, ha='center', va='bottom', color=color, **kwargs)


def _label_barh(ax, bars, text_format, **kwargs):
    """
    Attach a text label to each bar displaying its y value
    Note: label always outside. otherwise it's too hard to control as numbers can be very long
    """
    max_x_value = ax.get_xlim()[1]
    distance = max_x_value * 0.0025

    for bar in bars:
        text = text_format.format(bar.get_width())

        text_x = bar.get_width() + distance
        text_y = bar.get_y() + bar.get_height() / 2

        ax.text(text_x, text_y, text, va='center', **kwargs)

Now you can use them for regular bar plots:

fig, ax = plt.subplots((5, 5))
bars = ax.bar(x_pos, values, width=0.5, align="center")
value_format = "{:.1%}"  # displaying values as percentage with one fractional digit
label_bars(ax, bars, value_format)

or for horizontal bar plots:

fig, ax = plt.subplots((5, 5))
horizontal_bars = ax.barh(y_pos, values, width=0.5, align="center")
value_format = "{:.1%}"  # displaying values as percentage with one fractional digit
label_bars(ax, horizontal_bars, value_format)

回答 4

使用plt.text()将文本放入绘图中。

例:

import matplotlib.pyplot as plt
N = 5
menMeans = (20, 35, 30, 35, 27)
ind = np.arange(N)

#Creating a figure with some fig size
fig, ax = plt.subplots(figsize = (10,5))
ax.bar(ind,menMeans,width=0.4)
#Now the trick is here.
#plt.text() , you need to give (x,y) location , where you want to put the numbers,
#So here index will give you x pos and data+1 will provide a little gap in y axis.
for index,data in enumerate(menMeans):
    plt.text(x=index , y =data+1 , s=f"{data}" , fontdict=dict(fontsize=20))
plt.tight_layout()
plt.show()

该图将显示为:

条形图,值在顶部

Use plt.text() to put text in the plot.

Example:

import matplotlib.pyplot as plt
N = 5
menMeans = (20, 35, 30, 35, 27)
ind = np.arange(N)

#Creating a figure with some fig size
fig, ax = plt.subplots(figsize = (10,5))
ax.bar(ind,menMeans,width=0.4)
#Now the trick is here.
#plt.text() , you need to give (x,y) location , where you want to put the numbers,
#So here index will give you x pos and data+1 will provide a little gap in y axis.
for index,data in enumerate(menMeans):
    plt.text(x=index , y =data+1 , s=f"{data}" , fontdict=dict(fontsize=20))
plt.tight_layout()
plt.show()

This will show the figure as:

bar chart with values at the top


回答 5

对于熊猫人:

ax = s.plot(kind='barh') # s is a Series (float) in [0,1]
[ax.text(v, i, '{:.2f}%'.format(100*v)) for i, v in enumerate(s)];

而已。另外,对于那些喜欢apply使用枚举而不是循环的人:

it = iter(range(len(s)))
s.apply(lambda x: ax.text(x, next(it),'{:.2f}%'.format(100*x)));

此外,ax.patches还会为您提供您将获得的酒吧ax.bar(...)。如果您想应用@SaturnFromTitan的功能或其他技术。

For pandas people :

ax = s.plot(kind='barh') # s is a Series (float) in [0,1]
[ax.text(v, i, '{:.2f}%'.format(100*v)) for i, v in enumerate(s)];

That’s it. Alternatively, for those who prefer apply over looping with enumerate:

it = iter(range(len(s)))
s.apply(lambda x: ax.text(x, next(it),'{:.2f}%'.format(100*x)));

Also, ax.patches will give you the bars that you would get with ax.bar(...). In case you want to apply the functions of @SaturnFromTitan or techniques of others.


回答 6

我也需要条形标签,请注意,我的y轴具有使用y轴限制的缩放视图。用于将标签放在条形顶部的默认计算仍然可以使用高度(在示例中为use_global_coordinate = False)。但是我想表明,可以使用matplotlib 3.0.2中的全局坐标在缩放视图中将标签也放置在图形的底部。希望它能帮助某人。

def autolabel(rects,data):
"""
Attach a text label above each bar displaying its height
"""
c = 0
initial = 0.091
offset = 0.205
use_global_coordinate = True

if use_global_coordinate:
    for i in data:        
        ax.text(initial+offset*c, 0.05, str(i), horizontalalignment='center',
                verticalalignment='center', transform=ax.transAxes,fontsize=8)
        c=c+1
else:
    for rect,i in zip(rects,data):
        height = rect.get_height()
        ax.text(rect.get_x() + rect.get_width()/2., height,str(i),ha='center', va='bottom')

输出示例

I needed the bar labels too, note that my y-axis is having a zoomed view using limits on y axis. The default calculations for putting the labels on top of the bar still works using height (use_global_coordinate=False in the example). But I wanted to show that the labels can be put in the bottom of the graph too in zoomed view using global coordinates in matplotlib 3.0.2. Hope it help someone.

def autolabel(rects,data):
"""
Attach a text label above each bar displaying its height
"""
c = 0
initial = 0.091
offset = 0.205
use_global_coordinate = True

if use_global_coordinate:
    for i in data:        
        ax.text(initial+offset*c, 0.05, str(i), horizontalalignment='center',
                verticalalignment='center', transform=ax.transAxes,fontsize=8)
        c=c+1
else:
    for rect,i in zip(rects,data):
        height = rect.get_height()
        ax.text(rect.get_x() + rect.get_width()/2., height,str(i),ha='center', va='bottom')

Example output


回答 7

我正在尝试使用堆积的绘图栏来做到这一点。对我有用的代码是。

# Code to plot. Notice the variable ax.
ax = df.groupby('target').count().T.plot.bar(stacked=True, figsize=(10, 6))
ax.legend(bbox_to_anchor=(1.1, 1.05))

# Loop to add on each bar a tag in position
for rect in ax.patches:
    height = rect.get_height()
    ypos = rect.get_y() + height/2
    ax.text(rect.get_x() + rect.get_width()/2., ypos,
            '%d' % int(height), ha='center', va='bottom')

I was trying to do this with stacked plot bars. The code that worked for me was.

# Code to plot. Notice the variable ax.
ax = df.groupby('target').count().T.plot.bar(stacked=True, figsize=(10, 6))
ax.legend(bbox_to_anchor=(1.1, 1.05))

# Loop to add on each bar a tag in position
for rect in ax.patches:
    height = rect.get_height()
    ypos = rect.get_y() + height/2
    ax.text(rect.get_x() + rect.get_width()/2., ypos,
            '%d' % int(height), ha='center', va='bottom')

回答 8

检查此链接 Matplotlib Gallery 这就是我使用自动标签的代码片段的方式。

    def autolabel(rects):
    """Attach a text label above each bar in *rects*, displaying its height."""
    for rect in rects:
        height = rect.get_height()
        ax.annotate('{}'.format(height),
                    xy=(rect.get_x() + rect.get_width() / 2, height),
                    xytext=(0, 3),  # 3 points vertical offset
                    textcoords="offset points",
                    ha='center', va='bottom')
        
temp = df_launch.groupby(['yr_mt','year','month'])['subs_trend'].agg(subs_count='sum').sort_values(['year','month']).reset_index()
_, ax = plt.subplots(1,1, figsize=(30,10))
bar = ax.bar(height=temp['subs_count'],x=temp['yr_mt'] ,color ='g')
autolabel(bar)

ax.set_title('Monthly Change in Subscribers from Launch Date')
ax.set_ylabel('Subscriber Count Change')
ax.set_xlabel('Time')
plt.show()

Check this link Matplotlib Gallery This is how I used the code snippet of autolabel.

    def autolabel(rects):
    """Attach a text label above each bar in *rects*, displaying its height."""
    for rect in rects:
        height = rect.get_height()
        ax.annotate('{}'.format(height),
                    xy=(rect.get_x() + rect.get_width() / 2, height),
                    xytext=(0, 3),  # 3 points vertical offset
                    textcoords="offset points",
                    ha='center', va='bottom')
        
temp = df_launch.groupby(['yr_mt','year','month'])['subs_trend'].agg(subs_count='sum').sort_values(['year','month']).reset_index()
_, ax = plt.subplots(1,1, figsize=(30,10))
bar = ax.bar(height=temp['subs_count'],x=temp['yr_mt'] ,color ='g')
autolabel(bar)

ax.set_title('Monthly Change in Subscribers from Launch Date')
ax.set_ylabel('Subscriber Count Change')
ax.set_xlabel('Time')
plt.show()

使用matplotlib删除或调整图例框架的边框

问题:使用matplotlib删除或调整图例框架的边框

使用matplotlib绘制图时:

  1. 如何删除图例框?
  2. 如何更改图例框边框的颜色?
  3. 如何仅删除图例框的边框?

When plotting a plot using matplotlib:

  1. How to remove the box of the legend?
  2. How to change the color of the border of the legend box?
  3. How to remove only the border of the box of the legend?

回答 0

使用matplotlib绘制图时:

如何删除图例框?

plt.legend(frameon=False)

如何更改图例框边框的颜色?

leg = plt.legend()
leg.get_frame().set_edgecolor('b')

如何仅删除图例框的边框?

leg = plt.legend()
leg.get_frame().set_linewidth(0.0)

When plotting a plot using matplotlib:

How to remove the box of the legend?

plt.legend(frameon=False)

How to change the color of the border of the legend box?

leg = plt.legend()
leg.get_frame().set_edgecolor('b')

How to remove only the border of the box of the legend?

leg = plt.legend()
leg.get_frame().set_linewidth(0.0)

回答 1

还有一个相关的问题,因为我花了很长时间才找到答案:

如何使图例背景为空白(透明而不是白色):

legend = plt.legend()
legend.get_frame().set_facecolor('none')

警告,您想要'none'(字符串)。None表示默认颜色。

One more related question, since it took me forever to find the answer:

How to make the legend background blank (i.e. transparent, not white):

legend = plt.legend()
legend.get_frame().set_facecolor('none')

Warning, you want 'none' (the string). None means the default color instead.


matplotlib中的曲面图

问题:matplotlib中的曲面图

我有一个3元组的列表,表示3D空间中的一组点。我想绘制一个覆盖所有这些点的表面。

包中的plot_surface函数mplot3d要求X,Y和Z作为2d数组作为参数。是plot_surface正确的功能来绘制表面吗?如何将数据转换为所需的格式?

data = [(x1,y1,z1),(x2,y2,z2),.....,(xn,yn,zn)]

I have a list of 3-tuples representing a set of points in 3D space. I want to plot a surface that covers all these points.

The plot_surface function in the mplot3d package requires as arguments X,Y and Z to be 2d arrays. Is plot_surface the right function to plot surface and how do I transform my data into the required format?

data = [(x1,y1,z1),(x2,y2,z2),.....,(xn,yn,zn)]

回答 0

对于曲面,它与三元组列表略有不同,您应该为2d数组中的域传递网格。

如果您只拥有3d点列表而不是某些函数f(x, y) -> z,则将遇到问题,因为有多种方法可以将3d点云三角化为表面。

这是一个光滑的表面示例:

import numpy as np
from mpl_toolkits.mplot3d import Axes3D  
# Axes3D import has side effects, it enables using projection='3d' in add_subplot
import matplotlib.pyplot as plt
import random

def fun(x, y):
    return x**2 + y

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x = y = np.arange(-3.0, 3.0, 0.05)
X, Y = np.meshgrid(x, y)
zs = np.array(fun(np.ravel(X), np.ravel(Y)))
Z = zs.reshape(X.shape)

ax.plot_surface(X, Y, Z)

ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')

plt.show()

3D

For surfaces it’s a bit different than a list of 3-tuples, you should pass in a grid for the domain in 2d arrays.

If all you have is a list of 3d points, rather than some function f(x, y) -> z, then you will have a problem because there are multiple ways to triangulate that 3d point cloud into a surface.

Here’s a smooth surface example:

import numpy as np
from mpl_toolkits.mplot3d import Axes3D  
# Axes3D import has side effects, it enables using projection='3d' in add_subplot
import matplotlib.pyplot as plt
import random

def fun(x, y):
    return x**2 + y

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x = y = np.arange(-3.0, 3.0, 0.05)
X, Y = np.meshgrid(x, y)
zs = np.array(fun(np.ravel(X), np.ravel(Y)))
Z = zs.reshape(X.shape)

ax.plot_surface(X, Y, Z)

ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')

plt.show()

3d


回答 1

您可以直接从某些文件中读取数据并绘图

from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
from matplotlib import cm
import numpy as np
from sys import argv

x,y,z = np.loadtxt('your_file', unpack=True)

fig = plt.figure()
ax = Axes3D(fig)
surf = ax.plot_trisurf(x, y, z, cmap=cm.jet, linewidth=0.1)
fig.colorbar(surf, shrink=0.5, aspect=5)
plt.savefig('teste.pdf')
plt.show()

如有必要,您可以传递vmin和vmax来定义颜色条范围,例如

surf = ax.plot_trisurf(x, y, z, cmap=cm.jet, linewidth=0.1, vmin=0, vmax=2000)

表面

奖金部分

我想知道如何在人工数据的情况下进行一些交互式绘图

from __future__ import print_function
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets
from IPython.display import Image

from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits import mplot3d

def f(x, y):
    return np.sin(np.sqrt(x ** 2 + y ** 2))

def plot(i):

    fig = plt.figure()
    ax = plt.axes(projection='3d')

    theta = 2 * np.pi * np.random.random(1000)
    r = i * np.random.random(1000)
    x = np.ravel(r * np.sin(theta))
    y = np.ravel(r * np.cos(theta))
    z = f(x, y)

    ax.plot_trisurf(x, y, z, cmap='viridis', edgecolor='none')
    fig.tight_layout()

interactive_plot = interactive(plot, i=(2, 10))
interactive_plot

You can read data direct from some file and plot

from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
from matplotlib import cm
import numpy as np
from sys import argv

x,y,z = np.loadtxt('your_file', unpack=True)

fig = plt.figure()
ax = Axes3D(fig)
surf = ax.plot_trisurf(x, y, z, cmap=cm.jet, linewidth=0.1)
fig.colorbar(surf, shrink=0.5, aspect=5)
plt.savefig('teste.pdf')
plt.show()

If necessary you can pass vmin and vmax to define the colorbar range, e.g.

surf = ax.plot_trisurf(x, y, z, cmap=cm.jet, linewidth=0.1, vmin=0, vmax=2000)

surface

Bonus Section

I was wondering how to do some interactive plots, in this case with artificial data

from __future__ import print_function
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets
from IPython.display import Image

from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits import mplot3d

def f(x, y):
    return np.sin(np.sqrt(x ** 2 + y ** 2))

def plot(i):

    fig = plt.figure()
    ax = plt.axes(projection='3d')

    theta = 2 * np.pi * np.random.random(1000)
    r = i * np.random.random(1000)
    x = np.ravel(r * np.sin(theta))
    y = np.ravel(r * np.cos(theta))
    z = f(x, y)

    ax.plot_trisurf(x, y, z, cmap='viridis', edgecolor='none')
    fig.tight_layout()

interactive_plot = interactive(plot, i=(2, 10))
interactive_plot

回答 2

我只是遇到了同样的问题。我已均匀间隔即在3 1-d阵列,而不是2-d阵列数据matplotlibplot_surface欲望。我的数据恰好在,pandas.DataFrame所以这里是修改3个1D数组的matplotlib.plot_surface示例

from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter
import matplotlib.pyplot as plt
import numpy as np

X = np.arange(-5, 5, 0.25)
Y = np.arange(-5, 5, 0.25)
X, Y = np.meshgrid(X, Y)
R = np.sqrt(X**2 + Y**2)
Z = np.sin(R)

fig = plt.figure()
ax = fig.gca(projection='3d')
surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.coolwarm,
    linewidth=0, antialiased=False)
ax.set_zlim(-1.01, 1.01)

ax.zaxis.set_major_locator(LinearLocator(10))
ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f'))

fig.colorbar(surf, shrink=0.5, aspect=5)
plt.title('Original Code')

那是原始的例子。在下一个位加上这个位,可以从3个1D数组中创建相同的图。

# ~~~~ MODIFICATION TO EXAMPLE BEGINS HERE ~~~~ #
import pandas as pd
from scipy.interpolate import griddata
# create 1D-arrays from the 2D-arrays
x = X.reshape(1600)
y = Y.reshape(1600)
z = Z.reshape(1600)
xyz = {'x': x, 'y': y, 'z': z}

# put the data into a pandas DataFrame (this is what my data looks like)
df = pd.DataFrame(xyz, index=range(len(xyz['x']))) 

# re-create the 2D-arrays
x1 = np.linspace(df['x'].min(), df['x'].max(), len(df['x'].unique()))
y1 = np.linspace(df['y'].min(), df['y'].max(), len(df['y'].unique()))
x2, y2 = np.meshgrid(x1, y1)
z2 = griddata((df['x'], df['y']), df['z'], (x2, y2), method='cubic')

fig = plt.figure()
ax = fig.gca(projection='3d')
surf = ax.plot_surface(x2, y2, z2, rstride=1, cstride=1, cmap=cm.coolwarm,
    linewidth=0, antialiased=False)
ax.set_zlim(-1.01, 1.01)

ax.zaxis.set_major_locator(LinearLocator(10))
ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f'))

fig.colorbar(surf, shrink=0.5, aspect=5)
plt.title('Meshgrid Created from 3 1D Arrays')
# ~~~~ MODIFICATION TO EXAMPLE ENDS HERE ~~~~ #

plt.show()

以下是得出的数字:

在此处输入图片说明 在此处输入图片说明

I just came across this same problem. I have evenly spaced data that is in 3 1-D arrays instead of the 2-D arrays that matplotlib‘s plot_surface wants. My data happened to be in a pandas.DataFrame so here is the matplotlib.plot_surface example with the modifications to plot 3 1-D arrays.

from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter
import matplotlib.pyplot as plt
import numpy as np

X = np.arange(-5, 5, 0.25)
Y = np.arange(-5, 5, 0.25)
X, Y = np.meshgrid(X, Y)
R = np.sqrt(X**2 + Y**2)
Z = np.sin(R)

fig = plt.figure()
ax = fig.gca(projection='3d')
surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.coolwarm,
    linewidth=0, antialiased=False)
ax.set_zlim(-1.01, 1.01)

ax.zaxis.set_major_locator(LinearLocator(10))
ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f'))

fig.colorbar(surf, shrink=0.5, aspect=5)
plt.title('Original Code')

That is the original example. Adding this next bit on creates the same plot from 3 1-D arrays.

# ~~~~ MODIFICATION TO EXAMPLE BEGINS HERE ~~~~ #
import pandas as pd
from scipy.interpolate import griddata
# create 1D-arrays from the 2D-arrays
x = X.reshape(1600)
y = Y.reshape(1600)
z = Z.reshape(1600)
xyz = {'x': x, 'y': y, 'z': z}

# put the data into a pandas DataFrame (this is what my data looks like)
df = pd.DataFrame(xyz, index=range(len(xyz['x']))) 

# re-create the 2D-arrays
x1 = np.linspace(df['x'].min(), df['x'].max(), len(df['x'].unique()))
y1 = np.linspace(df['y'].min(), df['y'].max(), len(df['y'].unique()))
x2, y2 = np.meshgrid(x1, y1)
z2 = griddata((df['x'], df['y']), df['z'], (x2, y2), method='cubic')

fig = plt.figure()
ax = fig.gca(projection='3d')
surf = ax.plot_surface(x2, y2, z2, rstride=1, cstride=1, cmap=cm.coolwarm,
    linewidth=0, antialiased=False)
ax.set_zlim(-1.01, 1.01)

ax.zaxis.set_major_locator(LinearLocator(10))
ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f'))

fig.colorbar(surf, shrink=0.5, aspect=5)
plt.title('Meshgrid Created from 3 1D Arrays')
# ~~~~ MODIFICATION TO EXAMPLE ENDS HERE ~~~~ #

plt.show()

Here are the resulting figures:

enter image description here enter image description here


回答 3

只是为了说明一下,伊曼纽尔(Emanuel)有了我(可能还有许多其他人)正在寻找的答案。如果您在3个单独的阵列中有3d分散的数据,则pandas是一个了不起的帮助,并且比其他选项要好得多。详细地说,假设您的x,y,z是一些任意变量。在我的情况下,这些是c,gamma和错误,因为我正在测试支持向量机。有很多潜在的选择来绘制数据:

  • scatter3D(cParams,gammas,avg_errors_array)-可行,但是过于简单
  • plot_wireframe(cParams,gammas,avg_errors_array)-可以工作,但是如果您的数据排序不好,看起来会很丑陋,这可能是大量真实科学数据的情况
  • ax.plot3D(cParams,gammas,avg_errors_array)-类似于线框

数据线框图

数据线框图

数据的3D分散

数据的3D分散

代码如下:

    fig = plt.figure()
    ax = fig.gca(projection='3d')
    ax.set_xlabel('c parameter')
    ax.set_ylabel('gamma parameter')
    ax.set_zlabel('Error rate')
    #ax.plot_wireframe(cParams, gammas, avg_errors_array)
    #ax.plot3D(cParams, gammas, avg_errors_array)
    #ax.scatter3D(cParams, gammas, avg_errors_array, zdir='z',cmap='viridis')

    df = pd.DataFrame({'x': cParams, 'y': gammas, 'z': avg_errors_array})
    surf = ax.plot_trisurf(df.x, df.y, df.z, cmap=cm.jet, linewidth=0.1)
    fig.colorbar(surf, shrink=0.5, aspect=5)    
    plt.savefig('./plots/avgErrs_vs_C_andgamma_type_%s.png'%(k))
    plt.show()

这是最终输出:

xyz数据的plot_trisurf

Just to chime in, Emanuel had the answer that I (and probably many others) are looking for. If you have 3d scattered data in 3 separate arrays, pandas is an incredible help and works much better than the other options. To elaborate, suppose your x,y,z are some arbitrary variables. In my case these were c,gamma, and errors because I was testing a support vector machine. There are many potential choices to plot the data:

  • scatter3D(cParams, gammas, avg_errors_array) – this works but is overly simplistic
  • plot_wireframe(cParams, gammas, avg_errors_array) – this works, but will look ugly if your data isn’t sorted nicely, as is potentially the case with massive chunks of real scientific data
  • ax.plot3D(cParams, gammas, avg_errors_array) – similar to wireframe

Wireframe plot of the data

Wireframe plot of the data

3d scatter of the data

3d scatter of the data

The code looks like this:

    fig = plt.figure()
    ax = fig.gca(projection='3d')
    ax.set_xlabel('c parameter')
    ax.set_ylabel('gamma parameter')
    ax.set_zlabel('Error rate')
    #ax.plot_wireframe(cParams, gammas, avg_errors_array)
    #ax.plot3D(cParams, gammas, avg_errors_array)
    #ax.scatter3D(cParams, gammas, avg_errors_array, zdir='z',cmap='viridis')

    df = pd.DataFrame({'x': cParams, 'y': gammas, 'z': avg_errors_array})
    surf = ax.plot_trisurf(df.x, df.y, df.z, cmap=cm.jet, linewidth=0.1)
    fig.colorbar(surf, shrink=0.5, aspect=5)    
    plt.savefig('./plots/avgErrs_vs_C_andgamma_type_%s.png'%(k))
    plt.show()

Here is the final output:

plot_trisurf of xyz data


回答 4

查看官方示例。X,Y和Z实际上是2d数组,numpy.meshgrid()是从1d x和y值中获取2d x,y网格的简单方法。

http://matplotlib.sourceforge.net/mpl_examples/mplot3d/surface3d_demo.py

这是将3元组转换为3个1d数组的pythonic方法。

data = [(1,2,3), (10,20,30), (11, 22, 33), (110, 220, 330)]
X,Y,Z = zip(*data)
In [7]: X
Out[7]: (1, 10, 11, 110)
In [8]: Y
Out[8]: (2, 20, 22, 220)
In [9]: Z
Out[9]: (3, 30, 33, 330)

这是mtaplotlib delaunay三角剖分(插值),它将1d x,y,z转换为兼容的(?):

http://matplotlib.sourceforge.net/api/mlab_api.html#matplotlib.mlab.griddata

check the official example. X,Y and Z are indeed 2d arrays, numpy.meshgrid() is a simple way to get 2d x,y mesh out of 1d x and y values.

http://matplotlib.sourceforge.net/mpl_examples/mplot3d/surface3d_demo.py

here’s pythonic way to convert your 3-tuples to 3 1d arrays.

data = [(1,2,3), (10,20,30), (11, 22, 33), (110, 220, 330)]
X,Y,Z = zip(*data)
In [7]: X
Out[7]: (1, 10, 11, 110)
In [8]: Y
Out[8]: (2, 20, 22, 220)
In [9]: Z
Out[9]: (3, 30, 33, 330)

Here’s mtaplotlib delaunay triangulation (interpolation), it converts 1d x,y,z into something compliant (?):

http://matplotlib.sourceforge.net/api/mlab_api.html#matplotlib.mlab.griddata


回答 5

在Matlab中,我仅使用,坐标(而不是)delaunay上的函数做了类似的事情,然后使用或绘制,使用了高度。xyztrimeshtrisurfz

SciPy具有Delaunay类,该类基于与Matlab delaunay函数相同的基础QHull库,因此您应该获得相同的结果。

从那里开始,应该有几行代码将python-matplotlib示例中的Plotting 3D Polygons转换为您希望实现的目标,从而为Delaunay您提供了每个三角形多边形的规格。

In Matlab I did something similar using the delaunay function on the x, y coords only (not the z), then plotting with trimesh or trisurf, using z as the height.

SciPy has the Delaunay class, which is based on the same underlying QHull library that the Matlab’s delaunay function is, so you should get identical results.

From there, it should be a few lines of code to convert this Plotting 3D Polygons in python-matplotlib example into what you wish to achieve, as Delaunay gives you the specification of each triangular polygon.


回答 6

只是添加一些其他想法,这些想法可能会帮助其他人解决不规则的域类型问题。对于用户具有三个向量/列表的情况,x,y,z表示2D解决方案,其中z将被绘制在作为表面的矩形网格上,ArtifixR的’plot_trisurf()’注释适用。一个具有非矩形域的类似示例是:

import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D 

# problem parameters
nu = 50; nv = 50
u = np.linspace(0, 2*np.pi, nu,) 
v = np.linspace(0, np.pi, nv,)

xx = np.zeros((nu,nv),dtype='d')
yy = np.zeros((nu,nv),dtype='d')
zz = np.zeros((nu,nv),dtype='d')

# populate x,y,z arrays
for i in range(nu):
  for j in range(nv):
    xx[i,j] = np.sin(v[j])*np.cos(u[i])
    yy[i,j] = np.sin(v[j])*np.sin(u[i])
    zz[i,j] = np.exp(-4*(xx[i,j]**2 + yy[i,j]**2)) # bell curve

# convert arrays to vectors
x = xx.flatten()
y = yy.flatten()
z = zz.flatten()

# Plot solution surface
fig = plt.figure(figsize=(6,6))
ax = Axes3D(fig)
ax.plot_trisurf(x, y, z, cmap=cm.jet, linewidth=0,
                antialiased=False)
ax.set_title(r'trisurf example',fontsize=16, color='k')
ax.view_init(60, 35)
fig.tight_layout()
plt.show()

上面的代码生成:

非矩形网格问题的曲面图

但是,这可能无法解决所有问题,尤其是在不规则域中定义问题的情况下。同样,在畴具有一个或多个凹面区域的情况下,德劳内三角剖分可能会导致在畴外部生成虚假三角形。在这种情况下,必须从三角测量中删除这些流氓三角形,以实现正确的表面表示。对于这些情况,用户可能必须明确包括delaunay三角剖分计算,以便可以通过编程方式删除这些三角形。在这种情况下,以下代码可以代替以前的绘图代码:


import matplotlib.tri as mtri 
import scipy.spatial
# plot final solution
pts = np.vstack([x, y]).T
tess = scipy.spatial.Delaunay(pts) # tessilation

# Create the matplotlib Triangulation object
xx = tess.points[:, 0]
yy = tess.points[:, 1]
tri = tess.vertices # or tess.simplices depending on scipy version

#############################################################
# NOTE: If 2D domain has concave properties one has to
#       remove delaunay triangles that are exterior to the domain.
#       This operation is problem specific!
#       For simple situations create a polygon of the
#       domain from boundary nodes and identify triangles
#       in 'tri' outside the polygon. Then delete them from
#       'tri'.
#       <ADD THE CODE HERE>
#############################################################

triDat = mtri.Triangulation(x=pts[:, 0], y=pts[:, 1], triangles=tri)

# Plot solution surface
fig = plt.figure(figsize=(6,6))
ax = fig.gca(projection='3d')
ax.plot_trisurf(triDat, z, linewidth=0, edgecolor='none',
                antialiased=False, cmap=cm.jet)
ax.set_title(r'trisurf with delaunay triangulation', 
          fontsize=16, color='k')
plt.show()

下面的示例图说明了解决方案1)带有虚假三角形的溶液,以及2)去除了溶液的位置:

在此处输入图片说明

三角形已删除

我希望以上内容可能对解决方案数据中出现凹形情况的人们有所帮助。

Just to add some further thoughts which may help others with irregular domain type problems. For a situation where the user has three vectors/lists, x,y,z representing a 2D solution where z is to be plotted on a rectangular grid as a surface, the ‘plot_trisurf()’ comments by ArtifixR are applicable. A similar example but with non rectangular domain is:

import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D 

# problem parameters
nu = 50; nv = 50
u = np.linspace(0, 2*np.pi, nu,) 
v = np.linspace(0, np.pi, nv,)

xx = np.zeros((nu,nv),dtype='d')
yy = np.zeros((nu,nv),dtype='d')
zz = np.zeros((nu,nv),dtype='d')

# populate x,y,z arrays
for i in range(nu):
  for j in range(nv):
    xx[i,j] = np.sin(v[j])*np.cos(u[i])
    yy[i,j] = np.sin(v[j])*np.sin(u[i])
    zz[i,j] = np.exp(-4*(xx[i,j]**2 + yy[i,j]**2)) # bell curve

# convert arrays to vectors
x = xx.flatten()
y = yy.flatten()
z = zz.flatten()

# Plot solution surface
fig = plt.figure(figsize=(6,6))
ax = Axes3D(fig)
ax.plot_trisurf(x, y, z, cmap=cm.jet, linewidth=0,
                antialiased=False)
ax.set_title(r'trisurf example',fontsize=16, color='k')
ax.view_init(60, 35)
fig.tight_layout()
plt.show()

The above code produces:

Surface plot for non-rectangular grid problem

However, this may not solve all problems, particular where the problem is defined on an irregular domain. Also, in the case where the domain has one or more concave areas, the delaunay triangulation may result in generating spurious triangles exterior to the domain. In such cases, these rogue triangles have to be removed from the triangulation in order to achieve the correct surface representation. For these situations, the user may have to explicitly include the delaunay triangulation calculation so that these triangles can be removed programmatically. Under these circumstances, the following code could replace the previous plot code:


import matplotlib.tri as mtri 
import scipy.spatial
# plot final solution
pts = np.vstack([x, y]).T
tess = scipy.spatial.Delaunay(pts) # tessilation

# Create the matplotlib Triangulation object
xx = tess.points[:, 0]
yy = tess.points[:, 1]
tri = tess.vertices # or tess.simplices depending on scipy version

#############################################################
# NOTE: If 2D domain has concave properties one has to
#       remove delaunay triangles that are exterior to the domain.
#       This operation is problem specific!
#       For simple situations create a polygon of the
#       domain from boundary nodes and identify triangles
#       in 'tri' outside the polygon. Then delete them from
#       'tri'.
#       <ADD THE CODE HERE>
#############################################################

triDat = mtri.Triangulation(x=pts[:, 0], y=pts[:, 1], triangles=tri)

# Plot solution surface
fig = plt.figure(figsize=(6,6))
ax = fig.gca(projection='3d')
ax.plot_trisurf(triDat, z, linewidth=0, edgecolor='none',
                antialiased=False, cmap=cm.jet)
ax.set_title(r'trisurf with delaunay triangulation', 
          fontsize=16, color='k')
plt.show()

Example plots are given below illustrating solution 1) with spurious triangles, and 2) where they have been removed:

enter image description here

triangles removed

I hope the above may be of help to people with concavity situations in the solution data.


回答 7

无法使用您的数据直接制作3d曲面。我建议您使用诸如pykridge之类的工具构建插值模型。该过程将包括三个步骤:

  1. 使用训练插值模型 pykridge
  2. 使用X和构建网格Ymeshgrid
  3. 内插值 Z

创建了网格和相应的Z值之后,现在就可以使用了plot_surface。请注意,根据数据的大小,该meshgrid功能可以运行一段时间。解决方法是使用np.linspacefor XYaxis 创建均匀间隔的样本,然后应用插值来推断必要的Z值。如果是这样,则插值可能会与原始值有所不同,Z因为X并且Y已经更改。

It is not possible to directly make a 3d surface using your data. I would recommend you to build an interpolation model using some tools like pykridge. The process will include three steps:

  1. Train an interpolation model using pykridge
  2. Build a grid from X and Y using meshgrid
  3. Interpolate values for Z

Having created your grid and the corresponding Z values, now you’re ready to go with plot_surface. Note that depending on the size of your data, the meshgrid function can run for a while. The workaround is to create evenly spaced samples using np.linspace for X and Y axes, then apply interpolation to infer the necessary Z values. If so, the interpolated values might different from the original Z because X and Y have changed.


matplotlib导入时需要时间

问题:matplotlib导入时需要时间

我刚刚升级到matplotlib(1.5.1)的最新稳定版本,每次导入matplotlib时都会收到以下消息:

/usr/local/lib/python2.7/dist-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')

…总是停顿几秒钟。

这是预期的行为吗?之前也一样,只是没有打印出来的消息吗?

I just upgraded to the latest stable release of matplotlib (1.5.1) and everytime I import matplotlib I get this message:

/usr/local/lib/python2.7/dist-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')

… which always stalls for a few seconds.

Is this the expected behaviour? Was it the same also before, but just without the printed message?


回答 0

正如汤姆在上面的评论中建议的,删除文件:

fontList.cache
fontList.py3k.cache 
tex.cache 

解决这个问题。就我而言,文件位于:

`~/.matplotlib`

已编辑

几天前,该消息再次出现,我删除了上述位置中的文件,但没有成功。我发现,建议在这里通过牛逼穆道有一个额外的位置信息与文本缓存文件是:~/.cache/fontconfig

As tom suggested in the comment above, deleting the files:

fontList.cache
fontList.py3k.cache 
tex.cache 

solve the problem. In my case the files were under:

`~/.matplotlib`

EDITED

A couple of days ago the message appeared again, I deleted the files in the locations mention above without any success. I found that as suggested here by T Mudau there’s an extra location with text cache files is: ~/.cache/fontconfig


回答 1

确认的Hugo的方法适用于Ubuntu 14.04 LTS / matplotlib 1.5.1:

  • 删除〜/ .cache / matplotlib / fontList.cache
  • 运行代码,再次发出警告(假设:正在正确地重建缓存)
  • 再次运行代码,不再发出警告(最终)

Confirmed Hugo’s approach works for Ubuntu 14.04 LTS/matplotlib 1.5.1:

  • deleted ~/.cache/matplotlib/fontList.cache
  • ran code, again the warning was issued (assumption: is rebuilding the cache correctly)
  • ran code again, no more warning (finally)

回答 2

在OSX Yosemite(版本10.10.15)上,以下代码对我有用:

  • 也从该目录中删除缓存文件:〜/ .cache / fontconfig(根据tom的建议)
    rm -rvf ~/.cache/fontconfig/*
  • 还删除了〜/ .matplotlib中的.cache文件(根据Hugo的建议)
    rm -rvf ~/.matplotlib/*

On OSX Yosemite (version 10.10.15), the following worked for me:

  • remove the cache files from this directory as well: ~/.cache/fontconfig (as per tom’s suggestion)
    rm -rvf ~/.cache/fontconfig/*
  • also removed .cache files in ~/.matplotlib (as per Hugo’s suggestion)
    rm -rvf ~/.matplotlib/*

回答 3

我只使用sudo运行了python代码一次,它为我解决了警告。现在它运行得更快。不使用sudo运行就不会发出任何警告。

干杯

I ran the python code using sudo just once, and it resolved the warning for me. Now it runs faster. Running without sudo gives no warning at all.

Cheers


回答 4

我运行了python代码w。sudo并治愈了它。。。我猜是没有写这张桌子的许可了……祝你好运!

I ran the python code w. sudo and it cured it…my guess was that there wasn’t permission to write that table… good luck!


回答 5

嗨,您必须在我的情况下找到此文件:font_manager.py:C:\ Users \ gustavo \ Anaconda3 \ Lib \ site-packages \ matplotlib \ font_manager.py

和查找def win32InstalledFonts(directory = None,fontext =’ttf’)并替换为:

def win32InstalledFonts(directory = None,fontext =’ttf’):“”“在指定的字体目录中搜索字体,如果未提供,则使用系统目录。默认情况下,返回TrueType字体文件名列表;如果返回,则返回AFM字体fontext ==’afm’。“”“

from six.moves import winreg
if directory is None:
    directory = win32FontDirectory()

fontext = get_fontext_synonyms(fontext)

key, items = None, {}
for fontdir in MSFontDirectories:
    try:
        local = winreg.OpenKey(winreg.HKEY_LOCAL_MACHINE, fontdir)
    except OSError:
        continue

    if not local:
        return list_fonts(directory, fontext)
    try:
        for j in range(winreg.QueryInfoKey(local)[1]):
            try:
                key, direc, any = winreg.EnumValue(local, j)
                if not is_string_like(direc):
                    continue
                if not os.path.dirname(direc):
                    direc = os.path.join(directory, direc)
                    direc = direc.split('\0', 1)[0]

                if os.path.splitext(direc)[1][1:] in fontext:
                    items[direc] = 1
            except EnvironmentError:
                continue
            except WindowsError:
                continue
            except MemoryError:
                continue
        return list(six.iterkeys(items))
    finally:
        winreg.CloseKey(local)
return None

HI you must find this file : font_manager.py in my case : C:\Users\gustavo\Anaconda3\Lib\site-packages\matplotlib\ font_manager.py

and FIND def win32InstalledFonts(directory=None, fontext=’ttf’) and replace by :

def win32InstalledFonts(directory=None, fontext=’ttf’): “”” Search for fonts in the specified font directory, or use the system directories if none given. A list of TrueType font filenames are returned by default, or AFM fonts if fontext == ‘afm’. “””

from six.moves import winreg
if directory is None:
    directory = win32FontDirectory()

fontext = get_fontext_synonyms(fontext)

key, items = None, {}
for fontdir in MSFontDirectories:
    try:
        local = winreg.OpenKey(winreg.HKEY_LOCAL_MACHINE, fontdir)
    except OSError:
        continue

    if not local:
        return list_fonts(directory, fontext)
    try:
        for j in range(winreg.QueryInfoKey(local)[1]):
            try:
                key, direc, any = winreg.EnumValue(local, j)
                if not is_string_like(direc):
                    continue
                if not os.path.dirname(direc):
                    direc = os.path.join(directory, direc)
                    direc = direc.split('\0', 1)[0]

                if os.path.splitext(direc)[1][1:] in fontext:
                    items[direc] = 1
            except EnvironmentError:
                continue
            except WindowsError:
                continue
            except MemoryError:
                continue
        return list(six.iterkeys(items))
    finally:
        winreg.CloseKey(local)
return None

回答 6

这在使用Python 3.5.2的 Ubuntu 16.04 LST上对我有用。Anaconda 4.2.0(64位)。我删除了中的所有文件~/.cache/matplotlib/

sudo rm -r fontList.py3k.cache tex.cache 

起初我以为那是行不通的,因为后来我得到了警告。但是在重建缓存文件后,警告消失了。因此,关闭文件,然后重新打开(再次打开),它没有警告。

This worked for me on Ubuntu 16.04 LST with Python 3.5.2 | Anaconda 4.2.0 (64-bit). I deleted all of the files in ~/.cache/matplotlib/.

sudo rm -r fontList.py3k.cache tex.cache 

At first I thought it wouldn’t work, because I got the warning afterward. But after the cache files were rebuilt the warning went away. So, close your file, and reopen again(open again), it has no warning.


回答 7

这对我有用:

sudo apt-get install libfreetype6-dev libxft-dev

This worked for me:

sudo apt-get install libfreetype6-dev libxft-dev

使用matplotlib面向对象的界面进行seaborn绘图

问题:使用matplotlib面向对象的界面进行seaborn绘图

我非常喜欢matplotlib以OOP风格使用:

f, axarr = plt.subplots(2, sharex=True)
axarr[0].plot(...)
axarr[1].plot(...)

这样可以更轻松地跟踪多个图形和子图。

问题:如何以这种方式使用seaborn?或者,如何将此示例更改为OOP样式?如何分辨seaborn绘图功能(例如lmplot哪个Figure或哪个)Axes

I strongly prefer using matplotlib in OOP style:

f, axarr = plt.subplots(2, sharex=True)
axarr[0].plot(...)
axarr[1].plot(...)

This makes it easier to keep track of multiple figures and subplots.

Question: How to use seaborn this way? Or, how to change this example to OOP style? How to tell seaborn plotting functions like lmplot which Figure or Axes it plots to?


回答 0

这在某种程度上取决于您使用的是哪种功能。

Seaborn中的绘图功能大致分为两类

  • “轴级”功能,包括regplotboxplotkdeplot,和许多其他
  • “图级”功能,包括lmplotfactorplotjointplot和一个或两个其他

通过采用显式ax参数并返回Axes对象来标识第一组。如此建议,您可以将它们传递Axes给它们,从而以“面向对象”的方式使用它们:

f, (ax1, ax2) = plt.subplots(2)
sns.regplot(x, y, ax=ax1)
sns.kdeplot(x, ax=ax2)

轴级功能将仅绘制到,Axes并且不会与图形混淆,因此它们可以在面向对象的matplotlib脚本中完美地愉快地共存。

第二组功能(图级)的特征在于,生成的图可能包含多个轴,这些轴始终以“有意义”的方式组织。这意味着功能需要完全控制图形,因此不可能将图形绘制lmplot到已经存在的图形上。调用该函数始终会初始化图形,并将其设置为要绘制的特定图。

但是,一旦调用lmplot,它将返回类型的对象FacetGrid。该对象具有一些对生成的图进行操作的方法,这些方法对图的结构有所了解。它还在FacetGrid.figFacetGrid.axes参数处公开了基础图形和轴数组。该jointplot功能非常相似,但是它使用一个JointGrid对象。因此,您仍然可以在面向对象的上下文中使用这些函数,但是所有自定义必须在调用该函数之后进行。

It depends a bit on which seaborn function you are using.

The plotting functions in seaborn are broadly divided into two classes

  • “Axes-level” functions, including regplot, boxplot, kdeplot, and many others
  • “Figure-level” functions, including lmplot, factorplot, jointplot and one or two others

The first group is identified by taking an explicit ax argument and returning an Axes object. As this suggests, you can use them in an “object oriented” style by passing your Axes to them:

f, (ax1, ax2) = plt.subplots(2)
sns.regplot(x, y, ax=ax1)
sns.kdeplot(x, ax=ax2)

Axes-level functions will only draw onto an Axes and won’t otherwise mess with the figure, so they can coexist perfectly happily in an object-oriented matplotlib script.

The second group of functions (Figure-level) are distinguished by the fact that the resulting plot can potentially include several Axes which are always organized in a “meaningful” way. That means that the functions need to have total control over the figure, so it isn’t possible to plot, say, an lmplot onto one that already exists. Calling the function always initializes a figure and sets it up for the specific plot it’s drawing.

However, once you’ve called lmplot, it will return an object of the type FacetGrid. This object has some methods for operating on the resulting plot that know a bit about the structure of the plot. It also exposes the underlying figure and array of axes at the FacetGrid.fig and FacetGrid.axes arguments. The jointplot function is very similar, but it uses a JointGrid object. So you can still use these functions in an object-oriented context, but all of your customization has to come after you’ve called the function.


Matplotlib用线连接散点图-Python

问题:Matplotlib用线连接散点图-Python

我有两个列表,日期和值。我想使用matplotlib绘制它们。以下创建了我的数据的散点图。

import matplotlib.pyplot as plt

plt.scatter(dates,values)
plt.show()

plt.plot(dates, values) 创建一个折线图。

但是我真正想要的是一个散点图,其中的点通过一条线连接。

与R类似:

plot(dates, values)
lines(dates, value, type="l")

,这使我得到了点的散点图,并用连接点的线覆盖了点。

如何在python中执行此操作?

I have two lists, dates and values. I want to plot them using matplotlib. The following creates a scatter plot of my data.

import matplotlib.pyplot as plt

plt.scatter(dates,values)
plt.show()

plt.plot(dates, values) creates a line graph.

But what I really want is a scatterplot where the points are connected by a line.

Similar to in R:

plot(dates, values)
lines(dates, value, type="l")

, which gives me a scatterplot of points overlaid with a line connecting the points.

How do I do this in python?


回答 0

我认为@Evert有正确的答案:

plt.scatter(dates,values)
plt.plot(dates, values)
plt.show()

几乎与

plt.plot(dates, values, '-o')
plt.show()

或您喜欢的任何线型

I think @Evert has the right answer:

plt.scatter(dates,values)
plt.plot(dates, values)
plt.show()

Which is pretty much the same as

plt.plot(dates, values, '-o')
plt.show()

or whatever linestyle you prefer.


回答 1

红线表示点

plt.plot(dates, values, '.r-') 

或用于x标记和蓝线

plt.plot(dates, values, 'xb-')

For red lines an points

plt.plot(dates, values, '.r-') 

or for x markers and blue lines

plt.plot(dates, values, 'xb-')

回答 2

除了其他答案中提供的内容外,关键字“ zorder”还允许您确定垂直绘制不同对象的顺序。例如:

plt.plot(x,y,zorder=1) 
plt.scatter(x,y,zorder=2)

将散布符号绘制在该行的顶部,而

plt.plot(x,y,zorder=2)
plt.scatter(x,y,zorder=1)

在散布符号上绘制线。

参见例如zorder演示

In addition to what provided in the other answers, the keyword “zorder” allows one to decide the order in which different objects are plotted vertically. E.g.:

plt.plot(x,y,zorder=1) 
plt.scatter(x,y,zorder=2)

plots the scatter symbols on top of the line, while

plt.plot(x,y,zorder=2)
plt.scatter(x,y,zorder=1)

plots the line over the scatter symbols.

See, e.g., the zorder demo


“ log”和“ symlog”有什么区别?

问题:“ log”和“ symlog”有什么区别?

matplotlib中,我可以使用pyplot.xscale()或设置轴缩放Axes.set_xscale()。这两个函数接受三个不同的尺度:'linear'| 'log'| 'symlog'

'log'和之间有什么区别'symlog'?在我做的一个简单测试中,它们看起来完全一样。

我知道文档说它们接受不同的参数,但是我仍然不了解它们之间的区别。有人可以解释一下吗?如果有一些示例代码和图形,答案将是最好的!(另:“符号”的名称从何而来?)

In matplotlib, I can set the axis scaling using either pyplot.xscale() or Axes.set_xscale(). Both functions accept three different scales: 'linear' | 'log' | 'symlog'.

What is the difference between 'log' and 'symlog'? In a simple test I did, they both looked exactly the same.

I know the documentation says they accept different parameters, but I still don’t understand the difference between them. Can someone please explain it? The answer will be the best if it has some sample code and graphics! (also: where does the name ‘symlog’ come from?)


回答 0

我终于找到了一些时间来做一些实验,以了解它们之间的区别。这是我发现的:

  • log仅允许使用正值,并允许您选择如何处理负值(maskclip)。
  • symlog表示对数对称,并允许正值和负值。
  • symlog 允许在绘图内将范围设置为零左右,而不是对数,而是线性的。

我认为通过图形和示例,一切都将变得更容易理解,因此让我们尝试一下:

import numpy
from matplotlib import pyplot

# Enable interactive mode
pyplot.ion()

# Draw the grid lines
pyplot.grid(True)

# Numbers from -50 to 50, with 0.1 as step
xdomain = numpy.arange(-50,50, 0.1)

# Plots a simple linear function 'f(x) = x'
pyplot.plot(xdomain, xdomain)
# Plots 'sin(x)'
pyplot.plot(xdomain, numpy.sin(xdomain))

# 'linear' is the default mode, so this next line is redundant:
pyplot.xscale('linear')

使用“线性”缩放的图

# How to treat negative values?
# 'mask' will treat negative values as invalid
# 'mask' is the default, so the next two lines are equivalent
pyplot.xscale('log')
pyplot.xscale('log', nonposx='mask')

使用'log'缩放和nonposx ='mask'的图形

# 'clip' will map all negative values a very small positive one
pyplot.xscale('log', nonposx='clip')

使用'log'缩放和nonposx ='clip'的图形

# 'symlog' scaling, however, handles negative values nicely
pyplot.xscale('symlog')

使用“符号”缩放的图形

# And you can even set a linear range around zero
pyplot.xscale('symlog', linthreshx=20)

使用“符号”缩放比例的图,但线性在(-20,20)

为了完整起见,我使用以下代码保存每个图:

# Default dpi is 80
pyplot.savefig('matplotlib_xscale_linear.png', dpi=50, bbox_inches='tight')

请记住,您可以使用以下方法更改图形尺寸:

fig = pyplot.gcf()
fig.set_size_inches([4., 3.])
# Default size: [8., 6.]

(如果您不知道我的回答我的问题,请阅读

I finally found some time to do some experiments in order to understand the difference between them. Here’s what I discovered:

  • log only allows positive values, and lets you choose how to handle negative ones (mask or clip).
  • symlog means symmetrical log, and allows positive and negative values.
  • symlog allows to set a range around zero within the plot will be linear instead of logarithmic.

I think everything will get a lot easier to understand with graphics and examples, so let’s try them:

import numpy
from matplotlib import pyplot

# Enable interactive mode
pyplot.ion()

# Draw the grid lines
pyplot.grid(True)

# Numbers from -50 to 50, with 0.1 as step
xdomain = numpy.arange(-50,50, 0.1)

# Plots a simple linear function 'f(x) = x'
pyplot.plot(xdomain, xdomain)
# Plots 'sin(x)'
pyplot.plot(xdomain, numpy.sin(xdomain))

# 'linear' is the default mode, so this next line is redundant:
pyplot.xscale('linear')

A graph using 'linear' scaling

# How to treat negative values?
# 'mask' will treat negative values as invalid
# 'mask' is the default, so the next two lines are equivalent
pyplot.xscale('log')
pyplot.xscale('log', nonposx='mask')

A graph using 'log' scaling and nonposx='mask'

# 'clip' will map all negative values a very small positive one
pyplot.xscale('log', nonposx='clip')

A graph using 'log' scaling and nonposx='clip'

# 'symlog' scaling, however, handles negative values nicely
pyplot.xscale('symlog')

A graph using 'symlog' scaling

# And you can even set a linear range around zero
pyplot.xscale('symlog', linthreshx=20)

A graph using 'symlog' scaling, but linear within (-20,20)

Just for completeness, I’ve used the following code to save each figure:

# Default dpi is 80
pyplot.savefig('matplotlib_xscale_linear.png', dpi=50, bbox_inches='tight')

Remember you can change the figure size using:

fig = pyplot.gcf()
fig.set_size_inches([4., 3.])
# Default size: [8., 6.]

(If you are unsure about me answering my own question, read this)


回答 1

symlog类似于log,但是允许您定义一个接近零的值范围,在该范围内绘图是线性的,以避免使绘图在零附近变为无穷大。

来自http://matplotlib.sourceforge.net/api/axes_api.html#matplotlib.axes.Axes.set_xscale

在对数图中,永远不会有零值,并且如果您的值接近零,它将从图的底部向下(无限向下)尖峰,因为当您采用“ log(逼近零)”时,得到“接近负无穷大”。

symlog将在需要创建对数图的情况下为您提供帮助,但是当值有时可能会下降到零或下降到零时,但是您仍然希望能够以有意义的方式在图上显示该值。如果您需要符号记录,就可以知道。

symlog is like log but allows you to define a range of values near zero within which the plot is linear, to avoid having the plot go to infinity around zero.

From http://matplotlib.sourceforge.net/api/axes_api.html#matplotlib.axes.Axes.set_xscale

In a log graph, you can never have a zero value, and if you have a value that approaches zero, it will spike down way off the bottom off your graph (infinitely downward) because when you take “log(approaching zero)” you get “approaching negative infinity”.

symlog would help you out in situations where you want to have a log graph, but when the value may sometimes go down towards, or to, zero, but you still want to be able to show that on the graph in a meaningful way. If you need symlog, you’d know.


回答 2

这是必须使用符号日志时的行为示例:

初始图,未缩放。注意多少点聚集在x〜0

    ax = sns.scatterplot(x= 'Score', y ='Total Amount Deposited', data = df, hue = 'Predicted Category')

[ 非缩放

对数比例图。一切都崩溃了。

    ax = sns.scatterplot(x= 'Score', y ='Total Amount Deposited', data = df, hue = 'Predicted Category')

    ax.set_xscale('log')
    ax.set_yscale('log')
    ax.set(xlabel='Score, log', ylabel='Total Amount Deposited, log')

对数刻度

为什么会崩溃?由于x轴上的某些值非常接近或等于0。

符号比例图。一切都是应有的。

    ax = sns.scatterplot(x= 'Score', y ='Total Amount Deposited', data = df, hue = 'Predicted Category')

    ax.set_xscale('symlog')
    ax.set_yscale('symlog')
    ax.set(xlabel='Score, symlog', ylabel='Total Amount Deposited, symlog')

符号量表

Here’s an example of behaviour when symlog is necessary:

Initial plot, not scaled. Notice how many dots cluster at x~0

    ax = sns.scatterplot(x= 'Score', y ='Total Amount Deposited', data = df, hue = 'Predicted Category')

[Non scaled

Log scaled plot. Everything collapsed.

    ax = sns.scatterplot(x= 'Score', y ='Total Amount Deposited', data = df, hue = 'Predicted Category')

    ax.set_xscale('log')
    ax.set_yscale('log')
    ax.set(xlabel='Score, log', ylabel='Total Amount Deposited, log')

Log scale

Why did it collapse? Because of some values on the x-axis being very close or equal to 0.

Symlog scaled plot. Everything is as it should be.

    ax = sns.scatterplot(x= 'Score', y ='Total Amount Deposited', data = df, hue = 'Predicted Category')

    ax.set_xscale('symlog')
    ax.set_yscale('symlog')
    ax.set(xlabel='Score, symlog', ylabel='Total Amount Deposited, symlog')

Symlog scale


隐藏轴值,但将轴刻度标签保留在matplotlib中

问题:隐藏轴值,但将轴刻度标签保留在matplotlib中

我有这张图片:

plt.plot(sim_1['t'],sim_1['V'],'k')
plt.ylabel('V')
plt.xlabel('t')
plt.show()

在此处输入图片说明

我想隐藏数字;如果我使用:

plt.axis('off')

…我得到这张图片:

在此处输入图片说明

它还会隐藏标签Vt。隐藏值时如何保留标签?

I have this image:

plt.plot(sim_1['t'],sim_1['V'],'k')
plt.ylabel('V')
plt.xlabel('t')
plt.show()

enter image description here

I want to hide the numbers; if I use:

plt.axis('off')

…I get this image:

enter image description here

It also hide the labels, V and t. How can I keep the labels while hiding the values?


回答 0

如果您使用matplotlib 面向对象的方法,那么使用ax.set_xticklabels()和是一个简单的任务ax.set_yticklabels()

import matplotlib.pyplot as plt

# Create Figure and Axes instances
fig,ax = plt.subplots(1)

# Make your plot, set your axes labels
ax.plot(sim_1['t'],sim_1['V'],'k')
ax.set_ylabel('V')
ax.set_xlabel('t')

# Turn off tick labels
ax.set_yticklabels([])
ax.set_xticklabels([])

plt.show()

If you use the matplotlib object-oriented approach, this is a simple task using ax.set_xticklabels() and ax.set_yticklabels():

import matplotlib.pyplot as plt

# Create Figure and Axes instances
fig,ax = plt.subplots(1)

# Make your plot, set your axes labels
ax.plot(sim_1['t'],sim_1['V'],'k')
ax.set_ylabel('V')
ax.set_xlabel('t')

# Turn off tick labels
ax.set_yticklabels([])
ax.set_xticklabels([])

plt.show()

回答 1

没有 subplots,您可以像这样普遍删除刻度线:

plt.xticks([])
plt.yticks([])

Without a subplots, you can universally remove the ticks like this:

plt.xticks([])
plt.yticks([])

回答 2

这很好。只需在此之前粘贴plt.show()

plt.gca().axes.get_yaxis().set_visible(False)

繁荣。

This works great. Just paste this before plt.show():

plt.gca().axes.get_yaxis().set_visible(False)

Boom.


回答 3

不确定这是最好的方法,但是您当然可以这样替换刻度线标签:

import matplotlib.pyplot as plt
x = range(10)
y = range(10)
plt.plot(x,y)
plt.xticks(x," ")
plt.show()

在Python 3.4中,这会生成一个简单的线图,在x轴上没有刻度线。一个简单的例子在这里:http : //matplotlib.org/examples/ticks_and_spines/ticklabels_demo_rotation.html

这个相关问题也有一些更好的建议: 在matplotlib图中隐藏轴文本

我是python的新手。您的里程可能在早期版本中有所不同。也许其他人可以帮忙?

Not sure this is the best way, but you can certainly replace the tick labels like this:

import matplotlib.pyplot as plt
x = range(10)
y = range(10)
plt.plot(x,y)
plt.xticks(x," ")
plt.show()

In Python 3.4 this generates a simple line plot with no tick labels on the x-axis. A simple example is here: http://matplotlib.org/examples/ticks_and_spines/ticklabels_demo_rotation.html

This related question also has some better suggestions: Hiding axis text in matplotlib plots

I’m new to python. Your mileage may vary in earlier versions. Maybe others can help?


回答 4

完全删除刻度线,请使用:

ax.set_yticks([])
ax.set_xticks([])

否则ax.set_yticklabels([])ax.set_xticklabels([])将保留刻度线。

to remove tickmarks entirely use:

ax.set_yticks([])
ax.set_xticks([])

otherwise ax.set_yticklabels([]) and ax.set_xticklabels([]) will keep tickmarks.


回答 5

plt.gca().axes.yaxis.set_ticklabels([])

在此处输入图片说明

plt.gca().axes.yaxis.set_ticklabels([])

enter image description here


使用pcolor在matplotlib中进行热图绘制?

问题:使用pcolor在matplotlib中进行热图绘制?

我想制作一个像这样的热图(显示在FlowingData上): 热图

源数据在这里,但是可以使用随机数据和标签,即

import numpy
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = numpy.random.rand(4,4)

在matplotlib中制作热图非常简单:

from matplotlib import pyplot as plt
heatmap = plt.pcolor(data)

我什至发现了一个看起来正确的colormap参数:heatmap = plt.pcolor(data, cmap=matplotlib.cm.Blues)

但是除此之外,我不知道如何显示列和行的标签以及如何以正确的方向显示数据(起源在左上角而不是左下角)。

尝试操作heatmap.axes(例如heatmap.axes.set_xticklabels = column_labels)都失败了。我在这里想念什么?

I’d like to make a heatmap like this (shown on FlowingData): heatmap

The source data is here, but random data and labels would be fine to use, i.e.

import numpy
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = numpy.random.rand(4,4)

Making the heatmap is easy enough in matplotlib:

from matplotlib import pyplot as plt
heatmap = plt.pcolor(data)

And I even found a colormap arguments that look about right: heatmap = plt.pcolor(data, cmap=matplotlib.cm.Blues)

But beyond that, I can’t figure out how to display labels for the columns and rows and display the data in the proper orientation (origin at the top left instead of bottom left).

Attempts to manipulate heatmap.axes (e.g. heatmap.axes.set_xticklabels = column_labels) have all failed. What am I missing here?


回答 0

这很晚了,但是这是我对flowingdata NBA热图的python实现。

已更新:2014/1/4:谢谢大家

# -*- coding: utf-8 -*-
# <nbformat>3.0</nbformat>

# ------------------------------------------------------------------------
# Filename   : heatmap.py
# Date       : 2013-04-19
# Updated    : 2014-01-04
# Author     : @LotzJoe >> Joe Lotz
# Description: My attempt at reproducing the FlowingData graphic in Python
# Source     : http://flowingdata.com/2010/01/21/how-to-make-a-heatmap-a-quick-and-easy-solution/
#
# Other Links:
#     http://stackoverflow.com/questions/14391959/heatmap-in-matplotlib-with-pcolor
#
# ------------------------------------------------------------------------

import matplotlib.pyplot as plt
import pandas as pd
from urllib2 import urlopen
import numpy as np
%pylab inline

page = urlopen("http://datasets.flowingdata.com/ppg2008.csv")
nba = pd.read_csv(page, index_col=0)

# Normalize data columns
nba_norm = (nba - nba.mean()) / (nba.max() - nba.min())

# Sort data according to Points, lowest to highest
# This was just a design choice made by Yau
# inplace=False (default) ->thanks SO user d1337
nba_sort = nba_norm.sort('PTS', ascending=True)

nba_sort['PTS'].head(10)

# Plot it out
fig, ax = plt.subplots()
heatmap = ax.pcolor(nba_sort, cmap=plt.cm.Blues, alpha=0.8)

# Format
fig = plt.gcf()
fig.set_size_inches(8, 11)

# turn off the frame
ax.set_frame_on(False)

# put the major ticks at the middle of each cell
ax.set_yticks(np.arange(nba_sort.shape[0]) + 0.5, minor=False)
ax.set_xticks(np.arange(nba_sort.shape[1]) + 0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.tick_top()

# Set the labels

# label source:https://en.wikipedia.org/wiki/Basketball_statistics
labels = [
    'Games', 'Minutes', 'Points', 'Field goals made', 'Field goal attempts', 'Field goal percentage', 'Free throws made', 'Free throws attempts', 'Free throws percentage',
    'Three-pointers made', 'Three-point attempt', 'Three-point percentage', 'Offensive rebounds', 'Defensive rebounds', 'Total rebounds', 'Assists', 'Steals', 'Blocks', 'Turnover', 'Personal foul']

# note I could have used nba_sort.columns but made "labels" instead
ax.set_xticklabels(labels, minor=False)
ax.set_yticklabels(nba_sort.index, minor=False)

# rotate the
plt.xticks(rotation=90)

ax.grid(False)

# Turn off all the ticks
ax = plt.gca()

for t in ax.xaxis.get_major_ticks():
    t.tick1On = False
    t.tick2On = False
for t in ax.yaxis.get_major_ticks():
    t.tick1On = False
    t.tick2On = False

输出如下所示: 类似于dataingnba的nba热图

这里有一个IPython的笔记本用这些代码在这里。我从“溢出”中学到了很多东西,所以希望有人会发现它有用。

This is late, but here is my python implementation of the flowingdata NBA heatmap.

updated:1/4/2014: thanks everyone

# -*- coding: utf-8 -*-
# <nbformat>3.0</nbformat>

# ------------------------------------------------------------------------
# Filename   : heatmap.py
# Date       : 2013-04-19
# Updated    : 2014-01-04
# Author     : @LotzJoe >> Joe Lotz
# Description: My attempt at reproducing the FlowingData graphic in Python
# Source     : http://flowingdata.com/2010/01/21/how-to-make-a-heatmap-a-quick-and-easy-solution/
#
# Other Links:
#     http://stackoverflow.com/questions/14391959/heatmap-in-matplotlib-with-pcolor
#
# ------------------------------------------------------------------------

import matplotlib.pyplot as plt
import pandas as pd
from urllib2 import urlopen
import numpy as np
%pylab inline

page = urlopen("http://datasets.flowingdata.com/ppg2008.csv")
nba = pd.read_csv(page, index_col=0)

# Normalize data columns
nba_norm = (nba - nba.mean()) / (nba.max() - nba.min())

# Sort data according to Points, lowest to highest
# This was just a design choice made by Yau
# inplace=False (default) ->thanks SO user d1337
nba_sort = nba_norm.sort('PTS', ascending=True)

nba_sort['PTS'].head(10)

# Plot it out
fig, ax = plt.subplots()
heatmap = ax.pcolor(nba_sort, cmap=plt.cm.Blues, alpha=0.8)

# Format
fig = plt.gcf()
fig.set_size_inches(8, 11)

# turn off the frame
ax.set_frame_on(False)

# put the major ticks at the middle of each cell
ax.set_yticks(np.arange(nba_sort.shape[0]) + 0.5, minor=False)
ax.set_xticks(np.arange(nba_sort.shape[1]) + 0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.tick_top()

# Set the labels

# label source:https://en.wikipedia.org/wiki/Basketball_statistics
labels = [
    'Games', 'Minutes', 'Points', 'Field goals made', 'Field goal attempts', 'Field goal percentage', 'Free throws made', 'Free throws attempts', 'Free throws percentage',
    'Three-pointers made', 'Three-point attempt', 'Three-point percentage', 'Offensive rebounds', 'Defensive rebounds', 'Total rebounds', 'Assists', 'Steals', 'Blocks', 'Turnover', 'Personal foul']

# note I could have used nba_sort.columns but made "labels" instead
ax.set_xticklabels(labels, minor=False)
ax.set_yticklabels(nba_sort.index, minor=False)

# rotate the
plt.xticks(rotation=90)

ax.grid(False)

# Turn off all the ticks
ax = plt.gca()

for t in ax.xaxis.get_major_ticks():
    t.tick1On = False
    t.tick2On = False
for t in ax.yaxis.get_major_ticks():
    t.tick1On = False
    t.tick2On = False

The output looks like this: flowingdata-like nba heatmap

There’s an ipython notebook with all this code here. I’ve learned a lot from ‘overflow so hopefully someone will find this useful.


回答 1

python seaborn模块基于matplotlib,并产生非常好的热图。

下面是针对ipython / jupyter笔记本设计的seaborn实现。

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
# import the data directly into a pandas dataframe
nba = pd.read_csv("http://datasets.flowingdata.com/ppg2008.csv", index_col='Name  ')
# remove index title
nba.index.name = ""
# normalize data columns
nba_norm = (nba - nba.mean()) / (nba.max() - nba.min())
# relabel columns
labels = ['Games', 'Minutes', 'Points', 'Field goals made', 'Field goal attempts', 'Field goal percentage', 'Free throws made', 
          'Free throws attempts', 'Free throws percentage','Three-pointers made', 'Three-point attempt', 'Three-point percentage', 
          'Offensive rebounds', 'Defensive rebounds', 'Total rebounds', 'Assists', 'Steals', 'Blocks', 'Turnover', 'Personal foul']
nba_norm.columns = labels
# set appropriate font and dpi
sns.set(font_scale=1.2)
sns.set_style({"savefig.dpi": 100})
# plot it out
ax = sns.heatmap(nba_norm, cmap=plt.cm.Blues, linewidths=.1)
# set the x-axis labels on the top
ax.xaxis.tick_top()
# rotate the x-axis labels
plt.xticks(rotation=90)
# get figure (usually obtained via "fig,ax=plt.subplots()" with matplotlib)
fig = ax.get_figure()
# specify dimensions and save
fig.set_size_inches(15, 20)
fig.savefig("nba.png")

输出看起来像这样: Seaborn Nba热图 我使用了matplotlib Blues颜色图,但是个人发现默认颜色非常漂亮。我用matplotlib旋转了x轴标签,因为找不到语法。正如grexor指出的那样,有必要通过反复试验来指定尺寸(fig.set_size_inches),这让我感到有些沮丧。

如Paul H所述,您可以轻松地将值添加到热图(annot = True),但是在这种情况下,我认为它并没有改善该图。joelotz的出色回答摘录了几个代码段。

The python seaborn module is based on matplotlib, and produces a very nice heatmap.

Below is an implementation with seaborn, designed for the ipython/jupyter notebook.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
# import the data directly into a pandas dataframe
nba = pd.read_csv("http://datasets.flowingdata.com/ppg2008.csv", index_col='Name  ')
# remove index title
nba.index.name = ""
# normalize data columns
nba_norm = (nba - nba.mean()) / (nba.max() - nba.min())
# relabel columns
labels = ['Games', 'Minutes', 'Points', 'Field goals made', 'Field goal attempts', 'Field goal percentage', 'Free throws made', 
          'Free throws attempts', 'Free throws percentage','Three-pointers made', 'Three-point attempt', 'Three-point percentage', 
          'Offensive rebounds', 'Defensive rebounds', 'Total rebounds', 'Assists', 'Steals', 'Blocks', 'Turnover', 'Personal foul']
nba_norm.columns = labels
# set appropriate font and dpi
sns.set(font_scale=1.2)
sns.set_style({"savefig.dpi": 100})
# plot it out
ax = sns.heatmap(nba_norm, cmap=plt.cm.Blues, linewidths=.1)
# set the x-axis labels on the top
ax.xaxis.tick_top()
# rotate the x-axis labels
plt.xticks(rotation=90)
# get figure (usually obtained via "fig,ax=plt.subplots()" with matplotlib)
fig = ax.get_figure()
# specify dimensions and save
fig.set_size_inches(15, 20)
fig.savefig("nba.png")

The output looks like this: seaborn nba heatmap I used the matplotlib Blues color map, but personally find the default colors quite beautiful. I used matplotlib to rotate the x-axis labels, as I couldn’t find the seaborn syntax. As noted by grexor, it was necessary to specify the dimensions (fig.set_size_inches) by trial and error, which I found a bit frustrating.

As noted by Paul H, you can easily add the values to heat maps (annot=True), but in this case I didn’t think it improved the figure. Several code snippets were taken from the excellent answer by joelotz.


回答 2

主要问题是您首先需要设置x和y刻度的位置。而且,它有助于将更多面向对象的接口用于matplotlib。即,axes直接与对象进行交互。

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4,4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data)

# put the major ticks at the middle of each cell, notice "reverse" use of dimension
ax.set_yticks(np.arange(data.shape[0])+0.5, minor=False)
ax.set_xticks(np.arange(data.shape[1])+0.5, minor=False)


ax.set_xticklabels(row_labels, minor=False)
ax.set_yticklabels(column_labels, minor=False)
plt.show()

希望能有所帮助。

Main issue is that you first need to set the location of your x and y ticks. Also, it helps to use the more object-oriented interface to matplotlib. Namely, interact with the axes object directly.

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4,4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data)

# put the major ticks at the middle of each cell, notice "reverse" use of dimension
ax.set_yticks(np.arange(data.shape[0])+0.5, minor=False)
ax.set_xticks(np.arange(data.shape[1])+0.5, minor=False)


ax.set_xticklabels(row_labels, minor=False)
ax.set_yticklabels(column_labels, minor=False)
plt.show()

Hope that helps.


回答 3

有人编辑了这个问题以删除我使用的代码,因此我被迫将其添加为答案。感谢所有参与回答这个问题的人!我认为其他大多数答案都比该代码更好,我只是在这里留作参考。

感谢Paul Hunutbu(回答了这个问题),我得到了一些非常漂亮的输出:

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4,4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data, cmap=plt.cm.Blues)

# put the major ticks at the middle of each cell
ax.set_xticks(np.arange(data.shape[0])+0.5, minor=False)
ax.set_yticks(np.arange(data.shape[1])+0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.tick_top()

ax.set_xticklabels(row_labels, minor=False)
ax.set_yticklabels(column_labels, minor=False)
plt.show()

这是输出:

Matplotlib热图

Someone edited this question to remove the code I used, so I was forced to add it as an answer. Thanks to all who participated in answering this question! I think most of the other answers are better than this code, I’m just leaving this here for reference purposes.

With thanks to Paul H, and unutbu (who answered this question), I have some pretty nice-looking output:

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4,4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data, cmap=plt.cm.Blues)

# put the major ticks at the middle of each cell
ax.set_xticks(np.arange(data.shape[0])+0.5, minor=False)
ax.set_yticks(np.arange(data.shape[1])+0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.tick_top()

ax.set_xticklabels(row_labels, minor=False)
ax.set_yticklabels(column_labels, minor=False)
plt.show()

And here’s the output:

Matplotlib HeatMap


熊猫可以绘制日期直方图吗?

问题:熊猫可以绘制日期直方图吗?

我已经将我的Series系列产品,并将其强制为dtype =的datetime列datetime64[ns](尽管仅需要日期分辨率…不确定如何更改)。

import pandas as pd
df = pd.read_csv('somefile.csv')
column = df['date']
column = pd.to_datetime(column, coerce=True)

但是绘图不起作用:

ipdb> column.plot(kind='hist')
*** TypeError: ufunc add cannot use operands with types dtype('<M8[ns]') and dtype('float64')

我想绘制一个直方图,该直方图仅按周,月或年显示日期计数

当然有办法做到pandas吗?

I’ve taken my Series and coerced it to a datetime column of dtype=datetime64[ns] (though only need day resolution…not sure how to change).

import pandas as pd
df = pd.read_csv('somefile.csv')
column = df['date']
column = pd.to_datetime(column, coerce=True)

but plotting doesn’t work:

ipdb> column.plot(kind='hist')
*** TypeError: ufunc add cannot use operands with types dtype('<M8[ns]') and dtype('float64')

I’d like to plot a histogram that just shows the count of dates by week, month, or year.

Surely there is a way to do this in pandas?


回答 0

鉴于此df:

        date
0 2001-08-10
1 2002-08-31
2 2003-08-29
3 2006-06-21
4 2002-03-27
5 2003-07-14
6 2004-06-15
7 2003-08-14
8 2003-07-29

并且,如果还不是这样的话:

df["date"] = df["date"].astype("datetime64")

要按月显示日期计数:

df.groupby(df["date"].dt.month).count().plot(kind="bar")

.dt 允许您访问datetime属性。

这会给你:

分组日期月份

您可以按年,日等替换月份。

例如,如果要区分年份和月份,请执行以下操作:

df.groupby([df["date"].dt.year, df["date"].dt.month]).count().plot(kind="bar")

这使:

分组日期日期年

是您想要的吗?这清楚吗?

希望这可以帮助 !

Given this df:

        date
0 2001-08-10
1 2002-08-31
2 2003-08-29
3 2006-06-21
4 2002-03-27
5 2003-07-14
6 2004-06-15
7 2003-08-14
8 2003-07-29

and, if it’s not already the case:

df["date"] = df["date"].astype("datetime64")

To show the count of dates by month:

df.groupby(df["date"].dt.month).count().plot(kind="bar")

.dt allows you to access the datetime properties.

Which will give you:

groupby date month

You can replace month by year, day, etc..

If you want to distinguish year and month for instance, just do:

df.groupby([df["date"].dt.year, df["date"].dt.month]).count().plot(kind="bar")

Which gives:

groupby date month year

Was it what you wanted ? Is this clear ?

Hope this helps !


回答 1

我认为重新采样可能是您想要的。对于您的情况,请执行以下操作:

df.set_index('date', inplace=True)
# for '1M' for 1 month; '1W' for 1 week; check documentation on offset alias
df.resample('1M', how='count')

它仅在进行计数而不是在进行图,因此您必须自己制作图。

有关重新采样熊猫重新采样文档的更多详细信息,请参见这篇文章。

我遇到了和您一样的类似问题。希望这可以帮助。

I think resample might be what you are looking for. In your case, do:

df.set_index('date', inplace=True)
# for '1M' for 1 month; '1W' for 1 week; check documentation on offset alias
df.resample('1M', how='count')

It is only doing the counting and not the plot, so you then have to make your own plots.

See this post for more details on the documentation of resample pandas resample documentation

I have ran into similar problems as you did. Hope this helps.


回答 2

渲染示例

在此处输入图片说明

范例程式码

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""Create random datetime object."""

# core modules
from datetime import datetime
import random

# 3rd party modules
import pandas as pd
import matplotlib.pyplot as plt


def visualize(df, column_name='start_date', color='#494949', title=''):
    """
    Visualize a dataframe with a date column.

    Parameters
    ----------
    df : Pandas dataframe
    column_name : str
        Column to visualize
    color : str
    title : str
    """
    plt.figure(figsize=(20, 10))
    ax = (df[column_name].groupby(df[column_name].dt.hour)
                         .count()).plot(kind="bar", color=color)
    ax.set_facecolor('#eeeeee')
    ax.set_xlabel("hour of the day")
    ax.set_ylabel("count")
    ax.set_title(title)
    plt.show()


def create_random_datetime(from_date, to_date, rand_type='uniform'):
    """
    Create random date within timeframe.

    Parameters
    ----------
    from_date : datetime object
    to_date : datetime object
    rand_type : {'uniform'}

    Examples
    --------
    >>> random.seed(28041990)
    >>> create_random_datetime(datetime(1990, 4, 28), datetime(2000, 12, 31))
    datetime.datetime(1998, 12, 13, 23, 38, 0, 121628)
    >>> create_random_datetime(datetime(1990, 4, 28), datetime(2000, 12, 31))
    datetime.datetime(2000, 3, 19, 19, 24, 31, 193940)
    """
    delta = to_date - from_date
    if rand_type == 'uniform':
        rand = random.random()
    else:
        raise NotImplementedError('Unknown random mode \'{}\''
                                  .format(rand_type))
    return from_date + rand * delta


def create_df(n=1000):
    """Create a Pandas dataframe with datetime objects."""
    from_date = datetime(1990, 4, 28)
    to_date = datetime(2000, 12, 31)
    sales = [create_random_datetime(from_date, to_date) for _ in range(n)]
    df = pd.DataFrame({'start_date': sales})
    return df


if __name__ == '__main__':
    import doctest
    doctest.testmod()
    df = create_df()
    visualize(df)

Rendered example

enter image description here

Example Code

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""Create random datetime object."""

# core modules
from datetime import datetime
import random

# 3rd party modules
import pandas as pd
import matplotlib.pyplot as plt


def visualize(df, column_name='start_date', color='#494949', title=''):
    """
    Visualize a dataframe with a date column.

    Parameters
    ----------
    df : Pandas dataframe
    column_name : str
        Column to visualize
    color : str
    title : str
    """
    plt.figure(figsize=(20, 10))
    ax = (df[column_name].groupby(df[column_name].dt.hour)
                         .count()).plot(kind="bar", color=color)
    ax.set_facecolor('#eeeeee')
    ax.set_xlabel("hour of the day")
    ax.set_ylabel("count")
    ax.set_title(title)
    plt.show()


def create_random_datetime(from_date, to_date, rand_type='uniform'):
    """
    Create random date within timeframe.

    Parameters
    ----------
    from_date : datetime object
    to_date : datetime object
    rand_type : {'uniform'}

    Examples
    --------
    >>> random.seed(28041990)
    >>> create_random_datetime(datetime(1990, 4, 28), datetime(2000, 12, 31))
    datetime.datetime(1998, 12, 13, 23, 38, 0, 121628)
    >>> create_random_datetime(datetime(1990, 4, 28), datetime(2000, 12, 31))
    datetime.datetime(2000, 3, 19, 19, 24, 31, 193940)
    """
    delta = to_date - from_date
    if rand_type == 'uniform':
        rand = random.random()
    else:
        raise NotImplementedError('Unknown random mode \'{}\''
                                  .format(rand_type))
    return from_date + rand * delta


def create_df(n=1000):
    """Create a Pandas dataframe with datetime objects."""
    from_date = datetime(1990, 4, 28)
    to_date = datetime(2000, 12, 31)
    sales = [create_random_datetime(from_date, to_date) for _ in range(n)]
    df = pd.DataFrame({'start_date': sales})
    return df


if __name__ == '__main__':
    import doctest
    doctest.testmod()
    df = create_df()
    visualize(df)

回答 3

我可以通过以下方法解决此问题:(1)使用matplotlib进行绘制,而不是直接使用数据框,以及(2)使用values属性。参见示例:

import matplotlib.pyplot as plt

ax = plt.gca()
ax.hist(column.values)

如果我不使用values,这是行不通的,但是我不知道为什么行得通。

I was able to work around this by (1) plotting with matplotlib instead of using the dataframe directly and (2) using the values attribute. See example:

import matplotlib.pyplot as plt

ax = plt.gca()
ax.hist(column.values)

This doesn’t work if I don’t use values, but I don’t know why it does work.


回答 4

当您只想拥有期望的直方图时,这是一个解决方案。这不使用groupby,而是将日期时间值转换为整数并更改绘图上的标签。可以做一些改进以将刻度标签移动到均匀位置。同样,采用这种方法,内核密度估计图(和任何其他图)也是可能的。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame({"datetime": pd.to_datetime(np.random.randint(1582800000000000000, 1583500000000000000, 100, dtype=np.int64))})
fig, ax = plt.subplots()
df["datetime"].astype(np.int64).plot.hist(ax=ax)
labels = ax.get_xticks().tolist()
labels = pd.to_datetime(labels)
ax.set_xticklabels(labels, rotation=90)
plt.show()

日期时间直方图

Here is a solution for when you just want to have a histogram like you expect it. This doesn’t use groupby, but converts datetime values to integers and changes labels on the plot. Some improvement could be done to move the tick labels to even locations. Also with approach a kernel density estimation plot (and any other plot) is also possible.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame({"datetime": pd.to_datetime(np.random.randint(1582800000000000000, 1583500000000000000, 100, dtype=np.int64))})
fig, ax = plt.subplots()
df["datetime"].astype(np.int64).plot.hist(ax=ax)
labels = ax.get_xticks().tolist()
labels = pd.to_datetime(labels)
ax.set_xticklabels(labels, rotation=90)
plt.show()

Datetime histogram


回答 5

我认为要解决该问题,您可以使用以下代码,它将日期类型转换为int类型:

df['date'] = df['date'].astype(int)
df['date'] = pd.to_datetime(df['date'], unit='s')

仅用于获取日期,您可以添加以下代码:

pd.DatetimeIndex(df.date).normalize()
df['date'] = pd.DatetimeIndex(df.date).normalize()

I think for solving that problem, you can use this code, it converts date type to int types:

df['date'] = df['date'].astype(int)
df['date'] = pd.to_datetime(df['date'], unit='s')

for getting date only, you can add this code:

pd.DatetimeIndex(df.date).normalize()
df['date'] = pd.DatetimeIndex(df.date).normalize()

回答 6

我也有这个问题。我想像是因为您正在使用日期,所以您想要保留时间顺序(就像我一样)。

解决方法是

import matplotlib.pyplot as plt    
counts = df['date'].value_counts(sort=False)
plt.bar(counts.index,counts)
plt.show()

请,如果有人知道更好的方法,请说出来。

编辑:对于上面的吉恩,这是数据示例[我从完整数据集中随机取样,因此是平凡的直方图数据。]

print dates
type(dates),type(dates[0])
dates.hist()
plt.show()

输出:

0    2001-07-10
1    2002-05-31
2    2003-08-29
3    2006-06-21
4    2002-03-27
5    2003-07-14
6    2004-06-15
7    2002-01-17
Name: Date, dtype: object
<class 'pandas.core.series.Series'> <type 'datetime.date'>

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-38-f39e334eece0> in <module>()
      2 print dates
      3 print type(dates),type(dates[0])
----> 4 dates.hist()
      5 plt.show()

/anaconda/lib/python2.7/site-packages/pandas/tools/plotting.pyc in hist_series(self, by, ax, grid, xlabelsize, xrot, ylabelsize, yrot, figsize, bins, **kwds)
   2570         values = self.dropna().values
   2571 
-> 2572         ax.hist(values, bins=bins, **kwds)
   2573         ax.grid(grid)
   2574         axes = np.array([ax])

/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.pyc in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
   5620             for xi in x:
   5621                 if len(xi) > 0:
-> 5622                     xmin = min(xmin, xi.min())
   5623                     xmax = max(xmax, xi.max())
   5624             bin_range = (xmin, xmax)

TypeError: can't compare datetime.date to float

I was just having trouble with this as well. I imagine that since you’re working with dates you want to preserve chronological ordering (like I did.)

The workaround then is

import matplotlib.pyplot as plt    
counts = df['date'].value_counts(sort=False)
plt.bar(counts.index,counts)
plt.show()

Please, if anyone knows of a better way please speak up.

EDIT: for jean above, here’s a sample of the data [I randomly sampled from the full dataset, hence the trivial histogram data.]

print dates
type(dates),type(dates[0])
dates.hist()
plt.show()

Output:

0    2001-07-10
1    2002-05-31
2    2003-08-29
3    2006-06-21
4    2002-03-27
5    2003-07-14
6    2004-06-15
7    2002-01-17
Name: Date, dtype: object
<class 'pandas.core.series.Series'> <type 'datetime.date'>

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-38-f39e334eece0> in <module>()
      2 print dates
      3 print type(dates),type(dates[0])
----> 4 dates.hist()
      5 plt.show()

/anaconda/lib/python2.7/site-packages/pandas/tools/plotting.pyc in hist_series(self, by, ax, grid, xlabelsize, xrot, ylabelsize, yrot, figsize, bins, **kwds)
   2570         values = self.dropna().values
   2571 
-> 2572         ax.hist(values, bins=bins, **kwds)
   2573         ax.grid(grid)
   2574         axes = np.array([ax])

/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.pyc in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
   5620             for xi in x:
   5621                 if len(xi) > 0:
-> 5622                     xmin = min(xmin, xi.min())
   5623                     xmax = max(xmax, xi.max())
   5624             bin_range = (xmin, xmax)

TypeError: can't compare datetime.date to float

回答 7

所有这些答案似乎都过于复杂,至少对于“现代”熊猫来说,这是两行。

df.set_index('date', inplace=True)
df.resample('M').size().plot.bar()

All of these answers seem overly complex, as least with ‘modern’ pandas it’s two lines.

df.set_index('date', inplace=True)
df.resample('M').size().plot.bar()