标签归档:data-visualization

在matplotlib中将x轴移动到绘图的顶部

问题:在matplotlib中将x轴移动到绘图的顶部

基于关于matplotlib中的热图的问题,我想将x轴标题移动到图的顶部。

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4,4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data, cmap=plt.cm.Blues)

# put the major ticks at the middle of each cell
ax.set_xticks(np.arange(data.shape[0])+0.5, minor=False)
ax.set_yticks(np.arange(data.shape[1])+0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.set_label_position('top') # <-- This doesn't work!

ax.set_xticklabels(row_labels, minor=False)
ax.set_yticklabels(column_labels, minor=False)
plt.show()

但是,调用matplotlib的set_label_position(如上所述)似乎没有达到预期的效果。这是我的输出:

我究竟做错了什么?

Based on this question about heatmaps in matplotlib, I wanted to move the x-axis titles to the top of the plot.

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4,4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data, cmap=plt.cm.Blues)

# put the major ticks at the middle of each cell
ax.set_xticks(np.arange(data.shape[0])+0.5, minor=False)
ax.set_yticks(np.arange(data.shape[1])+0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.set_label_position('top') # <-- This doesn't work!

ax.set_xticklabels(row_labels, minor=False)
ax.set_yticklabels(column_labels, minor=False)
plt.show()

However, calling matplotlib’s set_label_position (as notated above) doesn’t seem to have the desired effect. Here’s my output:

What am I doing wrong?


回答 0

ax.xaxis.tick_top()

将刻度线放在图像的顶部。命令

ax.set_xlabel('X LABEL')    
ax.xaxis.set_label_position('top') 

影响标签,而不影响刻度线。

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4, 4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data, cmap=plt.cm.Blues)

# put the major ticks at the middle of each cell
ax.set_xticks(np.arange(data.shape[1]) + 0.5, minor=False)
ax.set_yticks(np.arange(data.shape[0]) + 0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.tick_top()

ax.set_xticklabels(column_labels, minor=False)
ax.set_yticklabels(row_labels, minor=False)
plt.show()

Use

ax.xaxis.tick_top()

to place the tick marks at the top of the image. The command

ax.set_xlabel('X LABEL')    
ax.xaxis.set_label_position('top') 

affects the label, not the tick marks.

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4, 4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data, cmap=plt.cm.Blues)

# put the major ticks at the middle of each cell
ax.set_xticks(np.arange(data.shape[1]) + 0.5, minor=False)
ax.set_yticks(np.arange(data.shape[0]) + 0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.tick_top()

ax.set_xticklabels(column_labels, minor=False)
ax.set_yticklabels(row_labels, minor=False)
plt.show()


回答 1

您想要set_ticks_position而不是set_label_position

ax.xaxis.set_ticks_position('top') # the rest is the same

这给了我:

You want set_ticks_position rather than set_label_position:

ax.xaxis.set_ticks_position('top') # the rest is the same

This gives me:


回答 2

tick_params对于设置刻度属性非常有用。可以使用以下命令将标签移到顶部:

    ax.tick_params(labelbottom=False,labeltop=True)

tick_params is very useful for setting tick properties. Labels can be moved to the top with:

    ax.tick_params(labelbottom=False,labeltop=True)

回答 3

如果要让刻度(而不是标签)显示在顶部和底部(而不仅仅是顶部),则必须做一些额外的按摩。我可以做到的唯一方法是对unutbu的代码进行较小的更改:

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4, 4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data, cmap=plt.cm.Blues)

# put the major ticks at the middle of each cell
ax.set_xticks(np.arange(data.shape[1]) + 0.5, minor=False)
ax.set_yticks(np.arange(data.shape[0]) + 0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.tick_top()
ax.xaxis.set_ticks_position('both') # THIS IS THE ONLY CHANGE

ax.set_xticklabels(column_labels, minor=False)
ax.set_yticklabels(row_labels, minor=False)
plt.show()

输出:

You’ve got to do some extra massaging if you want the ticks (not labels) to show up on the top and bottom (not just the top). The only way I could do this is with a minor change to unutbu’s code:

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4, 4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data, cmap=plt.cm.Blues)

# put the major ticks at the middle of each cell
ax.set_xticks(np.arange(data.shape[1]) + 0.5, minor=False)
ax.set_yticks(np.arange(data.shape[0]) + 0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.tick_top()
ax.xaxis.set_ticks_position('both') # THIS IS THE ONLY CHANGE

ax.set_xticklabels(column_labels, minor=False)
ax.set_yticklabels(row_labels, minor=False)
plt.show()

Output:


如何在Python中使用Matplotlib绘制带有数据列表的直方图?

问题:如何在Python中使用Matplotlib绘制带有数据列表的直方图?

我正在尝试使用该matplotlib.hist()函数绘制直方图,但是我不确定该怎么做。

我有一个清单

probability = [0.3602150537634409, 0.42028985507246375, 
  0.373117033603708, 0.36813186813186816, 0.32517482517482516, 
  0.4175257731958763, 0.41025641025641024, 0.39408866995073893, 
  0.4143222506393862, 0.34, 0.391025641025641, 0.3130841121495327, 
  0.35398230088495575]

和名称(字符串)列表。

如何使概率作为每个小节的y值,并命名为x值?

I am trying to plot a histogram using the matplotlib.hist() function but I am not sure how to do it.

I have a list

probability = [0.3602150537634409, 0.42028985507246375, 
  0.373117033603708, 0.36813186813186816, 0.32517482517482516, 
  0.4175257731958763, 0.41025641025641024, 0.39408866995073893, 
  0.4143222506393862, 0.34, 0.391025641025641, 0.3130841121495327, 
  0.35398230088495575]

and a list of names(strings).

How do I make the probability as my y-value of each bar and names as x-values?


回答 0

如果您想要直方图,则无需在x值上附加任何“名称”,因为在x轴上您将具有数据仓:

import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
np.random.seed(42)
x = np.random.normal(size=1000)
plt.hist(x, density=True, bins=30)  # `density=False` would make counts
plt.ylabel('Probability')
plt.xlabel('Data');

您可以通过PDF线条,标题和图例使直方图更奇特:

import scipy.stats as st
plt.hist(x, density=True, bins=30, label="Data")
mn, mx = plt.xlim()
plt.xlim(mn, mx)
kde_xs = np.linspace(mn, mx, 301)
kde = st.gaussian_kde(x)
plt.plot(kde_xs, kde.pdf(kde_xs), label="PDF")
plt.legend(loc="upper left")
plt.ylabel('Probability')
plt.xlabel('Data')
plt.title("Histogram");

但是,如果您的数据点数量有限(例如在OP中),则条形图可以更好地表示您的数据(然后您可以在x轴上附加标签):

x = np.arange(3)
plt.bar(x, height=[1,2,3])
plt.xticks(x, ['a','b','c'])

If you want a histogram, you don’t need to attach any ‘names’ to x-values, as on x-axis you would have data bins:

import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
np.random.seed(42)
x = np.random.normal(size=1000)
plt.hist(x, density=True, bins=30)  # `density=False` would make counts
plt.ylabel('Probability')
plt.xlabel('Data');

You can make your histogram a bit fancier with PDF line, titles, and legend:

import scipy.stats as st
plt.hist(x, density=True, bins=30, label="Data")
mn, mx = plt.xlim()
plt.xlim(mn, mx)
kde_xs = np.linspace(mn, mx, 301)
kde = st.gaussian_kde(x)
plt.plot(kde_xs, kde.pdf(kde_xs), label="PDF")
plt.legend(loc="upper left")
plt.ylabel('Probability')
plt.xlabel('Data')
plt.title("Histogram");

However, if you have limited number of data points, like in OP, a bar plot would make more sense to represent your data (then you may attach labels to x-axis):

x = np.arange(3)
plt.bar(x, height=[1,2,3])
plt.xticks(x, ['a','b','c'])


回答 1

如果尚未安装matplotlib,请尝试使用该命令。

> pip install matplotlib

图书馆进口

import matplotlib.pyplot as plot

直方图数据:

plot.hist(weightList,density=1, bins=20) 
plot.axis([50, 110, 0, 0.06]) 
#axis([xmin,xmax,ymin,ymax])
plot.xlabel('Weight')
plot.ylabel('Probability')

显示直方图

plot.show()

和输出是这样的:

If you haven’t installed matplotlib yet just try the command.

> pip install matplotlib

Library import

import matplotlib.pyplot as plot

The histogram data:

plot.hist(weightList,density=1, bins=20) 
plot.axis([50, 110, 0, 0.06]) 
#axis([xmin,xmax,ymin,ymax])
plot.xlabel('Weight')
plot.ylabel('Probability')

Display histogram

plot.show()

And the output is like :


回答 2

尽管问题似乎要求使用以下方法绘制直方图 matplotlib.hist()函数,但可以使用问题的后半部分,即使用给定的概率作为直方图的y值并使用给定的名称(字符串)作为直方图的y值,这可以说是不可行的。 x值。

我假设一个名称列表示例与绘制该图的给定概率相对应。一个简单的条形图可以解决给定问题。可以使用以下代码:

import matplotlib.pyplot as plt
probability = [0.3602150537634409, 0.42028985507246375, 
  0.373117033603708, 0.36813186813186816, 0.32517482517482516, 
  0.4175257731958763, 0.41025641025641024, 0.39408866995073893, 
  0.4143222506393862, 0.34, 0.391025641025641, 0.3130841121495327, 
  0.35398230088495575]
names = ['name1', 'name2', 'name3', 'name4', 'name5', 'name6', 'name7', 'name8', 'name9',
'name10', 'name11', 'name12', 'name13'] #sample names
plt.bar(names, probability)
plt.xticks(names)
plt.yticks(probability) #This may be included or excluded as per need
plt.xlabel('Names')
plt.ylabel('Probability')

Though the question appears to be demanding plotting a histogram using matplotlib.hist() function, it can arguably be not done using the same as the latter part of the question demands to use the given probabilities as the y-values of bars and given names(strings) as the x-values.

I’m assuming a sample list of names corresponding to given probabilities to draw the plot. A simple bar plot serves the purpose here for the given problem. The following code can be used:

import matplotlib.pyplot as plt
probability = [0.3602150537634409, 0.42028985507246375, 
  0.373117033603708, 0.36813186813186816, 0.32517482517482516, 
  0.4175257731958763, 0.41025641025641024, 0.39408866995073893, 
  0.4143222506393862, 0.34, 0.391025641025641, 0.3130841121495327, 
  0.35398230088495575]
names = ['name1', 'name2', 'name3', 'name4', 'name5', 'name6', 'name7', 'name8', 'name9',
'name10', 'name11', 'name12', 'name13'] #sample names
plt.bar(names, probability)
plt.xticks(names)
plt.yticks(probability) #This may be included or excluded as per need
plt.xlabel('Names')
plt.ylabel('Probability')

回答 3

这是一种非常绕行的方法,但是如果要创建直方图,在该直方图中您已经知道bin值但没有源数据,则可以使用该np.random.randint函数在每个范围内生成正确数量的值bin用于绘制的hist函数,例如:

import numpy as np
import matplotlib.pyplot as plt

data = [np.random.randint(0, 9, *desired y value*), np.random.randint(10, 19, *desired y value*), etc..]
plt.hist(data, histtype='stepfilled', bins=[0, 10, etc..])

至于标签,您可以将x刻度与垃圾箱对齐以获得类似以下内容:

#The following will align labels to the center of each bar with bin intervals of 10
plt.xticks([5, 15, etc.. ], ['Label 1', 'Label 2', etc.. ])

This is a very round-about way of doing it but if you want to make a histogram where you already know the bin values but dont have the source data, you can use the np.random.randint function to generate the correct number of values within the range of each bin for the hist function to graph, for example:

import numpy as np
import matplotlib.pyplot as plt

data = [np.random.randint(0, 9, *desired y value*), np.random.randint(10, 19, *desired y value*), etc..]
plt.hist(data, histtype='stepfilled', bins=[0, 10, etc..])

as for labels you can align x ticks with bins to get something like this:

#The following will align labels to the center of each bar with bin intervals of 10
plt.xticks([5, 15, etc.. ], ['Label 1', 'Label 2', etc.. ])

回答 4

这是一个老问题,但是先前的答案都没有解决真正的问题,即问题出在问题本身这一事实。

首先,如果已经计算出概率,即直方图聚合数据可以通过归一化的方式获得,则概率应加起来为1。它们显然没有,这意味着术语或数据有问题。或以询问方式。

其次,提供标签(而不是间隔)的事实通常意味着概率是分类响应变量的-最好使用条形图来绘制直方图(或者对pyplot的hist方法进行一些修改), Shayan Shafiq的答案提供了代码。

但是,请参阅问题1,这些概率是不正确的,在这种情况下使用条形图作为“直方图”将是错误的,因为由于某些原因,它不能告诉单变量分布的故事(也许类别是重叠的,并且观察被计数为多个)时间?),这种情况下不应称为直方图。

根据定义,直方图是单变量分布的图形表示(请参见 https://www.itl.nist.gov/div898/handbook/eda/section3/histogra.htmhttps://en.wikipedia.org/wiki /直方图),并通过绘制各种尺寸的条来创建,这些条表示关注变量的选定类别中的观察次数或观察频率。如果变量以连续刻度进行测量,则这些类别为箱(间隔)。直方图创建过程的重要部分是选择如何对分类变量的响应类别进行分组(或不分组分组),或者如何将可能值的域划分为连续的区间(在其中放置bin边界)类型变量。所有观察结果都应表示出来,并且每个图中只能观察一次。这意味着条形尺寸的总和应等于观察的总数(或宽度可变的情况下其面积,这是一种较不常用的方法)。或者,如果直方图已归一化,则所有概率必须加起来为1。

如果数据本身是作为响应的“概率”列表,即观察值是每个研究对象的(某物)概率值,则最佳答案就是 plt.hist(probability)的可能的装箱选项,并使用已经可用的x标签可疑。

然后,条形图不应用作直方图,而应简单地用作

import matplotlib.pyplot as plt
probability = [0.3602150537634409, 0.42028985507246375, 
  0.373117033603708, 0.36813186813186816, 0.32517482517482516, 
  0.4175257731958763, 0.41025641025641024, 0.39408866995073893, 
  0.4143222506393862, 0.34, 0.391025641025641, 0.3130841121495327, 
  0.35398230088495575]
plt.hist(probability)
plt.show()

结果

在这种情况下,matplotlib默认带有以下直方图值

(array([1., 1., 1., 1., 1., 2., 0., 2., 0., 4.]),
 array([0.31308411, 0.32380469, 0.33452526, 0.34524584, 0.35596641,
        0.36668698, 0.37740756, 0.38812813, 0.39884871, 0.40956928,
        0.42028986]),
 <a list of 10 Patch objects>)

结果是一个数组元组,第一个数组包含观察计数,即将相对于图的y轴显示的值(它们总计为13,观察总数),第二个数组是x的区间边界-轴。

可以检查它们是否等距分布,

x = plt.hist(probability)[1]
for left, right in zip(x[:-1], x[1:]):
  print(left, right, right-left)

或者,例如,对于3个bin(我的判断是需要13个观察值),一个将获得此直方图

plt.hist(probability, bins=3)

情节数据“在酒吧后面”是

问题的作者需要弄清楚“概率”值列表的含义是什么-“概率”只是响应变量的名称(然后为什么为直方图准备了x标签,这没有任何意义),还是列表值是根据数据计算出的概率(然后它们之和不等于1的事实就没有意义了)。

This is an old question but none of the previous answers has addressed the real issue, i.e. that fact that the problem is with the question itself.

First, if the probabilities have been already calculated, i.e. the histogram aggregated data is available in a normalized way then the probabilities should add up to 1. They obviously do not and that means that something is wrong here, either with terminology or with the data or in the way the question is asked.

Second, the fact that the labels are provided (and not intervals) would normally mean that the probabilities are of categorical response variable – and a use of a bar plot for plotting the histogram is best (or some hacking of the pyplot’s hist method), Shayan Shafiq’s answer provides the code.

However, see issue 1, those probabilities are not correct and using bar plot in this case as “histogram” would be wrong because it does not tell the story of univariate distribution, for some reason (perhaps the classes are overlapping and observations are counted multiple times?) and such plot should not be called a histogram in this case.

Histogram is by definition a graphical representation of the distribution of univariate variable (see https://www.itl.nist.gov/div898/handbook/eda/section3/histogra.htm , https://en.wikipedia.org/wiki/Histogram ) and is created by drawing bars of sizes representing counts or frequencies of observations in selected classes of the variable of interest. If the variable is measured on a continuous scale those classes are bins (intervals). Important part of histogram creation procedure is making a choice of how to group (or keep without grouping) the categories of responses for a categorical variable, or how to split the domain of possible values into intervals (where to put the bin boundaries) for continuous type variable. All observations should be represented, and each one only once in the plot. That means that the sum of the bar sizes should be equal to the total count of observation (or their areas in case of the variable widths, which is a less common approach). Or, if the histogram is normalised then all probabilities must add up to 1.

If the data itself is a list of “probabilities” as a response, i.e. the observations are probability values (of something) for each object of study then the best answer is simply plt.hist(probability) with maybe binning option, and use of x-labels already available is suspicious.

Then bar plot should not be used as histogram but rather simply

import matplotlib.pyplot as plt
probability = [0.3602150537634409, 0.42028985507246375, 
  0.373117033603708, 0.36813186813186816, 0.32517482517482516, 
  0.4175257731958763, 0.41025641025641024, 0.39408866995073893, 
  0.4143222506393862, 0.34, 0.391025641025641, 0.3130841121495327, 
  0.35398230088495575]
plt.hist(probability)
plt.show()

with the results

matplotlib in such case arrives by default with the following histogram values

(array([1., 1., 1., 1., 1., 2., 0., 2., 0., 4.]),
 array([0.31308411, 0.32380469, 0.33452526, 0.34524584, 0.35596641,
        0.36668698, 0.37740756, 0.38812813, 0.39884871, 0.40956928,
        0.42028986]),
 <a list of 10 Patch objects>)

the result is a tuple of arrays, the first array contains observation counts, i.e. what will be shown against the y-axis of the plot (they add up to 13, total number of observations) and the second array are the interval boundaries for x-axis.

One can check they they are equally spaced,

x = plt.hist(probability)[1]
for left, right in zip(x[:-1], x[1:]):
  print(left, right, right-left)

Or, for example for 3 bins (my judgment call for 13 observations) one would get this histogram

plt.hist(probability, bins=3)

with the plot data “behind the bars” being

The author of the question needs to clarify what is the meaning of the “probability” list of values – is the “probability” just a name of the response variable (then why are there x-labels ready for the histogram, it makes no sense), or are the list values the probabilities calculated from the data (then the fact they do not add up to 1 makes no sense).


使用pcolor在matplotlib中进行热图绘制?

问题:使用pcolor在matplotlib中进行热图绘制?

我想制作一个像这样的热图(显示在FlowingData上):

源数据在这里,但是可以使用随机数据和标签,即

import numpy
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = numpy.random.rand(4,4)

在matplotlib中制作热图非常简单:

from matplotlib import pyplot as plt
heatmap = plt.pcolor(data)

我什至发现了一个看起来正确的colormap参数:heatmap = plt.pcolor(data, cmap=matplotlib.cm.Blues)

但是除此之外,我不知道如何显示列和行的标签以及如何以正确的方向显示数据(起源在左上角而不是左下角)。

尝试操作heatmap.axes(例如heatmap.axes.set_xticklabels = column_labels)都失败了。我在这里想念什么?

I’d like to make a heatmap like this (shown on FlowingData):

The source data is here, but random data and labels would be fine to use, i.e.

import numpy
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = numpy.random.rand(4,4)

Making the heatmap is easy enough in matplotlib:

from matplotlib import pyplot as plt
heatmap = plt.pcolor(data)

And I even found a colormap arguments that look about right: heatmap = plt.pcolor(data, cmap=matplotlib.cm.Blues)

But beyond that, I can’t figure out how to display labels for the columns and rows and display the data in the proper orientation (origin at the top left instead of bottom left).

Attempts to manipulate heatmap.axes (e.g. heatmap.axes.set_xticklabels = column_labels) have all failed. What am I missing here?


回答 0

这很晚了,但是这是我对flowingdata NBA热图的python实现。

已更新:2014/1/4:谢谢大家

# -*- coding: utf-8 -*-
# <nbformat>3.0</nbformat>

# ------------------------------------------------------------------------
# Filename   : heatmap.py
# Date       : 2013-04-19
# Updated    : 2014-01-04
# Author     : @LotzJoe >> Joe Lotz
# Description: My attempt at reproducing the FlowingData graphic in Python
# Source     : http://flowingdata.com/2010/01/21/how-to-make-a-heatmap-a-quick-and-easy-solution/
#
# Other Links:
#     http://stackoverflow.com/questions/14391959/heatmap-in-matplotlib-with-pcolor
#
# ------------------------------------------------------------------------

import matplotlib.pyplot as plt
import pandas as pd
from urllib2 import urlopen
import numpy as np
%pylab inline

page = urlopen("http://datasets.flowingdata.com/ppg2008.csv")
nba = pd.read_csv(page, index_col=0)

# Normalize data columns
nba_norm = (nba - nba.mean()) / (nba.max() - nba.min())

# Sort data according to Points, lowest to highest
# This was just a design choice made by Yau
# inplace=False (default) ->thanks SO user d1337
nba_sort = nba_norm.sort('PTS', ascending=True)

nba_sort['PTS'].head(10)

# Plot it out
fig, ax = plt.subplots()
heatmap = ax.pcolor(nba_sort, cmap=plt.cm.Blues, alpha=0.8)

# Format
fig = plt.gcf()
fig.set_size_inches(8, 11)

# turn off the frame
ax.set_frame_on(False)

# put the major ticks at the middle of each cell
ax.set_yticks(np.arange(nba_sort.shape[0]) + 0.5, minor=False)
ax.set_xticks(np.arange(nba_sort.shape[1]) + 0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.tick_top()

# Set the labels

# label source:https://en.wikipedia.org/wiki/Basketball_statistics
labels = [
    'Games', 'Minutes', 'Points', 'Field goals made', 'Field goal attempts', 'Field goal percentage', 'Free throws made', 'Free throws attempts', 'Free throws percentage',
    'Three-pointers made', 'Three-point attempt', 'Three-point percentage', 'Offensive rebounds', 'Defensive rebounds', 'Total rebounds', 'Assists', 'Steals', 'Blocks', 'Turnover', 'Personal foul']

# note I could have used nba_sort.columns but made "labels" instead
ax.set_xticklabels(labels, minor=False)
ax.set_yticklabels(nba_sort.index, minor=False)

# rotate the
plt.xticks(rotation=90)

ax.grid(False)

# Turn off all the ticks
ax = plt.gca()

for t in ax.xaxis.get_major_ticks():
    t.tick1On = False
    t.tick2On = False
for t in ax.yaxis.get_major_ticks():
    t.tick1On = False
    t.tick2On = False

输出如下所示:

这里有一个IPython的笔记本用这些代码在这里。我从“溢出”中学到了很多东西,所以希望有人会发现它有用。

This is late, but here is my python implementation of the flowingdata NBA heatmap.

updated:1/4/2014: thanks everyone

# -*- coding: utf-8 -*-
# <nbformat>3.0</nbformat>

# ------------------------------------------------------------------------
# Filename   : heatmap.py
# Date       : 2013-04-19
# Updated    : 2014-01-04
# Author     : @LotzJoe >> Joe Lotz
# Description: My attempt at reproducing the FlowingData graphic in Python
# Source     : http://flowingdata.com/2010/01/21/how-to-make-a-heatmap-a-quick-and-easy-solution/
#
# Other Links:
#     http://stackoverflow.com/questions/14391959/heatmap-in-matplotlib-with-pcolor
#
# ------------------------------------------------------------------------

import matplotlib.pyplot as plt
import pandas as pd
from urllib2 import urlopen
import numpy as np
%pylab inline

page = urlopen("http://datasets.flowingdata.com/ppg2008.csv")
nba = pd.read_csv(page, index_col=0)

# Normalize data columns
nba_norm = (nba - nba.mean()) / (nba.max() - nba.min())

# Sort data according to Points, lowest to highest
# This was just a design choice made by Yau
# inplace=False (default) ->thanks SO user d1337
nba_sort = nba_norm.sort('PTS', ascending=True)

nba_sort['PTS'].head(10)

# Plot it out
fig, ax = plt.subplots()
heatmap = ax.pcolor(nba_sort, cmap=plt.cm.Blues, alpha=0.8)

# Format
fig = plt.gcf()
fig.set_size_inches(8, 11)

# turn off the frame
ax.set_frame_on(False)

# put the major ticks at the middle of each cell
ax.set_yticks(np.arange(nba_sort.shape[0]) + 0.5, minor=False)
ax.set_xticks(np.arange(nba_sort.shape[1]) + 0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.tick_top()

# Set the labels

# label source:https://en.wikipedia.org/wiki/Basketball_statistics
labels = [
    'Games', 'Minutes', 'Points', 'Field goals made', 'Field goal attempts', 'Field goal percentage', 'Free throws made', 'Free throws attempts', 'Free throws percentage',
    'Three-pointers made', 'Three-point attempt', 'Three-point percentage', 'Offensive rebounds', 'Defensive rebounds', 'Total rebounds', 'Assists', 'Steals', 'Blocks', 'Turnover', 'Personal foul']

# note I could have used nba_sort.columns but made "labels" instead
ax.set_xticklabels(labels, minor=False)
ax.set_yticklabels(nba_sort.index, minor=False)

# rotate the
plt.xticks(rotation=90)

ax.grid(False)

# Turn off all the ticks
ax = plt.gca()

for t in ax.xaxis.get_major_ticks():
    t.tick1On = False
    t.tick2On = False
for t in ax.yaxis.get_major_ticks():
    t.tick1On = False
    t.tick2On = False

The output looks like this:

There’s an ipython notebook with all this code here. I’ve learned a lot from ‘overflow so hopefully someone will find this useful.


回答 1

python seaborn模块基于matplotlib,并产生非常好的热图。

下面是针对ipython / jupyter笔记本设计的seaborn实现。

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
# import the data directly into a pandas dataframe
nba = pd.read_csv("http://datasets.flowingdata.com/ppg2008.csv", index_col='Name  ')
# remove index title
nba.index.name = ""
# normalize data columns
nba_norm = (nba - nba.mean()) / (nba.max() - nba.min())
# relabel columns
labels = ['Games', 'Minutes', 'Points', 'Field goals made', 'Field goal attempts', 'Field goal percentage', 'Free throws made', 
          'Free throws attempts', 'Free throws percentage','Three-pointers made', 'Three-point attempt', 'Three-point percentage', 
          'Offensive rebounds', 'Defensive rebounds', 'Total rebounds', 'Assists', 'Steals', 'Blocks', 'Turnover', 'Personal foul']
nba_norm.columns = labels
# set appropriate font and dpi
sns.set(font_scale=1.2)
sns.set_style({"savefig.dpi": 100})
# plot it out
ax = sns.heatmap(nba_norm, cmap=plt.cm.Blues, linewidths=.1)
# set the x-axis labels on the top
ax.xaxis.tick_top()
# rotate the x-axis labels
plt.xticks(rotation=90)
# get figure (usually obtained via "fig,ax=plt.subplots()" with matplotlib)
fig = ax.get_figure()
# specify dimensions and save
fig.set_size_inches(15, 20)
fig.savefig("nba.png")

输出看起来像这样: 我使用了matplotlib Blues颜色图,但是个人发现默认颜色非常漂亮。我用matplotlib旋转了x轴标签,因为找不到语法。正如grexor指出的那样,有必要通过反复试验来指定尺寸(fig.set_size_inches),这让我感到有些沮丧。

如Paul H所述,您可以轻松地将值添加到热图(annot = True),但是在这种情况下,我认为它并没有改善该图。joelotz的出色回答摘录了几个代码段。

The python seaborn module is based on matplotlib, and produces a very nice heatmap.

Below is an implementation with seaborn, designed for the ipython/jupyter notebook.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
# import the data directly into a pandas dataframe
nba = pd.read_csv("http://datasets.flowingdata.com/ppg2008.csv", index_col='Name  ')
# remove index title
nba.index.name = ""
# normalize data columns
nba_norm = (nba - nba.mean()) / (nba.max() - nba.min())
# relabel columns
labels = ['Games', 'Minutes', 'Points', 'Field goals made', 'Field goal attempts', 'Field goal percentage', 'Free throws made', 
          'Free throws attempts', 'Free throws percentage','Three-pointers made', 'Three-point attempt', 'Three-point percentage', 
          'Offensive rebounds', 'Defensive rebounds', 'Total rebounds', 'Assists', 'Steals', 'Blocks', 'Turnover', 'Personal foul']
nba_norm.columns = labels
# set appropriate font and dpi
sns.set(font_scale=1.2)
sns.set_style({"savefig.dpi": 100})
# plot it out
ax = sns.heatmap(nba_norm, cmap=plt.cm.Blues, linewidths=.1)
# set the x-axis labels on the top
ax.xaxis.tick_top()
# rotate the x-axis labels
plt.xticks(rotation=90)
# get figure (usually obtained via "fig,ax=plt.subplots()" with matplotlib)
fig = ax.get_figure()
# specify dimensions and save
fig.set_size_inches(15, 20)
fig.savefig("nba.png")

The output looks like this: I used the matplotlib Blues color map, but personally find the default colors quite beautiful. I used matplotlib to rotate the x-axis labels, as I couldn’t find the seaborn syntax. As noted by grexor, it was necessary to specify the dimensions (fig.set_size_inches) by trial and error, which I found a bit frustrating.

As noted by Paul H, you can easily add the values to heat maps (annot=True), but in this case I didn’t think it improved the figure. Several code snippets were taken from the excellent answer by joelotz.


回答 2

主要问题是您首先需要设置x和y刻度的位置。而且,它有助于将更多面向对象的接口用于matplotlib。即,axes直接与对象进行交互。

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4,4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data)

# put the major ticks at the middle of each cell, notice "reverse" use of dimension
ax.set_yticks(np.arange(data.shape[0])+0.5, minor=False)
ax.set_xticks(np.arange(data.shape[1])+0.5, minor=False)


ax.set_xticklabels(row_labels, minor=False)
ax.set_yticklabels(column_labels, minor=False)
plt.show()

希望能有所帮助。

Main issue is that you first need to set the location of your x and y ticks. Also, it helps to use the more object-oriented interface to matplotlib. Namely, interact with the axes object directly.

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4,4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data)

# put the major ticks at the middle of each cell, notice "reverse" use of dimension
ax.set_yticks(np.arange(data.shape[0])+0.5, minor=False)
ax.set_xticks(np.arange(data.shape[1])+0.5, minor=False)


ax.set_xticklabels(row_labels, minor=False)
ax.set_yticklabels(column_labels, minor=False)
plt.show()

Hope that helps.


回答 3

有人编辑了这个问题以删除我使用的代码,因此我被迫将其添加为答案。感谢所有参与回答这个问题的人!我认为其他大多数答案都比该代码更好,我只是在这里留作参考。

感谢Paul Hunutbu(回答了这个问题),我得到了一些非常漂亮的输出:

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4,4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data, cmap=plt.cm.Blues)

# put the major ticks at the middle of each cell
ax.set_xticks(np.arange(data.shape[0])+0.5, minor=False)
ax.set_yticks(np.arange(data.shape[1])+0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.tick_top()

ax.set_xticklabels(row_labels, minor=False)
ax.set_yticklabels(column_labels, minor=False)
plt.show()

这是输出:

Someone edited this question to remove the code I used, so I was forced to add it as an answer. Thanks to all who participated in answering this question! I think most of the other answers are better than this code, I’m just leaving this here for reference purposes.

With thanks to Paul H, and unutbu (who answered this question), I have some pretty nice-looking output:

import matplotlib.pyplot as plt
import numpy as np
column_labels = list('ABCD')
row_labels = list('WXYZ')
data = np.random.rand(4,4)
fig, ax = plt.subplots()
heatmap = ax.pcolor(data, cmap=plt.cm.Blues)

# put the major ticks at the middle of each cell
ax.set_xticks(np.arange(data.shape[0])+0.5, minor=False)
ax.set_yticks(np.arange(data.shape[1])+0.5, minor=False)

# want a more natural, table-like display
ax.invert_yaxis()
ax.xaxis.tick_top()

ax.set_xticklabels(row_labels, minor=False)
ax.set_yticklabels(column_labels, minor=False)
plt.show()

And here’s the output:


使用熊猫绘制相关矩阵

问题:使用熊猫绘制相关矩阵

我有一个包含大量特征的数据集,因此分析相关矩阵变得非常困难。我想绘制一个相关矩阵,我们可以使用dataframe.corr()熊猫库中的函数获取相关矩阵。熊猫库是否提供任何内置函数来绘制此矩阵?

I have a data set with huge number of features, so analysing the correlation matrix has become very difficult. I want to plot a correlation matrix which we get using dataframe.corr() function from pandas library. Is there any built-in function provided by the pandas library to plot this matrix?


回答 0

您可以使用pyplot.matshow()matplotlib

import matplotlib.pyplot as plt

plt.matshow(dataframe.corr())
plt.show()

编辑:

在注释中,要求更改轴刻度标签。这是一个豪华的版本,它使用较大的图形尺寸绘制,具有与数据框匹配的轴标签,以及用于解释色阶的色条图例。

我将介绍如何调整标签的大小和旋转角度,并使用数字比例使颜色条和主图形的高度相同。

f = plt.figure(figsize=(19, 15))
plt.matshow(df.corr(), fignum=f.number)
plt.xticks(range(df.shape[1]), df.columns, fontsize=14, rotation=45)
plt.yticks(range(df.shape[1]), df.columns, fontsize=14)
cb = plt.colorbar()
cb.ax.tick_params(labelsize=14)
plt.title('Correlation Matrix', fontsize=16);

You can use pyplot.matshow() from matplotlib:

import matplotlib.pyplot as plt

plt.matshow(dataframe.corr())
plt.show()

Edit:

In the comments was a request for how to change the axis tick labels. Here’s a deluxe version that is drawn on a bigger figure size, has axis labels to match the dataframe, and a colorbar legend to interpret the color scale.

I’m including how to adjust the size and rotation of the labels, and I’m using a figure ratio that makes the colorbar and the main figure come out the same height.

f = plt.figure(figsize=(19, 15))
plt.matshow(df.corr(), fignum=f.number)
plt.xticks(range(df.shape[1]), df.columns, fontsize=14, rotation=45)
plt.yticks(range(df.shape[1]), df.columns, fontsize=14)
cb = plt.colorbar()
cb.ax.tick_params(labelsize=14)
plt.title('Correlation Matrix', fontsize=16);


回答 1

如果您的主要目标是可视化相关矩阵,而不是自己创建图表,则便捷的pandas 样式选项是可行的内置解决方案:

import pandas as pd
import numpy as np

rs = np.random.RandomState(0)
df = pd.DataFrame(rs.rand(10, 10))
corr = df.corr()
corr.style.background_gradient(cmap='coolwarm')
# 'RdBu_r' & 'BrBG' are other good diverging colormaps

请注意,这需要在支持渲染HTML的后端中,例如JupyterLab Notebook。(深色背景上的自动浅色文本来自现有PR,而不是最新发布的版本pandas0.23)。


造型

您可以轻松限制数字精度:

corr.style.background_gradient(cmap='coolwarm').set_precision(2)

或者,如果您更喜欢没有注释的矩阵,也可以完全删除数字:

corr.style.background_gradient(cmap='coolwarm').set_properties(**{'font-size': '0pt'})

样式文档还包括更高级样式的说明,例如如何更改鼠标指针悬停在其上方的单元格的显示。为了保存输出,您可以通过附加render()方法来返回HTML ,然后将其写入文件(或者只是截取屏幕快照,以减少非正式目的)。


时间比较

在我的测试中,速度是10×10矩阵的style.background_gradient()4倍,是plt.matshow()120x的120 倍sns.heatmap()。不幸的是,它的伸缩性不如plt.matshow():对于100×100的矩阵,两者花费的时间大约相同,而plt.matshow()对于1000×1000的矩阵,两者的速度要快10倍。


保存

有几种方法可以保存样式化数据框:

  • 通过追加render()方法返回HTML ,然后将输出写入文件。
  • .xslx通过附加该to_excel()方法以条件格式另存为文件。
  • 与imgkit结合以保存位图
  • 截屏(出于非正式目的)。

熊猫> = 0.24的更新

通过设置axis=None,现在可以基于整个矩阵而不是每列或每行计算颜色:

corr.style.background_gradient(cmap='coolwarm', axis=None)

If your main goal is to visualize the correlation matrix, rather than creating a plot per se, the convenient pandas styling options is a viable built-in solution:

import pandas as pd
import numpy as np

rs = np.random.RandomState(0)
df = pd.DataFrame(rs.rand(10, 10))
corr = df.corr()
corr.style.background_gradient(cmap='coolwarm')
# 'RdBu_r' & 'BrBG' are other good diverging colormaps

Note that this needs to be in a backend that supports rendering HTML, such as the JupyterLab Notebook. (The automatic light text on dark backgrounds is from an existing PR and not the latest released version, pandas 0.23).


Styling

You can easily limit the digit precision:

corr.style.background_gradient(cmap='coolwarm').set_precision(2)

Or get rid of the digits altogether if you prefer the matrix without annotations:

corr.style.background_gradient(cmap='coolwarm').set_properties(**{'font-size': '0pt'})

The styling documentation also includes instructions of more advanced styles, such as how to change the display of the cell the mouse pointer is hovering over. To save the output you could return the HTML by appending the render() method and then write it to a file (or just take a screenshot for less formal purposes).


Time comparison

In my testing, style.background_gradient() was 4x faster than plt.matshow() and 120x faster than sns.heatmap() with a 10×10 matrix. Unfortunately it doesn’t scale as well as plt.matshow(): the two take about the same time for a 100×100 matrix, and plt.matshow() is 10x faster for a 1000×1000 matrix.


Saving

There are a few possible ways to save the stylized dataframe:

  • Return the HTML by appending the render() method and then write the output to a file.
  • Save as an .xslx file with conditional formatting by appending the to_excel() method.
  • Combine with imgkit to save a bitmap
  • Take a screenshot (for less formal purposes).

Update for pandas >= 0.24

By setting axis=None, it is now possible to compute the colors based on the entire matrix rather than per column or per row:

corr.style.background_gradient(cmap='coolwarm', axis=None)


回答 2

试试这个函数,它也显示相关矩阵的变量名:

def plot_corr(df,size=10):
    '''Function plots a graphical correlation matrix for each pair of columns in the dataframe.

    Input:
        df: pandas DataFrame
        size: vertical and horizontal size of the plot'''

    corr = df.corr()
    fig, ax = plt.subplots(figsize=(size, size))
    ax.matshow(corr)
    plt.xticks(range(len(corr.columns)), corr.columns);
    plt.yticks(range(len(corr.columns)), corr.columns);

Try this function, which also displays variable names for the correlation matrix:

def plot_corr(df,size=10):
    '''Function plots a graphical correlation matrix for each pair of columns in the dataframe.

    Input:
        df: pandas DataFrame
        size: vertical and horizontal size of the plot'''

    corr = df.corr()
    fig, ax = plt.subplots(figsize=(size, size))
    ax.matshow(corr)
    plt.xticks(range(len(corr.columns)), corr.columns);
    plt.yticks(range(len(corr.columns)), corr.columns);

回答 3

Seaborn的热图版本:

import seaborn as sns
corr = dataframe.corr()
sns.heatmap(corr, 
            xticklabels=corr.columns.values,
            yticklabels=corr.columns.values)

Seaborn’s heatmap version:

import seaborn as sns
corr = dataframe.corr()
sns.heatmap(corr, 
            xticklabels=corr.columns.values,
            yticklabels=corr.columns.values)

回答 4

您可以通过绘制Seaborn的热图或熊猫的散点图来观察要素之间的关系。

散点矩阵:

pd.scatter_matrix(dataframe, alpha = 0.3, figsize = (14,8), diagonal = 'kde');

如果您还想可视化每个特征的偏斜度,请使用深浅的成对图。

sns.pairplot(dataframe)

SNS热图:

import seaborn as sns

f, ax = pl.subplots(figsize=(10, 8))
corr = dataframe.corr()
sns.heatmap(corr, mask=np.zeros_like(corr, dtype=np.bool), cmap=sns.diverging_palette(220, 10, as_cmap=True),
            square=True, ax=ax)

输出将是要素的关联图。即见下面的例子。

杂货和洗涤剂之间的相关性很高。类似地:

具有高相关性的产品:
  1. 杂货和洗涤剂。
具有中等相关性的产品:
  1. 牛奶和杂货
  2. 牛奶和洗涤剂_纸
低相关性产品:
  1. 牛奶和熟食
  2. 冷冻和新鲜。
  3. 冷冻和熟食。

从线对图:您可以从线对图或散布矩阵观察同一组关系。但是从这些我们可以说,数据是否是正态分布的。

注意:以上是从数据中提取的相同图形,用于绘制热图。

You can observe the relation between features either by drawing a heat map from seaborn or scatter matrix from pandas.

Scatter Matrix:

pd.scatter_matrix(dataframe, alpha = 0.3, figsize = (14,8), diagonal = 'kde');

If you want to visualize each feature’s skewness as well – use seaborn pairplots.

sns.pairplot(dataframe)

Sns Heatmap:

import seaborn as sns

f, ax = pl.subplots(figsize=(10, 8))
corr = dataframe.corr()
sns.heatmap(corr, mask=np.zeros_like(corr, dtype=np.bool), cmap=sns.diverging_palette(220, 10, as_cmap=True),
            square=True, ax=ax)

The output will be a correlation map of the features. i.e. see the below example.

The correlation between grocery and detergents is high. Similarly:

Pdoducts With High Correlation:
  1. Grocery and Detergents.
Products With Medium Correlation:
  1. Milk and Grocery
  2. Milk and Detergents_Paper
Products With Low Correlation:
  1. Milk and Deli
  2. Frozen and Fresh.
  3. Frozen and Deli.

From Pairplots: You can observe same set of relations from pairplots or scatter matrix. But from these we can say that whether the data is normally distributed or not.

Note: The above is same graph taken from the data, which is used to draw heatmap.


回答 5

您可以从matplotlib使用imshow()方法

import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('ggplot')

plt.imshow(X.corr(), cmap=plt.cm.Reds, interpolation='nearest')
plt.colorbar()
tick_marks = [i for i in range(len(X.columns))]
plt.xticks(tick_marks, X.columns, rotation='vertical')
plt.yticks(tick_marks, X.columns)
plt.show()

You can use imshow() method from matplotlib

import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('ggplot')

plt.imshow(X.corr(), cmap=plt.cm.Reds, interpolation='nearest')
plt.colorbar()
tick_marks = [i for i in range(len(X.columns))]
plt.xticks(tick_marks, X.columns, rotation='vertical')
plt.yticks(tick_marks, X.columns)
plt.show()

回答 6

如果您df使用的是数据框,则可以简单地使用:

import matplotlib.pyplot as plt
import seaborn as sns

plt.figure(figsize=(15, 10))
sns.heatmap(df.corr(), annot=True)

If you dataframe is df you can simply use:

import matplotlib.pyplot as plt
import seaborn as sns

plt.figure(figsize=(15, 10))
sns.heatmap(df.corr(), annot=True)

回答 7

statmodels图形还提供了一个很好的相关矩阵视图

import statsmodels.api as sm
import matplotlib.pyplot as plt

corr = dataframe.corr()
sm.graphics.plot_corr(corr, xnames=list(corr.columns))
plt.show()

statmodels graphics also gives a nice view of correlation matrix

import statsmodels.api as sm
import matplotlib.pyplot as plt

corr = dataframe.corr()
sm.graphics.plot_corr(corr, xnames=list(corr.columns))
plt.show()

回答 8

为了完整起见,如果有人正在使用Jupyter,则是2019年底我所知道的seaborn最简单的解决方案:

import seaborn as sns
sns.heatmap(dataframe.corr())

For completeness, the simplest solution i know with seaborn as of late 2019, if one is using Jupyter:

import seaborn as sns
sns.heatmap(dataframe.corr())

回答 9

与其他方法一起使用pairplot也会很好,它会给出所有情况的散点图,

import pandas as pd
import numpy as np
import seaborn as sns
rs = np.random.RandomState(0)
df = pd.DataFrame(rs.rand(10, 10))
sns.pairplot(df)

Along with other methods it is also good to have pairplot which will give scatter plot for all the cases-

import pandas as pd
import numpy as np
import seaborn as sns
rs = np.random.RandomState(0)
df = pd.DataFrame(rs.rand(10, 10))
sns.pairplot(df)

回答 10

形式相关矩阵,在我的情况下zdf是我需要执行相关矩阵的数据帧。

corrMatrix =zdf.corr()
corrMatrix.to_csv('sm_zscaled_correlation_matrix.csv');
html = corrMatrix.style.background_gradient(cmap='RdBu').set_precision(2).render()

# Writing the output to a html file.
with open('test.html', 'w') as f:
   print('<!DOCTYPE html><html lang="en"><head><meta charset="UTF-8"><meta name="viewport" content="width=device-widthinitial-scale=1.0"><title>Document</title></head><style>table{word-break: break-all;}</style><body>' + html+'</body></html>', file=f)

然后我们可以截屏。或将html转换为图像文件。

Form correlation matrix, in my case zdf is the dataframe which i need perform correlation matrix.

corrMatrix =zdf.corr()
corrMatrix.to_csv('sm_zscaled_correlation_matrix.csv');
html = corrMatrix.style.background_gradient(cmap='RdBu').set_precision(2).render()

# Writing the output to a html file.
with open('test.html', 'w') as f:
   print('<!DOCTYPE html><html lang="en"><head><meta charset="UTF-8"><meta name="viewport" content="width=device-widthinitial-scale=1.0"><title>Document</title></head><style>table{word-break: break-all;}</style><body>' + html+'</body></html>', file=f)

Then we can take screenshot. or convert html to an image file.


如何将numpy数组转换为(并显示)图像?

问题:如何将numpy数组转换为(并显示)图像?

我因此创建了一个数组:

import numpy as np
data = np.zeros( (512,512,3), dtype=np.uint8)
data[256,256] = [255,0,0]

我要执行的操作是在512×512图像的中心显示一个红点。(至少从…开始,我想我可以从那里找出其余的内容)

I have created an array thusly:

import numpy as np
data = np.zeros( (512,512,3), dtype=np.uint8)
data[256,256] = [255,0,0]

What I want this to do is display a single red dot in the center of a 512×512 image. (At least to begin with… I think I can figure out the rest from there)


回答 0

您可以使用PIL创建(并显示)图像:

from PIL import Image
import numpy as np

w, h = 512, 512
data = np.zeros((h, w, 3), dtype=np.uint8)
data[0:256, 0:256] = [255, 0, 0] # red patch in upper left
img = Image.fromarray(data, 'RGB')
img.save('my.png')
img.show()

You could use PIL to create (and display) an image:

from PIL import Image
import numpy as np

w, h = 512, 512
data = np.zeros((h, w, 3), dtype=np.uint8)
data[0:256, 0:256] = [255, 0, 0] # red patch in upper left
img = Image.fromarray(data, 'RGB')
img.save('my.png')
img.show()

回答 1

以下应该工作:

from matplotlib import pyplot as plt
plt.imshow(data, interpolation='nearest')
plt.show()

如果您使用的是Jupyter笔记本/实验室,请在导入matplotlib之前使用以下内联命令:

%matplotlib inline 

The following should work:

from matplotlib import pyplot as plt
plt.imshow(data, interpolation='nearest')
plt.show()

If you are using Jupyter notebook/lab, use this inline command before importing matplotlib:

%matplotlib inline 

回答 2

最短的路径是使用scipy,如下所示:

from scipy.misc import toimage
toimage(data).show()

这也需要安装PIL或Pillow。

同样需要PIL或Pillow但可以调用其他查看器的类似方法是:

from scipy.misc import imshow
imshow(data)

Shortest path is to use scipy, like this:

from scipy.misc import toimage
toimage(data).show()

This requires PIL or Pillow to be installed as well.

A similar approach also requiring PIL or Pillow but which may invoke a different viewer is:

from scipy.misc import imshow
imshow(data)

回答 3

使用pygame,您可以打开一个窗口,以像素阵列的形式获取表面,然后从那里进行操作。但是,您需要将numpy数组复制到Surface数组中,这比在pygame Surface本身上进行实际图形操作要慢得多。

Using pygame, you can open a window, get the surface as an array of pixels, and manipulate as you want from there. You’ll need to copy your numpy array into the surface array, however, which will be much slower than doing actual graphics operations on the pygame surfaces themselves.


回答 4

如何使用示例显示存储在numpy数组中的图像(在Jupyter笔记本中有效)

我知道有更简单的答案,但是这一答案将使您了解如何从numpy数组中淹没图像。

加载示例

from sklearn.datasets import load_digits
digits = load_digits()
digits.images.shape   #this will give you (1797, 8, 8). 1797 images, each 8 x 8 in size

显示一幅图像的阵列

digits.images[0]
array([[ 0.,  0.,  5., 13.,  9.,  1.,  0.,  0.],
       [ 0.,  0., 13., 15., 10., 15.,  5.,  0.],
       [ 0.,  3., 15.,  2.,  0., 11.,  8.,  0.],
       [ 0.,  4., 12.,  0.,  0.,  8.,  8.,  0.],
       [ 0.,  5.,  8.,  0.,  0.,  9.,  8.,  0.],
       [ 0.,  4., 11.,  0.,  1., 12.,  7.,  0.],
       [ 0.,  2., 14.,  5., 10., 12.,  0.,  0.],
       [ 0.,  0.,  6., 13., 10.,  0.,  0.,  0.]])

创建空的10 x 10子图以可视化100张图像

import matplotlib.pyplot as plt
fig, axes = plt.subplots(10,10, figsize=(8,8))

绘制100张图像

for i,ax in enumerate(axes.flat):
    ax.imshow(digits.images[i])

结果:

怎么axes.flat办? 它创建了numpy枚举器,因此您可以在轴上迭代以在其上绘制对象。 例:

import numpy as np
x = np.arange(6).reshape(2,3)
x.flat
for item in (x.flat):
    print (item, end=' ')

How to show images stored in numpy array with example (works in Jupyter notebook)

I know there are simpler answers but this one will give you understanding of how images are actually drawn from a numpy array.

Load example

from sklearn.datasets import load_digits
digits = load_digits()
digits.images.shape   #this will give you (1797, 8, 8). 1797 images, each 8 x 8 in size

Display array of one image

digits.images[0]
array([[ 0.,  0.,  5., 13.,  9.,  1.,  0.,  0.],
       [ 0.,  0., 13., 15., 10., 15.,  5.,  0.],
       [ 0.,  3., 15.,  2.,  0., 11.,  8.,  0.],
       [ 0.,  4., 12.,  0.,  0.,  8.,  8.,  0.],
       [ 0.,  5.,  8.,  0.,  0.,  9.,  8.,  0.],
       [ 0.,  4., 11.,  0.,  1., 12.,  7.,  0.],
       [ 0.,  2., 14.,  5., 10., 12.,  0.,  0.],
       [ 0.,  0.,  6., 13., 10.,  0.,  0.,  0.]])

Create empty 10 x 10 subplots for visualizing 100 images

import matplotlib.pyplot as plt
fig, axes = plt.subplots(10,10, figsize=(8,8))

Plotting 100 images

for i,ax in enumerate(axes.flat):
    ax.imshow(digits.images[i])

Result:

What does axes.flat do? It creates a numpy enumerator so you can iterate over axis in order to draw objects on them. Example:

import numpy as np
x = np.arange(6).reshape(2,3)
x.flat
for item in (x.flat):
    print (item, end=' ')

回答 5

例如,使用枕头的fromarray:

from PIL import Image
from numpy import *

im = array(Image.open('image.jpg'))
Image.fromarray(im).show()

Using pillow’s fromarray, for example:

from PIL import Image
from numpy import *

im = array(Image.open('image.jpg'))
Image.fromarray(im).show()

回答 6

Python图像库可以显示使用numpy的阵列的图像。查看此页面以获取示例代码:

编辑:正如该页面底部的注释所述,您应该检查最新的发行说明,这会使此过程变得更加简单:

http://effbot.org/zone/pil-changes-116.htm

The Python Imaging Library can display images using Numpy arrays. Take a look at this page for sample code:

EDIT: As the note on the bottom of that page says, you should check the latest release notes which make this much simpler:

http://effbot.org/zone/pil-changes-116.htm


回答 7

使用matplotlib进行补充。我发现在执行计算机视觉任务时很方便。假设您有dtype = int32的数据

from matplotlib import pyplot as plot
import numpy as np

fig = plot.figure()
ax = fig.add_subplot(1, 1, 1)
# make sure your data is in H W C, otherwise you can change it by
# data = data.transpose((_, _, _))
data = np.zeros((512,512,3), dtype=np.int32)
data[256,256] = [255,0,0]
ax.imshow(data.astype(np.uint8))

Supplement for doing so with matplotlib. I found it handy doing computer vision tasks. Let’s say you got data with dtype = int32

from matplotlib import pyplot as plot
import numpy as np

fig = plot.figure()
ax = fig.add_subplot(1, 1, 1)
# make sure your data is in H W C, otherwise you can change it by
# data = data.transpose((_, _, _))
data = np.zeros((512,512,3), dtype=np.int32)
data[256,256] = [255,0,0]
ax.imshow(data.astype(np.uint8))

如何使IPython Notebook Matplotlib内联绘图

问题:如何使IPython Notebook Matplotlib内联绘图

我正在MacOS X上使用Python 2.7.2和IPython 1.1.0的情况下使用IPython Notebook。

我无法获得matplotlib图形来内联显示。

import matplotlib
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline  

我也试过了%pylab inline和ipython命令行参数,--pylab=inline但这没什么区别。

x = np.linspace(0, 3*np.pi, 500)
plt.plot(x, np.sin(x**2))
plt.title('A simple chirp')
plt.show()

我得到的不是内联图形,而是:

<matplotlib.figure.Figure at 0x110b9c450>

matplotlib.get_backend()表明我有'module://IPython.kernel.zmq.pylab.backend_inline'后端。

I am trying to use IPython notebook on MacOS X with Python 2.7.2 and IPython 1.1.0.

I cannot get matplotlib graphics to show up inline.

import matplotlib
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline  

I have also tried %pylab inline and the ipython command line arguments --pylab=inline but this makes no difference.

x = np.linspace(0, 3*np.pi, 500)
plt.plot(x, np.sin(x**2))
plt.title('A simple chirp')
plt.show()

Instead of inline graphics, I get this:

<matplotlib.figure.Figure at 0x110b9c450>

And matplotlib.get_backend() shows that I have the 'module://IPython.kernel.zmq.pylab.backend_inline' backend.


回答 0

%matplotlib inline在笔记本的第一个单元中使用了它,并且可以正常工作。我认为您应该尝试:

%matplotlib inline

import matplotlib
import numpy as np
import matplotlib.pyplot as plt

通过在配置文件中设置以下配置选项,默认情况下,您也始终可以始终默认以内联模式启动所有IPython内核:

c.IPKernelApp.matplotlib=<CaselessStrEnum>
  Default: None
  Choices: ['auto', 'gtk', 'gtk3', 'inline', 'nbagg', 'notebook', 'osx', 'qt', 'qt4', 'qt5', 'tk', 'wx']
  Configure matplotlib for interactive use with the default matplotlib backend.

I used %matplotlib inline in the first cell of the notebook and it works. I think you should try:

%matplotlib inline

import matplotlib
import numpy as np
import matplotlib.pyplot as plt

You can also always start all your IPython kernels in inline mode by default by setting the following config options in your config files:

c.IPKernelApp.matplotlib=<CaselessStrEnum>
  Default: None
  Choices: ['auto', 'gtk', 'gtk3', 'inline', 'nbagg', 'notebook', 'osx', 'qt', 'qt4', 'qt5', 'tk', 'wx']
  Configure matplotlib for interactive use with the default matplotlib backend.

回答 1

如果您的matplotlib版本高于1.4,则也可以使用

IPython 3.x及更高版本

%matplotlib notebook

import matplotlib.pyplot as plt

旧版本

%matplotlib nbagg

import matplotlib.pyplot as plt

两者都将激活nbagg后端,从而启用交互性。

If your matplotlib version is above 1.4, it is also possible to use

IPython 3.x and above

%matplotlib notebook

import matplotlib.pyplot as plt

older versions

%matplotlib nbagg

import matplotlib.pyplot as plt

Both will activate the nbagg backend, which enables interactivity.


回答 2

Ctrl + Enter

%matplotlib inline

魔线:D

请参阅:使用Matplotlib进行绘图

Ctrl + Enter

%matplotlib inline

Magic Line :D

See: Plotting with Matplotlib.


回答 3

使用%pylab inline魔术命令。

Use the %pylab inline magic command.


回答 4

要在Jupyter(IPython 3)中默认使matplotlib内联:

  1. 编辑档案 ~/.ipython/profile_default/ipython_config.py

  2. 加线 c.InteractiveShellApp.matplotlib = 'inline'

请注意,添加该行将ipython_notebook_config.py不起作用。否则,它可以与Jupyter和IPython 3.1.0一起使用

To make matplotlib inline by default in Jupyter (IPython 3):

  1. Edit file ~/.ipython/profile_default/ipython_config.py

  2. Add line c.InteractiveShellApp.matplotlib = 'inline'

Please note that adding this line to ipython_notebook_config.py would not work. Otherwise it works well with Jupyter and IPython 3.1.0


回答 5

我必须同意foobarbecue(我的建议不足,无法简单地在他的帖子下插入评论):

--pylab根据Fernando Perez(ipythonnb的创建者)的说法,现在建议不要使用该参数启动python笔记本。%matplotlib inline应该是笔记本的初始命令。

看到这里:http : //nbviewer.ipython.org/github/ipython/ipython/blob/1.x/examples/notebooks/Part%203%20-%20Plotting%20with%20Matplotlib.ipynb

I have to agree with foobarbecue (I don’t have enough recs to be able to simply insert a comment under his post):

It’s now recommended that python notebook isn’t started wit the argument --pylab, and according to Fernando Perez (creator of ipythonnb) %matplotlib inline should be the initial notebook command.

See here: http://nbviewer.ipython.org/github/ipython/ipython/blob/1.x/examples/notebooks/Part%203%20-%20Plotting%20with%20Matplotlib.ipynb


回答 6

我找到了一种非常令人满意的解决方法。我安装了Anaconda Python,现在对我来说开箱即用。

I found a workaround that is quite satisfactory. I installed Anaconda Python and this now works out of the box for me.


回答 7

我做了anaconda安装,但是matplotlib没有绘制

当我这样做时它开始绘图

import matplotlib
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline  

I did the anaconda install but matplotlib is not plotting

It starts plotting when i did this

import matplotlib
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline  

回答 8

您可以使用语法错误来模拟此问题,但是%matplotlib inline无法解决该问题。

首先是创建绘图的正确方法的示例。eNord9提供的导入内容和魔术可以使一切正常工作。

df_randNumbers1 = pd.DataFrame(np.random.randint(0,100,size=(100, 6)), columns=list('ABCDEF'))

df_randNumbers1.ix[:,["A","B"]].plot.kde()

但是,通过将()绘图类型的末尾保留为空白,您会收到含糊不清的非错误。

错误代码:

df_randNumbers1.ix[:,["A","B"]].plot.kde

错误示例:

<bound method FramePlotMethods.kde of <pandas.tools.plotting.FramePlotMethods object at 0x000001DDAF029588>>

除了这一行消息外,没有堆栈跟踪或其他明显的理由认为您犯了语法错误。该图不打印。

You can simulate this problem with a syntax mistake, however, %matplotlib inline won’t resolve the issue.

First an example of the right way to create a plot. Everything works as expected with the imports and magic that eNord9 supplied.

df_randNumbers1 = pd.DataFrame(np.random.randint(0,100,size=(100, 6)), columns=list('ABCDEF'))

df_randNumbers1.ix[:,["A","B"]].plot.kde()

However, by leaving the () off the end of the plot type you receive a somewhat ambiguous non-error.

Erronious code:

df_randNumbers1.ix[:,["A","B"]].plot.kde

Example error:

<bound method FramePlotMethods.kde of <pandas.tools.plotting.FramePlotMethods object at 0x000001DDAF029588>>

Other than this one line message, there is no stack trace or other obvious reason to think you made a syntax error. The plot doesn’t print.


回答 9

在Jupyter的单独单元中运行绘图命令时,我遇到了同样的问题:

In [1]:  %matplotlib inline
         import matplotlib
         import matplotlib.pyplot as plt
         import numpy as np
In [2]:  x = np.array([1, 3, 4])
         y = np.array([1, 5, 3])
In [3]:  fig = plt.figure()
         <Figure size 432x288 with 0 Axes>                      #this might be the problem
In [4]:  ax = fig.add_subplot(1, 1, 1)
In [5]:  ax.scatter(x, y)
Out[5]:  <matplotlib.collections.PathCollection at 0x12341234>  # CAN'T SEE ANY PLOT :(
In [6]:  plt.show()                                             # STILL CAN'T SEE IT :(

通过将绘图命令合并到单个单元格中解决了该问题:

In [1]:  %matplotlib inline
         import matplotlib
         import matplotlib.pyplot as plt
         import numpy as np
In [2]:  x = np.array([1, 3, 4])
         y = np.array([1, 5, 3])
In [3]:  fig = plt.figure()
         ax = fig.add_subplot(1, 1, 1)
         ax.scatter(x, y)
Out[3]:  <matplotlib.collections.PathCollection at 0x12341234>
         # AND HERE APPEARS THE PLOT AS DESIRED :)

I had the same problem when I was running the plotting commands in separate cells in Jupyter:

In [1]:  %matplotlib inline
         import matplotlib
         import matplotlib.pyplot as plt
         import numpy as np
In [2]:  x = np.array([1, 3, 4])
         y = np.array([1, 5, 3])
In [3]:  fig = plt.figure()
         <Figure size 432x288 with 0 Axes>                      #this might be the problem
In [4]:  ax = fig.add_subplot(1, 1, 1)
In [5]:  ax.scatter(x, y)
Out[5]:  <matplotlib.collections.PathCollection at 0x12341234>  # CAN'T SEE ANY PLOT :(
In [6]:  plt.show()                                             # STILL CAN'T SEE IT :(

The problem was solved by merging the plotting commands into a single cell:

In [1]:  %matplotlib inline
         import matplotlib
         import matplotlib.pyplot as plt
         import numpy as np
In [2]:  x = np.array([1, 3, 4])
         y = np.array([1, 5, 3])
In [3]:  fig = plt.figure()
         ax = fig.add_subplot(1, 1, 1)
         ax.scatter(x, y)
Out[3]:  <matplotlib.collections.PathCollection at 0x12341234>
         # AND HERE APPEARS THE PLOT AS DESIRED :)

Matplotlib:使用Python绘图

Matplotlib是一个综合性的库,用于用Python语言创建静电、动画和交互式可视化效果

请查看我们的home page了解更多信息

Matplotlib以各种硬拷贝格式和跨平台的交互环境制作出版质量的数字。Matplotlib可用于Python脚本、Python和IPython shell、Web应用程序服务器以及各种图形用户界面工具包

安装

有关安装说明和要求,请参见INSTALL.rst或者install文档

测试

安装后,启动测试套件:

python -m pytest

请阅读testing guide有关更多信息和替代方案

贡献力量

您发现了一个bug或其他您想要更改的东西-太棒了!

你已经想出了一种修复它的方法–甚至更好!

你想告诉我们这件事–最棒的是!

contributing guide好了!

联系方式

Discourse是一般性问题和讨论的讨论论坛,也是我们推荐的起点。

我们的活动邮件列表(反映在话语中)是:

Gitter用于协调发展并提出与matplotlib直接相关的问题

引用Matplotlib

如果Matplotlib对导致出版的项目做出了贡献,请引用Matplotlib来确认这一点

A ready-made citation entry有空房吗?

研究通知

请注意,该存储库正在参与一项关于开放源码项目可持续性的研究。从2021年6月开始,将在大约12个月的时间内收集有关此存储库的数据

收集的数据将包括贡献者数量、PR数量、关闭/合并这些PR所需的时间以及已关闭的问题

欲了解更多信息,请访问the informational
page
或下载participant information
sheet

Dash Python、R、Julia和Jupyter的分析型Web应用程序

DASH是用于构建ML和数据科学Web应用程序的下载最多、最值得信赖的Python框架

Dash构建在Plotly.js、Reaction和Flask之上,它将现代UI元素(如下拉列表、滑块和图形)直接绑定到您的分析Python代码。阅读我们自豪地制作的教程❤️由破折号本身

应用程序示例

应用程序 描述
这里有一个简单的Dash App示例,它将下拉菜单绑定到D3.js Ploly Graph。当用户在下拉列表中选择一个值时,应用程序代码会动态地将数据从Google Finance导出到Pandas DataFrame。这个应用程序是刚刚编写的43代码行(view the source)
DASH应用程序代码是声明性和反应性的,这使得构建包含许多交互元素的复杂应用程序变得很容易。下面是一个具有5个输入、3个输出和交叉滤波的示例。此应用程序仅由160行代码组成,所有代码均为Python
破折号使用Plotly.js用来绘制图表。支持超过35种图表类型,包括地图
DASH不仅仅适用于仪表盘。您可以完全控制应用程序的外观。这是一个样式设置为PDF报表的Dash应用程序

要了解有关Dash的更多信息,请阅读extensive announcement letterjump in with the user guide

DASH OSS和DASH Enterprise

使用Dash Open Source,Dash应用程序可在本地笔记本电脑或工作站上运行,但组织中的其他人无法轻松访问

当您的Dash应用程序准备好在部门或公司范围内使用时,使用Dash Enterprise进行纵向扩展。或者,从一开始就使用Dash Enterprise启动您的计划,从Ploly的团队中获得开发人员工作效率的提升和实践加速

ML Ops功能:为ML Ops提供一站式服务:为您的Dash应用程序提供水平可扩展的托管、部署和身份验证。不需要IT或DevOps

  • App manager无需IT或DevOps团队即可部署和管理Dash应用程序。应用程序管理器为您提供对Dash部署的所有方面的点击式控制
  • Kubernetes scaling确保Dash应用程序的高可用性,并利用Dash Enterprise的Kubernetes架构进行横向扩展。不需要IT或掌舵
  • No code auth只需点击几下即可控制Dash应用程序访问。Dash Enterprise支持LDAP、AD、PKI、OKTA、SAML、OpenID Connect、OAuth、SSO和简单的电子邮件身份验证
  • Job Queue作业队列是构建可伸缩的Dash应用程序的关键。将繁重的计算从同步Dash回调移动到作业队列以进行异步后台处理

低码特性:低码Dash应用程序功能,极大地提高了开发人员的工作效率

  • Design Kit像专业人士一样设计,不需要编写一行CSS。轻松安排、设置样式、打造品牌和自定义Dash应用程序
  • Snapshot Engine将Dash应用程序视图保存并共享为链接或PDF。或者,通过Dash运行Python作业,并在作业完成后让快照引擎通过电子邮件发送报告
  • Dashboard ToolkitDash应用程序的拖放布局、图表编辑和交叉过滤器
  • Embedding在不使用IFrame的情况下,在现有Web应用程序或网站中本地嵌入Dash应用程序

企业人工智能功能:您的数据科学团队快速交付AI/ML研究和业务计划所需的一切

  • AI App MarketplaceDash Enterprise附带了数十个Dash应用程序模板,用于解决AI/ML影响最大的业务问题
  • Big Data for Pything连接到Python最流行的大数据后端:Dask、Databricks、NVIDIA Rapids、Snowflake、Postgres、Vaex等
  • GPU & Dask AccelerationDash Enterprise将适用于GPU和并行CPU计算的最流行的Python HPC堆栈交到了企业用户手中
  • Data Science Workspaces从第一天开始高效工作。从Dash Enterprise的板载代码编辑器编写并执行Python、R和Julia代码

看见https://plotly.com/contact-us/为了取得联系

Streamlit-Streamlight-使用Python构建数据应用程序的最快方式

欢迎来到Streamlight👋

构建和共享数据应用程序的最快方式

Streamlight可以让您在几分钟内将数据脚本转换为可共享的Web应用程序,而不是几周。这都是Python,开源的,免费的!一旦您创建了应用程序,您就可以使用我们的free sharing platform要部署、管理和与全世界共享您的应用程序,请执行以下操作

安装

pip install streamlit
streamlit hello

Streamlight还可以安装在虚拟环境中WindowsMac,以及Linux

举个小例子

Streamlight使构建交互式应用程序变得极其简单:

import streamlit as st

x = st.slider('Select a value')
st.write(x, 'squared is', x * x)

一个更大的例子

Streamlitt简单而集中的API让您可以构建极其丰富和强大的工具。This demo project允许您浏览整个Udacity self-driving-car dataset类并实时运行推理。YOLO object detection net

完整的演示用不到300行Python代码实现。事实上,这款应用程序包含only 23 Streamlit calls它展示了Streamlight的所有主要构建块。您现在可以在以下地址尝试share.streamlit.io/streamlit/demo-self-driving

Streamlight GitHub徽章

Streamlight的GitHub徽章可帮助其他人找到并使用您的Streamlight应用程序

部署应用程序后,您可以将此徽章直接嵌入到GitHub readme.md中,如下所示:

[![Streamlit App](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://share.streamlit.io/yourGitHubName/yourRepo/yourApp/)

更多信息

为团队提供流光照明

Streamlit for Teams是我们的企业解决方案,用于部署、管理、共享和协作您的Streamlight应用程序。Streamlight for Teams提供安全的单击部署、身份验证、Web编辑、版本控制等功能。它目前处于封闭测试阶段,您可以join the wait-list here

许可证

Streamlight是完全免费和开源的,并在Apache 2.0许可证

Superset-Apache Superset是一个数据可视化和数据探索平台

现代的、企业就绪的商业智能Web应用程序

为什么是超集?

超集提供:

  • 用于可视化数据集和制作交互式仪表板的直观界面
  • 一系列精美的可视化效果,可展示您的数据
  • 用于提取和显示数据集的无代码可视化构建器
  • 世界级的SQL IDE,用于准备用于可视化的数据,包括丰富的元数据浏览器
  • 轻量级语义层,使数据分析师能够快速定义自定义维度和指标
  • 对大多数SQL语言数据库的开箱即用支持
  • 无缝的内存中异步缓存和查询
  • 一种可扩展的安全模型,允许配置关于谁可以访问哪些产品功能和数据集的非常复杂的规则
  • 与主要身份验证后端(数据库、OpenID、LDAP、OAUTH、REMOTE_USER等)集成
  • 能够添加自定义可视化插件
  • 用于编程自定义的API
  • 从头开始为规模而设计的云原生架构

支持的数据库

超集可以从任何使用SQL语言的数据存储或数据引擎(例如,Presto或Athena)查询具有Python DB-API驱动程序和SQLAlChemy方言的数据

更全面的支持数据库列表以及配置说明可以找到:here

想要添加对您的数据存储区或数据引擎的支持吗?阅读更多内容here关于技术要求

安装和配置

Extended documentation for Superset

参与进来吧

投稿人指南

有兴趣做贡献吗?请查看我们的CONTRIBUTING.md查找有关贡献的资源,以及有关如何设置开发环境的详细指南

资源