标签归档:scatter-plot

如何在Python中用散点图绘制散点图?

问题:如何在Python中用散点图绘制散点图?

在Python中,使用Matplotlib,如何绘制带有圆的散点图?目标是在已经由绘制的一些彩色磁盘周围绘制空圆scatter(),以便突出显示它们,理想情况下不必重新绘制彩色圆。

我试过了facecolors=None,无济于事。

In Python, with Matplotlib, how can a scatter plot with empty circles be plotted? The goal is to draw empty circles around some of the colored disks already plotted by scatter(), so as to highlight them, ideally without having to redraw the colored circles.

I tried facecolors=None, to no avail.


回答 0

从分散的文档中:

Optional kwargs control the Collection properties; in particular:

    edgecolors:
        The string none to plot faces with no outlines
    facecolors:
        The string none to plot unfilled outlines

请尝试以下操作:

import matplotlib.pyplot as plt 
import numpy as np 

x = np.random.randn(60) 
y = np.random.randn(60)

plt.scatter(x, y, s=80, facecolors='none', edgecolors='r')
plt.show()

注意:对于其他类型的地块看到这个帖子的使用markeredgecolormarkerfacecolor

From the documentation for scatter:

Optional kwargs control the Collection properties; in particular:

    edgecolors:
        The string ‘none’ to plot faces with no outlines
    facecolors:
        The string ‘none’ to plot unfilled outlines

Try the following:

import matplotlib.pyplot as plt 
import numpy as np 

x = np.random.randn(60) 
y = np.random.randn(60)

plt.scatter(x, y, s=80, facecolors='none', edgecolors='r')
plt.show()

Note: For other types of plots see this post on the use of markeredgecolor and markerfacecolor.


回答 1

这些行得通吗?

plt.scatter(np.random.randn(100), np.random.randn(100), facecolors='none')

或使用plot()

plt.plot(np.random.randn(100), np.random.randn(100), 'o', mfc='none')

Would these work?

plt.scatter(np.random.randn(100), np.random.randn(100), facecolors='none')

or using plot()

plt.plot(np.random.randn(100), np.random.randn(100), 'o', mfc='none')


回答 2

这是另一种方式:这会在当前轴,图或图像等上添加一个圆:

from matplotlib.patches import Circle  # $matplotlib/patches.py

def circle( xy, radius, color="lightsteelblue", facecolor="none", alpha=1, ax=None ):
    """ add a circle to ax= or current axes
    """
        # from .../pylab_examples/ellipse_demo.py
    e = Circle( xy=xy, radius=radius )
    if ax is None:
        ax = pl.gca()  # ax = subplot( 1,1,1 )
    ax.add_artist(e)
    e.set_clip_box(ax.bbox)
    e.set_edgecolor( color )
    e.set_facecolor( facecolor )  # "none" not None
    e.set_alpha( alpha )

(由于,图片中的圆圈被挤压成椭圆形imshow aspect="auto")。

Here’s another way: this adds a circle to the current axes, plot or image or whatever :

from matplotlib.patches import Circle  # $matplotlib/patches.py

def circle( xy, radius, color="lightsteelblue", facecolor="none", alpha=1, ax=None ):
    """ add a circle to ax= or current axes
    """
        # from .../pylab_examples/ellipse_demo.py
    e = Circle( xy=xy, radius=radius )
    if ax is None:
        ax = pl.gca()  # ax = subplot( 1,1,1 )
    ax.add_artist(e)
    e.set_clip_box(ax.bbox)
    e.set_edgecolor( color )
    e.set_facecolor( facecolor )  # "none" not None
    e.set_alpha( alpha )

(The circles in the picture get squashed to ellipses because imshow aspect="auto" ).


回答 3

在matplotlib 2.0中,有一个名为的参数fillstyle ,可以更好地控制标记的填充方式。就我而言,我已将其与错误栏一起使用,但它可用于一般http://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.errorbar.html中的标记

fillstyle接受以下值:[‘full’| “左” | ‘正确’| “底部” | ‘顶部’| ‘没有’]

使用时有两点要牢记fillstyle

1)如果将mfc设置为任何类型的值,它将具有优先权,因此,如果您将fillstyle设置为“ none”,则它不会生效。因此,请避免同时使用mfc和fillstyle

2)您可能想要控制标记的边缘宽度(使用markeredgewidthmew),因为如果标记相对较小且边缘宽度较厚,则标记看起来会像已填充,即使没有。

以下是使用错误栏的示例:

myplot.errorbar(x=myXval, y=myYval, yerr=myYerrVal, fmt='o', fillstyle='none', ecolor='blue',  mec='blue')

In matplotlib 2.0 there is a parameter called fillstyle which allows better control on the way markers are filled. In my case I have used it with errorbars but it works for markers in general http://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.errorbar.html

fillstyle accepts the following values: [‘full’ | ‘left’ | ‘right’ | ‘bottom’ | ‘top’ | ‘none’]

There are two important things to keep in mind when using fillstyle,

1) If mfc is set to any kind of value it will take priority, hence, if you did set fillstyle to ‘none’ it would not take effect. So avoid using mfc in conjuntion with fillstyle

2) You might want to control the marker edge width (using markeredgewidth or mew) because if the marker is relatively small and the edge width is thick, the markers will look like filled even though they are not.

Following is an example using errorbars:

myplot.errorbar(x=myXval, y=myYval, yerr=myYerrVal, fmt='o', fillstyle='none', ecolor='blue',  mec='blue')

回答 4

基于Gary Kerr的示例,如此处所建议可以使用以下代码创建与指定值相关的空圆:

import matplotlib.pyplot as plt 
import numpy as np 
from matplotlib.markers import MarkerStyle

x = np.random.randn(60) 
y = np.random.randn(60)
z = np.random.randn(60)

g=plt.scatter(x, y, s=80, c=z)
g.set_facecolor('none')
plt.colorbar()
plt.show()

Basend on the example of Gary Kerr and as proposed here one may create empty circles related to specified values with following code:

import matplotlib.pyplot as plt 
import numpy as np 
from matplotlib.markers import MarkerStyle

x = np.random.randn(60) 
y = np.random.randn(60)
z = np.random.randn(60)

g=plt.scatter(x, y, s=80, c=z)
g.set_facecolor('none')
plt.colorbar()
plt.show()

回答 5

因此,我假设您想突出显示符合特定条件的一些要点。您可以使用Prelude的命令对高亮点进行第二次散点图绘制,并用一个空圆进行第一次散点图绘制。确保s参数足够小,以使较大的空圆圈包围较小的填充圆。

另一个选择是不使用散点图,而使用circle / ellipse命令分别绘制补丁。这些位于matplotlib.patches中,是一些有关如何绘制圆形矩形等的示例代码。

So I assume you want to highlight some points that fit a certain criteria. You can use Prelude’s command to do a second scatter plot of the hightlighted points with an empty circle and a first call to plot all the points. Make sure the s paramter is sufficiently small for the larger empty circles to enclose the smaller filled ones.

The other option is to not use scatter and draw the patches individually using the circle/ellipse command. These are in matplotlib.patches, here is some sample code on how to draw circles rectangles etc.


在matplotlib上的散点图中为每个系列设置不同的颜色

问题:在matplotlib上的散点图中为每个系列设置不同的颜色

假设我有三个数据集:

X = [1,2,3,4]
Y1 = [4,8,12,16]
Y2 = [1,4,9,16]

我可以散点图:

from matplotlib import pyplot as plt
plt.scatter(X,Y1,color='red')
plt.scatter(X,Y2,color='blue')
plt.show()

我怎样用10套来做到这一点?

我进行了搜索,可以找到我所要求的任何参考。

编辑:澄清(希望)我的问题

如果我多次调用散点图,则只能在每个散点图上设置相同的颜色。另外,我知道我可以手动设置颜色阵列,但是我敢肯定有更好的方法可以做到这一点。我的问题是:“如何自动散布我的几个数据集,每个数据集具有不同的颜色。

如果有帮助,我可以轻松地为每个数据集分配一个唯一的编号。

Suppose I have three data sets:

X = [1,2,3,4]
Y1 = [4,8,12,16]
Y2 = [1,4,9,16]

I can scatter plot this:

from matplotlib import pyplot as plt
plt.scatter(X,Y1,color='red')
plt.scatter(X,Y2,color='blue')
plt.show()

How can I do this with 10 sets?

I searched for this and could find any reference to what I’m asking.

Edit: clarifying (hopefully) my question

If I call scatter multiple times, I can only set the same color on each scatter. Also, I know I can set a color array manually but I’m sure there is a better way to do this. My question is then, “How can I automatically scatter-plot my several data sets, each with a different color.

If that helps, I can easily assign a unique number to each data set.


回答 0

我不知道“手动”是什么意思。您可以选择一个颜色图并足够容易地创建颜色阵列:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

x = np.arange(10)
ys = [i+x+(i*x)**2 for i in range(10)]

colors = cm.rainbow(np.linspace(0, 1, len(ys)))
for y, c in zip(ys, colors):
    plt.scatter(x, y, color=c)

或者,您可以使用itertools.cycle并指定要循环显示的颜色来制作自己的颜色循环仪,并使用next来获得所需的颜色。例如,使用3种颜色:

import itertools

colors = itertools.cycle(["r", "b", "g"])
for y in ys:
    plt.scatter(x, y, color=next(colors))

想一想,也许最好不要同时使用zip第一个:

colors = iter(cm.rainbow(np.linspace(0, 1, len(ys))))
for y in ys:
    plt.scatter(x, y, color=next(colors))

I don’t know what you mean by ‘manually’. You can choose a colourmap and make a colour array easily enough:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

x = np.arange(10)
ys = [i+x+(i*x)**2 for i in range(10)]

colors = cm.rainbow(np.linspace(0, 1, len(ys)))
for y, c in zip(ys, colors):
    plt.scatter(x, y, color=c)

Or you can make your own colour cycler using itertools.cycle and specifying the colours you want to loop over, using next to get the one you want. For example, with 3 colours:

import itertools

colors = itertools.cycle(["r", "b", "g"])
for y in ys:
    plt.scatter(x, y, color=next(colors))

Come to think of it, maybe it’s cleaner not to use zip with the first one neither:

colors = iter(cm.rainbow(np.linspace(0, 1, len(ys))))
for y in ys:
    plt.scatter(x, y, color=next(colors))

回答 1

在matplotlib中用不同颜色的点绘制图的正常方法是传递颜色列表作为参数。

例如:

import matplotlib.pyplot
matplotlib.pyplot.scatter([1,2,3],[4,5,6],color=['red','green','blue'])

当您有一个列表列表时,您希望每个列表都带有颜色。我认为最优雅的方法是@DSM建议,只需做一个循环进行多次调用即可分散。

但是,如果由于某种原因您只想打一个电话,就可以制作一个大的颜色列表,并具有列表理解力和一些地板分割:

import matplotlib
import numpy as np

X = [1,2,3,4]
Ys = np.array([[4,8,12,16],
      [1,4,9,16],
      [17, 10, 13, 18],
      [9, 10, 18, 11],
      [4, 15, 17, 6],
      [7, 10, 8, 7],
      [9, 0, 10, 11],
      [14, 1, 15, 5],
      [8, 15, 9, 14],
       [20, 7, 1, 5]])
nCols = len(X)  
nRows = Ys.shape[0]

colors = matplotlib.cm.rainbow(np.linspace(0, 1, len(Ys)))

cs = [colors[i//len(X)] for i in range(len(Ys)*len(X))] #could be done with numpy's repmat
Xs=X*nRows #use list multiplication for repetition
matplotlib.pyplot.scatter(Xs,Ys.flatten(),color=cs)

cs = [array([ 0.5,  0. ,  1. ,  1. ]),
 array([ 0.5,  0. ,  1. ,  1. ]),
 array([ 0.5,  0. ,  1. ,  1. ]),
 array([ 0.5,  0. ,  1. ,  1. ]),
 array([ 0.28039216,  0.33815827,  0.98516223,  1.        ]),
 array([ 0.28039216,  0.33815827,  0.98516223,  1.        ]),
 array([ 0.28039216,  0.33815827,  0.98516223,  1.        ]),
 array([ 0.28039216,  0.33815827,  0.98516223,  1.        ]),
 ...
 array([  1.00000000e+00,   1.22464680e-16,   6.12323400e-17,
          1.00000000e+00]),
 array([  1.00000000e+00,   1.22464680e-16,   6.12323400e-17,
          1.00000000e+00]),
 array([  1.00000000e+00,   1.22464680e-16,   6.12323400e-17,
          1.00000000e+00]),
 array([  1.00000000e+00,   1.22464680e-16,   6.12323400e-17,
          1.00000000e+00])]

The normal way to plot plots with points in different colors in matplotlib is to pass a list of colors as a parameter.

E.g.:

import matplotlib.pyplot
matplotlib.pyplot.scatter([1,2,3],[4,5,6],color=['red','green','blue'])

When you have a list of lists and you want them colored per list. I think the most elegant way is that suggesyted by @DSM, just do a loop making multiple calls to scatter.

But if for some reason you wanted to do it with just one call, you can make a big list of colors, with a list comprehension and a bit of flooring division:

import matplotlib
import numpy as np

X = [1,2,3,4]
Ys = np.array([[4,8,12,16],
      [1,4,9,16],
      [17, 10, 13, 18],
      [9, 10, 18, 11],
      [4, 15, 17, 6],
      [7, 10, 8, 7],
      [9, 0, 10, 11],
      [14, 1, 15, 5],
      [8, 15, 9, 14],
       [20, 7, 1, 5]])
nCols = len(X)  
nRows = Ys.shape[0]

colors = matplotlib.cm.rainbow(np.linspace(0, 1, len(Ys)))

cs = [colors[i//len(X)] for i in range(len(Ys)*len(X))] #could be done with numpy's repmat
Xs=X*nRows #use list multiplication for repetition
matplotlib.pyplot.scatter(Xs,Ys.flatten(),color=cs)

cs = [array([ 0.5,  0. ,  1. ,  1. ]),
 array([ 0.5,  0. ,  1. ,  1. ]),
 array([ 0.5,  0. ,  1. ,  1. ]),
 array([ 0.5,  0. ,  1. ,  1. ]),
 array([ 0.28039216,  0.33815827,  0.98516223,  1.        ]),
 array([ 0.28039216,  0.33815827,  0.98516223,  1.        ]),
 array([ 0.28039216,  0.33815827,  0.98516223,  1.        ]),
 array([ 0.28039216,  0.33815827,  0.98516223,  1.        ]),
 ...
 array([  1.00000000e+00,   1.22464680e-16,   6.12323400e-17,
          1.00000000e+00]),
 array([  1.00000000e+00,   1.22464680e-16,   6.12323400e-17,
          1.00000000e+00]),
 array([  1.00000000e+00,   1.22464680e-16,   6.12323400e-17,
          1.00000000e+00]),
 array([  1.00000000e+00,   1.22464680e-16,   6.12323400e-17,
          1.00000000e+00])]

回答 2

一个简单的解决方法

如果您只有一种类型的集合(例如,没有误差线的散点图),则还可以在绘制它们后更改颜色,这有时更易于执行。

import matplotlib.pyplot as plt
from random import randint
import numpy as np

#Let's generate some random X, Y data X = [ [frst group],[second group] ...]
X = [ [randint(0,50) for i in range(0,5)] for i in range(0,24)]
Y = [ [randint(0,50) for i in range(0,5)] for i in range(0,24)]
labels = range(1,len(X)+1)

fig = plt.figure()
ax = fig.add_subplot(111)
for x,y,lab in zip(X,Y,labels):
        ax.scatter(x,y,label=lab)

您唯一需要的一段代码:

#Now this is actually the code that you need, an easy fix your colors just cut and paste not you need ax.
colormap = plt.cm.gist_ncar #nipy_spectral, Set1,Paired  
colorst = [colormap(i) for i in np.linspace(0, 0.9,len(ax.collections))]       
for t,j1 in enumerate(ax.collections):
    j1.set_color(colorst[t])


ax.legend(fontsize='small')

即使在同一子图中有许多不同的散点图,输出也会为您提供不同的颜色。

An easy fix

If you have only one type of collections (e.g. scatter with no error bars) you can also change the colours after that you have plotted them, this sometimes is easier to perform.

import matplotlib.pyplot as plt
from random import randint
import numpy as np

#Let's generate some random X, Y data X = [ [frst group],[second group] ...]
X = [ [randint(0,50) for i in range(0,5)] for i in range(0,24)]
Y = [ [randint(0,50) for i in range(0,5)] for i in range(0,24)]
labels = range(1,len(X)+1)

fig = plt.figure()
ax = fig.add_subplot(111)
for x,y,lab in zip(X,Y,labels):
        ax.scatter(x,y,label=lab)

The only piece of code that you need:

#Now this is actually the code that you need, an easy fix your colors just cut and paste not you need ax.
colormap = plt.cm.gist_ncar #nipy_spectral, Set1,Paired  
colorst = [colormap(i) for i in np.linspace(0, 0.9,len(ax.collections))]       
for t,j1 in enumerate(ax.collections):
    j1.set_color(colorst[t])


ax.legend(fontsize='small')

The output gives you differnent colors even when you have many different scatter plots in the same subplot.


回答 3

您可以始终plot()像这样使用该函数:

import matplotlib.pyplot as plt

import numpy as np

x = np.arange(10)
ys = [i+x+(i*x)**2 for i in range(10)]
plt.figure()
for y in ys:
    plt.plot(x, y, 'o')
plt.show()

You can always use the plot() function like so:

import matplotlib.pyplot as plt

import numpy as np

x = np.arange(10)
ys = [i+x+(i*x)**2 for i in range(10)]
plt.figure()
for y in ys:
    plt.plot(x, y, 'o')
plt.show()


回答 4

在2013年1月和matplotlib 1.3.1(2013年8月)之前,这个问题有点棘手,您可以在matpplotlib网站上找到最旧的稳定版本。但是在那之后,它是微不足道的。

因为当前版本的matplotlib.pylab.scatter支持分配:颜色名称字符串数组,带有颜色映射的浮点数数组,RGB或RGBA数组。

此答案表示@Oxinabox对在2015年更正2013年版本的我的无尽热情。


您有两个选择,可以在单个调用中使用具有多种颜色的scatter命令。

  1. 作为pylab.scatter命令支持,请使用RGBA数组执行所需的任何颜色;

  2. 早在2013年初,就没有办法这样做,因为该命令仅支持整个散点集合的单一颜色。当我执行10000行项目时,我想出了一个通用的解决方案来绕过它。所以它很俗气,但是我可以做任何形状,颜色,大小和透明的东西。此技巧也可以应用于绘制路径集合,线集合…。

该代码也受到的源代码的启发pyplot.scatter,我只是复制了散点图,而没有触发它绘制。

该命令pyplot.scatter返回一个PatchCollection对象,在文件“matplotlib / collections.py”私有变量_facecolorsCollection类和方法set_facecolors

因此,只要有散点可以绘制,就可以这样做:

# rgbaArr is a N*4 array of float numbers you know what I mean
# X is a N*2 array of coordinates
# axx is the axes object that current draw, you get it from
# axx = fig.gca()

# also import these, to recreate the within env of scatter command 
import matplotlib.markers as mmarkers
import matplotlib.transforms as mtransforms
from matplotlib.collections import PatchCollection
import matplotlib.markers as mmarkers
import matplotlib.patches as mpatches


# define this function
# m is a string of scatter marker, it could be 'o', 's' etc..
# s is the size of the point, use 1.0
# dpi, get it from axx.figure.dpi
def addPatch_point(m, s, dpi):
    marker_obj = mmarkers.MarkerStyle(m)
    path = marker_obj.get_path()
    trans = mtransforms.Affine2D().scale(np.sqrt(s*5)*dpi/72.0)
    ptch = mpatches.PathPatch(path, fill = True, transform = trans)
    return ptch

patches = []
# markerArr is an array of maker string, ['o', 's'. 'o'...]
# sizeArr is an array of size float, [1.0, 1.0. 0.5...]

for m, s in zip(markerArr, sizeArr):
    patches.append(addPatch_point(m, s, axx.figure.dpi))

pclt = PatchCollection(
                patches,
                offsets = zip(X[:,0], X[:,1]),
                transOffset = axx.transData)

pclt.set_transform(mtransforms.IdentityTransform())
pclt.set_edgecolors('none') # it's up to you
pclt._facecolors = rgbaArr

# in the end, when you decide to draw
axx.add_collection(pclt)
# and call axx's parent to draw_idle()

This question is a bit tricky before Jan 2013 and matplotlib 1.3.1 (Aug 2013), which is the oldest stable version you can find on matpplotlib website. But after that it is quite trivial.

Because present version of matplotlib.pylab.scatter support assigning: array of colour name string, array of float number with colour map, array of RGB or RGBA.

this answer is dedicate to @Oxinabox’s endless passion for correcting the 2013 version of myself in 2015.


you have two option of using scatter command with multiple colour in a single call.

  1. as pylab.scatter command support use RGBA array to do whatever colour you want;

  2. back in early 2013, there is no way to do so, since the command only support single colour for the whole scatter point collection. When I was doing my 10000-line project I figure out a general solution to bypass it. so it is very tacky, but I can do it in whatever shape, colour, size and transparent. this trick also could be apply to draw path collection, line collection….

the code is also inspired by the source code of pyplot.scatter, I just duplicated what scatter does without trigger it to draw.

the command pyplot.scatter return a PatchCollection Object, in the file “matplotlib/collections.py” a private variable _facecolors in Collection class and a method set_facecolors.

so whenever you have a scatter points to draw you can do this:

# rgbaArr is a N*4 array of float numbers you know what I mean
# X is a N*2 array of coordinates
# axx is the axes object that current draw, you get it from
# axx = fig.gca()

# also import these, to recreate the within env of scatter command 
import matplotlib.markers as mmarkers
import matplotlib.transforms as mtransforms
from matplotlib.collections import PatchCollection
import matplotlib.markers as mmarkers
import matplotlib.patches as mpatches


# define this function
# m is a string of scatter marker, it could be 'o', 's' etc..
# s is the size of the point, use 1.0
# dpi, get it from axx.figure.dpi
def addPatch_point(m, s, dpi):
    marker_obj = mmarkers.MarkerStyle(m)
    path = marker_obj.get_path()
    trans = mtransforms.Affine2D().scale(np.sqrt(s*5)*dpi/72.0)
    ptch = mpatches.PathPatch(path, fill = True, transform = trans)
    return ptch

patches = []
# markerArr is an array of maker string, ['o', 's'. 'o'...]
# sizeArr is an array of size float, [1.0, 1.0. 0.5...]

for m, s in zip(markerArr, sizeArr):
    patches.append(addPatch_point(m, s, axx.figure.dpi))

pclt = PatchCollection(
                patches,
                offsets = zip(X[:,0], X[:,1]),
                transOffset = axx.transData)

pclt.set_transform(mtransforms.IdentityTransform())
pclt.set_edgecolors('none') # it's up to you
pclt._facecolors = rgbaArr

# in the end, when you decide to draw
axx.add_collection(pclt)
# and call axx's parent to draw_idle()

回答 5

这对我有用:

对于每个系列,请使用随机的RGB颜色生成器

c = color[np.random.random_sample(), np.random.random_sample(), np.random.random_sample()]

This works for me:

for each series, use a random rgb colour generator

c = color[np.random.random_sample(), np.random.random_sample(), np.random.random_sample()]

回答 6

对于大型数据集和有限数量的颜色,一种更快的解决方案是使用Pandas和groupby函数:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import time


# a generic set of data with associated colors
nsamples=1000
x=np.random.uniform(0,10,nsamples)
y=np.random.uniform(0,10,nsamples)
colors={0:'r',1:'g',2:'b',3:'k'}
c=[colors[i] for i in np.round(np.random.uniform(0,3,nsamples),0)]

plt.close('all')

# "Fast" Scatter plotting
starttime=time.time()
# 1) make a dataframe
df=pd.DataFrame()
df['x']=x
df['y']=y
df['c']=c
plt.figure()
# 2) group the dataframe by color and loop
for g,b in df.groupby(by='c'):
    plt.scatter(b['x'],b['y'],color=g)
print('Fast execution time:', time.time()-starttime)

# "Slow" Scatter plotting
starttime=time.time()
plt.figure()
# 2) group the dataframe by color and loop
for i in range(len(x)):
    plt.scatter(x[i],y[i],color=c[i])
print('Slow execution time:', time.time()-starttime)

plt.show()

A MUCH faster solution for large dataset and limited number of colors is the use of Pandas and the groupby function:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import time


# a generic set of data with associated colors
nsamples=1000
x=np.random.uniform(0,10,nsamples)
y=np.random.uniform(0,10,nsamples)
colors={0:'r',1:'g',2:'b',3:'k'}
c=[colors[i] for i in np.round(np.random.uniform(0,3,nsamples),0)]

plt.close('all')

# "Fast" Scatter plotting
starttime=time.time()
# 1) make a dataframe
df=pd.DataFrame()
df['x']=x
df['y']=y
df['c']=c
plt.figure()
# 2) group the dataframe by color and loop
for g,b in df.groupby(by='c'):
    plt.scatter(b['x'],b['y'],color=g)
print('Fast execution time:', time.time()-starttime)

# "Slow" Scatter plotting
starttime=time.time()
plt.figure()
# 2) group the dataframe by color and loop
for i in range(len(x)):
    plt.scatter(x[i],y[i],color=c[i])
print('Slow execution time:', time.time()-starttime)

plt.show()

Matplotlib散点图在每个数据点具有不同的文本

问题:Matplotlib散点图在每个数据点具有不同的文本

我正在尝试绘制散点图,并用列表中的不同数字注释数据点。因此,例如,我想绘制yvs x并使用中的相应数字进行注释n

y = [2.56422, 3.77284, 3.52623, 3.51468, 3.02199]
z = [0.15, 0.3, 0.45, 0.6, 0.75]
n = [58, 651, 393, 203, 123]
ax = fig.add_subplot(111)
ax1.scatter(z, y, fmt='o')

有任何想法吗?

I am trying to make a scatter plot and annotate data points with different numbers from a list. So, for example, I want to plot y vs x and annotate with corresponding numbers from n.

y = [2.56422, 3.77284, 3.52623, 3.51468, 3.02199]
z = [0.15, 0.3, 0.45, 0.6, 0.75]
n = [58, 651, 393, 203, 123]
ax = fig.add_subplot(111)
ax1.scatter(z, y, fmt='o')

Any ideas?


回答 0

我不知道有任何采用数组或列表的绘图方法,但可以annotate()在对中的值进行迭代时使用n

y = [2.56422, 3.77284, 3.52623, 3.51468, 3.02199]
z = [0.15, 0.3, 0.45, 0.6, 0.75]
n = [58, 651, 393, 203, 123]

fig, ax = plt.subplots()
ax.scatter(z, y)

for i, txt in enumerate(n):
    ax.annotate(txt, (z[i], y[i]))

的格式设置选项很多annotate(),请参见matplotlib网站:

I’m not aware of any plotting method which takes arrays or lists but you could use annotate() while iterating over the values in n.

y = [2.56422, 3.77284, 3.52623, 3.51468, 3.02199]
z = [0.15, 0.3, 0.45, 0.6, 0.75]
n = [58, 651, 393, 203, 123]

fig, ax = plt.subplots()
ax.scatter(z, y)

for i, txt in enumerate(n):
    ax.annotate(txt, (z[i], y[i]))

There are a lot of formatting options for annotate(), see the matplotlib website:


回答 1

在版本低于matplotlib 2.0的版本中,ax.scatter绘制没有标记的文本不是必需的。在2.0版中,您需要ax.scatter为文本设置适当的范围和标记。

y = [2.56422, 3.77284, 3.52623, 3.51468, 3.02199]
z = [0.15, 0.3, 0.45, 0.6, 0.75]
n = [58, 651, 393, 203, 123]

fig, ax = plt.subplots()

for i, txt in enumerate(n):
    ax.annotate(txt, (z[i], y[i]))

在此链接中,您可以找到3d中的示例。

In version’s earlier than matplotlib 2.0, ax.scatter is not necessary to plot text without markers. In version 2.0 you’ll need ax.scatter to set the proper range and markers for text.

y = [2.56422, 3.77284, 3.52623, 3.51468, 3.02199]
z = [0.15, 0.3, 0.45, 0.6, 0.75]
n = [58, 651, 393, 203, 123]

fig, ax = plt.subplots()

for i, txt in enumerate(n):
    ax.annotate(txt, (z[i], y[i]))

And in this link you can find an example in 3d.


回答 2

如果有人试图将上述解决方案应用于.scatter()而不是.subplot(),

我尝试运行以下代码

y = [2.56422, 3.77284, 3.52623, 3.51468, 3.02199]
z = [0.15, 0.3, 0.45, 0.6, 0.75]
n = [58, 651, 393, 203, 123]

fig, ax = plt.scatter(z, y)

for i, txt in enumerate(n):
    ax.annotate(txt, (z[i], y[i]))

但是遇到错误,指出“无法解压缩不可迭代的PathCollection对象”,该错误专门指向代码行无花果,ax = plt.scatter(z,y)

我最终使用以下代码解决了该错误

plt.scatter(z, y)

for i, txt in enumerate(n):
    plt.annotate(txt, (z[i], y[i]))

我没想到我应该更清楚地知道.scatter()和.subplot()之间没有区别。

In case anyone is trying to apply the above solutions to a .scatter() instead of a .subplot(),

I tried running the following code

y = [2.56422, 3.77284, 3.52623, 3.51468, 3.02199]
z = [0.15, 0.3, 0.45, 0.6, 0.75]
n = [58, 651, 393, 203, 123]

fig, ax = plt.scatter(z, y)

for i, txt in enumerate(n):
    ax.annotate(txt, (z[i], y[i]))

But ran into errors stating “cannot unpack non-iterable PathCollection object”, with the error specifically pointing at codeline fig, ax = plt.scatter(z, y)

I eventually solved the error using the following code

plt.scatter(z, y)

for i, txt in enumerate(n):
    plt.annotate(txt, (z[i], y[i]))

I didn’t expect there to be a difference between .scatter() and .subplot() I should have known better.


回答 3

您也可以使用pyplot.text(请参阅此处)。

def plot_embeddings(M_reduced, word2Ind, words):
""" Plot in a scatterplot the embeddings of the words specified in the list "words".
    Include a label next to each point.
"""
for word in words:
    x, y = M_reduced[word2Ind[word]]
    plt.scatter(x, y, marker='x', color='red')
    plt.text(x+.03, y+.03, word, fontsize=9)
plt.show()

M_reduced_plot_test = np.array([[1, 1], [-1, -1], [1, -1], [-1, 1], [0, 0]])
word2Ind_plot_test = {'test1': 0, 'test2': 1, 'test3': 2, 'test4': 3, 'test5': 4}
words = ['test1', 'test2', 'test3', 'test4', 'test5']
plot_embeddings(M_reduced_plot_test, word2Ind_plot_test, words)

You may also use pyplot.text (see here).

def plot_embeddings(M_reduced, word2Ind, words):
    """ 
        Plot in a scatterplot the embeddings of the words specified in the list "words".
        Include a label next to each point.
    """
    for word in words:
        x, y = M_reduced[word2Ind[word]]
        plt.scatter(x, y, marker='x', color='red')
        plt.text(x+.03, y+.03, word, fontsize=9)
    plt.show()

M_reduced_plot_test = np.array([[1, 1], [-1, -1], [1, -1], [-1, 1], [0, 0]])
word2Ind_plot_test = {'test1': 0, 'test2': 1, 'test3': 2, 'test4': 3, 'test5': 4}
words = ['test1', 'test2', 'test3', 'test4', 'test5']
plot_embeddings(M_reduced_plot_test, word2Ind_plot_test, words)


回答 4

Python 3.6及更高版本:

coordinates = [('a',1,2), ('b',3,4), ('c',5,6)]
for x in coordinates: plt.annotate(x[0], (x[1], x[2]))

Python 3.6+:

coordinates = [('a',1,2), ('b',3,4), ('c',5,6)]
for x in coordinates: plt.annotate(x[0], (x[1], x[2]))

回答 5

作为一个使用列表理解和numpy的班轮:

[ax.annotate(x[0], (x[1], x[2])) for x in np.array([n,z,y]).T]

设置与Rutger的答案相同。

As a one liner using list comprehension and numpy:

[ax.annotate(x[0], (x[1], x[2])) for x in np.array([n,z,y]).T]

setup is ditto to Rutger’s answer.


回答 6

我想补充一点,您甚至可以使用箭头/文本框来标注标签。这是我的意思:

import random
import matplotlib.pyplot as plt


y = [2.56422, 3.77284, 3.52623, 3.51468, 3.02199]
z = [0.15, 0.3, 0.45, 0.6, 0.75]
n = [58, 651, 393, 203, 123]

fig, ax = plt.subplots()
ax.scatter(z, y)

ax.annotate(n[0], (z[0], y[0]), xytext=(z[0]+0.05, y[0]+0.3), 
    arrowprops=dict(facecolor='red', shrink=0.05))

ax.annotate(n[1], (z[1], y[1]), xytext=(z[1]-0.05, y[1]-0.3), 
    arrowprops = dict(  arrowstyle="->",
                        connectionstyle="angle3,angleA=0,angleB=-90"))

ax.annotate(n[2], (z[2], y[2]), xytext=(z[2]-0.05, y[2]-0.3), 
    arrowprops = dict(arrowstyle="wedge,tail_width=0.5", alpha=0.1))

ax.annotate(n[3], (z[3], y[3]), xytext=(z[3]+0.05, y[3]-0.2), 
    arrowprops = dict(arrowstyle="fancy"))

ax.annotate(n[4], (z[4], y[4]), xytext=(z[4]-0.1, y[4]-0.2),
    bbox=dict(boxstyle="round", alpha=0.1), 
    arrowprops = dict(arrowstyle="simple"))

plt.show()

将生成以下图形:

I would love to add that you can even use arrows /text boxes to annotate the labels. Here is what I mean:

import random
import matplotlib.pyplot as plt


y = [2.56422, 3.77284, 3.52623, 3.51468, 3.02199]
z = [0.15, 0.3, 0.45, 0.6, 0.75]
n = [58, 651, 393, 203, 123]

fig, ax = plt.subplots()
ax.scatter(z, y)

ax.annotate(n[0], (z[0], y[0]), xytext=(z[0]+0.05, y[0]+0.3), 
    arrowprops=dict(facecolor='red', shrink=0.05))

ax.annotate(n[1], (z[1], y[1]), xytext=(z[1]-0.05, y[1]-0.3), 
    arrowprops = dict(  arrowstyle="->",
                        connectionstyle="angle3,angleA=0,angleB=-90"))

ax.annotate(n[2], (z[2], y[2]), xytext=(z[2]-0.05, y[2]-0.3), 
    arrowprops = dict(arrowstyle="wedge,tail_width=0.5", alpha=0.1))

ax.annotate(n[3], (z[3], y[3]), xytext=(z[3]+0.05, y[3]-0.2), 
    arrowprops = dict(arrowstyle="fancy"))

ax.annotate(n[4], (z[4], y[4]), xytext=(z[4]-0.1, y[4]-0.2),
    bbox=dict(boxstyle="round", alpha=0.1), 
    arrowprops = dict(arrowstyle="simple"))

plt.show()

Which will generate the following graph: