分类目录归档:知识问答

为什么在matplotlib图中我的xlabel被截断了?

问题:为什么在matplotlib图中我的xlabel被截断了?

我正在使用matplotlib具有“很高”的xlabel的位置绘制数据集(这是在TeX中渲染的公式,其中包含一个分数,因此其高度等于几行文本)。

无论如何,当我绘制数字时,公式的底部总是被切除。更改图形大小似乎无济于事,而且我还无法弄清楚如何将x轴“向上”移动以为xlabel腾出空间。诸如此类的东西是一个合理的临时解决方案,但最好的办法是使matplotlib自动识别标签已被剪切并相应地调整大小。

这是我的意思的示例:

import matplotlib.pyplot as plt

plt.figure()
plt.ylabel(r'$\ln\left(\frac{x_a-x_b}{x_a-x_c}\right)$')
plt.xlabel(r'$\ln\left(\frac{x_a-x_d}{x_a-x_e}\right)$')
plt.show()

虽然您可以看到整个ylabel,但xlabel在底部被切除了。

如果这是机器特定的问题,我将在带有matplotlib 1.0.0的OSX 10.6.8上运行此问题

I am plotting a dataset using matplotlib where I have an xlabel that is quite “tall” (it’s a formula rendered in TeX that contains a fraction and is therefore has the height equivalent of a couple of lines of text).

In any case, the bottom of the formula is always cut off when I draw the figures. Changing figure size doesn’t seem to help this, and I haven’t been able to figure out how to shift the x-axis “up” to make room for the xlabel. Something like that would be a reasonable temporary solution, but what would be nice would be to have a way to make matplotlib recognize automatically that the label is cut off and resize accordingly.

Here’s an example of what I mean:

import matplotlib.pyplot as plt

plt.figure()
plt.ylabel(r'$\ln\left(\frac{x_a-x_b}{x_a-x_c}\right)$')
plt.xlabel(r'$\ln\left(\frac{x_a-x_d}{x_a-x_e}\right)$')
plt.show()

while you can see the entire ylabel, the xlabel is cut off at the bottom.

In the case this is a machine-specific problem, I am running this on OSX 10.6.8 with matplotlib 1.0.0


回答 0

采用:

import matplotlib.pyplot as plt

plt.gcf().subplots_adjust(bottom=0.15)

为标签腾出空间。

编辑:

自从我给出答案以来,matplotlib增加了tight_layout()功能。所以我建议使用它:

plt.tight_layout()

应该为xlabel腾出空间。

Use:

import matplotlib.pyplot as plt

plt.gcf().subplots_adjust(bottom=0.15)

to make room for the label.

Edit:

Since i gave the answer, matplotlib has added the tight_layout() function. So i suggest to use it:

plt.tight_layout()

should make room for the xlabel.


回答 1

一个简单的选项是将matplotlib配置为自动调整绘图大小。它非常适合我,我不确定为什么默认情况下未激活它。

方法一

在您的matplotlibrc文件中进行设置

figure.autolayout : True

有关自定义matplotlibrc文件的更多信息,请参见此处:http : //matplotlib.org/users/customizing.html

方法2

像这样在运行时更新rcParams

from matplotlib import rcParams
rcParams.update({'figure.autolayout': True})

使用这种方法的优点是您的代码将在配置不同的计算机上生成相同的图形。

An easy option is to configure matplotlib to automatically adjust the plot size. It works perfectly for me and I’m not sure why it’s not activated by default.

Method 1

Set this in your matplotlibrc file

figure.autolayout : True

See here for more information on customizing the matplotlibrc file: http://matplotlib.org/users/customizing.html

Method 2

Update the rcParams during runtime like this

from matplotlib import rcParams
rcParams.update({'figure.autolayout': True})

The advantage of using this approach is that your code will produce the same graphs on differently-configured machines.


回答 2

如果要将其存储到文件中,可以使用bbox_inches="tight"参数来解决:

plt.savefig('myfile.png', bbox_inches = "tight")

In case you want to store it to a file, you solve it using bbox_inches="tight" argument:

plt.savefig('myfile.png', bbox_inches = "tight")

回答 3

plot.tight_layout()所有更改放在图表上,show()savefig()将解决问题。

Putting plot.tight_layout() after all changes on the graph, just before show() or savefig() will solve the problem.


回答 4

plt.autoscale() 为我工作。

plt.autoscale() worked for me.


回答 5

您还可以$HOME/.matplotlib/matplotlib_rc按照以下方式将自定义填充设置为默认值。在下面的示例中,我同时修改了底部和左侧的填充:

# The figure subplot parameters.  All dimensions are a fraction of the
# figure width or height
figure.subplot.left  : 0.1 #left side of the subplots of the figure
#figure.subplot.right : 0.9 
figure.subplot.bottom : 0.15
...

You can also set custom padding as defaults in your $HOME/.matplotlib/matplotlib_rc as follows. In the example below I have modified both the bottom and left out-of-the-box padding:

# The figure subplot parameters.  All dimensions are a fraction of the
# figure width or height
figure.subplot.left  : 0.1 #left side of the subplots of the figure
#figure.subplot.right : 0.9 
figure.subplot.bottom : 0.15
...

Python / NumPy中的meshgrid的用途是什么?

问题:Python / NumPy中的meshgrid的用途是什么?

有人可以向我解释meshgridNumpy 中功能的目的是什么?我知道它会为绘图创建某种坐标网格,但是我真的看不到它的直接好处。

我正在研究Sebastian Raschka的“ Python机器学习”,他正在使用它来绘制决策边界。请参阅此处的输入11 。

我也从官方文档中尝试过此代码,但是再次,输出对我来说真的没有意义。

x = np.arange(-5, 5, 1)
y = np.arange(-5, 5, 1)
xx, yy = np.meshgrid(x, y, sparse=True)
z = np.sin(xx**2 + yy**2) / (xx**2 + yy**2)
h = plt.contourf(x,y,z)

请,如果可能的话,还请给我展示很多真实的例子。

Can someone explain to me what is the purpose of meshgrid function in Numpy? I know it creates some kind of grid of coordinates for plotting, but I can’t really see the direct benefit of it.

I am studying “Python Machine Learning” from Sebastian Raschka, and he is using it for plotting the decision borders. See input 11 here.

I have also tried this code from official documentation, but, again, the output doesn’t really make sense to me.

x = np.arange(-5, 5, 1)
y = np.arange(-5, 5, 1)
xx, yy = np.meshgrid(x, y, sparse=True)
z = np.sin(xx**2 + yy**2) / (xx**2 + yy**2)
h = plt.contourf(x,y,z)

Please, if possible, also show me a lot of real-world examples.


回答 0

目的meshgrid是根据x值数组和y值数组创建矩形网格。

因此,例如,如果我们要创建一个网格,在x和y方向上每个介于0和4之间的整数值处都有一个点。要创建矩形网格,我们需要xy点的每个组合。

这将是25分,对吧?因此,如果我们想为所有这些点创建一个x和y数组,则可以执行以下操作。

x[0,0] = 0    y[0,0] = 0
x[0,1] = 1    y[0,1] = 0
x[0,2] = 2    y[0,2] = 0
x[0,3] = 3    y[0,3] = 0
x[0,4] = 4    y[0,4] = 0
x[1,0] = 0    y[1,0] = 1
x[1,1] = 1    y[1,1] = 1
...
x[4,3] = 3    y[4,3] = 4
x[4,4] = 4    y[4,4] = 4

这将导致以下xy矩阵,使得每个矩阵中对应元素的配对给出网格中一个点的x和y坐标。

x =   0 1 2 3 4        y =   0 0 0 0 0
      0 1 2 3 4              1 1 1 1 1
      0 1 2 3 4              2 2 2 2 2
      0 1 2 3 4              3 3 3 3 3
      0 1 2 3 4              4 4 4 4 4

然后,我们可以绘制这些图形以验证它们是否为网格:

plt.plot(x,y, marker='.', color='k', linestyle='none')

在此处输入图片说明

显然,这特别繁琐,尤其对于x和的范围y。相反,meshgrid实际上可以为我们生成此代码:我们只需指定唯一值xy值即可。

xvalues = np.array([0, 1, 2, 3, 4]);
yvalues = np.array([0, 1, 2, 3, 4]);

现在,当我们调用时meshgrid,我们将自动获得先前的输出。

xx, yy = np.meshgrid(xvalues, yvalues)

plt.plot(xx, yy, marker='.', color='k', linestyle='none')

在此处输入图片说明

这些矩形网格的创建对于许多任务很有用。在您的帖子中提供的示例中,这只是一种sin(x**2 + y**2) / (x**2 + y**2)x和的值范围内对函数()进行采样的方法y

由于此函数已在矩形网格上采样,因此现在可以将其可视化为“图像”。

在此处输入图片说明

此外,现在可以将结果传递给期望在矩形网格上获得数据的函数(例如contourf

The purpose of meshgrid is to create a rectangular grid out of an array of x values and an array of y values.

So, for example, if we want to create a grid where we have a point at each integer value between 0 and 4 in both the x and y directions. To create a rectangular grid, we need every combination of the x and y points.

This is going to be 25 points, right? So if we wanted to create an x and y array for all of these points, we could do the following.

x[0,0] = 0    y[0,0] = 0
x[0,1] = 1    y[0,1] = 0
x[0,2] = 2    y[0,2] = 0
x[0,3] = 3    y[0,3] = 0
x[0,4] = 4    y[0,4] = 0
x[1,0] = 0    y[1,0] = 1
x[1,1] = 1    y[1,1] = 1
...
x[4,3] = 3    y[4,3] = 4
x[4,4] = 4    y[4,4] = 4

This would result in the following x and y matrices, such that the pairing of the corresponding element in each matrix gives the x and y coordinates of a point in the grid.

x =   0 1 2 3 4        y =   0 0 0 0 0
      0 1 2 3 4              1 1 1 1 1
      0 1 2 3 4              2 2 2 2 2
      0 1 2 3 4              3 3 3 3 3
      0 1 2 3 4              4 4 4 4 4

We can then plot these to verify that they are a grid:

plt.plot(x,y, marker='.', color='k', linestyle='none')

enter image description here

Obviously, this gets very tedious especially for large ranges of x and y. Instead, meshgrid can actually generate this for us: all we have to specify are the unique x and y values.

xvalues = np.array([0, 1, 2, 3, 4]);
yvalues = np.array([0, 1, 2, 3, 4]);

Now, when we call meshgrid, we get the previous output automatically.

xx, yy = np.meshgrid(xvalues, yvalues)

plt.plot(xx, yy, marker='.', color='k', linestyle='none')

enter image description here

Creation of these rectangular grids is useful for a number of tasks. In the example that you have provided in your post, it is simply a way to sample a function (sin(x**2 + y**2) / (x**2 + y**2)) over a range of values for x and y.

Because this function has been sampled on a rectangular grid, the function can now be visualized as an “image”.

enter image description here

Additionally, the result can now be passed to functions which expect data on rectangular grid (i.e. contourf)


回答 1

由Microsoft Excel提供: 

在此处输入图片说明

Courtesy of Microsoft Excel: 

enter image description here


回答 2

实际上np.meshgrid,文档中已经提到的目的:

np.meshgrid

从坐标向量返回坐标矩阵。

给定一维坐标数组x1,x2,…,xn,制作ND坐标数组以对ND网格上的ND标量/矢量场进行矢量化评估。

因此,其主要目的是创建坐标矩阵。

您可能只是问自己:

为什么我们需要创建坐标矩阵?

使用Python / NumPy需要坐标矩阵的原因是,从坐标到值没有直接关系,除非坐标从零开始并且是纯正整数。然后,您可以只使用数组的索引作为索引。但是,如果不是这种情况,则您需要以某种方式将坐标存储在数据旁边。那就是网格进来的地方。

假设您的数据是:

1  2  1
2  5  2
1  2  1

但是,每个值代表一个水平2公里,垂直3公里的区域。假设您的原点是左上角,并且您想要一个表示可以使用的距离的数组:

import numpy as np
h, v = np.meshgrid(np.arange(3)*3, np.arange(3)*2)

其中v是:

array([[0, 0, 0],
       [2, 2, 2],
       [4, 4, 4]])

和h:

array([[0, 3, 6],
       [0, 3, 6],
       [0, 3, 6]])

所以,如果你有两个指标,比方说xy(这就是为什么的返回值meshgrid通常是xxxs,而不是x在这种情况下,我选择h了水平!),那么你可以得到该点的x坐标,在Y点和坐标通过使用以下方法在那时的价值:

h[x, y]    # horizontal coordinate
v[x, y]    # vertical coordinate
data[x, y]  # value

这样可以更轻松地跟踪坐标(更重要的是)您可以将其传递给需要知道坐标的函数。

稍长的解释

但是,np.meshgrid本身并不经常直接使用,大多数人只是使用类似对象np.mgrid或中的一种np.ogrid。这里np.mgrid代表sparse=Falsenp.ogridsparse=True情况(我指的是的sparse参数np.meshgrid)。请注意,np.meshgridnp.ogrid和之间有显着差异 np.mgrid:返回的前两个值(如果有两个或多个)将颠倒。通常这无关紧要,但是您应该根据上下文提供有意义的变量名称。

例如,在2D网格的情况下,matplotlib.pyplot.imshow命名第一个返回的项np.meshgrid x和第二个返回项是有意义的,y而对于np.mgrid和则相反np.ogrid

np.ogrid 和稀疏的网格

>>> import numpy as np
>>> yy, xx = np.ogrid[-5:6, -5:6]
>>> xx
array([[-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5]])
>>> yy
array([[-5],
       [-4],
       [-3],
       [-2],
       [-1],
       [ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5]])

正如已经说过的,与相比np.meshgrid,输出是相反的,这就是为什么我解压缩yy, xx而不是xx, yy

>>> xx, yy = np.meshgrid(np.arange(-5, 6), np.arange(-5, 6), sparse=True)
>>> xx
array([[-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5]])
>>> yy
array([[-5],
       [-4],
       [-3],
       [-2],
       [-1],
       [ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5]])

这已经看起来像座标,特别是2D图的x和y线。

可视化:

yy, xx = np.ogrid[-5:6, -5:6]
plt.figure()
plt.title('ogrid (sparse meshgrid)')
plt.grid()
plt.xticks(xx.ravel())
plt.yticks(yy.ravel())
plt.scatter(xx, np.zeros_like(xx), color="blue", marker="*")
plt.scatter(np.zeros_like(yy), yy, color="red", marker="x")

在此处输入图片说明

np.mgrid 和密集/充实的网格

>>> yy, xx = np.mgrid[-5:6, -5:6]
>>> xx
array([[-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5]])
>>> yy
array([[-5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5],
       [-4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4],
       [-3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3],
       [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1],
       [ 2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2],
       [ 3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3],
       [ 4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4],
       [ 5,  5,  5,  5,  5,  5,  5,  5,  5,  5,  5]])

此处同样适用:与相比,输出反转np.meshgrid

>>> xx, yy = np.meshgrid(np.arange(-5, 6), np.arange(-5, 6))
>>> xx
array([[-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5]])
>>> yy
array([[-5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5],
       [-4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4],
       [-3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3],
       [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1],
       [ 2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2],
       [ 3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3],
       [ 4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4],
       [ 5,  5,  5,  5,  5,  5,  5,  5,  5,  5,  5]])

ogrid这些数组不同的是,它们在-5 <= xx <= 5中包含所有 xxyy坐标;-5 <= yy <= 5格。

yy, xx = np.mgrid[-5:6, -5:6]
plt.figure()
plt.title('mgrid (dense meshgrid)')
plt.grid()
plt.xticks(xx[0])
plt.yticks(yy[:, 0])
plt.scatter(xx, yy, color="red", marker="x")

在此处输入图片说明

功能性

它不仅限于二维,而且这些函数适用于任意尺寸(嗯,Python中为函数提供了最大数量的参数,而NumPy允许最大数量的尺寸):

>>> x1, x2, x3, x4 = np.ogrid[:3, 1:4, 2:5, 3:6]
>>> for i, x in enumerate([x1, x2, x3, x4]):
...     print('x{}'.format(i+1))
...     print(repr(x))
x1
array([[[[0]]],


       [[[1]]],


       [[[2]]]])
x2
array([[[[1]],

        [[2]],

        [[3]]]])
x3
array([[[[2],
         [3],
         [4]]]])
x4
array([[[[3, 4, 5]]]])

>>> # equivalent meshgrid output, note how the first two arguments are reversed and the unpacking
>>> x2, x1, x3, x4 = np.meshgrid(np.arange(1,4), np.arange(3), np.arange(2, 5), np.arange(3, 6), sparse=True)
>>> for i, x in enumerate([x1, x2, x3, x4]):
...     print('x{}'.format(i+1))
...     print(repr(x))
# Identical output so it's omitted here.

即使这些也适用于一维,也有两个(更为常见的)一维网格创建功能:

除了startand stop参数外,它还支持step参数(即使是代表步骤数的复杂步骤):

>>> x1, x2 = np.mgrid[1:10:2, 1:10:4j]
>>> x1  # The dimension with the explicit step width of 2
array([[1., 1., 1., 1.],
       [3., 3., 3., 3.],
       [5., 5., 5., 5.],
       [7., 7., 7., 7.],
       [9., 9., 9., 9.]])
>>> x2  # The dimension with the "number of steps"
array([[ 1.,  4.,  7., 10.],
       [ 1.,  4.,  7., 10.],
       [ 1.,  4.,  7., 10.],
       [ 1.,  4.,  7., 10.],
       [ 1.,  4.,  7., 10.]])

应用领域

您专门询问了目的,实际上,如果需要坐标系,这些网格非常有用。

例如,如果您有一个NumPy函数,它可以在两个维度上计算距离:

def distance_2d(x_point, y_point, x, y):
    return np.hypot(x-x_point, y-y_point)

您想知道每个点的距离:

>>> ys, xs = np.ogrid[-5:5, -5:5]
>>> distances = distance_2d(1, 2, xs, ys)  # distance to point (1, 2)
>>> distances
array([[9.21954446, 8.60232527, 8.06225775, 7.61577311, 7.28010989,
        7.07106781, 7.        , 7.07106781, 7.28010989, 7.61577311],
       [8.48528137, 7.81024968, 7.21110255, 6.70820393, 6.32455532,
        6.08276253, 6.        , 6.08276253, 6.32455532, 6.70820393],
       [7.81024968, 7.07106781, 6.40312424, 5.83095189, 5.38516481,
        5.09901951, 5.        , 5.09901951, 5.38516481, 5.83095189],
       [7.21110255, 6.40312424, 5.65685425, 5.        , 4.47213595,
        4.12310563, 4.        , 4.12310563, 4.47213595, 5.        ],
       [6.70820393, 5.83095189, 5.        , 4.24264069, 3.60555128,
        3.16227766, 3.        , 3.16227766, 3.60555128, 4.24264069],
       [6.32455532, 5.38516481, 4.47213595, 3.60555128, 2.82842712,
        2.23606798, 2.        , 2.23606798, 2.82842712, 3.60555128],
       [6.08276253, 5.09901951, 4.12310563, 3.16227766, 2.23606798,
        1.41421356, 1.        , 1.41421356, 2.23606798, 3.16227766],
       [6.        , 5.        , 4.        , 3.        , 2.        ,
        1.        , 0.        , 1.        , 2.        , 3.        ],
       [6.08276253, 5.09901951, 4.12310563, 3.16227766, 2.23606798,
        1.41421356, 1.        , 1.41421356, 2.23606798, 3.16227766],
       [6.32455532, 5.38516481, 4.47213595, 3.60555128, 2.82842712,
        2.23606798, 2.        , 2.23606798, 2.82842712, 3.60555128]])

如果通过密集网格而不是开放网格,则输出将是相同的。NumPys广播使之成为可能!

让我们可视化结果:

plt.figure()
plt.title('distance to point (1, 2)')
plt.imshow(distances, origin='lower', interpolation="none")
plt.xticks(np.arange(xs.shape[1]), xs.ravel())  # need to set the ticks manually
plt.yticks(np.arange(ys.shape[0]), ys.ravel())
plt.colorbar()

在此处输入图片说明

这也是当NumPys mgridogrid变得非常方便,因为它可以让你轻松更改网格的分辨率:

ys, xs = np.ogrid[-5:5:200j, -5:5:200j]
# otherwise same code as above

在此处输入图片说明

但是,由于imshow不支持xy输入,因此必须手动更改报价。如果接受x和,这将非常方便。y坐标,对吗?

使用NumPy编写自然处理网格的函数很容易。此外,NumPy,SciPy,matplotlib中有几个函数希望您传递到网格中。

我喜欢图片,因此让我们来探索一下matplotlib.pyplot.contour

ys, xs = np.mgrid[-5:5:200j, -5:5:200j]
density = np.sin(ys)-np.cos(xs)
plt.figure()
plt.contour(xs, ys, density)

在此处输入图片说明

注意如何正确设置坐标!如果您只是传入,则不会是这种情况density

或举另一个使用天体模型的有趣示例(这次我不太在乎坐标,我只是使用它们来创建一些网格):

from astropy.modeling import models
z = np.zeros((100, 100))
y, x = np.mgrid[0:100, 0:100]
for _ in range(10):
    g2d = models.Gaussian2D(amplitude=100, 
                           x_mean=np.random.randint(0, 100), 
                           y_mean=np.random.randint(0, 100), 
                           x_stddev=3, 
                           y_stddev=3)
    z += g2d(x, y)
    a2d = models.AiryDisk2D(amplitude=70, 
                            x_0=np.random.randint(0, 100), 
                            y_0=np.random.randint(0, 100), 
                            radius=5)
    z += a2d(x, y)

在此处输入图片说明

尽管那只是为了“外观”,但与Scipy中的功能模型和拟合(例如scipy.interpolate.interp2dscipy.interpolate.griddata甚至显示示例使用np.mgrid)相关的一些功能仍需要网格。其中大多数都适用于开放式网格和密集型网格,但是有些仅适用于其中之一。

Actually the purpose of np.meshgrid is already mentioned in the documentation:

np.meshgrid

Return coordinate matrices from coordinate vectors.

Make N-D coordinate arrays for vectorized evaluations of N-D scalar/vector fields over N-D grids, given one-dimensional coordinate arrays x1, x2,…, xn.

So it’s primary purpose is to create a coordinates matrices.

You probably just asked yourself:

Why do we need to create coordinate matrices?

The reason you need coordinate matrices with Python/NumPy is that there is no direct relation from coordinates to values, except when your coordinates start with zero and are purely positive integers. Then you can just use the indices of an array as the index. However when that’s not the case you somehow need to store coordinates alongside your data. That’s where grids come in.

Suppose your data is:

1  2  1
2  5  2
1  2  1

However, each value represents a 2 kilometers wide region horizontally and 3 kilometers vertically. Suppose your origin is the upper left corner and you want arrays that represent the distance you could use:

import numpy as np
h, v = np.meshgrid(np.arange(3)*3, np.arange(3)*2)

where v is:

array([[0, 0, 0],
       [2, 2, 2],
       [4, 4, 4]])

and h:

array([[0, 3, 6],
       [0, 3, 6],
       [0, 3, 6]])

So if you have two indices, let’s say x and y (that’s why the return value of meshgrid is usually xx or xs instead of x in this case I chose h for horizontally!) then you can get the x coordinate of the point, the y coordinate of the point and the value at that point by using:

h[x, y]    # horizontal coordinate
v[x, y]    # vertical coordinate
data[x, y]  # value

That makes it much easier to keep track of coordinates and (even more importantly) you can pass them to functions that need to know the coordinates.

A slightly longer explanation

However, np.meshgrid itself isn’t often used directly, mostly one just uses one of similar objects np.mgrid or np.ogrid. Here np.mgrid represents the sparse=False and np.ogrid the sparse=True case (I refer to the sparse argument of np.meshgrid). Note that there is a significant difference between np.meshgrid and np.ogrid and np.mgrid: The first two returned values (if there are two or more) are reversed. Often this doesn’t matter but you should give meaningful variable names depending on the context.

For example, in case of a 2D grid and matplotlib.pyplot.imshow it makes sense to name the first returned item of np.meshgrid x and the second one y while it’s the other way around for np.mgrid and np.ogrid.

np.ogrid and sparse grids

>>> import numpy as np
>>> yy, xx = np.ogrid[-5:6, -5:6]
>>> xx
array([[-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5]])
>>> yy
array([[-5],
       [-4],
       [-3],
       [-2],
       [-1],
       [ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5]])
       

As already said the output is reversed when compared to np.meshgrid, that’s why I unpacked it as yy, xx instead of xx, yy:

>>> xx, yy = np.meshgrid(np.arange(-5, 6), np.arange(-5, 6), sparse=True)
>>> xx
array([[-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5]])
>>> yy
array([[-5],
       [-4],
       [-3],
       [-2],
       [-1],
       [ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5]])

This already looks like coordinates, specifically the x and y lines for 2D plots.

Visualized:

yy, xx = np.ogrid[-5:6, -5:6]
plt.figure()
plt.title('ogrid (sparse meshgrid)')
plt.grid()
plt.xticks(xx.ravel())
plt.yticks(yy.ravel())
plt.scatter(xx, np.zeros_like(xx), color="blue", marker="*")
plt.scatter(np.zeros_like(yy), yy, color="red", marker="x")

enter image description here

np.mgrid and dense/fleshed out grids

>>> yy, xx = np.mgrid[-5:6, -5:6]
>>> xx
array([[-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5]])
>>> yy
array([[-5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5],
       [-4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4],
       [-3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3],
       [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1],
       [ 2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2],
       [ 3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3],
       [ 4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4],
       [ 5,  5,  5,  5,  5,  5,  5,  5,  5,  5,  5]])
       

The same applies here: The output is reversed compared to np.meshgrid:

>>> xx, yy = np.meshgrid(np.arange(-5, 6), np.arange(-5, 6))
>>> xx
array([[-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5],
       [-5, -4, -3, -2, -1,  0,  1,  2,  3,  4,  5]])
>>> yy
array([[-5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5],
       [-4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4],
       [-3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3],
       [-2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1],
       [ 2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2],
       [ 3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3],
       [ 4,  4,  4,  4,  4,  4,  4,  4,  4,  4,  4],
       [ 5,  5,  5,  5,  5,  5,  5,  5,  5,  5,  5]])
       

Unlike ogrid these arrays contain all xx and yy coordinates in the -5 <= xx <= 5; -5 <= yy <= 5 grid.

yy, xx = np.mgrid[-5:6, -5:6]
plt.figure()
plt.title('mgrid (dense meshgrid)')
plt.grid()
plt.xticks(xx[0])
plt.yticks(yy[:, 0])
plt.scatter(xx, yy, color="red", marker="x")

enter image description here

Functionality

It’s not only limited to 2D, these functions work for arbitrary dimensions (well, there is a maximum number of arguments given to function in Python and a maximum number of dimensions that NumPy allows):

>>> x1, x2, x3, x4 = np.ogrid[:3, 1:4, 2:5, 3:6]
>>> for i, x in enumerate([x1, x2, x3, x4]):
...     print('x{}'.format(i+1))
...     print(repr(x))
x1
array([[[[0]]],


       [[[1]]],


       [[[2]]]])
x2
array([[[[1]],

        [[2]],

        [[3]]]])
x3
array([[[[2],
         [3],
         [4]]]])
x4
array([[[[3, 4, 5]]]])

>>> # equivalent meshgrid output, note how the first two arguments are reversed and the unpacking
>>> x2, x1, x3, x4 = np.meshgrid(np.arange(1,4), np.arange(3), np.arange(2, 5), np.arange(3, 6), sparse=True)
>>> for i, x in enumerate([x1, x2, x3, x4]):
...     print('x{}'.format(i+1))
...     print(repr(x))
# Identical output so it's omitted here.

Even if these also work for 1D there are two (much more common) 1D grid creation functions:

Besides the start and stop argument it also supports the step argument (even complex steps that represent the number of steps):

>>> x1, x2 = np.mgrid[1:10:2, 1:10:4j]
>>> x1  # The dimension with the explicit step width of 2
array([[1., 1., 1., 1.],
       [3., 3., 3., 3.],
       [5., 5., 5., 5.],
       [7., 7., 7., 7.],
       [9., 9., 9., 9.]])
>>> x2  # The dimension with the "number of steps"
array([[ 1.,  4.,  7., 10.],
       [ 1.,  4.,  7., 10.],
       [ 1.,  4.,  7., 10.],
       [ 1.,  4.,  7., 10.],
       [ 1.,  4.,  7., 10.]])
       

Applications

You specifically asked about the purpose and in fact, these grids are extremely useful if you need a coordinate system.

For example if you have a NumPy function that calculates the distance in two dimensions:

def distance_2d(x_point, y_point, x, y):
    return np.hypot(x-x_point, y-y_point)
    

And you want to know the distance of each point:

>>> ys, xs = np.ogrid[-5:5, -5:5]
>>> distances = distance_2d(1, 2, xs, ys)  # distance to point (1, 2)
>>> distances
array([[9.21954446, 8.60232527, 8.06225775, 7.61577311, 7.28010989,
        7.07106781, 7.        , 7.07106781, 7.28010989, 7.61577311],
       [8.48528137, 7.81024968, 7.21110255, 6.70820393, 6.32455532,
        6.08276253, 6.        , 6.08276253, 6.32455532, 6.70820393],
       [7.81024968, 7.07106781, 6.40312424, 5.83095189, 5.38516481,
        5.09901951, 5.        , 5.09901951, 5.38516481, 5.83095189],
       [7.21110255, 6.40312424, 5.65685425, 5.        , 4.47213595,
        4.12310563, 4.        , 4.12310563, 4.47213595, 5.        ],
       [6.70820393, 5.83095189, 5.        , 4.24264069, 3.60555128,
        3.16227766, 3.        , 3.16227766, 3.60555128, 4.24264069],
       [6.32455532, 5.38516481, 4.47213595, 3.60555128, 2.82842712,
        2.23606798, 2.        , 2.23606798, 2.82842712, 3.60555128],
       [6.08276253, 5.09901951, 4.12310563, 3.16227766, 2.23606798,
        1.41421356, 1.        , 1.41421356, 2.23606798, 3.16227766],
       [6.        , 5.        , 4.        , 3.        , 2.        ,
        1.        , 0.        , 1.        , 2.        , 3.        ],
       [6.08276253, 5.09901951, 4.12310563, 3.16227766, 2.23606798,
        1.41421356, 1.        , 1.41421356, 2.23606798, 3.16227766],
       [6.32455532, 5.38516481, 4.47213595, 3.60555128, 2.82842712,
        2.23606798, 2.        , 2.23606798, 2.82842712, 3.60555128]])
        

The output would be identical if one passed in a dense grid instead of an open grid. NumPys broadcasting makes it possible!

Let’s visualize the result:

plt.figure()
plt.title('distance to point (1, 2)')
plt.imshow(distances, origin='lower', interpolation="none")
plt.xticks(np.arange(xs.shape[1]), xs.ravel())  # need to set the ticks manually
plt.yticks(np.arange(ys.shape[0]), ys.ravel())
plt.colorbar()

enter image description here

And this is also when NumPys mgrid and ogrid become very convenient because it allows you to easily change the resolution of your grids:

ys, xs = np.ogrid[-5:5:200j, -5:5:200j]
# otherwise same code as above

enter image description here

However, since imshow doesn’t support x and y inputs one has to change the ticks by hand. It would be really convenient if it would accept the x and y coordinates, right?

It’s easy to write functions with NumPy that deal naturally with grids. Furthermore, there are several functions in NumPy, SciPy, matplotlib that expect you to pass in the grid.

I like images so let’s explore matplotlib.pyplot.contour:

ys, xs = np.mgrid[-5:5:200j, -5:5:200j]
density = np.sin(ys)-np.cos(xs)
plt.figure()
plt.contour(xs, ys, density)

enter image description here

Note how the coordinates are already correctly set! That wouldn’t be the case if you just passed in the density.

Or to give another fun example using astropy models (this time I don’t care much about the coordinates, I just use them to create some grid):

from astropy.modeling import models
z = np.zeros((100, 100))
y, x = np.mgrid[0:100, 0:100]
for _ in range(10):
    g2d = models.Gaussian2D(amplitude=100, 
                           x_mean=np.random.randint(0, 100), 
                           y_mean=np.random.randint(0, 100), 
                           x_stddev=3, 
                           y_stddev=3)
    z += g2d(x, y)
    a2d = models.AiryDisk2D(amplitude=70, 
                            x_0=np.random.randint(0, 100), 
                            y_0=np.random.randint(0, 100), 
                            radius=5)
    z += a2d(x, y)
    

enter image description here

Although that’s just “for the looks” several functions related to functional models and fitting (for example scipy.interpolate.interp2d, scipy.interpolate.griddata even show examples using np.mgrid) in Scipy, etc. require grids. Most of these work with open grids and dense grids, however some only work with one of them.


回答 3

假设您有一个功能:

def sinus2d(x, y):
    return np.sin(x) + np.sin(y)

例如,您想要查看0到2 * pi范围内的图像。你会怎么做?有np.meshgrid来自于:

xx, yy = np.meshgrid(np.linspace(0,2*np.pi,100), np.linspace(0,2*np.pi,100))
z = sinus2d(xx, yy) # Create the image on this grid

这样的情节看起来像:

import matplotlib.pyplot as plt
plt.imshow(z, origin='lower', interpolation='none')
plt.show()

在此处输入图片说明

所以np.meshgrid只是一个方便。原则上,可以通过以下方式完成此操作:

z2 = sinus2d(np.linspace(0,2*np.pi,100)[:,None], np.linspace(0,2*np.pi,100)[None,:])

但是您需要了解自己的尺寸(假设您有两个以上…)和正确的广播。np.meshgrid为您完成所有这一切。

此外,例如,如果您想进行插值但排除某些值,则meshgrid允许您将坐标与数据一起删除:

condition = z>0.6
z_new = z[condition] # This will make your array 1D

那么您现在将如何进行插值?你可以给xy一个插值函数,scipy.interpolate.interp2d因此您需要一种方法来知道删除了哪些坐标:

x_new = xx[condition]
y_new = yy[condition]

然后您仍然可以使用“正确的”坐标进行插值(在没有网状网格的情况下进行尝试,您将获得很多额外的代码):

from scipy.interpolate import interp2d
interpolated = interp2d(x_new, y_new, z_new)

并且原始网格物体允许您再次在原始网格物体上获得插值:

interpolated_grid = interpolated(xx[0], yy[:, 0]).reshape(xx.shape)

这些只是我使用过的一些示例,meshgrid可能还会更多。

Suppose you have a function:

def sinus2d(x, y):
    return np.sin(x) + np.sin(y)

and you want, for example, to see what it looks like in the range 0 to 2*pi. How would you do it? There np.meshgrid comes in:

xx, yy = np.meshgrid(np.linspace(0,2*np.pi,100), np.linspace(0,2*np.pi,100))
z = sinus2d(xx, yy) # Create the image on this grid

and such a plot would look like:

import matplotlib.pyplot as plt
plt.imshow(z, origin='lower', interpolation='none')
plt.show()

enter image description here

So np.meshgrid is just a convenience. In principle the same could be done by:

z2 = sinus2d(np.linspace(0,2*np.pi,100)[:,None], np.linspace(0,2*np.pi,100)[None,:])

but there you need to be aware of your dimensions (suppose you have more than two …) and the right broadcasting. np.meshgrid does all of this for you.

Also meshgrid allows you to delete coordinates together with the data if you, for example, want to do an interpolation but exclude certain values:

condition = z>0.6
z_new = z[condition] # This will make your array 1D

so how would you do the interpolation now? You can give x and y to an interpolation function like scipy.interpolate.interp2d so you need a way to know which coordinates were deleted:

x_new = xx[condition]
y_new = yy[condition]

and then you can still interpolate with the “right” coordinates (try it without the meshgrid and you will have a lot of extra code):

from scipy.interpolate import interp2d
interpolated = interp2d(x_new, y_new, z_new)

and the original meshgrid allows you to get the interpolation on the original grid again:

interpolated_grid = interpolated(xx[0], yy[:, 0]).reshape(xx.shape)

These are just some examples where I used the meshgrid there might be a lot more.


回答 4

meshgrid帮助从两个数组的所有成对点的两个一维数组中创建一个矩形网格。

x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 1, 2, 3, 4])

现在,如果您定义了函数f(x,y),并且想将此函数应用于数组’x’和’y’的所有可能的点组合,则可以执行以下操作:

f(*np.meshgrid(x, y))

假设,如果您的函数仅产生两个元素的乘积,那么这就是可以有效地为大型数组获得笛卡尔积的方法。

这里引用

meshgrid helps in creating a rectangular grid from two 1-D arrays of all pairs of points from the two arrays.

x = np.array([0, 1, 2, 3, 4])
y = np.array([0, 1, 2, 3, 4])

Now, if you have defined a function f(x,y) and you wanna apply this function to all the possible combination of points from the arrays ‘x’ and ‘y’, then you can do this:

f(*np.meshgrid(x, y))

Say, if your function just produces the product of two elements, then this is how a cartesian product can be achieved, efficiently for large arrays.

Referred from here


回答 5

基本思想

给定可能的x值xs(将其视为图的x轴上的刻度线)和可能的y值ys,将meshgrid生成对应的(x,y)网格点集-与类似set((x, y) for x in xs for y in yx)。例如,如果xs=[1,2,3]ys=[4,5,6],我们将获得一组坐标{(1,4), (2,4), (3,4), (1,5), (2,5), (3,5), (1,6), (2,6), (3,6)}

返回值的形式

但是,meshgrid返回的表示形式在两个方面与上述表达式不同:

首先meshgrid在2d数组中布置网格点:行对应于不同的y值,列对应于不同的x值-如在中list(list((x, y) for x in xs) for y in ys),将得到以下数组:

   [[(1,4), (2,4), (3,4)],
    [(1,5), (2,5), (3,5)],
    [(1,6), (2,6), (3,6)]]

其次,分别meshgrid返回x和y坐标(即在两个不同的numpy 2d数组中):

   xcoords, ycoords = (
       array([[1, 2, 3],
              [1, 2, 3],
              [1, 2, 3]]),
       array([[4, 4, 4],
              [5, 5, 5],
              [6, 6, 6]]))
   # same thing using np.meshgrid:
   xcoords, ycoords = np.meshgrid([1,2,3], [4,5,6])
   # same thing without meshgrid:
   xcoords = np.array([xs] * len(ys)
   ycoords = np.array([ys] * len(xs)).T

注意,np.meshgrid也可以生成更大尺寸的网格。给定xs,ys和zs,您将把xcoords,ycoords,zcoords作为3d数组返回。meshgrid还支持维度的反向排序以及结果的稀疏表示。

应用领域

我们为什么要这种形式的输出?

在网格上的每个点上应用一个函数: 一种动机是像(+,-,*,/,**)这样的二进制运算符对于numpy数组作为元素操作进行了重载。这意味着,如果我有一个def f(x, y): return (x - y) ** 2可以在两个标量上使用的函数,那么我也可以将其应用于两个numpy数组以获取按元素排列的结果数组:例如f(xcoords, ycoords)f(*np.meshgrid(xs, ys))在上面的示例中给出以下内容:

array([[ 9,  4,  1],
       [16,  9,  4],
       [25, 16,  9]])

高维外部产品:我不确定这有多有效,但是您可以通过以下方式获得高维外部产品:np.prod(np.meshgrid([1,2,3], [1,2], [1,2,3,4]), axis=0)

在matplotlib云图:我遇到meshgrid调查时绘制等高线图与matplotlib用于绘制的决策边界。为此,使用生成一个网格meshgrid,在每个网格点评估函数(例如,如上所示),然后将xcoords,ycoords和计算出的f值(即zcoords)传递给contourf函数。

Basic Idea

Given possible x values, xs, (think of them as the tick-marks on the x-axis of a plot) and possible y values, ys, meshgrid generates the corresponding set of (x, y) grid points—analogous to set((x, y) for x in xs for y in yx). For example, if xs=[1,2,3] and ys=[4,5,6], we’d get the set of coordinates {(1,4), (2,4), (3,4), (1,5), (2,5), (3,5), (1,6), (2,6), (3,6)}.

Form of the Return Value

However, the representation that meshgrid returns is different from the above expression in two ways:

First, meshgrid lays out the grid points in a 2d array: rows correspond to different y-values, columns correspond to different x-values—as in list(list((x, y) for x in xs) for y in ys), which would give the following array:

   [[(1,4), (2,4), (3,4)],
    [(1,5), (2,5), (3,5)],
    [(1,6), (2,6), (3,6)]]

Second, meshgrid returns the x and y coordinates separately (i.e. in two different numpy 2d arrays):

   xcoords, ycoords = (
       array([[1, 2, 3],
              [1, 2, 3],
              [1, 2, 3]]),
       array([[4, 4, 4],
              [5, 5, 5],
              [6, 6, 6]]))
   # same thing using np.meshgrid:
   xcoords, ycoords = np.meshgrid([1,2,3], [4,5,6])
   # same thing without meshgrid:
   xcoords = np.array([xs] * len(ys)
   ycoords = np.array([ys] * len(xs)).T

Note, np.meshgrid can also generate grids for higher dimensions. Given xs, ys, and zs, you’d get back xcoords, ycoords, zcoords as 3d arrays. meshgrid also supports reverse ordering of the dimensions as well as sparse representation of the result.

Applications

Why would we want this form of output?

Apply a function at every point on a grid: One motivation is that binary operators like (+, -, *, /, **) are overloaded for numpy arrays as elementwise operations. This means that if I have a function def f(x, y): return (x - y) ** 2 that works on two scalars, I can also apply it on two numpy arrays to get an array of elementwise results: e.g. f(xcoords, ycoords) or f(*np.meshgrid(xs, ys)) gives the following on the above example:

array([[ 9,  4,  1],
       [16,  9,  4],
       [25, 16,  9]])

Higher dimensional outer product: I’m not sure how efficient this is, but you can get high-dimensional outer products this way: np.prod(np.meshgrid([1,2,3], [1,2], [1,2,3,4]), axis=0).

Contour plots in matplotlib: I came across meshgrid when investigating drawing contour plots with matplotlib for plotting decision boundaries. For this, you generate a grid with meshgrid, evaluate the function at each grid point (e.g. as shown above), and then pass the xcoords, ycoords, and computed f-values (i.e. zcoords) into the contourf function.


将大纪元时间转换为日期时间

问题:将大纪元时间转换为日期时间

我从其他人那里得到的回应是像这样的大纪元时间格式

start_time = 1234566
end_time = 1234578

我想将那个纪元秒转换为MySQL格式时间,以便可以将差异存储在MySQL数据库中。

我试过了:

>>> import time
>>> time.gmtime(123456)
time.struct_time(tm_year=1970, tm_mon=1, tm_mday=2, tm_hour=10, tm_min=17, tm_sec=36, tm_wday=4, tm_yday=2, tm_isdst=0)

以上结果不是我所期望的。我希望它像

2012-09-12 21:00:00

请提出我该如何实现?

另外,为什么我TypeError: a float is required

>>> getbbb_class.end_time = 1347516459425
>>> mend = time.gmtime(getbbb_class.end_time).tm_hour
Traceback (most recent call last):
  ...
TypeError: a float is required

I am getting a response from the rest is an Epoch time format like

start_time = 1234566
end_time = 1234578

I want to convert that epoch seconds in MySQL format time so that I could store the differences in my MySQL database.

I tried:

>>> import time
>>> time.gmtime(123456)
time.struct_time(tm_year=1970, tm_mon=1, tm_mday=2, tm_hour=10, tm_min=17, tm_sec=36, tm_wday=4, tm_yday=2, tm_isdst=0)

The above result is not what I am expecting. I want it be like

2012-09-12 21:00:00

Please suggest how can I achieve this?

Also, Why I am getting TypeError: a float is required for

>>> getbbb_class.end_time = 1347516459425
>>> mend = time.gmtime(getbbb_class.end_time).tm_hour
Traceback (most recent call last):
  ...
TypeError: a float is required

回答 0

要将时间值(浮点型或整数型)转换为格式化的字符串,请使用:

time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(1347517370))

To convert your time value (float or int) to a formatted string, use:

time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(1347517370))

回答 1

您也可以使用datetime

>>> import datetime
>>> datetime.datetime.fromtimestamp(1347517370).strftime('%c')
  '2012-09-13 02:22:50'

You can also use datetime:

>>> import datetime
>>> datetime.datetime.fromtimestamp(1347517370).strftime('%c')
  '2012-09-13 02:22:50'

回答 2

>>> import datetime
>>> datetime.datetime.fromtimestamp(1347517370).strftime('%Y-%m-%d %H:%M:%S')
'2012-09-13 14:22:50' # Local time

要获取UTC:

>>> datetime.datetime.utcfromtimestamp(1347517370).strftime('%Y-%m-%d %H:%M:%S')
  '2012-09-13 06:22:50'
>>> import datetime
>>> datetime.datetime.fromtimestamp(1347517370).strftime('%Y-%m-%d %H:%M:%S')
'2012-09-13 14:22:50' # Local time

To get UTC:

>>> datetime.datetime.utcfromtimestamp(1347517370).strftime('%Y-%m-%d %H:%M:%S')
  '2012-09-13 06:22:50'

回答 3

这就是你所需要的

In [1]: time.time()
Out[1]: 1347517739.44904

In [2]: time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(time.time()))
Out[2]: '2012-09-13 06:31:43'

请输入一个,float而不是一个int,而另一个TypeError应消失。

mend = time.gmtime(float(getbbb_class.end_time)).tm_hour

This is what you need

In [1]: time.time()
Out[1]: 1347517739.44904

In [2]: time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(time.time()))
Out[2]: '2012-09-13 06:31:43'

Please input a float instead of an int and that other TypeError should go away.

mend = time.gmtime(float(getbbb_class.end_time)).tm_hour

回答 4

尝试这个:

>>> import time
>>> time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(1347517119))
'2012-09-12 23:18:39'

同样在MySQL中,您可以FROM_UNIXTIME喜欢:

INSERT INTO tblname VALUES (FROM_UNIXTIME(1347517119))

对于第二个问题,可能是因为它getbbb_class.end_time是一个字符串。您可以将其转换为数字,例如:float(getbbb_class.end_time)

Try this:

>>> import time
>>> time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(1347517119))
'2012-09-12 23:18:39'

Also in MySQL, you can FROM_UNIXTIME like:

INSERT INTO tblname VALUES (FROM_UNIXTIME(1347517119))

For your 2nd question, it is probably because getbbb_class.end_time is a string. You can convert it to numeric like: float(getbbb_class.end_time)


回答 5

#This adds 10 seconds from now.
from datetime import datetime
import commands

date_string_command="date +%s"
utc = commands.getoutput(date_string_command)
a_date=datetime.fromtimestamp(float(int(utc))).strftime('%Y-%m-%d %H:%M:%S')
print('a_date:'+a_date)
utc = int(utc)+10
b_date=datetime.fromtimestamp(float(utc)).strftime('%Y-%m-%d %H:%M:%S')
print('b_date:'+b_date)

这有点罗word,但它来自UNIX中的date命令。

#This adds 10 seconds from now.
from datetime import datetime
import commands

date_string_command="date +%s"
utc = commands.getoutput(date_string_command)
a_date=datetime.fromtimestamp(float(int(utc))).strftime('%Y-%m-%d %H:%M:%S')
print('a_date:'+a_date)
utc = int(utc)+10
b_date=datetime.fromtimestamp(float(utc)).strftime('%Y-%m-%d %H:%M:%S')
print('b_date:'+b_date)

This is a little more wordy but it comes from date command in unix.


回答 6

首先是man gmtime时代的一些信息

The ctime(), gmtime() and localtime() functions all take an argument of data type time_t which represents calendar  time.   When  inter-
       preted  as  an absolute time value, it represents the number of seconds elapsed since 00:00:00 on January 1, 1970, Coordinated Universal
       Time (UTC).

了解时代应该如何。

>>> time.time()
1347517171.6514659
>>> time.gmtime(time.time())
(2012, 9, 13, 6, 19, 34, 3, 257, 0)

只要确保您要传递的arg time.gmtime()为整数即可。

First a bit of info in epoch from man gmtime

The ctime(), gmtime() and localtime() functions all take an argument of data type time_t which represents calendar  time.   When  inter-
       preted  as  an absolute time value, it represents the number of seconds elapsed since 00:00:00 on January 1, 1970, Coordinated Universal
       Time (UTC).

to understand how epoch should be.

>>> time.time()
1347517171.6514659
>>> time.gmtime(time.time())
(2012, 9, 13, 6, 19, 34, 3, 257, 0)

just ensure the arg you are passing to time.gmtime() is integer.


如何将2D float numpy数组转换为2D int numpy数组?

问题:如何将2D float numpy数组转换为2D int numpy数组?

如何将实际的numpy数组转换为int numpy数组?尝试直接使用map映射到数组,但是没有用。

How to convert real numpy array to int numpy array? Tried using map directly to array but it did not work.


回答 0

使用astype方法。

>>> x = np.array([[1.0, 2.3], [1.3, 2.9]])
>>> x
array([[ 1. ,  2.3],
       [ 1.3,  2.9]])
>>> x.astype(int)
array([[1, 2],
       [1, 2]])

Use the astype method.

>>> x = np.array([[1.0, 2.3], [1.3, 2.9]])
>>> x
array([[ 1. ,  2.3],
       [ 1.3,  2.9]])
>>> x.astype(int)
array([[1, 2],
       [1, 2]])

回答 1

一些用于控制舍入的numpy函数:rintfloortruncceil。取决于您希望如何将浮点数向上,向下或最接近的整数取整。

>>> x = np.array([[1.0,2.3],[1.3,2.9]])
>>> x
array([[ 1. ,  2.3],
       [ 1.3,  2.9]])
>>> y = np.trunc(x)
>>> y
array([[ 1.,  2.],
       [ 1.,  2.]])
>>> z = np.ceil(x)
>>> z
array([[ 1.,  3.],
       [ 2.,  3.]])
>>> t = np.floor(x)
>>> t
array([[ 1.,  2.],
       [ 1.,  2.]])
>>> a = np.rint(x)
>>> a
array([[ 1.,  2.],
       [ 1.,  3.]])

要将其中之一转换为int或将numpy中的其他类型转换astype(由BrenBern回答):

a.astype(int)
array([[1, 2],
       [1, 3]])

>>> y.astype(int)
array([[1, 2],
       [1, 2]])

Some numpy functions for how to control the rounding: rint, floor,trunc, ceil. depending how u wish to round the floats, up, down, or to the nearest int.

>>> x = np.array([[1.0,2.3],[1.3,2.9]])
>>> x
array([[ 1. ,  2.3],
       [ 1.3,  2.9]])
>>> y = np.trunc(x)
>>> y
array([[ 1.,  2.],
       [ 1.,  2.]])
>>> z = np.ceil(x)
>>> z
array([[ 1.,  3.],
       [ 2.,  3.]])
>>> t = np.floor(x)
>>> t
array([[ 1.,  2.],
       [ 1.,  2.]])
>>> a = np.rint(x)
>>> a
array([[ 1.,  2.],
       [ 1.,  3.]])

To make one of this in to int, or one of the other types in numpy, astype (as answered by BrenBern):

a.astype(int)
array([[1, 2],
       [1, 3]])

>>> y.astype(int)
array([[1, 2],
       [1, 2]])

回答 2

您可以使用np.int_

>>> x = np.array([[1.0, 2.3], [1.3, 2.9]])
>>> x
array([[ 1. ,  2.3],
       [ 1.3,  2.9]])
>>> np.int_(x)
array([[1, 2],
       [1, 2]])

you can use np.int_:

>>> x = np.array([[1.0, 2.3], [1.3, 2.9]])
>>> x
array([[ 1. ,  2.3],
       [ 1.3,  2.9]])
>>> np.int_(x)
array([[1, 2],
       [1, 2]])

回答 3

如果不确定您的输入将是Numpy数组,则可以使用asarraywith dtype=int代替astype

>>> np.asarray([1,2,3,4], dtype=int)
array([1, 2, 3, 4])

如果输入数组已经具有正确的dtype,请asarray避免使用数组复制,astype而不要这样做(除非您指定copy=False):

>>> a = np.array([1,2,3,4])
>>> a is np.asarray(a)  # no copy :)
True
>>> a is a.astype(int)  # copy :(
False
>>> a is a.astype(int, copy=False)  # no copy :)
True

If you’re not sure your input is going to be a Numpy array, you can use asarray with dtype=int instead of astype:

>>> np.asarray([1,2,3,4], dtype=int)
array([1, 2, 3, 4])

If the input array already has the correct dtype, asarray avoids the array copy while astype does not (unless you specify copy=False):

>>> a = np.array([1,2,3,4])
>>> a is np.asarray(a)  # no copy :)
True
>>> a is a.astype(int)  # copy :(
False
>>> a is a.astype(int, copy=False)  # no copy :)
True

如何获取Python当前模块中所有类的列表?

问题:如何获取Python当前模块中所有类的列表?

我见过很多人从一个模块中提取所有类的示例,通常是这样的:

# foo.py
class Foo:
    pass

# test.py
import inspect
import foo

for name, obj in inspect.getmembers(foo):
    if inspect.isclass(obj):
        print obj

太棒了

但是我无法找到如何从当前模块中获取所有类。

# foo.py
import inspect

class Foo:
    pass

def print_classes():
    for name, obj in inspect.getmembers(???): # what do I do here?
        if inspect.isclass(obj):
            print obj

# test.py
import foo

foo.print_classes()

这可能确实很明显,但是我什么也找不到。谁能帮我吗?

I’ve seen plenty of examples of people extracting all of the classes from a module, usually something like:

# foo.py
class Foo:
    pass

# test.py
import inspect
import foo

for name, obj in inspect.getmembers(foo):
    if inspect.isclass(obj):
        print obj

Awesome.

But I can’t find out how to get all of the classes from the current module.

# foo.py
import inspect

class Foo:
    pass

def print_classes():
    for name, obj in inspect.getmembers(???): # what do I do here?
        if inspect.isclass(obj):
            print obj

# test.py
import foo

foo.print_classes()

This is probably something really obvious, but I haven’t been able to find anything. Can anyone help me out?


回答 0

尝试这个:

import sys
current_module = sys.modules[__name__]

在您的情况下:

import sys, inspect
def print_classes():
    for name, obj in inspect.getmembers(sys.modules[__name__]):
        if inspect.isclass(obj):
            print(obj)

甚至更好:

clsmembers = inspect.getmembers(sys.modules[__name__], inspect.isclass)

因为inspect.getmembers()带谓语。

Try this:

import sys
current_module = sys.modules[__name__]

In your context:

import sys, inspect
def print_classes():
    for name, obj in inspect.getmembers(sys.modules[__name__]):
        if inspect.isclass(obj):
            print(obj)

And even better:

clsmembers = inspect.getmembers(sys.modules[__name__], inspect.isclass)

Because inspect.getmembers() takes a predicate.


回答 1

关于什么

g = globals().copy()
for name, obj in g.iteritems():

What about

g = globals().copy()
for name, obj in g.iteritems():

?


回答 2

我不知道是否有“适当的”方法来执行此操作,但是您的代码片段import foo处在正确的轨道上:只需将其添加到foo.py中,do inspect.getmembers(foo),它就可以正常工作。

I don’t know if there’s a ‘proper’ way to do it, but your snippet is on the right track: just add import foo to foo.py, do inspect.getmembers(foo), and it should work fine.


回答 3

我能够从dir内置plus中获得所需的一切getattr

# Works on pretty much everything, but be mindful that 
# you get lists of strings back

print dir(myproject)
print dir(myproject.mymodule)
print dir(myproject.mymodule.myfile)
print dir(myproject.mymodule.myfile.myclass)

# But, the string names can be resolved with getattr, (as seen below)

虽然,它的确看起来像个毛线球:

def list_supported_platforms():
    """
        List supported platforms (to match sys.platform)

        @Retirms:
            list str: platform names
    """
    return list(itertools.chain(
        *list(
            # Get the class's constant
            getattr(
                # Get the module's first class, which we wrote
                getattr(
                    # Get the module
                    getattr(platforms, item),
                    dir(
                        getattr(platforms, item)
                    )[0]
                ),
                'SYS_PLATFORMS'
            )
            # For each include in platforms/__init__.py 
            for item in dir(platforms)
            # Ignore magic, ourselves (index.py) and a base class.
            if not item.startswith('__') and item not in ['index', 'base']
        )
    ))

I was able to get all I needed from the dir built in plus getattr.

# Works on pretty much everything, but be mindful that 
# you get lists of strings back

print dir(myproject)
print dir(myproject.mymodule)
print dir(myproject.mymodule.myfile)
print dir(myproject.mymodule.myfile.myclass)

# But, the string names can be resolved with getattr, (as seen below)

Though, it does come out looking like a hairball:

def list_supported_platforms():
    """
        List supported platforms (to match sys.platform)

        @Retirms:
            list str: platform names
    """
    return list(itertools.chain(
        *list(
            # Get the class's constant
            getattr(
                # Get the module's first class, which we wrote
                getattr(
                    # Get the module
                    getattr(platforms, item),
                    dir(
                        getattr(platforms, item)
                    )[0]
                ),
                'SYS_PLATFORMS'
            )
            # For each include in platforms/__init__.py 
            for item in dir(platforms)
            # Ignore magic, ourselves (index.py) and a base class.
            if not item.startswith('__') and item not in ['index', 'base']
        )
    ))

回答 4

import pyclbr
print(pyclbr.readmodule(__name__).keys())

请注意,stdlib的Python类浏览器模块使用静态源分析,因此它仅适用于由实际.py文件支持的模块。

import pyclbr
print(pyclbr.readmodule(__name__).keys())

Note that the stdlib’s Python class browser module uses static source analysis, so it only works for modules that are backed by a real .py file.


回答 5

如果要拥有属于当前模块的所有类,则可以使用以下方法:

import sys, inspect
def print_classes():
    is_class_member = lambda member: inspect.isclass(member) and member.__module__ == __name__
    clsmembers = inspect.getmembers(sys.modules[__name__], is_class_member)

如果您使用Nadia的答案并且要在模块上导入其他类,则这些类也将被导入。

因此,这就是为什么member.__module__ == __name__要添加到上使用的谓词的原因is_class_member。该语句检查该类是否确实属于该模块。

谓词是一个函数(可调用),它返回布尔值。

If you want to have all the classes, that belong to the current module, you could use this :

import sys, inspect
def print_classes():
    is_class_member = lambda member: inspect.isclass(member) and member.__module__ == __name__
    clsmembers = inspect.getmembers(sys.modules[__name__], is_class_member)

If you use Nadia’s answer and you were importing other classes on your module, that classes will be being imported too.

So that’s why member.__module__ == __name__ is being added to the predicate used on is_class_member. This statement checks that the class really belongs to the module.

A predicate is a function (callable), that returns a boolean value.


回答 6

另一个可在Python 2和3中使用的解决方案:

#foo.py
import sys

class Foo(object):
    pass

def print_classes():
    current_module = sys.modules[__name__]
    for key in dir(current_module):
        if isinstance( getattr(current_module, key), type ):
            print(key)

# test.py
import foo
foo.print_classes()

Another solution which works in Python 2 and 3:

#foo.py
import sys

class Foo(object):
    pass

def print_classes():
    current_module = sys.modules[__name__]
    for key in dir(current_module):
        if isinstance( getattr(current_module, key), type ):
            print(key)

# test.py
import foo
foo.print_classes()

回答 7

这是我用来获取当前模块中已定义(即未导入)的所有类的行。根据PEP-8,它有点长,但是您可以根据需要进行更改。

import sys
import inspect

classes = [name for name, obj in inspect.getmembers(sys.modules[__name__], inspect.isclass) 
          if obj.__module__ is __name__]

这为您提供了一个类名列表。如果您想要类对象本身,只需保留obj即可。

classes = [obj for name, obj in inspect.getmembers(sys.modules[__name__], inspect.isclass)
          if obj.__module__ is __name__]

根据我的经验,这是更有用的。

This is the line that I use to get all of the classes that have been defined in the current module (ie not imported). It’s a little long according to PEP-8 but you can change it as you see fit.

import sys
import inspect

classes = [name for name, obj in inspect.getmembers(sys.modules[__name__], inspect.isclass) 
          if obj.__module__ is __name__]

This gives you a list of the class names. If you want the class objects themselves just keep obj instead.

classes = [obj for name, obj in inspect.getmembers(sys.modules[__name__], inspect.isclass)
          if obj.__module__ is __name__]

This is has been more useful in my experience.


回答 8

import Foo 
dir(Foo)

import collections
dir(collections)
import Foo 
dir(Foo)

import collections
dir(collections)

回答 9

我认为您可以做这样的事情。

class custom(object):
    __custom__ = True
class Alpha(custom):
    something = 3
def GetClasses():
    return [x for x in globals() if hasattr(globals()[str(x)], '__custom__')]
print(GetClasses())`

如果您需要自己的类

I think that you can do something like this.

class custom(object):
    __custom__ = True
class Alpha(custom):
    something = 3
def GetClasses():
    return [x for x in globals() if hasattr(globals()[str(x)], '__custom__')]
print(GetClasses())`

if you need own classes


回答 10

我经常发现自己在编写命令行实用程序,其中第一个参数旨在引用许多不同类中的一个。例如./something.py feature command —-arguments,where Feature是一个类,并且command是该类上的一个方法。这是一个使这变得容易的基类。

假设该基类与所有子类都位于一个目录中。然后ArgBaseClass(foo = bar).load_subclasses(),您可以拨打电话,这将返回字典。例如,如果目录如下所示:

  • arg_base_class.py
  • feature.py

假设feature.py工具class Feature(ArgBaseClass),则上述调用load_subclasses将返回{ 'feature' : <Feature object> }。相同的kwargsfoo = bar)将传递给Feature该类。

#!/usr/bin/env python3
import os, pkgutil, importlib, inspect

class ArgBaseClass():
    # Assign all keyword arguments as properties on self, and keep the kwargs for later.
    def __init__(self, **kwargs):
        self._kwargs = kwargs
        for (k, v) in kwargs.items():
            setattr(self, k, v)
        ms = inspect.getmembers(self, predicate=inspect.ismethod)
        self.methods = dict([(n, m) for (n, m) in ms if not n.startswith('_')])

    # Add the names of the methods to a parser object.
    def _parse_arguments(self, parser):
        parser.add_argument('method', choices=list(self.methods))
        return parser

    # Instantiate one of each of the subclasses of this class.
    def load_subclasses(self):
        module_dir = os.path.dirname(__file__)
        module_name = os.path.basename(os.path.normpath(module_dir))
        parent_class = self.__class__
        modules = {}
        # Load all the modules it the package:
        for (module_loader, name, ispkg) in pkgutil.iter_modules([module_dir]):
            modules[name] = importlib.import_module('.' + name, module_name)

        # Instantiate one of each class, passing the keyword arguments.
        ret = {}
        for cls in parent_class.__subclasses__():
            path = cls.__module__.split('.')
            ret[path[-1]] = cls(**self._kwargs)
        return ret

I frequently find myself writing command line utilities wherein the first argument is meant to refer to one of many different classes. For example ./something.py feature command —-arguments, where Feature is a class and command is a method on that class. Here’s a base class that makes this easy.

The assumption is that this base class resides in a directory alongside all of its subclasses. You can then call ArgBaseClass(foo = bar).load_subclasses() which will return a dictionary. For example, if the directory looks like this:

  • arg_base_class.py
  • feature.py

Assuming feature.py implements class Feature(ArgBaseClass), then the above invocation of load_subclasses will return { 'feature' : <Feature object> }. The same kwargs (foo = bar) will be passed into the Feature class.

#!/usr/bin/env python3
import os, pkgutil, importlib, inspect

class ArgBaseClass():
    # Assign all keyword arguments as properties on self, and keep the kwargs for later.
    def __init__(self, **kwargs):
        self._kwargs = kwargs
        for (k, v) in kwargs.items():
            setattr(self, k, v)
        ms = inspect.getmembers(self, predicate=inspect.ismethod)
        self.methods = dict([(n, m) for (n, m) in ms if not n.startswith('_')])

    # Add the names of the methods to a parser object.
    def _parse_arguments(self, parser):
        parser.add_argument('method', choices=list(self.methods))
        return parser

    # Instantiate one of each of the subclasses of this class.
    def load_subclasses(self):
        module_dir = os.path.dirname(__file__)
        module_name = os.path.basename(os.path.normpath(module_dir))
        parent_class = self.__class__
        modules = {}
        # Load all the modules it the package:
        for (module_loader, name, ispkg) in pkgutil.iter_modules([module_dir]):
            modules[name] = importlib.import_module('.' + name, module_name)

        # Instantiate one of each class, passing the keyword arguments.
        ret = {}
        for cls in parent_class.__subclasses__():
            path = cls.__module__.split('.')
            ret[path[-1]] = cls(**self._kwargs)
        return ret

是否有Python等效于C#null-coalescing运算符?

问题:是否有Python等效于C#null-coalescing运算符?

在C#中,有一个null-coalescing运算符(写为??),允许在赋值期间轻松(简短)进行null检查:

string s = null;
var other = s ?? "some default value";

有python等效项吗?

我知道我可以做到:

s = None
other = s if s else "some default value"

但是,还有没有更短的方法(我不需要重复s)?

In C# there’s a null-coalescing operator (written as ??) that allows for easy (short) null checking during assignment:

string s = null;
var other = s ?? "some default value";

Is there a python equivalent?

I know that I can do:

s = None
other = s if s else "some default value"

But is there an even shorter way (where I don’t need to repeat s)?


回答 0

other = s or "some default value"

好的,必须弄清楚or操作员的工作方式。它是一个布尔运算符,因此可以在布尔上下文中工作。如果值不是布尔值,则出于运算符的目的,它们将转换为布尔值。

请注意,or运算符不会仅返回TrueFalse。相反,如果第一个操作数的计算结果为true,则返回第一个操作数;如果第一个操作数的计算结果为false,则返回第二个操作数。

在这种情况下,如果表达式为true ,则x or y返回x它,True或者当转换为boolean时,表达式为true。否则,返回y。在大多数情况下,这将与C♯的空值运算符相同,但请记住:

42    or "something"    # returns 42
0     or "something"    # returns "something"
None  or "something"    # returns "something"
False or "something"    # returns "something"
""    or "something"    # returns "something"

如果您使用变量s来保存对类实例的引用或None(只要您的类未定义member __nonzero__()__len__()),则使用与null-coalescing运算符相同的语义是安全的。

实际上,拥有Python的这种副作用甚至可能是有用的。由于您知道哪些值的计算结果为false,因此可以使用它来触发默认值,而无需None专门使用(例如,错误对象)。

在某些语言中,此行为称为Elvis运算符

other = s or "some default value"

Ok, it must be clarified how the or operator works. It is a boolean operator, so it works in a boolean context. If the values are not boolean, they are converted to boolean for the purposes of the operator.

Note that the or operator does not return only True or False. Instead, it returns the first operand if the first operand evaluates to true, and it returns the second operand if the first operand evaluates to false.

In this case, the expression x or y returns x if it is True or evaluates to true when converted to boolean. Otherwise, it returns y. For most cases, this will serve for the very same purpose of C♯’s null-coalescing operator, but keep in mind:

42    or "something"    # returns 42
0     or "something"    # returns "something"
None  or "something"    # returns "something"
False or "something"    # returns "something"
""    or "something"    # returns "something"

If you use your variable s to hold something that is either a reference to the instance of a class or None (as long as your class does not define members __nonzero__() and __len__()), it is secure to use the same semantics as the null-coalescing operator.

In fact, it may even be useful to have this side-effect of Python. Since you know what values evaluates to false, you can use this to trigger the default value without using None specifically (an error object, for example).

In some languages this behavior is referred to as the Elvis operator.


回答 1

严格来说

other = s if s is not None else "default value"

否则,s = False将变为"default value",这可能不是预期的。

如果要缩短此时间,请尝试:

def notNone(s,d):
    if s is None:
        return d
    else:
        return s

other = notNone(s, "default value")

Strictly,

other = s if s is not None else "default value"

Otherwise, s = False will become "default value", which may not be what was intended.

If you want to make this shorter, try:

def notNone(s,d):
    if s is None:
        return d
    else:
        return s

other = notNone(s, "default value")

回答 2

这是一个将返回第一个参数not的函数None

def coalesce(*arg):
  return reduce(lambda x, y: x if x is not None else y, arg)

# Prints "banana"
print coalesce(None, "banana", "phone", None)

reduce()即使第一个参数不是None,也可能不必要地遍历所有参数,因此您也可以使用以下版本:

def coalesce(*arg):
  for el in arg:
    if el is not None:
      return el
  return None

Here’s a function that will return the first argument that isn’t None:

def coalesce(*arg):
  return reduce(lambda x, y: x if x is not None else y, arg)

# Prints "banana"
print coalesce(None, "banana", "phone", None)

reduce() might needlessly iterate over all the arguments even if the first argument is not None, so you can also use this version:

def coalesce(*arg):
  for el in arg:
    if el is not None:
      return el
  return None

回答 3

我知道已经解决了,但是在处理对象时还有另一种选择。

如果您的对象可能是:

{
   name: {
      first: "John",
      last: "Doe"
   }
}

您可以使用:

obj.get(property_name, value_if_null)

喜欢:

obj.get("name", {}).get("first", "Name is missing") 

通过添加{}作为默认值,如果缺少“名称”,则返回一个空对象并将其传递到下一个get。这类似于C#中的null-safe-navigation,就像obj?.name?.first

I realize this is answered, but there is another option when you’re dealing with objects.

If you have an object that might be:

{
   name: {
      first: "John",
      last: "Doe"
   }
}

You can use:

obj.get(property_name, value_if_null)

Like:

obj.get("name", {}).get("first", "Name is missing") 

By adding {} as the default value, if “name” is missing, an empty object is returned and passed through to the next get. This is similar to null-safe-navigation in C#, which would be like obj?.name?.first.


回答 4

除了朱利诺(Juliano)有关“或”行为的答案:“快速”

>>> 1 or 5/0
1

因此有时对于某些事情,它可能是一个有用的捷径

object = getCachedVersion() or getFromDB()

In addition to Juliano’s answer about behavior of “or”: it’s “fast”

>>> 1 or 5/0
1

So sometimes it’s might be a useful shortcut for things like

object = getCachedVersion() or getFromDB()

回答 5

对于单个值,除了@Bothwells答案(我更喜欢)之外,为了对函数返回值进行空值检查,您可以使用新的walrus-operator(自python3.8起):

def test():
    return

a = 2 if (x:= test()) is None else x

因此,test无需对函数进行两次评估(如中的所示a = 2 if test() is None else test()

Addionally to @Bothwells answer (which I prefer) for single values, in order to null-checking assingment of function return values, you can use new walrus-operator (since python3.8):

def test():
    return

a = 2 if (x:= test()) is None else x

Thus, test function does not need to be evaluated two times (as in a = 2 if test() is None else test())


回答 6

如果您需要嵌套多个null合并操作,例如:

model?.data()?.first()

这不是用容易解决的问题or。它也不能解决.get()需要字典类型或类似类型(无论如何都不能嵌套)的问题,或者getattr()当NoneType没有属性时会抛出异常。

考虑向该语言添加空合并的相关点是PEP 505,与该文档相关的讨论在python-ideas线程中。

In case you need to nest more than one null coalescing operation such as:

model?.data()?.first()

This is not a problem easily solved with or. It also cannot be solved with .get() which requires a dictionary type or similar (and cannot be nested anyway) or getattr() which will throw an exception when NoneType doesn’t have the attribute.

The relevant pip considering adding null coalescing to the language is PEP 505 and the discussion relevant to the document is in the python-ideas thread.


回答 7

关于@Hugh Bothwell,@ mortehu和@glglgl的回答。

设置测试数据集

import random

dataset = [random.randint(0,15) if random.random() > .6 else None for i in range(1000)]

定义实施

def not_none(x, y=None):
    if x is None:
        return y
    return x

def coalesce1(*arg):
  return reduce(lambda x, y: x if x is not None else y, arg)

def coalesce2(*args):
    return next((i for i in args if i is not None), None)

使测试功能

def test_func(dataset, func):
    default = 1
    for i in dataset:
        func(i, default)

使用Python 2.7在Mac i7 @ 2.7Ghz上的结果

>>> %timeit test_func(dataset, not_none)
1000 loops, best of 3: 224 µs per loop

>>> %timeit test_func(dataset, coalesce1)
1000 loops, best of 3: 471 µs per loop

>>> %timeit test_func(dataset, coalesce2)
1000 loops, best of 3: 782 µs per loop

显然,该not_none函数可以正确回答OP的问题并处理“虚假”问题。它也是最快,最容易阅读的。如果将逻辑应用于很多地方,显然这是最好的方法。

如果您有一个问题想要在迭代中找到第一个非空值,那么@mortehu的响应就是解决方法。但这是与OP 不同的解决方案,尽管它可以部分解决这种情况。它不能采用可迭代的AND默认值。最后一个参数将是返回的默认值,但在这种情况下,您将不会传递可迭代的值,而且也不清楚最后一个参数是否是value的默认值。

然后,您可以在下面做,但我仍将使用not_null单值用例。

def coalesce(*args, **kwargs):
    default = kwargs.get('default')
    return next((a for a in arg if a is not None), default)

Regarding answers by @Hugh Bothwell, @mortehu and @glglgl.

Setup Dataset for testing

import random

dataset = [random.randint(0,15) if random.random() > .6 else None for i in range(1000)]

Define implementations

def not_none(x, y=None):
    if x is None:
        return y
    return x

def coalesce1(*arg):
  return reduce(lambda x, y: x if x is not None else y, arg)

def coalesce2(*args):
    return next((i for i in args if i is not None), None)

Make test function

def test_func(dataset, func):
    default = 1
    for i in dataset:
        func(i, default)

Results on mac i7 @2.7Ghz using python 2.7

>>> %timeit test_func(dataset, not_none)
1000 loops, best of 3: 224 µs per loop

>>> %timeit test_func(dataset, coalesce1)
1000 loops, best of 3: 471 µs per loop

>>> %timeit test_func(dataset, coalesce2)
1000 loops, best of 3: 782 µs per loop

Clearly the not_none function answers the OP’s question correctly and handles the “falsy” problem. It is also the fastest and easiest to read. If applying the logic in many places, it is clearly the best way to go.

If you have a problem where you want to find the 1st non-null value in a iterable, then @mortehu’s response is the way to go. But it is a solution to a different problem than OP, although it can partially handle that case. It cannot take an iterable AND a default value. The last argument would be the default value returned, but then you wouldn’t be passing in an iterable in that case as well as it isn’t explicit that the last argument is a default to value.

You could then do below, but I’d still use not_null for the single value use case.

def coalesce(*args, **kwargs):
    default = kwargs.get('default')
    return next((a for a in arg if a is not None), default)

回答 8

对于像我这样偶然发现此问题的可行解决方案的人,当变量可能未定义时,我得到的最接近的是:

if 'variablename' in globals() and ((variablename or False) == True):
  print('variable exists and it\'s true')
else:
  print('variable doesn\'t exist, or it\'s false')

请注意,在检查全局变量时需要一个字符串,但是之后在检查值时将使用实际变量。

有关变量存在的更多信息: 如何检查变量是否存在?

For those like me that stumbled here looking for a viable solution to this issue, when the variable might be undefined, the closest i got is:

if 'variablename' in globals() and ((variablename or False) == True):
  print('variable exists and it\'s true')
else:
  print('variable doesn\'t exist, or it\'s false')

Note that a string is needed when checking in globals, but afterwards the actual variable is used when checking for value.

More on variable existence: How do I check if a variable exists?


回答 9

Python has a get function that its very useful to return a value of an existent key, if the key exist;
if not it will return a default value.

def main():
    names = ['Jack','Maria','Betsy','James','Jack']
    names_repeated = dict()
    default_value = 0

    for find_name in names:
        names_repeated[find_name] = names_repeated.get(find_name, default_value) + 1

如果您在字典中找不到名称,它将返回default_value,如果名称存在,则将任何现有值加1。

希望这可以帮助

Python has a get function that its very useful to return a value of an existent key, if the key exist;
if not it will return a default value.

def main():
    names = ['Jack','Maria','Betsy','James','Jack']
    names_repeated = dict()
    default_value = 0

    for find_name in names:
        names_repeated[find_name] = names_repeated.get(find_name, default_value) + 1

if you cannot find the name inside the dictionary, it will return the default_value, if the name exist then it will add any existing value with 1.

hope this can help


回答 10

我发现下面的两个功能在处理许多可变测试用例时非常有用。

def nz(value, none_value, strict=True):
    ''' This function is named after an old VBA function. It returns a default
        value if the passed in value is None. If strict is False it will
        treat an empty string as None as well.

        example:
        x = None
        nz(x,"hello")
        --> "hello"
        nz(x,"")
        --> ""
        y = ""   
        nz(y,"hello")
        --> ""
        nz(y,"hello", False)
        --> "hello" '''

    if value is None and strict:
        return_val = none_value
    elif strict and value is not None:
        return_val = value
    elif not strict and not is_not_null(value):
        return_val = none_value
    else:
        return_val = value
    return return_val 

def is_not_null(value):
    ''' test for None and empty string '''
    return value is not None and len(str(value)) > 0

The two functions below I have found to be very useful when dealing with many variable testing cases.

def nz(value, none_value, strict=True):
    ''' This function is named after an old VBA function. It returns a default
        value if the passed in value is None. If strict is False it will
        treat an empty string as None as well.

        example:
        x = None
        nz(x,"hello")
        --> "hello"
        nz(x,"")
        --> ""
        y = ""   
        nz(y,"hello")
        --> ""
        nz(y,"hello", False)
        --> "hello" '''

    if value is None and strict:
        return_val = none_value
    elif strict and value is not None:
        return_val = value
    elif not strict and not is_not_null(value):
        return_val = none_value
    else:
        return_val = value
    return return_val 

def is_not_null(value):
    ''' test for None and empty string '''
    return value is not None and len(str(value)) > 0

namedtuple和可选关键字参数的默认值

问题:namedtuple和可选关键字参数的默认值

我正在尝试将冗长的空心“数据”类转换为命名元组。我的Class目前看起来像这样:

class Node(object):
    def __init__(self, val, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

转换为namedtuple它后看起来像:

from collections import namedtuple
Node = namedtuple('Node', 'val left right')

但是这里有一个问题。我的原始类允许我只传递一个值,并通过对named / keyword参数使用默认值来处理默认值。就像是:

class BinaryTree(object):
    def __init__(self, val):
        self.root = Node(val)

但这在我的重构命名元组的情况下不起作用,因为它希望我传递所有字段。我当然可以替换Node(val)to 的出现,Node(val, None, None)但是这并不是我喜欢的。

那么,是否存在一个可以使我的重写成功而又不增加很多代码复杂性(元编程)的好技巧,还是我应该吞下药丸并继续进行“搜索并替换”?:)

I’m trying to convert a longish hollow “data” class into a named tuple. My class currently looks like this:

class Node(object):
    def __init__(self, val, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

After conversion to namedtuple it looks like:

from collections import namedtuple
Node = namedtuple('Node', 'val left right')

But there is a problem here. My original class allowed me to pass in just a value and took care of the default by using default values for the named/keyword arguments. Something like:

class BinaryTree(object):
    def __init__(self, val):
        self.root = Node(val)

But this doesn’t work in the case of my refactored named tuple since it expects me to pass all the fields. I can of course replace the occurrences of Node(val) to Node(val, None, None) but it isn’t to my liking.

So does there exist a good trick which can make my re-write successful without adding a lot of code complexity (metaprogramming) or should I just swallow the pill and go ahead with the “search and replace”? :)


回答 0

Python 3.7

使用默认参数。

>>> from collections import namedtuple
>>> fields = ('val', 'left', 'right')
>>> Node = namedtuple('Node', fields, defaults=(None,) * len(fields))
>>> Node()
Node(val=None, left=None, right=None)

或者更好的是,使用新的dataclasses库,它比namedtuple好得多。

>>> from dataclasses import dataclass
>>> from typing import Any
>>> @dataclass
... class Node:
...     val: Any = None
...     left: 'Node' = None
...     right: 'Node' = None
>>> Node()
Node(val=None, left=None, right=None)

在Python 3.7之前

设置Node.__new__.__defaults__为默认值。

>>> from collections import namedtuple
>>> Node = namedtuple('Node', 'val left right')
>>> Node.__new__.__defaults__ = (None,) * len(Node._fields)
>>> Node()
Node(val=None, left=None, right=None)

在Python 2.6之前

设置Node.__new__.func_defaults为默认值。

>>> from collections import namedtuple
>>> Node = namedtuple('Node', 'val left right')
>>> Node.__new__.func_defaults = (None,) * len(Node._fields)
>>> Node()
Node(val=None, left=None, right=None)

订购

在所有版本的Python中,如果您设置的默认值少于namedtuple中的默认值,则默认值将应用于最右边的参数。这使您可以将一些参数保留为必需参数。

>>> Node.__new__.__defaults__ = (1,2)
>>> Node()
Traceback (most recent call last):
  ...
TypeError: __new__() missing 1 required positional argument: 'val'
>>> Node(3)
Node(val=3, left=1, right=2)

适用于Python 2.6到3.6的包装器

这是给您的包装器,甚至可以让您(可选)将默认值设置为以外的其他值None。这不支持必需的参数。

import collections
def namedtuple_with_defaults(typename, field_names, default_values=()):
    T = collections.namedtuple(typename, field_names)
    T.__new__.__defaults__ = (None,) * len(T._fields)
    if isinstance(default_values, collections.Mapping):
        prototype = T(**default_values)
    else:
        prototype = T(*default_values)
    T.__new__.__defaults__ = tuple(prototype)
    return T

例:

>>> Node = namedtuple_with_defaults('Node', 'val left right')
>>> Node()
Node(val=None, left=None, right=None)
>>> Node = namedtuple_with_defaults('Node', 'val left right', [1, 2, 3])
>>> Node()
Node(val=1, left=2, right=3)
>>> Node = namedtuple_with_defaults('Node', 'val left right', {'right':7})
>>> Node()
Node(val=None, left=None, right=7)
>>> Node(4)
Node(val=4, left=None, right=7)

Python 3.7

Use the defaults parameter.

>>> from collections import namedtuple
>>> fields = ('val', 'left', 'right')
>>> Node = namedtuple('Node', fields, defaults=(None,) * len(fields))
>>> Node()
Node(val=None, left=None, right=None)

Or better yet, use the new dataclasses library, which is much nicer than namedtuple.

>>> from dataclasses import dataclass
>>> from typing import Any
>>> @dataclass
... class Node:
...     val: Any = None
...     left: 'Node' = None
...     right: 'Node' = None
>>> Node()
Node(val=None, left=None, right=None)

Before Python 3.7

Set Node.__new__.__defaults__ to the default values.

>>> from collections import namedtuple
>>> Node = namedtuple('Node', 'val left right')
>>> Node.__new__.__defaults__ = (None,) * len(Node._fields)
>>> Node()
Node(val=None, left=None, right=None)

Before Python 2.6

Set Node.__new__.func_defaults to the default values.

>>> from collections import namedtuple
>>> Node = namedtuple('Node', 'val left right')
>>> Node.__new__.func_defaults = (None,) * len(Node._fields)
>>> Node()
Node(val=None, left=None, right=None)

Order

In all versions of Python, if you set fewer default values than exist in the namedtuple, the defaults are applied to the rightmost parameters. This allows you to keep some arguments as required arguments.

>>> Node.__new__.__defaults__ = (1,2)
>>> Node()
Traceback (most recent call last):
  ...
TypeError: __new__() missing 1 required positional argument: 'val'
>>> Node(3)
Node(val=3, left=1, right=2)

Wrapper for Python 2.6 to 3.6

Here’s a wrapper for you, which even lets you (optionally) set the default values to something other than None. This does not support required arguments.

import collections
def namedtuple_with_defaults(typename, field_names, default_values=()):
    T = collections.namedtuple(typename, field_names)
    T.__new__.__defaults__ = (None,) * len(T._fields)
    if isinstance(default_values, collections.Mapping):
        prototype = T(**default_values)
    else:
        prototype = T(*default_values)
    T.__new__.__defaults__ = tuple(prototype)
    return T

Example:

>>> Node = namedtuple_with_defaults('Node', 'val left right')
>>> Node()
Node(val=None, left=None, right=None)
>>> Node = namedtuple_with_defaults('Node', 'val left right', [1, 2, 3])
>>> Node()
Node(val=1, left=2, right=3)
>>> Node = namedtuple_with_defaults('Node', 'val left right', {'right':7})
>>> Node()
Node(val=None, left=None, right=7)
>>> Node(4)
Node(val=4, left=None, right=7)

回答 1

我将namedtuple子类化,并覆盖了该__new__方法:

from collections import namedtuple

class Node(namedtuple('Node', ['value', 'left', 'right'])):
    __slots__ = ()
    def __new__(cls, value, left=None, right=None):
        return super(Node, cls).__new__(cls, value, left, right)

这样可以保留直观的类型层次结构,而伪装成类的工厂函数则不会创建。

I subclassed namedtuple and overrode the __new__ method:

from collections import namedtuple

class Node(namedtuple('Node', ['value', 'left', 'right'])):
    __slots__ = ()
    def __new__(cls, value, left=None, right=None):
        return super(Node, cls).__new__(cls, value, left, right)

This preserves an intuitive type hierarchy, which the creation of a factory function disguised as a class does not.


回答 2

将其包装在函数中。

NodeT = namedtuple('Node', 'val left right')

def Node(val, left=None, right=None):
  return NodeT(val, left, right)

Wrap it in a function.

NodeT = namedtuple('Node', 'val left right')

def Node(val, left=None, right=None):
  return NodeT(val, left, right)

回答 3

随着typing.NamedTuple在Python 3.6.1+,你可以同时提供一个默认值和类型标注为NamedTuple场。使用typing.Any,如果你只需要前者:

from typing import Any, NamedTuple


class Node(NamedTuple):
    val: Any
    left: 'Node' = None
    right: 'Node' = None

用法:

>>> Node(1)
Node(val=1, left=None, right=None)
>>> n = Node(1)
>>> Node(2, left=n)
Node(val=2, left=Node(val=1, left=None, right=None), right=None)

另外,如果您既需要默认值又需要可选的可变性,则Python 3.7将具有数据类(PEP 557),这些数据类在某些(很多情况下可以替换namedtuple。


旁注:Python中当前注释规范(:参数和变量之后的表达式以及->函数之后的表达式)的一个怪癖是它们在定义时间*进行评估。因此,由于“一旦执行了整个类的主体,就定义了类名称”,因此'Node'上面的类字段中的注释必须是字符串,以避免NameError。

这种类型的提示称为“正向引用”([1][2]),在PEP 563中, Python 3.7+将具有__future__导入(默认情况下在4.0中启用),该导入将允许使用正向引用没有报价,则推迟评估。

*在运行时不评估仅AFAICT局部变量注释。(来源:PEP 526

With typing.NamedTuple in Python 3.6.1+ you can provide both a default value and a type annotation to a NamedTuple field. Use typing.Any if you only need the former:

from typing import Any, NamedTuple


class Node(NamedTuple):
    val: Any
    left: 'Node' = None
    right: 'Node' = None

Usage:

>>> Node(1)
Node(val=1, left=None, right=None)
>>> n = Node(1)
>>> Node(2, left=n)
Node(val=2, left=Node(val=1, left=None, right=None), right=None)

Also, in case you need both default values and optional mutability, Python 3.7 is going to have data classes (PEP 557) that can in some (many?) cases replace namedtuples.


Sidenote: one quirk of the current specification of annotations (expressions after : for parameters and variables and after -> for functions) in Python is that they are evaluated at definition time*. So, since “class names become defined once the entire body of the class has been executed”, the annotations for 'Node' in the class fields above must be strings to avoid NameError.

This kind of type hints is called “forward reference” ([1], [2]), and with PEP 563 Python 3.7+ is going to have a __future__ import (to be enabled by default in 4.0) that will allow to use forward references without quotes, postponing their evaluation.

* AFAICT only local variable annotations are not evaluated at runtime. (source: PEP 526)


回答 4

这是直接来自docs的示例

可以使用_replace()定制原型实例来实现默认值:

>>> Account = namedtuple('Account', 'owner balance transaction_count')
>>> default_account = Account('<owner name>', 0.0, 0)
>>> johns_account = default_account._replace(owner='John')
>>> janes_account = default_account._replace(owner='Jane')

因此,OP的示例为:

from collections import namedtuple
Node = namedtuple('Node', 'val left right')
default_node = Node(None, None, None)
example = default_node._replace(val="whut")

但是,我更喜欢这里给出的其他一些答案。我只是想添加此内容以保持完整性。

This is an example straight from the docs:

Default values can be implemented by using _replace() to customize a prototype instance:

>>> Account = namedtuple('Account', 'owner balance transaction_count')
>>> default_account = Account('<owner name>', 0.0, 0)
>>> johns_account = default_account._replace(owner='John')
>>> janes_account = default_account._replace(owner='Jane')

So, the OP’s example would be:

from collections import namedtuple
Node = namedtuple('Node', 'val left right')
default_node = Node(None, None, None)
example = default_node._replace(val="whut")

However, I like some of the other answers given here better. I just wanted to add this for completeness.


回答 5

我不确定仅内置的namedtuple是否有简单的方法。有一个很好的模块,称为recordtype,具有以下功能:

>>> from recordtype import recordtype
>>> Node = recordtype('Node', [('val', None), ('left', None), ('right', None)])
>>> Node(3)
Node(val=3, left=None, right=None)
>>> Node(3, 'L')
Node(val=3, left=L, right=None)

I’m not sure if there’s an easy way with just the built-in namedtuple. There’s a nice module called recordtype that has this functionality:

>>> from recordtype import recordtype
>>> Node = recordtype('Node', [('val', None), ('left', None), ('right', None)])
>>> Node(3)
Node(val=3, left=None, right=None)
>>> Node(3, 'L')
Node(val=3, left=L, right=None)

回答 6

这是一个受Justinfay的回答启发的更紧凑的版本:

from collections import namedtuple
from functools import partial

Node = namedtuple('Node', ('val left right'))
Node.__new__ = partial(Node.__new__, left=None, right=None)

Here is a more compact version inspired by justinfay’s answer:

from collections import namedtuple
from functools import partial

Node = namedtuple('Node', ('val left right'))
Node.__new__ = partial(Node.__new__, left=None, right=None)

回答 7

在python3.7 +中,有一个全新的defaults =关键字参数。

默认值可以是默认值,也可以是None默认值的可迭代值。由于具有默认值的字段必须位于任何没有默认值的字段之后,因此默认值将应用于最右边的参数。举例来说,如果所述字段名是['x', 'y', 'z']与默认值(1, 2),然后x将所需要的参数,y将默认为1,和z将默认2

用法示例:

$ ./python
Python 3.7.0b1+ (heads/3.7:4d65430, Feb  1 2018, 09:28:35) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from collections import namedtuple
>>> nt = namedtuple('nt', ('a', 'b', 'c'), defaults=(1, 2))
>>> nt(0)
nt(a=0, b=1, c=2)
>>> nt(0, 3)  
nt(a=0, b=3, c=2)
>>> nt(0, c=3)
nt(a=0, b=1, c=3)

In python3.7+ there’s a brand new defaults= keyword argument.

defaults can be None or an iterable of default values. Since fields with a default value must come after any fields without a default, the defaults are applied to the rightmost parameters. For example, if the fieldnames are ['x', 'y', 'z'] and the defaults are (1, 2), then x will be a required argument, y will default to 1, and z will default to 2.

Example usage:

$ ./python
Python 3.7.0b1+ (heads/3.7:4d65430, Feb  1 2018, 09:28:35) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from collections import namedtuple
>>> nt = namedtuple('nt', ('a', 'b', 'c'), defaults=(1, 2))
>>> nt(0)
nt(a=0, b=1, c=2)
>>> nt(0, 3)  
nt(a=0, b=3, c=2)
>>> nt(0, c=3)
nt(a=0, b=1, c=3)

回答 8

简短,简单,不会导致人们使用isinstance不当:

class Node(namedtuple('Node', ('val', 'left', 'right'))):
    @classmethod
    def make(cls, val, left=None, right=None):
        return cls(val, left, right)

# Example
x = Node.make(3)
x._replace(right=Node.make(4))

Short, simple, and doesn’t lead people to use isinstance improperly:

class Node(namedtuple('Node', ('val', 'left', 'right'))):
    @classmethod
    def make(cls, val, left=None, right=None):
        return cls(val, left, right)

# Example
x = Node.make(3)
x._replace(right=Node.make(4))

回答 9

一个稍微扩展的示例,使用以下命令初始化所有缺少的参数None

from collections import namedtuple

class Node(namedtuple('Node', ['value', 'left', 'right'])):
    __slots__ = ()
    def __new__(cls, *args, **kwargs):
        # initialize missing kwargs with None
        all_kwargs = {key: kwargs.get(key) for key in cls._fields}
        return super(Node, cls).__new__(cls, *args, **all_kwargs)

A slightly extended example to initialize all missing arguments with None:

from collections import namedtuple

class Node(namedtuple('Node', ['value', 'left', 'right'])):
    __slots__ = ()
    def __new__(cls, *args, **kwargs):
        # initialize missing kwargs with None
        all_kwargs = {key: kwargs.get(key) for key in cls._fields}
        return super(Node, cls).__new__(cls, *args, **all_kwargs)

回答 10

Python 3.7:的介绍 defaults在namedtuple定义中 param。

文档中显示的示例:

>>> Account = namedtuple('Account', ['type', 'balance'], defaults=[0])
>>> Account._fields_defaults
{'balance': 0}
>>> Account('premium')
Account(type='premium', balance=0)

在这里阅读更多。

Python 3.7: introduction of defaults param in namedtuple definition.

Example as shown in the documentation:

>>> Account = namedtuple('Account', ['type', 'balance'], defaults=[0])
>>> Account._fields_defaults
{'balance': 0}
>>> Account('premium')
Account(type='premium', balance=0)

Read more here.


回答 11

您还可以使用以下命令:

import inspect

def namedtuple_with_defaults(type, default_value=None, **kwargs):
    args_list = inspect.getargspec(type.__new__).args[1:]
    params = dict([(x, default_value) for x in args_list])
    params.update(kwargs)

    return type(**params)

基本上,这使您可以构造具有默认值的任何命名元组,并仅覆盖所需的参数,例如:

import collections

Point = collections.namedtuple("Point", ["x", "y"])
namedtuple_with_defaults(Point)
>>> Point(x=None, y=None)

namedtuple_with_defaults(Point, x=1)
>>> Point(x=1, y=None)

You can also use this:

import inspect

def namedtuple_with_defaults(type, default_value=None, **kwargs):
    args_list = inspect.getargspec(type.__new__).args[1:]
    params = dict([(x, default_value) for x in args_list])
    params.update(kwargs)

    return type(**params)

This basically gives you the possibility to construct any named tuple with a default value and override just the parameters you need, for example:

import collections

Point = collections.namedtuple("Point", ["x", "y"])
namedtuple_with_defaults(Point)
>>> Point(x=None, y=None)

namedtuple_with_defaults(Point, x=1)
>>> Point(x=1, y=None)

回答 12

@Denis和@Mark的组合方法:

from collections import namedtuple
import inspect

class Node(namedtuple('Node', 'left right val')):
    __slots__ = ()
    def __new__(cls, *args, **kwargs):
        args_list = inspect.getargspec(super(Node, cls).__new__).args[len(args)+1:]
        params = {key: kwargs.get(key) for key in args_list + kwargs.keys()}
        return super(Node, cls).__new__(cls, *args, **params) 

那应该支持创建带有位置参数和混合大小写的元组。测试用例:

>>> print Node()
Node(left=None, right=None, val=None)

>>> print Node(1,2,3)
Node(left=1, right=2, val=3)

>>> print Node(1, right=2)
Node(left=1, right=2, val=None)

>>> print Node(1, right=2, val=100)
Node(left=1, right=2, val=100)

>>> print Node(left=1, right=2, val=100)
Node(left=1, right=2, val=100)

>>> print Node(left=1, right=2)
Node(left=1, right=2, val=None)

还支持TypeError:

>>> Node(1, left=2)
TypeError: __new__() got multiple values for keyword argument 'left'

Combining approaches of @Denis and @Mark:

from collections import namedtuple
import inspect

class Node(namedtuple('Node', 'left right val')):
    __slots__ = ()
    def __new__(cls, *args, **kwargs):
        args_list = inspect.getargspec(super(Node, cls).__new__).args[len(args)+1:]
        params = {key: kwargs.get(key) for key in args_list + kwargs.keys()}
        return super(Node, cls).__new__(cls, *args, **params) 

That should support creating the tuple with positional arguments and also with mixed cases. Test cases:

>>> print Node()
Node(left=None, right=None, val=None)

>>> print Node(1,2,3)
Node(left=1, right=2, val=3)

>>> print Node(1, right=2)
Node(left=1, right=2, val=None)

>>> print Node(1, right=2, val=100)
Node(left=1, right=2, val=100)

>>> print Node(left=1, right=2, val=100)
Node(left=1, right=2, val=100)

>>> print Node(left=1, right=2)
Node(left=1, right=2, val=None)

but also support TypeError:

>>> Node(1, left=2)
TypeError: __new__() got multiple values for keyword argument 'left'

回答 13

我发现此版本更易于阅读:

from collections import namedtuple

def my_tuple(**kwargs):
    defaults = {
        'a': 2.0,
        'b': True,
        'c': "hello",
    }
    default_tuple = namedtuple('MY_TUPLE', ' '.join(defaults.keys()))(*defaults.values())
    return default_tuple._replace(**kwargs)

这并不是很有效,因为它需要两次创建对象,但是您可以通过在模块内定义默认的duple并让函数执行替换行来更改它。

I find this version easier to read:

from collections import namedtuple

def my_tuple(**kwargs):
    defaults = {
        'a': 2.0,
        'b': True,
        'c': "hello",
    }
    default_tuple = namedtuple('MY_TUPLE', ' '.join(defaults.keys()))(*defaults.values())
    return default_tuple._replace(**kwargs)

This is not as efficient as it requires creation of the object twice but you could change that by defining the default duple inside the module and just having the function do the replace line.


回答 14

由于您是namedtuple作为数据类使用的,因此应注意python 3.7 @dataclass为此会引入一个装饰器-当然,它具有默认值。

来自docs的示例

@dataclass
class C:
    a: int       # 'a' has no default value
    b: int = 0   # assign a default value for 'b'

比黑客更干净,可读性和可用性更高namedtuple。不难预测,namedtuple随着3.7的采用,s的使用将下降。

Since you are using namedtuple as a data class, you should be aware that python 3.7 will introduce a @dataclass decorator for this very purpose — and of course it has default values.

An example from the docs:

@dataclass
class C:
    a: int       # 'a' has no default value
    b: int = 0   # assign a default value for 'b'

Much cleaner, readable and usable than hacking namedtuple. It is not hard to predict that usage of namedtuples will drop with the adoption of 3.7.


回答 15

受到对另一个问题的答案的启发,是我建议的基于元类的解决方案,并使用super(正确处理将来的子缩放)。这与Justinfay的答案非常相似。

from collections import namedtuple

NodeTuple = namedtuple("NodeTuple", ("val", "left", "right"))

class NodeMeta(type):
    def __call__(cls, val, left=None, right=None):
        return super(NodeMeta, cls).__call__(val, left, right)

class Node(NodeTuple, metaclass=NodeMeta):
    __slots__ = ()

然后:

>>> Node(1, Node(2, Node(4)),(Node(3, None, Node(5))))
Node(val=1, left=Node(val=2, left=Node(val=4, left=None, right=None), right=None), right=Node(val=3, left=None, right=Node(val=5, left=None, right=None)))

Inspired by this answer to a different question, here is my proposed solution based on a metaclass and using super (to handle future subcalssing correctly). It is quite similar to justinfay’s answer.

from collections import namedtuple

NodeTuple = namedtuple("NodeTuple", ("val", "left", "right"))

class NodeMeta(type):
    def __call__(cls, val, left=None, right=None):
        return super(NodeMeta, cls).__call__(val, left, right)

class Node(NodeTuple, metaclass=NodeMeta):
    __slots__ = ()

Then:

>>> Node(1, Node(2, Node(4)),(Node(3, None, Node(5))))
Node(val=1, left=Node(val=2, left=Node(val=4, left=None, right=None), right=None), right=Node(val=3, left=None, right=Node(val=5, left=None, right=None)))

回答 16

jterrace使用recordtype的答案很好,但是该库的作者建议使用他的namedlist项目,该项目提供了mutable(namedlist)和immutable(namedtuple)实现。

from namedlist import namedtuple
>>> Node = namedtuple('Node', ['val', ('left', None), ('right', None)])
>>> Node(3)
Node(val=3, left=None, right=None)
>>> Node(3, 'L')
Node(val=3, left=L, right=None)

The answer by jterrace to use recordtype is great, but the author of the library recommends to use his namedlist project, which provides both mutable (namedlist) and immutable (namedtuple) implementations.

from namedlist import namedtuple
>>> Node = namedtuple('Node', ['val', ('left', None), ('right', None)])
>>> Node(3)
Node(val=3, left=None, right=None)
>>> Node(3, 'L')
Node(val=3, left=L, right=None)

回答 17

这是一个简短的,简单的通用答案,带有带有默认参数的命名元组的漂亮语法:

import collections

def dnamedtuple(typename, field_names, **defaults):
    fields = sorted(field_names.split(), key=lambda x: x in defaults)
    T = collections.namedtuple(typename, ' '.join(fields))
    T.__new__.__defaults__ = tuple(defaults[field] for field in fields[-len(defaults):])
    return T

用法:

Test = dnamedtuple('Test', 'one two three', two=2)
Test(1, 3)  # Test(one=1, three=3, two=2)

缩小:

def dnamedtuple(tp, fs, **df):
    fs = sorted(fs.split(), key=df.__contains__)
    T = collections.namedtuple(tp, ' '.join(fs))
    T.__new__.__defaults__ = tuple(df[i] for i in fs[-len(df):])
    return T

Here’s a short, simple generic answer with a nice syntax for a named tuple with default arguments:

import collections

def dnamedtuple(typename, field_names, **defaults):
    fields = sorted(field_names.split(), key=lambda x: x in defaults)
    T = collections.namedtuple(typename, ' '.join(fields))
    T.__new__.__defaults__ = tuple(defaults[field] for field in fields[-len(defaults):])
    return T

Usage:

Test = dnamedtuple('Test', 'one two three', two=2)
Test(1, 3)  # Test(one=1, three=3, two=2)

Minified:

def dnamedtuple(tp, fs, **df):
    fs = sorted(fs.split(), key=df.__contains__)
    T = collections.namedtuple(tp, ' '.join(fs))
    T.__new__.__defaults__ = tuple(df[i] for i in fs[-len(df):])
    return T

回答 18

使用NamedTupleAdvanced Enum (aenum)库中的类并使用class语法,这非常简单:

from aenum import NamedTuple

class Node(NamedTuple):
    val = 0
    left = 1, 'previous Node', None
    right = 2, 'next Node', None

一个潜在的缺点是,对于__doc__具有默认值的任何属性都需要一个字符串(对于简单属性是可选的)。在使用中它看起来像:

>>> Node()
Traceback (most recent call last):
  ...
TypeError: values not provided for field(s): val

>>> Node(3)
Node(val=3, left=None, right=None)

它具有以下优点justinfay's answer

from collections import namedtuple

class Node(namedtuple('Node', ['value', 'left', 'right'])):
    __slots__ = ()
    def __new__(cls, value, left=None, right=None):
        return super(Node, cls).__new__(cls, value, left, right)

是简单的,以及metaclass基于基础而不是exec基础。

Using the NamedTuple class from my Advanced Enum (aenum) library, and using the class syntax, this is quite simple:

from aenum import NamedTuple

class Node(NamedTuple):
    val = 0
    left = 1, 'previous Node', None
    right = 2, 'next Node', None

The one potential drawback is the requirement for a __doc__ string for any attribute with a default value (it’s optional for simple attributes). In use it looks like:

>>> Node()
Traceback (most recent call last):
  ...
TypeError: values not provided for field(s): val

>>> Node(3)
Node(val=3, left=None, right=None)

The advantages this has over justinfay's answer:

from collections import namedtuple

class Node(namedtuple('Node', ['value', 'left', 'right'])):
    __slots__ = ()
    def __new__(cls, value, left=None, right=None):
        return super(Node, cls).__new__(cls, value, left, right)

is simplicity, as well as being metaclass based instead of exec based.


回答 19

另一个解决方案:

import collections


def defaultargs(func, defaults):
    def wrapper(*args, **kwargs):
        for key, value in (x for x in defaults[len(args):] if len(x) == 2):
            kwargs.setdefault(key, value)
        return func(*args, **kwargs)
    return wrapper


def namedtuple(name, fields):
    NamedTuple = collections.namedtuple(name, [x[0] for x in fields])
    NamedTuple.__new__ = defaultargs(NamedTuple.__new__, [(NamedTuple,)] + fields)
    return NamedTuple

用法:

>>> Node = namedtuple('Node', [
...     ('val',),
...     ('left', None),
...     ('right', None),
... ])
__main__.Node

>>> Node(1)
Node(val=1, left=None, right=None)

>>> Node(1, 2, right=3)
Node(val=1, left=2, right=3)

Another solution:

import collections


def defaultargs(func, defaults):
    def wrapper(*args, **kwargs):
        for key, value in (x for x in defaults[len(args):] if len(x) == 2):
            kwargs.setdefault(key, value)
        return func(*args, **kwargs)
    return wrapper


def namedtuple(name, fields):
    NamedTuple = collections.namedtuple(name, [x[0] for x in fields])
    NamedTuple.__new__ = defaultargs(NamedTuple.__new__, [(NamedTuple,)] + fields)
    return NamedTuple

Usage:

>>> Node = namedtuple('Node', [
...     ('val',),
...     ('left', None),
...     ('right', None),
... ])
__main__.Node

>>> Node(1)
Node(val=1, left=None, right=None)

>>> Node(1, 2, right=3)
Node(val=1, left=2, right=3)

回答 20

这是Mark Lodato的包装器的一种不太灵活但更简洁的版本:它使用字段和默认值作为字典。

import collections
def namedtuple_with_defaults(typename, fields_dict):
    T = collections.namedtuple(typename, ' '.join(fields_dict.keys()))
    T.__new__.__defaults__ = tuple(fields_dict.values())
    return T

例:

In[1]: fields = {'val': 1, 'left': 2, 'right':3}

In[2]: Node = namedtuple_with_defaults('Node', fields)

In[3]: Node()
Out[3]: Node(val=1, left=2, right=3)

In[4]: Node(4,5,6)
Out[4]: Node(val=4, left=5, right=6)

In[5]: Node(val=10)
Out[5]: Node(val=10, left=2, right=3)

Here’s a less flexible, but more concise version of Mark Lodato’s wrapper: It takes the fields and defaults as a dictionary.

import collections
def namedtuple_with_defaults(typename, fields_dict):
    T = collections.namedtuple(typename, ' '.join(fields_dict.keys()))
    T.__new__.__defaults__ = tuple(fields_dict.values())
    return T

Example:

In[1]: fields = {'val': 1, 'left': 2, 'right':3}

In[2]: Node = namedtuple_with_defaults('Node', fields)

In[3]: Node()
Out[3]: Node(val=1, left=2, right=3)

In[4]: Node(4,5,6)
Out[4]: Node(val=4, left=5, right=6)

In[5]: Node(val=10)
Out[5]: Node(val=10, left=2, right=3)

函数调用超时

问题:函数调用超时

我正在Python中调用一个函数,该函数可能会停滞并迫使我重新启动脚本。

如何调用该函数或将其包装在其中,以便如果花费的时间超过5秒,脚本将取消该函数并执行其他操作?

I’m calling a function in Python which I know may stall and force me to restart the script.

How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it and does something else?


回答 0

如果您在UNIX上运行,则可以使用信号包:

In [1]: import signal

# Register an handler for the timeout
In [2]: def handler(signum, frame):
   ...:     print("Forever is over!")
   ...:     raise Exception("end of time")
   ...: 

# This function *may* run for an indetermined time...
In [3]: def loop_forever():
   ...:     import time
   ...:     while 1:
   ...:         print("sec")
   ...:         time.sleep(1)
   ...:         
   ...:         

# Register the signal function handler
In [4]: signal.signal(signal.SIGALRM, handler)
Out[4]: 0

# Define a timeout for your function
In [5]: signal.alarm(10)
Out[5]: 0

In [6]: try:
   ...:     loop_forever()
   ...: except Exception, exc: 
   ...:     print(exc)
   ....: 
sec
sec
sec
sec
sec
sec
sec
sec
Forever is over!
end of time

# Cancel the timer if the function returned before timeout
# (ok, mine won't but yours maybe will :)
In [7]: signal.alarm(0)
Out[7]: 0

调用后10秒钟,将调用alarm.alarm(10)处理程序。这引发了一个异常,您可以从常规Python代码中拦截该异常。

该模块不能很好地与线程配合使用(但是,谁可以呢?)

请注意,由于发生超时时会引发异常,因此它可能最终在函数内部被捕获并被忽略,例如一个这样的函数:

def loop_forever():
    while 1:
        print('sec')
        try:
            time.sleep(10)
        except:
            continue

You may use the signal package if you are running on UNIX:

In [1]: import signal

# Register an handler for the timeout
In [2]: def handler(signum, frame):
   ...:     print("Forever is over!")
   ...:     raise Exception("end of time")
   ...: 

# This function *may* run for an indetermined time...
In [3]: def loop_forever():
   ...:     import time
   ...:     while 1:
   ...:         print("sec")
   ...:         time.sleep(1)
   ...:         
   ...:         

# Register the signal function handler
In [4]: signal.signal(signal.SIGALRM, handler)
Out[4]: 0

# Define a timeout for your function
In [5]: signal.alarm(10)
Out[5]: 0

In [6]: try:
   ...:     loop_forever()
   ...: except Exception, exc: 
   ...:     print(exc)
   ....: 
sec
sec
sec
sec
sec
sec
sec
sec
Forever is over!
end of time

# Cancel the timer if the function returned before timeout
# (ok, mine won't but yours maybe will :)
In [7]: signal.alarm(0)
Out[7]: 0

10 seconds after the call signal.alarm(10), the handler is called. This raises an exception that you can intercept from the regular Python code.

This module doesn’t play well with threads (but then, who does?)

Note that since we raise an exception when timeout happens, it may end up caught and ignored inside the function, for example of one such function:

def loop_forever():
    while 1:
        print('sec')
        try:
            time.sleep(10)
        except:
            continue

回答 1

您可以multiprocessing.Process用来精确地做到这一点。

import multiprocessing
import time

# bar
def bar():
    for i in range(100):
        print "Tick"
        time.sleep(1)

if __name__ == '__main__':
    # Start bar as a process
    p = multiprocessing.Process(target=bar)
    p.start()

    # Wait for 10 seconds or until process finishes
    p.join(10)

    # If thread is still active
    if p.is_alive():
        print "running... let's kill it..."

        # Terminate
        p.terminate()
        p.join()

You can use multiprocessing.Process to do exactly that.

Code

import multiprocessing
import time

# bar
def bar():
    for i in range(100):
        print "Tick"
        time.sleep(1)

if __name__ == '__main__':
    # Start bar as a process
    p = multiprocessing.Process(target=bar)
    p.start()

    # Wait for 10 seconds or until process finishes
    p.join(10)

    # If thread is still active
    if p.is_alive():
        print "running... let's kill it..."

        # Terminate - may not work if process is stuck for good
        p.terminate()
        # OR Kill - will work for sure, no chance for process to finish nicely however
        # p.kill()

        p.join()

回答 2

如何调用该函数或将其包装起来,以便如果花费的时间超过5秒钟,脚本将取消该函数?

我发布了要点,用装饰器和来解决此问题threading.Timer。这是一个细分。

导入和设置以实现兼容性

它已经通过Python 2和3进行了测试。它也应该在Unix / Linux和Windows下运行。

首先是进口。这些尝试使代码保持一致,而不管Python版本如何:

from __future__ import print_function
import sys
import threading
from time import sleep
try:
    import thread
except ImportError:
    import _thread as thread

使用版本无关代码:

try:
    range, _print = xrange, print
    def print(*args, **kwargs): 
        flush = kwargs.pop('flush', False)
        _print(*args, **kwargs)
        if flush:
            kwargs.get('file', sys.stdout).flush()            
except NameError:
    pass

现在,我们已经从标准库中导入了我们的功能。

exit_after 装饰工

接下来,我们需要一个函数来终止main()子线程:

def quit_function(fn_name):
    # print to stderr, unbuffered in Python 2.
    print('{0} took too long'.format(fn_name), file=sys.stderr)
    sys.stderr.flush() # Python 3 stderr is likely buffered.
    thread.interrupt_main() # raises KeyboardInterrupt

这是装饰器本身:

def exit_after(s):
    '''
    use as decorator to exit process if 
    function takes longer than s seconds
    '''
    def outer(fn):
        def inner(*args, **kwargs):
            timer = threading.Timer(s, quit_function, args=[fn.__name__])
            timer.start()
            try:
                result = fn(*args, **kwargs)
            finally:
                timer.cancel()
            return result
        return inner
    return outer

用法

这是直接回答您有关5秒后退出的问题的用法!:

@exit_after(5)
def countdown(n):
    print('countdown started', flush=True)
    for i in range(n, -1, -1):
        print(i, end=', ', flush=True)
        sleep(1)
    print('countdown finished')

演示:

>>> countdown(3)
countdown started
3, 2, 1, 0, countdown finished
>>> countdown(10)
countdown started
10, 9, 8, 7, 6, countdown took too long
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 11, in inner
  File "<stdin>", line 6, in countdown
KeyboardInterrupt

第二个函数调用将不会结束,而是该过程应退出并回溯!

KeyboardInterrupt 并不总是停止休眠线程

请注意,在Windows上的Python 2上,睡眠不会总是被键盘中断打断,例如:

@exit_after(1)
def sleep10():
    sleep(10)
    print('slept 10 seconds')

>>> sleep10()
sleep10 took too long         # Note that it hangs here about 9 more seconds
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 11, in inner
  File "<stdin>", line 3, in sleep10
KeyboardInterrupt

除非它明确检查PyErr_CheckSignals(),否则也不太可能中断在扩展程序中运行的代码,请参见 Cython,Python和KeyboardInterrupt被忽略

在任何情况下,我都避免将线程休眠超过一秒钟-这是处理器时间的永恒。

如何调用该函数或将其包装在其中,以便如果花费的时间超过5秒,脚本将取消该函数并执行其他操作?

要捕获它并执行其他操作,可以捕获KeyboardInterrupt。

>>> try:
...     countdown(10)
... except KeyboardInterrupt:
...     print('do something else')
... 
countdown started
10, 9, 8, 7, 6, countdown took too long
do something else

How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it?

I posted a gist that solves this question/problem with a decorator and a threading.Timer. Here it is with a breakdown.

Imports and setups for compatibility

It was tested with Python 2 and 3. It should also work under Unix/Linux and Windows.

First the imports. These attempt to keep the code consistent regardless of the Python version:

from __future__ import print_function
import sys
import threading
from time import sleep
try:
    import thread
except ImportError:
    import _thread as thread

Use version independent code:

try:
    range, _print = xrange, print
    def print(*args, **kwargs): 
        flush = kwargs.pop('flush', False)
        _print(*args, **kwargs)
        if flush:
            kwargs.get('file', sys.stdout).flush()            
except NameError:
    pass

Now we have imported our functionality from the standard library.

exit_after decorator

Next we need a function to terminate the main() from the child thread:

def quit_function(fn_name):
    # print to stderr, unbuffered in Python 2.
    print('{0} took too long'.format(fn_name), file=sys.stderr)
    sys.stderr.flush() # Python 3 stderr is likely buffered.
    thread.interrupt_main() # raises KeyboardInterrupt

And here is the decorator itself:

def exit_after(s):
    '''
    use as decorator to exit process if 
    function takes longer than s seconds
    '''
    def outer(fn):
        def inner(*args, **kwargs):
            timer = threading.Timer(s, quit_function, args=[fn.__name__])
            timer.start()
            try:
                result = fn(*args, **kwargs)
            finally:
                timer.cancel()
            return result
        return inner
    return outer

Usage

And here’s the usage that directly answers your question about exiting after 5 seconds!:

@exit_after(5)
def countdown(n):
    print('countdown started', flush=True)
    for i in range(n, -1, -1):
        print(i, end=', ', flush=True)
        sleep(1)
    print('countdown finished')

Demo:

>>> countdown(3)
countdown started
3, 2, 1, 0, countdown finished
>>> countdown(10)
countdown started
10, 9, 8, 7, 6, countdown took too long
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 11, in inner
  File "<stdin>", line 6, in countdown
KeyboardInterrupt

The second function call will not finish, instead the process should exit with a traceback!

KeyboardInterrupt does not always stop a sleeping thread

Note that sleep will not always be interrupted by a keyboard interrupt, on Python 2 on Windows, e.g.:

@exit_after(1)
def sleep10():
    sleep(10)
    print('slept 10 seconds')

>>> sleep10()
sleep10 took too long         # Note that it hangs here about 9 more seconds
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 11, in inner
  File "<stdin>", line 3, in sleep10
KeyboardInterrupt

nor is it likely to interrupt code running in extensions unless it explicitly checks for PyErr_CheckSignals(), see Cython, Python and KeyboardInterrupt ignored

I would avoid sleeping a thread more than a second, in any case – that’s an eon in processor time.

How do I call the function or what do I wrap it in so that if it takes longer than 5 seconds the script cancels it and does something else?

To catch it and do something else, you can catch the KeyboardInterrupt.

>>> try:
...     countdown(10)
... except KeyboardInterrupt:
...     print('do something else')
... 
countdown started
10, 9, 8, 7, 6, countdown took too long
do something else

回答 3

我有一个不同的建议,它是一个纯函数(具有与线程建议相同的API),并且似乎可以正常工作(基于此线程的建议)

def timeout(func, args=(), kwargs={}, timeout_duration=1, default=None):
    import signal

    class TimeoutError(Exception):
        pass

    def handler(signum, frame):
        raise TimeoutError()

    # set the timeout handler
    signal.signal(signal.SIGALRM, handler) 
    signal.alarm(timeout_duration)
    try:
        result = func(*args, **kwargs)
    except TimeoutError as exc:
        result = default
    finally:
        signal.alarm(0)

    return result

I have a different proposal which is a pure function (with the same API as the threading suggestion) and seems to work fine (based on suggestions on this thread)

def timeout(func, args=(), kwargs={}, timeout_duration=1, default=None):
    import signal

    class TimeoutError(Exception):
        pass

    def handler(signum, frame):
        raise TimeoutError()

    # set the timeout handler
    signal.signal(signal.SIGALRM, handler) 
    signal.alarm(timeout_duration)
    try:
        result = func(*args, **kwargs)
    except TimeoutError as exc:
        result = default
    finally:
        signal.alarm(0)

    return result

回答 4

在搜索单元测试的超时调用时,我遇到了这个线程。我没有在答案或第三方软件包中找到任何简单的东西,因此我在下面编写了装饰器,您可以直接进入代码:

import multiprocessing.pool
import functools

def timeout(max_timeout):
    """Timeout decorator, parameter in seconds."""
    def timeout_decorator(item):
        """Wrap the original function."""
        @functools.wraps(item)
        def func_wrapper(*args, **kwargs):
            """Closure for function."""
            pool = multiprocessing.pool.ThreadPool(processes=1)
            async_result = pool.apply_async(item, args, kwargs)
            # raises a TimeoutError if execution exceeds max_timeout
            return async_result.get(max_timeout)
        return func_wrapper
    return timeout_decorator

然后,使测试或您喜欢的任何功能超时就这么简单:

@timeout(5.0)  # if execution takes longer than 5 seconds, raise a TimeoutError
def test_base_regression(self):
    ...

I ran across this thread when searching for a timeout call on unit tests. I didn’t find anything simple in the answers or 3rd party packages so I wrote the decorator below you can drop right into code:

import multiprocessing.pool
import functools

def timeout(max_timeout):
    """Timeout decorator, parameter in seconds."""
    def timeout_decorator(item):
        """Wrap the original function."""
        @functools.wraps(item)
        def func_wrapper(*args, **kwargs):
            """Closure for function."""
            pool = multiprocessing.pool.ThreadPool(processes=1)
            async_result = pool.apply_async(item, args, kwargs)
            # raises a TimeoutError if execution exceeds max_timeout
            return async_result.get(max_timeout)
        return func_wrapper
    return timeout_decorator

Then it’s as simple as this to timeout a test or any function you like:

@timeout(5.0)  # if execution takes longer than 5 seconds, raise a TimeoutError
def test_base_regression(self):
    ...

回答 5

stopit在pypi上找到软件包似乎可以很好地处理超时问题。

我喜欢@stopit.threading_timeoutable装饰器,它增加了一个timeout向装饰的函数参数,该参数完成了您所期望的操作,从而停止了该函数。

在pypi上查看:https ://pypi.python.org/pypi/stopit

The stopit package, found on pypi, seems to handle timeouts well.

I like the @stopit.threading_timeoutable decorator, which adds a timeout parameter to the decorated function, which does what you expect, it stops the function.

Check it out on pypi: https://pypi.python.org/pypi/stopit


回答 6

有很多建议,但没有一个建议使用current.futures,我认为这是处理此问题的最清晰的方法。

from concurrent.futures import ProcessPoolExecutor

# Warning: this does not terminate function if timeout
def timeout_five(fnc, *args, **kwargs):
    with ProcessPoolExecutor() as p:
        f = p.submit(fnc, *args, **kwargs)
        return f.result(timeout=5)

超级简单的阅读和维护。

我们创建一个池,提交一个进程,然后等待最多5秒钟,然后引发一个TimeoutError,您可以根据需要捕获并处理它。

原生于python 3.2+,并反向移植到2.7(点安装期货)。

线程和进程之间的切换很简单,只要更换ProcessPoolExecutorThreadPoolExecutor

如果您想在超时时终止进程,建议您调查Pebble

There are a lot of suggestions, but none using concurrent.futures, which I think is the most legible way to handle this.

from concurrent.futures import ProcessPoolExecutor

# Warning: this does not terminate function if timeout
def timeout_five(fnc, *args, **kwargs):
    with ProcessPoolExecutor() as p:
        f = p.submit(fnc, *args, **kwargs)
        return f.result(timeout=5)

Super simple to read and maintain.

We make a pool, submit a single process and then wait up to 5 seconds before raising a TimeoutError that you could catch and handle however you needed.

Native to python 3.2+ and backported to 2.7 (pip install futures).

Switching between threads and processes is as simple as replacing ProcessPoolExecutor with ThreadPoolExecutor.

If you want to terminate the Process on timeout I would suggest looking into Pebble.


回答 7

出色,易于使用且可靠的PyPi项目超时装饰器https://pypi.org/project/timeout-decorator/

安装方式

pip install timeout-decorator

用法

import time
import timeout_decorator

@timeout_decorator.timeout(5)
def mytest():
    print "Start"
    for i in range(1,10):
        time.sleep(1)
        print "%d seconds have passed" % i

if __name__ == '__main__':
    mytest()

Great, easy to use and reliable PyPi project timeout-decorator (https://pypi.org/project/timeout-decorator/)

installation:

pip install timeout-decorator

Usage:

import time
import timeout_decorator

@timeout_decorator.timeout(5)
def mytest():
    print "Start"
    for i in range(1,10):
        time.sleep(1)
        print "%d seconds have passed" % i

if __name__ == '__main__':
    mytest()

回答 8

我是wrapt_timeout_decorator的作者

乍一看,这里介绍的大多数解决方案在Linux上都无法正常工作-因为我们有fork()和signal()-但是在Windows上,情况看起来有些不同。当涉及到Linux上的子线程时,您将无法再使用Signals。

为了在Windows下产生一个进程,它必须是可挑选的-许多修饰函数或Class方法不是。

因此,您需要使用更好的Pickler,如莳萝和多进程(而不是Pickle和多进程)-这就是为什么您不能使用ProcessPoolExecutor(或仅在功能有限的情况下)的原因。

对于超时本身-您需要定义超时的含义-因为在Windows上将花费大量(且不确定)的时间来生成该进程。如果超时时间短,这可能会很棘手。让我们假设,生成该过程大约需要0.5秒(很容易!!!)。如果您给出0.2秒的超时时间,应该怎么办?函数是否应在0.5 + 0.2秒后超时(让方法运行0.2秒)?还是被调用的进程应该在0.2秒后超时(在这种情况下,修饰的函数将始终超时,因为在那个时间内它甚至没有生成)?

嵌套的装饰器也很讨厌,您不能在子线程中使用Signals。如果要创建真正的通用跨平台装饰器,则需要考虑所有这些因素(并进行测试)。

其他问题会将异常传递回调用者以及日志记录问题(如果在修饰的函数中使用-不支持在另一个进程中记录文件)

我试图涵盖所有的极端情况,您可能会考虑wrapt_timeout_decorator包,或者至少测试那里使用的单元测试启发的自己的解决方案。

@Alexis Eggermont-很遗憾,我没有足够的意见要发表-也许其他人可以通知您-我想我已经解决了您的导入问题。

I am the author of wrapt_timeout_decorator

Most of the solutions presented here work wunderfully under Linux on the first glance – because we have fork() and signals() – but on windows the things look a bit different. And when it comes to subthreads on Linux, You cant use Signals anymore.

In order to spawn a process under Windows, it needs to be picklable – and many decorated functions or Class methods are not.

So You need to use a better pickler like dill and multiprocess (not pickle and multiprocessing) – thats why You cant use ProcessPoolExecutor (or only with limited functionality).

For the timeout itself – You need to define what timeout means – because on Windows it will take considerable (and not determinable) time to spawn the process. This can be tricky on short timeouts. Lets assume, spawning the process takes about 0.5 seconds (easily !!!). If You give a timeout of 0.2 seconds what should happen ? Should the function time out after 0.5 + 0.2 seconds (so let the method run for 0.2 seconds)? Or should the called process time out after 0.2 seconds (in that case, the decorated function will ALWAYS timeout, because in that time it is not even spawned) ?

Also nested decorators can be nasty and You cant use Signals in a subthread. If You want to create a truly universal, cross-platform decorator, all this needs to be taken into consideration (and tested).

Other issues are passing exceptions back to the caller, as well as logging issues (if used in the decorated function – logging to files in another process is NOT supported)

I tried to cover all edge cases, You might look into the package wrapt_timeout_decorator, or at least test Your own solutions inspired by the unittests used there.

@Alexis Eggermont – unfortunately I dont have enough points to comment – maybe someone else can notify You – I think I solved Your import issue.


回答 9

timeout-decorator不能在Windows系统上正常运行,因为Windows不能signal很好地支持。

如果您在Windows系统中使用timeout-decorator,则会得到以下信息

AttributeError: module 'signal' has no attribute 'SIGALRM'

有些人建议使用use_signals=False但对我没有用。

作者@bitranox创建了以下软件包:

pip install https://github.com/bitranox/wrapt-timeout-decorator/archive/master.zip

代码样例:

import time
from wrapt_timeout_decorator import *

@timeout(5)
def mytest(message):
    print(message)
    for i in range(1,10):
        time.sleep(1)
        print('{} seconds have passed'.format(i))

def main():
    mytest('starting')


if __name__ == '__main__':
    main()

给出以下异常:

TimeoutError: Function mytest timed out after 5 seconds

timeout-decorator don’t work on windows system as , windows didn’t support signal well.

If you use timeout-decorator in windows system you will get the following

AttributeError: module 'signal' has no attribute 'SIGALRM'

Some suggested to use use_signals=False but didn’t worked for me.

Author @bitranox created the following package:

pip install https://github.com/bitranox/wrapt-timeout-decorator/archive/master.zip

Code Sample:

import time
from wrapt_timeout_decorator import *

@timeout(5)
def mytest(message):
    print(message)
    for i in range(1,10):
        time.sleep(1)
        print('{} seconds have passed'.format(i))

def main():
    mytest('starting')


if __name__ == '__main__':
    main()

Gives the following exception:

TimeoutError: Function mytest timed out after 5 seconds

回答 10

我们可以使用相同的信号。我认为以下示例对您有用。与线程相比,它非常简单。

import signal

def timeout(signum, frame):
    raise myException

#this is an infinite loop, never ending under normal circumstances
def main():
    print 'Starting Main ',
    while 1:
        print 'in main ',

#SIGALRM is only usable on a unix platform
signal.signal(signal.SIGALRM, timeout)

#change 5 to however many seconds you need
signal.alarm(5)

try:
    main()
except myException:
    print "whoops"

We can use signals for the same. I think the below example will be useful for you. It is very simple compared to threads.

import signal

def timeout(signum, frame):
    raise myException

#this is an infinite loop, never ending under normal circumstances
def main():
    print 'Starting Main ',
    while 1:
        print 'in main ',

#SIGALRM is only usable on a unix platform
signal.signal(signal.SIGALRM, timeout)

#change 5 to however many seconds you need
signal.alarm(5)

try:
    main()
except myException:
    print "whoops"

回答 11

#!/usr/bin/python2
import sys, subprocess, threading
proc = subprocess.Popen(sys.argv[2:])
timer = threading.Timer(float(sys.argv[1]), proc.terminate)
timer.start()
proc.wait()
timer.cancel()
exit(proc.returncode)
#!/usr/bin/python2
import sys, subprocess, threading
proc = subprocess.Popen(sys.argv[2:])
timer = threading.Timer(float(sys.argv[1]), proc.terminate)
timer.start()
proc.wait()
timer.cancel()
exit(proc.returncode)

回答 12

我需要不会被time.sleep(基于线程的方法无法做到)阻止的可嵌套定时中断(SIGALARM无法做到)。我最终从这里复制并修改了代码:http : //code.activestate.com/recipes/577600-queue-for-managing-multiple-sigalrm-alarms-concurr/

代码本身:

#!/usr/bin/python

# lightly modified version of http://code.activestate.com/recipes/577600-queue-for-managing-multiple-sigalrm-alarms-concurr/


"""alarm.py: Permits multiple SIGALRM events to be queued.

Uses a `heapq` to store the objects to be called when an alarm signal is
raised, so that the next alarm is always at the top of the heap.
"""

import heapq
import signal
from time import time

__version__ = '$Revision: 2539 $'.split()[1]

alarmlist = []

__new_alarm = lambda t, f, a, k: (t + time(), f, a, k)
__next_alarm = lambda: int(round(alarmlist[0][0] - time())) if alarmlist else None
__set_alarm = lambda: signal.alarm(max(__next_alarm(), 1))


class TimeoutError(Exception):
    def __init__(self, message, id_=None):
        self.message = message
        self.id_ = id_


class Timeout:
    ''' id_ allows for nested timeouts. '''
    def __init__(self, id_=None, seconds=1, error_message='Timeout'):
        self.seconds = seconds
        self.error_message = error_message
        self.id_ = id_
    def handle_timeout(self):
        raise TimeoutError(self.error_message, self.id_)
    def __enter__(self):
        self.this_alarm = alarm(self.seconds, self.handle_timeout)
    def __exit__(self, type, value, traceback):
        try:
            cancel(self.this_alarm) 
        except ValueError:
            pass


def __clear_alarm():
    """Clear an existing alarm.

    If the alarm signal was set to a callable other than our own, queue the
    previous alarm settings.
    """
    oldsec = signal.alarm(0)
    oldfunc = signal.signal(signal.SIGALRM, __alarm_handler)
    if oldsec > 0 and oldfunc != __alarm_handler:
        heapq.heappush(alarmlist, (__new_alarm(oldsec, oldfunc, [], {})))


def __alarm_handler(*zargs):
    """Handle an alarm by calling any due heap entries and resetting the alarm.

    Note that multiple heap entries might get called, especially if calling an
    entry takes a lot of time.
    """
    try:
        nextt = __next_alarm()
        while nextt is not None and nextt <= 0:
            (tm, func, args, keys) = heapq.heappop(alarmlist)
            func(*args, **keys)
            nextt = __next_alarm()
    finally:
        if alarmlist: __set_alarm()


def alarm(sec, func, *args, **keys):
    """Set an alarm.

    When the alarm is raised in `sec` seconds, the handler will call `func`,
    passing `args` and `keys`. Return the heap entry (which is just a big
    tuple), so that it can be cancelled by calling `cancel()`.
    """
    __clear_alarm()
    try:
        newalarm = __new_alarm(sec, func, args, keys)
        heapq.heappush(alarmlist, newalarm)
        return newalarm
    finally:
        __set_alarm()


def cancel(alarm):
    """Cancel an alarm by passing the heap entry returned by `alarm()`.

    It is an error to try to cancel an alarm which has already occurred.
    """
    __clear_alarm()
    try:
        alarmlist.remove(alarm)
        heapq.heapify(alarmlist)
    finally:
        if alarmlist: __set_alarm()

和用法示例:

import alarm
from time import sleep

try:
    with alarm.Timeout(id_='a', seconds=5):
        try:
            with alarm.Timeout(id_='b', seconds=2):
                sleep(3)
        except alarm.TimeoutError as e:
            print 'raised', e.id_
        sleep(30)
except alarm.TimeoutError as e:
    print 'raised', e.id_
else:
    print 'nope.'

I had a need for nestable timed interrupts (which SIGALARM can’t do) that won’t get blocked by time.sleep (which the thread-based approach can’t do). I ended up copying and lightly modifying code from here: http://code.activestate.com/recipes/577600-queue-for-managing-multiple-sigalrm-alarms-concurr/

The code itself:

#!/usr/bin/python

# lightly modified version of http://code.activestate.com/recipes/577600-queue-for-managing-multiple-sigalrm-alarms-concurr/


"""alarm.py: Permits multiple SIGALRM events to be queued.

Uses a `heapq` to store the objects to be called when an alarm signal is
raised, so that the next alarm is always at the top of the heap.
"""

import heapq
import signal
from time import time

__version__ = '$Revision: 2539 $'.split()[1]

alarmlist = []

__new_alarm = lambda t, f, a, k: (t + time(), f, a, k)
__next_alarm = lambda: int(round(alarmlist[0][0] - time())) if alarmlist else None
__set_alarm = lambda: signal.alarm(max(__next_alarm(), 1))


class TimeoutError(Exception):
    def __init__(self, message, id_=None):
        self.message = message
        self.id_ = id_


class Timeout:
    ''' id_ allows for nested timeouts. '''
    def __init__(self, id_=None, seconds=1, error_message='Timeout'):
        self.seconds = seconds
        self.error_message = error_message
        self.id_ = id_
    def handle_timeout(self):
        raise TimeoutError(self.error_message, self.id_)
    def __enter__(self):
        self.this_alarm = alarm(self.seconds, self.handle_timeout)
    def __exit__(self, type, value, traceback):
        try:
            cancel(self.this_alarm) 
        except ValueError:
            pass


def __clear_alarm():
    """Clear an existing alarm.

    If the alarm signal was set to a callable other than our own, queue the
    previous alarm settings.
    """
    oldsec = signal.alarm(0)
    oldfunc = signal.signal(signal.SIGALRM, __alarm_handler)
    if oldsec > 0 and oldfunc != __alarm_handler:
        heapq.heappush(alarmlist, (__new_alarm(oldsec, oldfunc, [], {})))


def __alarm_handler(*zargs):
    """Handle an alarm by calling any due heap entries and resetting the alarm.

    Note that multiple heap entries might get called, especially if calling an
    entry takes a lot of time.
    """
    try:
        nextt = __next_alarm()
        while nextt is not None and nextt <= 0:
            (tm, func, args, keys) = heapq.heappop(alarmlist)
            func(*args, **keys)
            nextt = __next_alarm()
    finally:
        if alarmlist: __set_alarm()


def alarm(sec, func, *args, **keys):
    """Set an alarm.

    When the alarm is raised in `sec` seconds, the handler will call `func`,
    passing `args` and `keys`. Return the heap entry (which is just a big
    tuple), so that it can be cancelled by calling `cancel()`.
    """
    __clear_alarm()
    try:
        newalarm = __new_alarm(sec, func, args, keys)
        heapq.heappush(alarmlist, newalarm)
        return newalarm
    finally:
        __set_alarm()


def cancel(alarm):
    """Cancel an alarm by passing the heap entry returned by `alarm()`.

    It is an error to try to cancel an alarm which has already occurred.
    """
    __clear_alarm()
    try:
        alarmlist.remove(alarm)
        heapq.heapify(alarmlist)
    finally:
        if alarmlist: __set_alarm()

and a usage example:

import alarm
from time import sleep

try:
    with alarm.Timeout(id_='a', seconds=5):
        try:
            with alarm.Timeout(id_='b', seconds=2):
                sleep(3)
        except alarm.TimeoutError as e:
            print 'raised', e.id_
        sleep(30)
except alarm.TimeoutError as e:
    print 'raised', e.id_
else:
    print 'nope.'

回答 13

这是对给定的基于线程的解决方案的一点改进。

下面的代码支持异常

def runFunctionCatchExceptions(func, *args, **kwargs):
    try:
        result = func(*args, **kwargs)
    except Exception, message:
        return ["exception", message]

    return ["RESULT", result]


def runFunctionWithTimeout(func, args=(), kwargs={}, timeout_duration=10, default=None):
    import threading
    class InterruptableThread(threading.Thread):
        def __init__(self):
            threading.Thread.__init__(self)
            self.result = default
        def run(self):
            self.result = runFunctionCatchExceptions(func, *args, **kwargs)
    it = InterruptableThread()
    it.start()
    it.join(timeout_duration)
    if it.isAlive():
        return default

    if it.result[0] == "exception":
        raise it.result[1]

    return it.result[1]

在5秒钟的超时时间内调用它:

result = timeout(remote_calculate, (myarg,), timeout_duration=5)

Here is a slight improvement to the given thread-based solution.

The code below supports exceptions:

def runFunctionCatchExceptions(func, *args, **kwargs):
    try:
        result = func(*args, **kwargs)
    except Exception, message:
        return ["exception", message]

    return ["RESULT", result]


def runFunctionWithTimeout(func, args=(), kwargs={}, timeout_duration=10, default=None):
    import threading
    class InterruptableThread(threading.Thread):
        def __init__(self):
            threading.Thread.__init__(self)
            self.result = default
        def run(self):
            self.result = runFunctionCatchExceptions(func, *args, **kwargs)
    it = InterruptableThread()
    it.start()
    it.join(timeout_duration)
    if it.isAlive():
        return default

    if it.result[0] == "exception":
        raise it.result[1]

    return it.result[1]

Invoking it with a 5 second timeout:

result = timeout(remote_calculate, (myarg,), timeout_duration=5)

将subprocess.Popen调用的输出存储在字符串中

问题:将subprocess.Popen调用的输出存储在字符串中

我正在尝试在Python中进行系统调用,并将输出存储到我可以在Python程序中操作的字符串中。

#!/usr/bin/python
import subprocess
p2 = subprocess.Popen("ntpq -p")

我尝试了一些事情,包括此处的一些建议:

检索subprocess.call()的输出

但没有任何运气。

I’m trying to make a system call in Python and store the output to a string that I can manipulate in the Python program.

#!/usr/bin/python
import subprocess
p2 = subprocess.Popen("ntpq -p")

I’ve tried a few things including some of the suggestions here:

Retrieving the output of subprocess.call()

but without any luck.


回答 0

在Python 2.7或Python 3中

Popen您可以使用subprocess.check_output()函数将命令的输出存储在字符串中,而不是直接创建对象:

from subprocess import check_output
out = check_output(["ntpq", "-p"])

在Python 2.4-2.6中

使用communicate方法。

import subprocess
p = subprocess.Popen(["ntpq", "-p"], stdout=subprocess.PIPE)
out, err = p.communicate()

out 是你想要的。

有关其他答案的重要说明

请注意我如何传递命令。该"ntpq -p"示例提出了另一回事。由于Popen不调用外壳程序,因此您将使用命令和选项列表["ntpq", "-p"]

In Python 2.7 or Python 3

Instead of making a Popen object directly, you can use the subprocess.check_output() function to store output of a command in a string:

from subprocess import check_output
out = check_output(["ntpq", "-p"])

In Python 2.4-2.6

Use the communicate method.

import subprocess
p = subprocess.Popen(["ntpq", "-p"], stdout=subprocess.PIPE)
out, err = p.communicate()

out is what you want.

Important note about the other answers

Note how I passed in the command. The "ntpq -p" example brings up another matter. Since Popen does not invoke the shell, you would use a list of the command and options—["ntpq", "-p"].


回答 1

这为我重定向标准输出工作(stderr可以类似地处理):

from subprocess import Popen, PIPE
pipe = Popen(path, stdout=PIPE)
text = pipe.communicate()[0]

如果对您不起作用,请确切说明您遇到的问题。

This worked for me for redirecting stdout (stderr can be handled similarly):

from subprocess import Popen, PIPE
pipe = Popen(path, stdout=PIPE)
text = pipe.communicate()[0]

If it doesn’t work for you, please specify exactly the problem you’re having.


回答 2

假设这pwd只是一个示例,可以通过以下方法进行:

import subprocess

p = subprocess.Popen("pwd", stdout=subprocess.PIPE)
result = p.communicate()[0]
print result

看到子文档另一个例子和更多信息。

Assuming that pwd is just an example, this is how you can do it:

import subprocess

p = subprocess.Popen("pwd", stdout=subprocess.PIPE)
result = p.communicate()[0]
print result

See the subprocess documentation for another example and more information.


回答 3

subprocess.Popen:http : //docs.python.org/2/library/subprocess.html#subprocess.Popen

import subprocess

command = "ntpq -p"  # the shell command
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=None, shell=True)

#Launch the shell command:
output = process.communicate()

print output[0]

在Popen构造函数中,如果shellTrue,则应以字符串而不是序列的形式传递命令。否则,只需将命令拆分为一个列表:

command = ["ntpq", "-p"]  # the shell command
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=None)

如果您还需要将标准错误读取到Popen初始化中,则可以将stderr设置为subprocess.PIPEsubprocess.STDOUT

import subprocess

command = "ntpq -p"  # the shell command
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)

#Launch the shell command:
output, error = process.communicate()

subprocess.Popen: http://docs.python.org/2/library/subprocess.html#subprocess.Popen

import subprocess

command = "ntpq -p"  # the shell command
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=None, shell=True)

#Launch the shell command:
output = process.communicate()

print output[0]

In the Popen constructor, if shell is True, you should pass the command as a string rather than as a sequence. Otherwise, just split the command into a list:

command = ["ntpq", "-p"]  # the shell command
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=None)

If you need to read also the standard error, into the Popen initialization, you can set stderr to subprocess.PIPE or to subprocess.STDOUT:

import subprocess

command = "ntpq -p"  # the shell command
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)

#Launch the shell command:
output, error = process.communicate()

回答 4

对于Python 2.7+,惯用的答案是使用 subprocess.check_output()

您还应该在调用子流程时注意对参数的处理,因为这可能会造成一些混乱。

如果args只是单个命令而没有自己的args(或您已shell=True设置),则它可以是字符串。否则,它必须是一个列表。

例如…调用ls命令,这很好:

from subprocess import check_call
check_call('ls')

是这样的:

from subprocess import check_call
check_call(['ls',])

但是,如果要将一些args传递给shell命令,则不能这样做:

from subprocess import check_call
check_call('ls -al')

相反,您必须将其作为列表传递:

from subprocess import check_call
check_call(['ls', '-al'])

shlex.split()函数有时在创建子进程之前将字符串拆分成类似shell的语法很有用…就像这样:

from subprocess import check_call
import shlex
check_call(shlex.split('ls -al'))

for Python 2.7+ the idiomatic answer is to use subprocess.check_output()

You should also note the handling of arguments when invoking a subprocess, as it can be a little confusing….

If args is just single command with no args of its own (or you have shell=True set), it can be a string. Otherwise it must be a list.

for example… to invoke the ls command, this is fine:

from subprocess import check_call
check_call('ls')

so is this:

from subprocess import check_call
check_call(['ls',])

however, if you want to pass some args to the shell command, you can’t do this:

from subprocess import check_call
check_call('ls -al')

instead, you must pass it as a list:

from subprocess import check_call
check_call(['ls', '-al'])

the shlex.split() function can sometimes be useful to split a string into shell-like syntax before creating a subprocesses… like this:

from subprocess import check_call
import shlex
check_call(shlex.split('ls -al'))

回答 5

这完全适合我:

import subprocess
try:
    #prints results and merges stdout and std
    result = subprocess.check_output("echo %USERNAME%", stderr=subprocess.STDOUT, shell=True)
    print result
    #causes error and merges stdout and stderr
    result = subprocess.check_output("copy testfds", stderr=subprocess.STDOUT, shell=True)
except subprocess.CalledProcessError, ex: # error code <> 0 
    print "--------error------"
    print ex.cmd
    print ex.message
    print ex.returncode
    print ex.output # contains stdout and stderr together 

This works perfectly for me:

import subprocess
try:
    #prints results and merges stdout and std
    result = subprocess.check_output("echo %USERNAME%", stderr=subprocess.STDOUT, shell=True)
    print result
    #causes error and merges stdout and stderr
    result = subprocess.check_output("copy testfds", stderr=subprocess.STDOUT, shell=True)
except subprocess.CalledProcessError, ex: # error code <> 0 
    print "--------error------"
    print ex.cmd
    print ex.message
    print ex.returncode
    print ex.output # contains stdout and stderr together 

回答 6

这对我来说是完美的。您将在元组中获得返回码,stdout和stderr。

from subprocess import Popen, PIPE

def console(cmd):
    p = Popen(cmd, shell=True, stdout=PIPE)
    out, err = p.communicate()
    return (p.returncode, out, err)

例如:

result = console('ls -l')
print 'returncode: %s' % result[0]
print 'output: %s' % result[1]
print 'error: %s' % result[2]

This was perfect for me. You will get the return code, stdout and stderr in a tuple.

from subprocess import Popen, PIPE

def console(cmd):
    p = Popen(cmd, shell=True, stdout=PIPE)
    out, err = p.communicate()
    return (p.returncode, out, err)

For Example:

result = console('ls -l')
print 'returncode: %s' % result[0]
print 'output: %s' % result[1]
print 'error: %s' % result[2]

回答 7

接受的答案仍然是好的,只是对新功能的一些评论。从python 3.6开始,您可以直接在中处理编码check_output,请参阅文档。现在返回一个字符串对象:

import subprocess 
out = subprocess.check_output(["ls", "-l"], encoding="utf-8")

在python 3.7中,capture_output向subprocess.run()添加了一个参数,该参数为我们执行了一些Popen / PIPE处理,请参见python docs

import subprocess 
p2 = subprocess.run(["ls", "-l"], capture_output=True, encoding="utf-8")
p2.stdout

The accepted answer is still good, just a few remarks on newer features. Since python 3.6, you can handle encoding directly in check_output, see documentation. This returns a string object now:

import subprocess 
out = subprocess.check_output(["ls", "-l"], encoding="utf-8")

In python 3.7, a parameter capture_output was added to subprocess.run(), which does some of the Popen/PIPE handling for us, see the python docs :

import subprocess 
p2 = subprocess.run(["ls", "-l"], capture_output=True, encoding="utf-8")
p2.stdout

回答 8

 import os   
 list = os.popen('pwd').read()

在这种情况下,列表中将只有一个元素。

 import os   
 list = os.popen('pwd').read()

In this case you will only have one element in the list.


回答 9

我根据此处的其他答案编写了一个小函数:

def pexec(*args):
    return subprocess.Popen(args, stdout=subprocess.PIPE).communicate()[0].rstrip()

用法:

changeset = pexec('hg','id','--id')
branch = pexec('hg','id','--branch')
revnum = pexec('hg','id','--num')
print('%s : %s (%s)' % (revnum, changeset, branch))

I wrote a little function based on the other answers here:

def pexec(*args):
    return subprocess.Popen(args, stdout=subprocess.PIPE).communicate()[0].rstrip()

Usage:

changeset = pexec('hg','id','--id')
branch = pexec('hg','id','--branch')
revnum = pexec('hg','id','--num')
print('%s : %s (%s)' % (revnum, changeset, branch))

回答 10

在Python 3.7中,capture_output为引入了新的关键字参数subprocess.run。启用简短的说明:

import subprocess

p = subprocess.run("echo 'hello world!'", capture_output=True, shell=True, encoding="utf8")
assert p.stdout == 'hello world!\n'

In Python 3.7 a new keyword argument capture_output was introduced for subprocess.run. Enabling the short and simple:

import subprocess

p = subprocess.run("echo 'hello world!'", capture_output=True, shell=True, encoding="utf8")
assert p.stdout == 'hello world!\n'

回答 11

import subprocess
output = str(subprocess.Popen("ntpq -p",shell = True,stdout = subprocess.PIPE, 
stderr = subprocess.STDOUT).communicate()[0])

这是一线解决方案

import subprocess
output = str(subprocess.Popen("ntpq -p",shell = True,stdout = subprocess.PIPE, 
stderr = subprocess.STDOUT).communicate()[0])

This is one line solution


回答 12

以下内容在单个变量中捕获了进程的stdout和stderr。它与Python 2和3兼容:

from subprocess import check_output, CalledProcessError, STDOUT

command = ["ls", "-l"]
try:
    output = check_output(command, stderr=STDOUT).decode()
    success = True 
except CalledProcessError as e:
    output = e.output.decode()
    success = False

如果您的命令是字符串而不是数组,请在此前缀:

import shlex
command = shlex.split(command)

The following captures stdout and stderr of the process in a single variable. It is Python 2 and 3 compatible:

from subprocess import check_output, CalledProcessError, STDOUT

command = ["ls", "-l"]
try:
    output = check_output(command, stderr=STDOUT).decode()
    success = True 
except CalledProcessError as e:
    output = e.output.decode()
    success = False

If your command is a string rather than an array, prefix this with:

import shlex
command = shlex.split(command)

回答 13

对于python 3.5,我根据先前的答案提出了功能。日志可能会被删除,以为拥有它很高兴

import shlex
from subprocess import check_output, CalledProcessError, STDOUT


def cmdline(command):
    log("cmdline:{}".format(command))
    cmdArr = shlex.split(command)
    try:
        output = check_output(cmdArr,  stderr=STDOUT).decode()
        log("Success:{}".format(output))
    except (CalledProcessError) as e:
        output = e.output.decode()
        log("Fail:{}".format(output))
    except (Exception) as e:
        output = str(e);
        log("Fail:{}".format(e))
    return str(output)


def log(msg):
    msg = str(msg)
    d_date = datetime.datetime.now()
    now = str(d_date.strftime("%Y-%m-%d %H:%M:%S"))
    print(now + " " + msg)
    if ("LOG_FILE" in globals()):
        with open(LOG_FILE, "a") as myfile:
            myfile.write(now + " " + msg + "\n")

For python 3.5 I put up function based on previous answer. Log may be removed, thought it’s nice to have

import shlex
from subprocess import check_output, CalledProcessError, STDOUT


def cmdline(command):
    log("cmdline:{}".format(command))
    cmdArr = shlex.split(command)
    try:
        output = check_output(cmdArr,  stderr=STDOUT).decode()
        log("Success:{}".format(output))
    except (CalledProcessError) as e:
        output = e.output.decode()
        log("Fail:{}".format(output))
    except (Exception) as e:
        output = str(e);
        log("Fail:{}".format(e))
    return str(output)


def log(msg):
    msg = str(msg)
    d_date = datetime.datetime.now()
    now = str(d_date.strftime("%Y-%m-%d %H:%M:%S"))
    print(now + " " + msg)
    if ("LOG_FILE" in globals()):
        with open(LOG_FILE, "a") as myfile:
            myfile.write(now + " " + msg + "\n")

回答 14

使用ckeck_output方法subprocess

import subprocess
address = 192.168.x.x
res = subprocess.check_output(['ping', address, '-c', '3'])

最后解析字符串

for line in res.splitlines():

希望对您有所帮助,编码愉快

Use ckeck_output method of subprocess

import subprocess
address = 192.168.x.x
res = subprocess.check_output(['ping', address, '-c', '3'])

Finally parse the string

for line in res.splitlines():

Hope it helps, happy coding


Matplotlib使刻度标签的字体变小

问题:Matplotlib使刻度标签的字体变小

在matplotlib图中,如何使用ax1.set_xticklabels()较小的刻度标签来设置字体大小?

此外,如何将其从水平旋转到垂直?

In a matplotlib figure, how can I make the font size for the tick labels using ax1.set_xticklabels() smaller?

Further, how can one rotate it from horizontal to vertical?


回答 0

请注意,较新版本的MPL具有此任务的快捷方式。该问题的其他答案中显示了一个示例:https : //stackoverflow.com/a/11386056/42346

下面的代码仅用于说明目的,不一定进行优化。

import matplotlib.pyplot as plt
import numpy as np

def xticklabels_example():
    fig = plt.figure() 

    x = np.arange(20)
    y1 = np.cos(x)
    y2 = (x**2)
    y3 = (x**3)
    yn = (y1,y2,y3)
    COLORS = ('b','g','k')

    for i,y in enumerate(yn):
        ax = fig.add_subplot(len(yn),1,i+1)

        ax.plot(x, y, ls='solid', color=COLORS[i]) 

        if i != len(yn) - 1:
            # all but last 
            ax.set_xticklabels( () )
        else:
            for tick in ax.xaxis.get_major_ticks():
                tick.label.set_fontsize(14) 
                # specify integer or one of preset strings, e.g.
                #tick.label.set_fontsize('x-small') 
                tick.label.set_rotation('vertical')

    fig.suptitle('Matplotlib xticklabels Example')
    plt.show()

if __name__ == '__main__':
    xticklabels_example()

在此处输入图片说明

Please note that newer versions of MPL have a shortcut for this task. An example is shown in the other answer to this question: https://stackoverflow.com/a/11386056/42346

The code below is for illustrative purposes and may not necessarily be optimized.

import matplotlib.pyplot as plt
import numpy as np

def xticklabels_example():
    fig = plt.figure() 

    x = np.arange(20)
    y1 = np.cos(x)
    y2 = (x**2)
    y3 = (x**3)
    yn = (y1,y2,y3)
    COLORS = ('b','g','k')

    for i,y in enumerate(yn):
        ax = fig.add_subplot(len(yn),1,i+1)

        ax.plot(x, y, ls='solid', color=COLORS[i]) 

        if i != len(yn) - 1:
            # all but last 
            ax.set_xticklabels( () )
        else:
            for tick in ax.xaxis.get_major_ticks():
                tick.label.set_fontsize(14) 
                # specify integer or one of preset strings, e.g.
                #tick.label.set_fontsize('x-small') 
                tick.label.set_rotation('vertical')

    fig.suptitle('Matplotlib xticklabels Example')
    plt.show()

if __name__ == '__main__':
    xticklabels_example()

enter image description here


回答 1

实际上有一个更简单的方法。我刚刚发现:

import matplotlib.pyplot as plt
# We prepare the plot  
fig, ax = plt.subplots()

# We change the fontsize of minor ticks label 
ax.tick_params(axis='both', which='major', labelsize=10)
ax.tick_params(axis='both', which='minor', labelsize=8)

但是,这只能回答label部分问题的大小。

There is a simpler way actually. I just found:

import matplotlib.pyplot as plt
# We prepare the plot  
fig, ax = plt.subplots()

# We change the fontsize of minor ticks label 
ax.tick_params(axis='both', which='major', labelsize=10)
ax.tick_params(axis='both', which='minor', labelsize=8)

This only answers to the size of label part of your question though.


回答 2

要同时指定字体大小和旋转度,请尝试以下操作:

plt.xticks(fontsize=14, rotation=90)

To specify both font size and rotation at the same time, try this:

plt.xticks(fontsize=14, rotation=90)

回答 3

或者,您可以执行以下操作:

import matplotlib as mpl
label_size = 8
mpl.rcParams['xtick.labelsize'] = label_size 

Alternatively, you can just do:

import matplotlib as mpl
label_size = 8
mpl.rcParams['xtick.labelsize'] = label_size 

回答 4

plt.tick_params(axis='both', which='minor', labelsize=12)
plt.tick_params(axis='both', which='minor', labelsize=12)

回答 5

另一种选择

我有两个并排的图,想分别调整刻度线标签。

上面的解决方案很接近,但是对我来说却没有用。我从此matplotlib 页面找到了解决方案。

ax.xaxis.set_tick_params(labelsize=20)

这可以解决问题,并且很直接。在我的用例中,需要调整的是右图。自从创建新的刻度标签以来,对于左侧的图,我可以在设置标签的相同过程中调整字体。

ax1.set_xticklabels(ax1_x, fontsize=15)
ax1.set_yticklabels(ax1_y, fontsize=15)

因此我使用了正确的情节,

ax2.xaxis.set_tick_params(labelsize=24)
ax2.yaxis.set_tick_params(labelsize=24)

轻微的微妙…我知道…但是我希望这可以帮助某人:)

如果有人知道如何调整数量级标签的字体大小,则加分。

在此处输入图片说明

Another alternative

I have two plots side by side and would like to adjust tick labels separately.

The above solutions were close however they were not working out for me. I found my solution from this matplotlib page.

ax.xaxis.set_tick_params(labelsize=20)

This did the trick and was straight to the point. For my use case, it was the plot on the right that needed to be adjusted. For the plot on the left since I was creating new tick labels I was able to adjust the font in the same process as seting the labels.

ie

ax1.set_xticklabels(ax1_x, fontsize=15)
ax1.set_yticklabels(ax1_y, fontsize=15)

thus I used for the right plot,

ax2.xaxis.set_tick_params(labelsize=24)
ax2.yaxis.set_tick_params(labelsize=24)

A minor subtlety… I know… but I hope this helps someone :)

Bonus points if anyone knows how to adjust the font size of the order of magnitude label.

enter image description here


回答 6

在当前版本的Matplotlib中,您可以执行axis.set_xticklabels(labels, fontsize='small')

In current versions of Matplotlib, you can do axis.set_xticklabels(labels, fontsize='small').


回答 7

您还可以使用如下一行来更改标签显示参数(如fontsize):

zed = [tick.label.set_fontsize(14) for tick in ax.yaxis.get_major_ticks()]

You can also change label display parameters like fontsize with a line like this:

zed = [tick.label.set_fontsize(14) for tick in ax.yaxis.get_major_ticks()]

回答 8

对于较小的字体,我使用

ax1.set_xticklabels(xticklabels, fontsize=7)

而且有效!

For smaller font, I use

ax1.set_xticklabels(xticklabels, fontsize=7)

and it works!


回答 9

以下为我工作:

ax2.xaxis.set_tick_params(labelsize=7)
ax2.yaxis.set_tick_params(labelsize=7)

上面的优点是您无需提供的array,也不需要labels使用上的任何数据axes

The following worked for me:

ax2.xaxis.set_tick_params(labelsize=7)
ax2.yaxis.set_tick_params(labelsize=7)

The advantage of the above is you do not need to provide the array of labels and works with any data on the axes.