# 标签归档：numpy

## 1.准备

(可选1) 如果你用Python的目的是数据分析，可以直接安装Anaconda：Python数据分析与挖掘好帮手—Anaconda，它内置了Python和pip.

(可选2) 此外，推荐大家用VSCode编辑器来编写小型Python项目：Python 编程的最好搭档—VSCode 详细指南

Windows环境下打开Cmd(开始—运行—CMD)，苹果系统环境下请打开Terminal(command+空格输入Terminal)，输入命令安装依赖：

`pip install autograd`

```# 公众号 Python实用宝典

def oneline(x):
y = x/2
return y

```

```(base) G:\push\20220724>python 1.py
0.5```

```# 公众号 Python实用宝典

def tanh(x):
y = np.exp(-2.0 * x)
return (1.0 - y) / (1.0 + y)

```(base) G:\push\20220724>python 1.py
0.419974341614026```

```# 公众号 Python实用宝典

def tanh(x):
y = np.exp(-2.0 * x)
return (1.0 - y) / (1.0 + y)

import matplotlib.pyplot as plt
x = np.linspace(-7, 7, 200)
plt.show()```

## 3.实现一个逻辑回归模型

```import autograd.numpy as np

# Build a toy dataset.
inputs = np.array([[0.52, 1.12,  0.77],
[0.88, -1.08, 0.15],
[0.52, 0.06, -1.30],
[0.74, -2.49, 1.39]])
targets = np.array([True, True, False, True])

def sigmoid(x):
return 0.5 * (np.tanh(x / 2.) + 1)

def logistic_predictions(weights, inputs):
# Outputs probability of a label being true according to logistic model.
return sigmoid(np.dot(inputs, weights))```

```def training_loss(weights):
# Training loss is the negative log-likelihood of the training labels.
preds = logistic_predictions(weights, inputs)
label_probabilities = preds * targets + (1 - preds) * (1 - targets)
return -np.sum(np.log(label_probabilities))```

```# Define a function that returns gradients of training loss using Autograd.

# Optimize weights using gradient descent.
weights = np.array([0.0, 0.0, 0.0])
print("Initial loss:", training_loss(weights))
for i in range(100):

print("Trained loss:", training_loss(weights))```

```(base) G:\push\20220724>python regress.py
Initial loss: 2.772588722239781
Trained loss: 1.067270675787016```

¥1¥5¥10¥20¥50¥100¥200 自定义

​Python实用宝典 ( pythondict.com )

# 为什么说Python大数据处理一定要用Numpy Array?

Numpy 是Python科学计算的一个核心模块。它提供了非常高效的数组对象，以及用于处理这些数组对象的工具。一个Numpy数组由许多值组成，所有值的类型是相同的。

Python的核心库提供了 List 列表。列表是最常见的Python数据类型之一，它可以调整大小并且包含不同类型的元素，非常方便。

Numpy数据结构在以下方面表现更好：

1.内存大小—Numpy数据结构占用的内存更小。

2.性能—Numpy底层是用C语言实现的，比列表更快。

3.运算方法—内置优化了代数运算等方法。

## 1.Numpy Array内存占用更小

64 + 8 * len(lst) + len(lst) * 28 字节

96 + len(a) * 8 字节

## 2.Numpy Array速度更快、内置计算方法

```import time
import numpy as np

size_of_vec = 1000

def pure_python_version():
t1 = time.time()
X = range(size_of_vec)
Y = range(size_of_vec)
Z = [X[i] + Y[i] for i in range(len(X)) ]
return time.time() - t1

def numpy_version():
t1 = time.time()
X = np.arange(size_of_vec)
Y = np.arange(size_of_vec)
Z = X + Y
return time.time() - t1

t1 = pure_python_version()
t2 = numpy_version()
print(t1, t2)
print("Numpy is in this example " + str(t1/t2) + " faster!")
```

```0.00048732757568359375 0.0002491474151611328
Numpy is in this example 1.955980861244019 faster!```

```import numpy as np
from timeit import Timer

size_of_vec = 1000
X_list = range(size_of_vec)
Y_list = range(size_of_vec)
X = np.arange(size_of_vec)
Y = np.arange(size_of_vec)

def pure_python_version():
Z = [X_list[i] + Y_list[i] for i in range(len(X_list)) ]

def numpy_version():
Z = X + Y

timer_obj1 = Timer("pure_python_version()",
"from __main__ import pure_python_version")
timer_obj2 = Timer("numpy_version()",
"from __main__ import numpy_version")

print(timer_obj1.timeit(10))
print(timer_obj2.timeit(10))  # Runs Faster!

print(timer_obj1.repeat(repeat=3, number=10))
print(timer_obj2.repeat(repeat=3, number=10)) # repeat to prove it!```

```0.0029753120616078377
0.00014940369874238968
[0.002683573868125677, 0.002754641231149435, 0.002803879790008068]
[6.536301225423813e-05, 2.9387418180704117e-05, 2.9171351343393326e-05]
```

¥1¥5¥10¥20¥50¥100¥200 自定义

​Python实用宝典 ( pythondict.com )

# 切片NumPy 2d数组，或者如何从nxn数组（n> m）中提取mxm子矩阵？

## 问题：切片NumPy 2d数组，或者如何从nxn数组（n> m）中提取mxm子矩阵？

``````from numpy import *
x = range(16)
x = reshape(x,(4,4))

print x
[[ 0  1  2  3]
[ 4  5  6  7]
[ 8  9 10 11]
[12 13 14 15]]``````

``````In : x[0:2,0:2]
Out:
array([[0, 1],
[4, 5]])

In : x[2:,2:]
Out:
array([[10, 11],
[14, 15]])``````

``````In : x[[1,3],[1,3]]
Out: array([ 5, 15])``````

``````    In : x[[1,3]][:,[1,3]]
Out:
array([[ 5,  7],
[13, 15]])``````

I want to slice a NumPy nxn array. I want to extract an arbitrary selection of m rows and columns of that array (i.e. without any pattern in the numbers of rows/columns), making it a new, mxm array. For this example let us say the array is 4×4 and I want to extract a 2×2 array from it.

Here is our array:

``````from numpy import *
x = range(16)
x = reshape(x,(4,4))

print x
[[ 0  1  2  3]
[ 4  5  6  7]
[ 8  9 10 11]
[12 13 14 15]]
``````

The line and columns to remove are the same. The easiest case is when I want to extract a 2×2 submatrix that is at the beginning or at the end, i.e. :

``````In : x[0:2,0:2]
Out:
array([[0, 1],
[4, 5]])

In : x[2:,2:]
Out:
array([[10, 11],
[14, 15]])
``````

But what if I need to remove another mixture of rows/columns? What if I need to remove the first and third lines/rows, thus extracting the submatrix `[[5,7],[13,15]]`? There can be any composition of rows/lines. I read somewhere that I just need to index my array using arrays/lists of indices for both rows and columns, but that doesn’t seem to work:

``````In : x[[1,3],[1,3]]
Out: array([ 5, 15])
``````

I found one way, which is:

``````    In : x[[1,3]][:,[1,3]]
Out:
array([[ 5,  7],
[13, 15]])
``````

First issue with this is that it is hardly readable, although I can live with that. If someone has a better solution, I’d certainly like to hear it.

Other thing is I read on a forum that indexing arrays with arrays forces NumPy to make a copy of the desired array, thus when treating with large arrays this could become a problem. Why is that so / how does this mechanism work?

## 回答 0

As Sven mentioned, `x[[,],[1,3]]` will give back the 0 and 2 rows that match with the 1 and 3 columns while `x[[0,2],[1,3]]` will return the values x[0,1] and x[2,3] in an array.

There is a helpful function for doing the first example I gave, `numpy.ix_`. You can do the same thing as my first example with `x[numpy.ix_([0,2],[1,3])]`. This can save you from having to enter in all of those extra brackets.

## 回答 1

NumPy通过引入步幅来解决此问题。在计算要访问的内存偏移量时`x[i,j]`，实际计算的是`i*x.strides+j*x.strides`（并且这已经包括int大小的因数）：

``````x.strides
(16, 4)``````

`y`像上面那样提取时，NumPy不会创建新的缓冲区，但是创建一个引用相同缓冲区的新数组对象（否则`y`将等于`x`。）然后，新数组对象将具有不同的形状，`x`并且可能以不同的开头偏移到缓冲区中，但将与`x`（至少在这种情况下）共享跨步：

``````y.shape
(2,2)
y.strides
(16, 4)``````

``x[[,],[1,3]]``

``x[1::2, 1::2]``

To answer this question, we have to look at how indexing a multidimensional array works in Numpy. Let’s first say you have the array `x` from your question. The buffer assigned to `x` will contain 16 ascending integers from 0 to 15. If you access one element, say `x[i,j]`, NumPy has to figure out the memory location of this element relative to the beginning of the buffer. This is done by calculating in effect `i*x.shape+j` (and multiplying with the size of an int to get an actual memory offset).

If you extract a subarray by basic slicing like `y = x[0:2,0:2]`, the resulting object will share the underlying buffer with `x`. But what happens if you acces `y[i,j]`? NumPy can’t use `i*y.shape+j` to calculate the offset into the array, because the data belonging to `y` is not consecutive in memory.

NumPy solves this problem by introducing strides. When calculating the memory offset for accessing `x[i,j]`, what is actually calculated is `i*x.strides+j*x.strides` (and this already includes the factor for the size of an int):

``````x.strides
(16, 4)
``````

When `y` is extracted like above, NumPy does not create a new buffer, but it does create a new array object referencing the same buffer (otherwise `y` would just be equal to `x`.) The new array object will have a different shape then `x` and maybe a different starting offset into the buffer, but will share the strides with `x` (in this case at least):

``````y.shape
(2,2)
y.strides
(16, 4)
``````

This way, computing the memory offset for `y[i,j]` will yield the correct result.

But what should NumPy do for something like `z=x[[1,3]]`? The strides mechanism won’t allow correct indexing if the original buffer is used for `z`. NumPy theoretically could add some more sophisticated mechanism than the strides, but this would make element access relatively expensive, somehow defying the whole idea of an array. In addition, a view wouldn’t be a really lightweight object anymore.

This is covered in depth in the NumPy documentation on indexing.

Oh, and nearly forgot about your actual question: Here is how to make the indexing with multiple lists work as expected:

``````x[[,],[1,3]]
``````

This is because the index arrays are broadcasted to a common shape. Of course, for this particular example, you can also make do with basic slicing:

``````x[1::2, 1::2]
``````

## 回答 2

``a[[1,3],:][:,[1,3]]``

I don’t think that `x[[1,3]][:,[1,3]]` is hardly readable. If you want to be more clear on your intent, you can do:

``````a[[1,3],:][:,[1,3]]
``````

I am not an expert in slicing but typically, if you try to slice into an array and the values are continuous, you get back a view where the stride value is changed.

e.g. In your inputs 33 and 34, although you get a 2×2 array, the stride is 4. Thus, when you index the next row, the pointer moves to the correct position in memory.

Clearly, this mechanism doesn’t carry well into the case of an array of indices. Hence, numpy will have to make the copy. After all, many other matrix math function relies on size, stride and continuous memory allocation.

## 回答 3

``````In : x=np.arange(16).reshape((4,4))
In : x[1:4:2,1:4:2]
Out:
array([[ 5,  7],
[13, 15]])``````

``````In : y=x[1:4:2,1:4:2]

In : y[0,0]=100

In : x   # <---- Notice x[1,1] has changed
Out:
array([[  0,   1,   2,   3],
[  4, 100,   6,   7],
[  8,   9,  10,  11],
[ 12,  13,  14,  15]])``````

`z=x[(1,3),:][:,(1,3)]`使用高级索引并因此返回副本：

``````In : x=np.arange(16).reshape((4,4))
In : z=x[(1,3),:][:,(1,3)]

In : z
Out:
array([[ 5,  7],
[13, 15]])

In : z[0,0]=0``````

``````In : x
Out:
array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[12, 13, 14, 15]])``````

If you want to skip every other row and every other column, then you can do it with basic slicing:

``````In : x=np.arange(16).reshape((4,4))
In : x[1:4:2,1:4:2]
Out:
array([[ 5,  7],
[13, 15]])
``````

This returns a view, not a copy of your array.

``````In : y=x[1:4:2,1:4:2]

In : y[0,0]=100

In : x   # <---- Notice x[1,1] has changed
Out:
array([[  0,   1,   2,   3],
[  4, 100,   6,   7],
[  8,   9,  10,  11],
[ 12,  13,  14,  15]])
``````

while `z=x[(1,3),:][:,(1,3)]` uses advanced indexing and thus returns a copy:

``````In : x=np.arange(16).reshape((4,4))
In : z=x[(1,3),:][:,(1,3)]

In : z
Out:
array([[ 5,  7],
[13, 15]])

In : z[0,0]=0
``````

Note that `x` is unchanged:

``````In : x
Out:
array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[12, 13, 14, 15]])
``````

If you wish to select arbitrary rows and columns, then you can’t use basic slicing. You’ll have to use advanced indexing, using something like `x[rows,:][:,columns]`, where `rows` and `columns` are sequences. This of course is going to give you a copy, not a view, of your original array. This is as one should expect, since a numpy array uses contiguous memory (with constant strides), and there would be no way to generate a view with arbitrary rows and columns (since that would require non-constant strides).

## 回答 4

``````>>> x[1:4:2, 1:4:2]
array([[ 5,  7],
[13, 15]])``````

With numpy, you can pass a slice for each component of the index – so, your `x[0:2,0:2]` example above works.

If you just want to evenly skip columns or rows, you can pass slices with three components (i.e. start, stop, step).

``````>>> x[1:4:2, 1:4:2]
array([[ 5,  7],
[13, 15]])
``````

Which is basically: slice in the first dimension, with start at index 1, stop when index is equal or greater than 4, and add 2 to the index in each pass. The same for the second dimension. Again: this only works for constant steps.

The syntax you got to do something quite different internally – what `x[[1,3]][:,[1,3]]` actually does is create a new array including only rows 1 and 3 from the original array (done with the `x[[1,3]]` part), and then re-slice that – creating a third array – including only columns 1 and 3 of the previous array.

## 回答 5

``````columns_to_keep = [1,3]
rows_to_keep = [1,3]``````

``x[np.ix_(rows_to_keep, columns_to_keep)] ``

``````array([[ 5,  7],
[13, 15]])``````

I have a similar question here: Writting in sub-ndarray of a ndarray in the most pythonian way. Python 2 .

Following the solution of previous post for your case the solution looks like:

``````columns_to_keep = [1,3]
rows_to_keep = [1,3]
``````

An using ix_:

``````x[np.ix_(rows_to_keep, columns_to_keep)]
``````

Which is:

``````array([[ 5,  7],
[13, 15]])
``````

## 回答 6

`````` x=np.arange(16).reshape((4,4))
x[range(1,3), :][:,range(1,3)] ``````

I’m not sure how efficient this is but you can use range() to slice in both axis

`````` x=np.arange(16).reshape((4,4))
x[range(1,3), :][:,range(1,3)]
``````

# 如何捕获像异常一样的numpy警告（不仅用于测试）？

## 问题：如何捕获像异常一样的numpy警告（不仅用于测试）？

``````import numpy as np
import matplotlib.pyplot as plt
import warnings

class Lagrange:
def __init__(self, xPts, yPts):
self.xPts = np.array(xPts)
self.yPts = np.array(yPts)
self.degree = len(xPts)-1
self.weights = np.array([np.product([x_j - x_i for x_j in xPts if x_j != x_i]) for x_i in xPts])

def __call__(self, x):
warnings.filterwarnings("error")
try:
bigNumerator = np.product(x - self.xPts)
numerators = np.array([bigNumerator/(x - x_j) for x_j in self.xPts])
return sum(numerators/self.weights*self.yPts)
except Exception, e: # Catch division by 0. Only possible in 'numerators' array
return yPts[np.where(xPts == x)]

L = Lagrange([-1,0,1],[1,0,1]) # Creates quadratic poly L(x) = x^2

L(1) # This should catch an error, then return 1. ``````

``Warning: divide by zero encountered in int_scalars``

I have to make a Lagrange polynomial in Python for a project I’m doing. I’m doing a barycentric style one to avoid using an explicit for-loop as opposed to a Newton’s divided difference style one. The problem I have is that I need to catch a division by zero, but Python (or maybe numpy) just makes it a warning instead of a normal exception.

So, what I need to know how to do is to catch this warning as if it were an exception. The related questions to this I found on this site were answered not in the way I needed. Here’s my code:

``````import numpy as np
import matplotlib.pyplot as plt
import warnings

class Lagrange:
def __init__(self, xPts, yPts):
self.xPts = np.array(xPts)
self.yPts = np.array(yPts)
self.degree = len(xPts)-1
self.weights = np.array([np.product([x_j - x_i for x_j in xPts if x_j != x_i]) for x_i in xPts])

def __call__(self, x):
warnings.filterwarnings("error")
try:
bigNumerator = np.product(x - self.xPts)
numerators = np.array([bigNumerator/(x - x_j) for x_j in self.xPts])
return sum(numerators/self.weights*self.yPts)
except Exception, e: # Catch division by 0. Only possible in 'numerators' array
return yPts[np.where(xPts == x)]

L = Lagrange([-1,0,1],[1,0,1]) # Creates quadratic poly L(x) = x^2

L(1) # This should catch an error, then return 1.
``````

When this code is executed, the output I get is:

``````Warning: divide by zero encountered in int_scalars
``````

That’s the warning I want to catch. It should occur inside the list comprehension.

## 回答 0

``````>>> import numpy as np
>>> np.array()/0   #'warn' mode
__main__:1: RuntimeWarning: divide by zero encountered in divide
array()
>>> np.seterr(all='print')
{'over': 'warn', 'divide': 'warn', 'invalid': 'warn', 'under': 'ignore'}
>>> np.array()/0   #'print' mode
Warning: divide by zero encountered in divide
array()``````

1. 使用`numpy.seterr(all='raise')`它将直接引发异常。但是，这会更改所有操作的行为，因此，这是行为上的很大变化。
2. 使用`numpy.seterr(all='warn')`，可以将打印的警告转换为真实的警告，您将可以使用上述解决方案来本地化此行为更改。

``````>>> import warnings
>>>
>>> warnings.filterwarnings('error')
>>>
>>> try:
...     warnings.warn(Warning())
... except Warning:
...     print 'Warning was raised as an exception!'
...
Warning was raised as an exception!``````

``````>>> import warnings
>>> with warnings.catch_warnings():
...     warnings.filterwarnings('error')
...     try:
...         warnings.warn(Warning())
...     except Warning: print 'Raised!'
...
Raised!
>>> try:
...     warnings.warn(Warning())
... except Warning: print 'Not raised!'
...
__main__:2: Warning: ``````

It seems that your configuration is using the `print` option for `numpy.seterr`:

``````>>> import numpy as np
>>> np.array()/0   #'warn' mode
__main__:1: RuntimeWarning: divide by zero encountered in divide
array()
>>> np.seterr(all='print')
{'over': 'warn', 'divide': 'warn', 'invalid': 'warn', 'under': 'ignore'}
>>> np.array()/0   #'print' mode
Warning: divide by zero encountered in divide
array()
``````

This means that the warning you see is not a real warning, but it’s just some characters printed to `stdout`(see the documentation for `seterr`). If you want to catch it you can:

1. Use `numpy.seterr(all='raise')` which will directly raise the exception. This however changes the behaviour of all the operations, so it’s a pretty big change in behaviour.
2. Use `numpy.seterr(all='warn')`, which will transform the printed warning in a real warning and you’ll be able to use the above solution to localize this change in behaviour.

Once you actually have a warning, you can use the `warnings` module to control how the warnings should be treated:

``````>>> import warnings
>>>
>>> warnings.filterwarnings('error')
>>>
>>> try:
...     warnings.warn(Warning())
... except Warning:
...     print 'Warning was raised as an exception!'
...
Warning was raised as an exception!
``````

Read carefully the documentation for `filterwarnings` since it allows you to filter only the warning you want and has other options. I’d also consider looking at `catch_warnings` which is a context manager which automatically resets the original `filterwarnings` function:

``````>>> import warnings
>>> with warnings.catch_warnings():
...     warnings.filterwarnings('error')
...     try:
...         warnings.warn(Warning())
...     except Warning: print 'Raised!'
...
Raised!
>>> try:
...     warnings.warn(Warning())
... except Warning: print 'Not raised!'
...
__main__:2: Warning:
``````

## 回答 1

``````import numpy as np

a = np.r_[1.]
with np.errstate(divide='raise'):
try:
a / 0   # this gets caught and handled as an exception
except FloatingPointError:
print('oh no!')
a / 0           # this prints a RuntimeWarning as usual``````

### 编辑：

``````all_zeros = np.array([0., 0.])
not_all_zeros = np.array([1., 0.])

with np.errstate(divide='raise'):
not_all_zeros / 0.  # Raises FloatingPointError

with np.errstate(divide='raise'):
all_zeros / 0.  # No exception raised

with np.errstate(invalid='raise'):
all_zeros / 0.  # Raises FloatingPointError``````

If you already know where the warning is likely to occur then it’s often cleaner to use the `numpy.errstate` context manager, rather than `numpy.seterr` which treats all subsequent warnings of the same type the same regardless of where they occur within your code:

``````import numpy as np

a = np.r_[1.]
with np.errstate(divide='raise'):
try:
a / 0   # this gets caught and handled as an exception
except FloatingPointError:
print('oh no!')
a / 0           # this prints a RuntimeWarning as usual
``````

### Edit:

In my original example I had `a = np.r_`, but apparently there was a change in numpy’s behaviour such that division-by-zero is handled differently in cases where the numerator is all-zeros. For example, in numpy 1.16.4:

``````all_zeros = np.array([0., 0.])
not_all_zeros = np.array([1., 0.])

with np.errstate(divide='raise'):
not_all_zeros / 0.  # Raises FloatingPointError

with np.errstate(divide='raise'):
all_zeros / 0.  # No exception raised

with np.errstate(invalid='raise'):
all_zeros / 0.  # Raises FloatingPointError
``````

The corresponding warning messages are also different: `1. / 0.` is logged as `RuntimeWarning: divide by zero encountered in true_divide`, whereas `0. / 0.` is logged as `RuntimeWarning: invalid value encountered in true_divide`. I’m not sure why exactly this change was made, but I suspect it has to do with the fact that the result of `0. / 0.` is not representable as a number (numpy returns a NaN in this case) whereas `1. / 0.` and `-1. / 0.` return +Inf and -Inf respectively, per the IEE 754 standard.

If you want to catch both types of error you can always pass `np.errstate(divide='raise', invalid='raise')`, or `all='raise'` if you want to raise an exception on any kind of floating point error.

## 回答 2

``````import warnings

with warnings.catch_warnings():
warnings.filterwarnings('error')
try:
except Warning as e:
print('error found:', e)``````

To elaborate on @Bakuriu’s answer above, I’ve found that this enables me to catch a runtime warning in a similar fashion to how I would catch an error warning, printing out the warning nicely:

``````import warnings

with warnings.catch_warnings():
warnings.filterwarnings('error')
try:
except Warning as e:
print('error found:', e)
``````

You will probably be able to play around with placing of the warnings.catch_warnings() placement depending on how big of an umbrella you want to cast with catching errors this way.

## 回答 3

``numpy.seterr(all='raise')``

``````numpy.seterr(all='raise')
``````

# 使用Matplotlib绘制2D热图

## 问题：使用Matplotlib绘制2D热图

Using Matplotlib, I want to plot a 2D heat map. My data is an n-by-n Numpy array, each with a value between 0 and 1. So for the (i, j) element of this array, I want to plot a square at the (i, j) coordinate in my heat map, whose color is proportional to the element’s value in the array.

How can I do this?

## 回答 0

`imshow()`函数带有参数`interpolation='nearest'``cmap='hot'`应该执行您想要的操作。

``````import matplotlib.pyplot as plt
import numpy as np

a = np.random.random((16, 16))
plt.imshow(a, cmap='hot', interpolation='nearest')
plt.show()
`````` The `imshow()` function with parameters `interpolation='nearest'` and `cmap='hot'` should do what you want.

``````import matplotlib.pyplot as plt
import numpy as np

a = np.random.random((16, 16))
plt.imshow(a, cmap='hot', interpolation='nearest')
plt.show()
`````` ## 回答 1

Seaborn负责许多手动工作，并自动在图表的侧面绘制渐变等。

``````import numpy as np
import seaborn as sns
import matplotlib.pylab as plt

uniform_data = np.random.rand(10, 12)
ax = sns.heatmap(uniform_data, linewidth=0.5)
plt.show()
`````` ``````corr = np.corrcoef(np.random.randn(10, 200))
with sns.axes_style("white"):
plt.show()
`````` Seaborn takes care of a lot of the manual work and automatically plots a gradient at the side of the chart etc.

``````import numpy as np
import seaborn as sns
import matplotlib.pylab as plt

uniform_data = np.random.rand(10, 12)
ax = sns.heatmap(uniform_data, linewidth=0.5)
plt.show()
`````` Or, you can even plot upper / lower left / right triangles of square matrices, for example a correlation matrix which is square and is symmetric, so plotting all values would be redundant anyway.

``````corr = np.corrcoef(np.random.randn(10, 200))
with sns.axes_style("white"):
plt.show()
`````` ## 回答 2

``````import matplotlib.pyplot as plt
import numpy as np

def heatmap2d(arr: np.ndarray):
plt.imshow(arr, cmap='viridis')
plt.colorbar()
plt.show()

test_array = np.arange(100 * 100).reshape(100, 100)
heatmap2d(test_array)
`````` For a 2d `numpy` array, simply use `imshow()` may help you:

``````import matplotlib.pyplot as plt
import numpy as np

def heatmap2d(arr: np.ndarray):
plt.imshow(arr, cmap='viridis')
plt.colorbar()
plt.show()

test_array = np.arange(100 * 100).reshape(100, 100)
heatmap2d(test_array)
`````` This code produces a continuous heatmap.

You can choose another built-in `colormap` from here.

## 回答 3

``````import matplotlib.pyplot as plt
import numpy as np

# generate 2 2d grids for the x & y bounds
y, x = np.meshgrid(np.linspace(-3, 3, 100), np.linspace(-3, 3, 100))

z = (1 - x / 2. + x ** 5 + y ** 3) * np.exp(-x ** 2 - y ** 2)
# x and y are bounds, so z should be the value *inside* those bounds.
# Therefore, remove the last value from the z array.
z = z[:-1, :-1]
z_min, z_max = -np.abs(z).max(), np.abs(z).max()

fig, ax = plt.subplots()

c = ax.pcolormesh(x, y, z, cmap='RdBu', vmin=z_min, vmax=z_max)
ax.set_title('pcolormesh')
# set the limits of the plot to the limits of the data
ax.axis([x.min(), x.max(), y.min(), y.max()])
fig.colorbar(c, ax=ax)

plt.show()
`````` I would use matplotlib’s pcolor/pcolormesh function since it allows nonuniform spacing of the data.

Example taken from matplotlib:

``````import matplotlib.pyplot as plt
import numpy as np

# generate 2 2d grids for the x & y bounds
y, x = np.meshgrid(np.linspace(-3, 3, 100), np.linspace(-3, 3, 100))

z = (1 - x / 2. + x ** 5 + y ** 3) * np.exp(-x ** 2 - y ** 2)
# x and y are bounds, so z should be the value *inside* those bounds.
# Therefore, remove the last value from the z array.
z = z[:-1, :-1]
z_min, z_max = -np.abs(z).max(), np.abs(z).max()

fig, ax = plt.subplots()

c = ax.pcolormesh(x, y, z, cmap='RdBu', vmin=z_min, vmax=z_max)
ax.set_title('pcolormesh')
# set the limits of the plot to the limits of the data
ax.axis([x.min(), x.max(), y.min(), y.max()])
fig.colorbar(c, ax=ax)

plt.show()
`````` ## 回答 4

``````import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import griddata

X_dat = dat[:,0]
Y_dat = dat[:,1]
Z_dat = dat[:,2]

# Convert from pandas dataframes to numpy arrays
X, Y, Z, = np.array([]), np.array([]), np.array([])
for i in range(len(X_dat)):
X = np.append(X, X_dat[i])
Y = np.append(Y, Y_dat[i])
Z = np.append(Z, Z_dat[i])

# create x-y points to be used in heatmap
xi = np.linspace(X.min(), X.max(), 1000)
yi = np.linspace(Y.min(), Y.max(), 1000)

# Z is a matrix of x-y values
zi = griddata((X, Y), Z, (xi[None,:], yi[:,None]), method='cubic')

# I control the range of my colorbar by removing data
# outside of my range of interest
zmin = 3
zmax = 12
zi[(zi<zmin) | (zi>zmax)] = None

# Create the contour plot
CS = plt.contourf(xi, yi, zi, 15, cmap=plt.cm.rainbow,
vmax=zmax, vmin=zmin)
plt.colorbar()
plt.show()
``````

`dat.xyz`形式在哪里

``````x1 y1 z1
x2 y2 z2
...
``````

Here’s how to do it from a csv:

``````import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import griddata

X_dat = dat[:,0]
Y_dat = dat[:,1]
Z_dat = dat[:,2]

# Convert from pandas dataframes to numpy arrays
X, Y, Z, = np.array([]), np.array([]), np.array([])
for i in range(len(X_dat)):
X = np.append(X, X_dat[i])
Y = np.append(Y, Y_dat[i])
Z = np.append(Z, Z_dat[i])

# create x-y points to be used in heatmap
xi = np.linspace(X.min(), X.max(), 1000)
yi = np.linspace(Y.min(), Y.max(), 1000)

# Z is a matrix of x-y values
zi = griddata((X, Y), Z, (xi[None,:], yi[:,None]), method='cubic')

# I control the range of my colorbar by removing data
# outside of my range of interest
zmin = 3
zmax = 12
zi[(zi<zmin) | (zi>zmax)] = None

# Create the contour plot
CS = plt.contourf(xi, yi, zi, 15, cmap=plt.cm.rainbow,
vmax=zmax, vmin=zmin)
plt.colorbar()
plt.show()
``````

where `dat.xyz` is in the form

``````x1 y1 z1
x2 y2 z2
...
``````

# 我应该使用scipy.pi，numpy.pi还是math.pi？

## 问题：我应该使用scipy.pi，numpy.pi还是math.pi？

In a project using SciPy and NumPy, should I use `scipy.pi`, `numpy.pi`, or `math.pi`?

## 回答 0

``````>>> import math
>>> import numpy as np
>>> import scipy
>>> math.pi == np.pi == scipy.pi
True``````

``````>>> import math
>>> import numpy as np
>>> import scipy
>>> math.pi == np.pi == scipy.pi
True
``````

So it doesn’t matter, they are all the same value.

The only reason all three modules provide a `pi` value is so if you are using just one of the three modules, you can conveniently have access to pi without having to import another module. They’re not providing different values for pi.

## 回答 1

``````import math
import numpy
import scipy
import sympy

print(math.pi == numpy.pi)
> True
print(math.pi == scipy.pi)
> True
print(math.pi == sympy.pi)
> False``````

One thing to note is that not all libraries will use the same meaning for pi, of course, so it never hurts to know what you’re using. For example, the symbolic math library Sympy’s representation of pi is not the same as math and numpy:

``````import math
import numpy
import scipy
import sympy

print(math.pi == numpy.pi)
> True
print(math.pi == scipy.pi)
> True
print(math.pi == sympy.pi)
> False
``````

# 如何将NumPy数组标准化到一定范围内？

## 问题：如何将NumPy数组标准化到一定范围内？

``````# Normalize audio channels to between -1.0 and +1.0
audio[:,0] = audio[:,0]/abs(audio[:,0]).max()
audio[:,1] = audio[:,1]/abs(audio[:,1]).max()

# Normalize image to between 0 and 255
image = image/(image.max()/255.0)``````

After doing some processing on an audio or image array, it needs to be normalized within a range before it can be written back to a file. This can be done like so:

``````# Normalize audio channels to between -1.0 and +1.0
audio[:,0] = audio[:,0]/abs(audio[:,0]).max()
audio[:,1] = audio[:,1]/abs(audio[:,1]).max()

# Normalize image to between 0 and 255
image = image/(image.max()/255.0)
``````

Is there a less verbose, convenience function way to do this? `matplotlib.colors.Normalize()` doesn’t seem to be related.

## 回答 0

``````audio /= np.max(np.abs(audio),axis=0)
image *= (255.0/image.max())``````

``image *= 255.0/image.max()    # Uses 1 division and image.size multiplications``

``image /= image.max()/255.0    # Uses 1+image.size divisions``

``image = image.astype('float64')``
``````audio /= np.max(np.abs(audio),axis=0)
image *= (255.0/image.max())
``````

Using `/=` and `*=` allows you to eliminate an intermediate temporary array, thus saving some memory. Multiplication is less expensive than division, so

``````image *= 255.0/image.max()    # Uses 1 division and image.size multiplications
``````

is marginally faster than

``````image /= image.max()/255.0    # Uses 1+image.size divisions
``````

Since we are using basic numpy methods here, I think this is about as efficient a solution in numpy as can be.

In-place operations do not change the dtype of the container array. Since the desired normalized values are floats, the `audio` and `image` arrays need to have floating-point point dtype before the in-place operations are performed. If they are not already of floating-point dtype, you’ll need to convert them using `astype`. For example,

``````image = image.astype('float64')
``````

## 回答 1

``````import numpy as np

a = np.random.rand(3,2)

# Normalised [0,1]
b = (a - np.min(a))/np.ptp(a)

# Normalised [0,255] as integer: don't forget the parenthesis before astype(int)
c = (255*(a - np.min(a))/np.ptp(a)).astype(int)

# Normalised [-1,1]
d = 2.*(a - np.min(a))/np.ptp(a)-1``````

``````def nan_ptp(a):
return np.ptp(a[np.isfinite(a)])

b = (a - np.nanmin(a))/nan_ptp(a)``````

``e = (a - np.mean(a)) / np.std(a)``

If the array contains both positive and negative data, I’d go with:

``````import numpy as np

a = np.random.rand(3,2)

# Normalised [0,1]
b = (a - np.min(a))/np.ptp(a)

# Normalised [0,255] as integer: don't forget the parenthesis before astype(int)
c = (255*(a - np.min(a))/np.ptp(a)).astype(int)

# Normalised [-1,1]
d = 2.*(a - np.min(a))/np.ptp(a)-1
``````

If the array contains `nan`, one solution could be to just remove them as:

``````def nan_ptp(a):
return np.ptp(a[np.isfinite(a)])

b = (a - np.nanmin(a))/nan_ptp(a)
``````

However, depending on the context you might want to treat `nan` differently. E.g. interpolate the value, replacing in with e.g. 0, or raise an error.

Finally, worth mentioning even if it’s not OP’s question, standardization:

``````e = (a - np.mean(a)) / np.std(a)
``````

## 回答 2

``````from sklearn.preprocessing import scale
X = scale( X, axis=0, with_mean=True, with_std=True, copy=True )``````

You can also rescale using `sklearn`. The advantages are that you can adjust normalize the standard deviation, in addition to mean-centering the data, and that you can do this on either axis, by features, or by records.

``````from sklearn.preprocessing import scale
X = scale( X, axis=0, with_mean=True, with_std=True, copy=True )
``````

The keyword arguments `axis`, `with_mean`, `with_std` are self explanatory, and are shown in their default state. The argument `copy` performs the operation in-place if it is set to `False`. Documentation here.

## 回答 3

``image /= (image.max()/255.0)``

``````def normalize_columns(arr):
rows, cols = arr.shape
for col in xrange(cols):
arr[:,col] /= abs(arr[:,col]).max()``````

You can use the “i” (as in idiv, imul..) version, and it doesn’t look half bad:

``````image /= (image.max()/255.0)
``````

For the other case you can write a function to normalize an n-dimensional array by colums:

``````def normalize_columns(arr):
rows, cols = arr.shape
for col in xrange(cols):
arr[:,col] /= abs(arr[:,col]).max()
``````

## 回答 4

``audio_scaled = minmax_scale(audio, feature_range=(-1,1))``

``````shape = image.shape
image_scaled = minmax_scale(image.ravel(), feature_range=(0,255)).reshape(shape)``````

You are trying to min-max scale the values of `audio` between -1 and +1 and `image` between 0 and 255.

Using `sklearn.preprocessing.minmax_scale`, should easily solve your problem.

e.g.:

``````audio_scaled = minmax_scale(audio, feature_range=(-1,1))
``````

and

``````shape = image.shape
image_scaled = minmax_scale(image.ravel(), feature_range=(0,255)).reshape(shape)
``````

note: Not to be confused with the operation that scales the norm (length) of a vector to a certain value (usually 1), which is also commonly referred to as normalization.

## 回答 5

``````scaler = sk.MinMaxScaler(feature_range=(0, 250))
scaler = scaler.fit(X)
X_scaled = scaler.transform(X)
# Checking reconstruction
X_rec = scaler.inverse_transform(X_scaled)``````

A simple solution is using the scalers offered by the sklearn.preprocessing library.

``````scaler = sk.MinMaxScaler(feature_range=(0, 250))
scaler = scaler.fit(X)
X_scaled = scaler.transform(X)
# Checking reconstruction
X_rec = scaler.inverse_transform(X_scaled)
``````

The error X_rec-X will be zero. You can adjust the feature_range for your needs, or even use a standart scaler sk.StandardScaler()

## 回答 6

``TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''``

`numpy`我试图正常化阵列是一个`integer`数组。似乎他们不赞成在版本>中进行类型转换`1.10`，而您必须使用它`numpy.true_divide()`来解决该问题。

``````arr = np.array(img)
arr = np.true_divide(arr,[255.0],out=None)``````

`img`是一个`PIL.Image`对象。

I tried following this, and got the error

``````TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''
``````

The `numpy` array I was trying to normalize was an `integer` array. It seems they deprecated type casting in versions > `1.10`, and you have to use `numpy.true_divide()` to resolve that.

``````arr = np.array(img)
arr = np.true_divide(arr,[255.0],out=None)
``````

`img` was an `PIL.Image` object.

# 遍历一个numpy数组

## 问题：遍历一个numpy数组

``````for x in xrange(array.shape):
for y in xrange(array.shape):
do_stuff(x, y)``````

``````for x, y in itertools.product(map(xrange, array.shape)):
do_stuff(x, y)``````

``````for x, y in array.indices:
do_stuff(x, y)``````

Is there a less verbose alternative to this:

``````for x in xrange(array.shape):
for y in xrange(array.shape):
do_stuff(x, y)
``````

I came up with this:

``````for x, y in itertools.product(map(xrange, array.shape)):
do_stuff(x, y)
``````

Which saves one indentation, but is still pretty ugly.

I’m hoping for something that looks like this pseudocode:

``````for x, y in array.indices:
do_stuff(x, y)
``````

Does anything like that exist?

## 回答 0

``````>>> a =numpy.array([[1,2],[3,4],[5,6]])
>>> for (x,y), value in numpy.ndenumerate(a):
...  print x,y
...
0 0
0 1
1 0
1 1
2 0
2 1``````

``````X = np.zeros((100, 100, 100))

%timeit list([((i,j,k), X[i,j,k]) for i in range(X.shape) for j in range(X.shape) for k in range(X.shape)])
1 loop, best of 3: 376 ms per loop

%timeit list(np.ndenumerate(X))
1 loop, best of 3: 570 ms per loop``````

``````a = X.flat
%timeit list([(a.coords, x) for x in a.flat])
1 loop, best of 3: 305 ms per loop``````

I think you’re looking for the ndenumerate.

``````>>> a =numpy.array([[1,2],[3,4],[5,6]])
>>> for (x,y), value in numpy.ndenumerate(a):
...  print x,y
...
0 0
0 1
1 0
1 1
2 0
2 1
``````

Regarding the performance. It is a bit slower than a list comprehension.

``````X = np.zeros((100, 100, 100))

%timeit list([((i,j,k), X[i,j,k]) for i in range(X.shape) for j in range(X.shape) for k in range(X.shape)])
1 loop, best of 3: 376 ms per loop

%timeit list(np.ndenumerate(X))
1 loop, best of 3: 570 ms per loop
``````

If you are worried about the performance you could optimise a bit further by looking at the implementation of `ndenumerate`, which does 2 things, converting to an array and looping. If you know you have an array, you can call the `.coords` attribute of the flat iterator.

``````a = X.flat
%timeit list([(a.coords, x) for x in a.flat])
1 loop, best of 3: 305 ms per loop
``````

## 回答 1

``````>>> a = numpy.arange(9).reshape(3, 3)
>>> [(x, y) for x, y in numpy.ndindex(a.shape)]
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]``````

If you only need the indices, you could try `numpy.ndindex`:

``````>>> a = numpy.arange(9).reshape(3, 3)
>>> [(x, y) for x, y in numpy.ndindex(a.shape)]
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
``````

## 回答 2

``````import numpy as np
Y = np.array([3,4,5,6])
y += 3

Y == np.array([6, 7, 8, 9])``````

`y = 3`将无法使用`y *= 0``y += 3`而是使用。

see nditer

``````import numpy as np
Y = np.array([3,4,5,6])
y += 3

Y == np.array([6, 7, 8, 9])
``````

`y = 3` would not work, use `y *= 0` and `y += 3` instead.

# Cython：“严重错误：numpy / arrayobject.h：没有此类文件或目录”

## 问题：Cython：“严重错误：numpy / arrayobject.h：没有此类文件或目录”

``````import numpy as np
import scipy as sp
cimport numpy as np
cimport cython

cdef inline np.ndarray[np.int, ndim=1] fbincount(np.ndarray[np.int_t, ndim=1] x):
cdef int m = np.amax(x)+1
cdef int n = x.size
cdef unsigned int i
cdef np.ndarray[np.int_t, ndim=1] c = np.zeros(m, dtype=np.int)

for i in xrange(n):
c[<unsigned int>x[i]] += 1

return c

cdef packed struct Point:
np.float64_t f0, f1

@cython.boundscheck(False)
def sparsemaker(np.ndarray[np.float_t, ndim=2] X not None,
np.ndarray[np.float_t, ndim=2] Y not None,
np.ndarray[np.float_t, ndim=2] Z not None):

cdef np.ndarray[np.float64_t, ndim=1] counts, factor
cdef np.ndarray[np.int_t, ndim=1] row, col, repeats
cdef np.ndarray[Point] indices

cdef int x_, y_

_, row = np.unique(X, return_inverse=True); x_ = _.size
_, col = np.unique(Y, return_inverse=True); y_ = _.size
indices = np.rec.fromarrays([row,col])
_, repeats = np.unique(indices, return_inverse=True)
counts = 1. / fbincount(repeats)
Z.flat *= counts.take(repeats)

return sp.sparse.csr_matrix((Z.flat,(row,col)), shape=(x_, y_)).toarray()``````

I’m trying to speed up the answer here using Cython. I try to compile the code (after doing the `cygwinccompiler.py` hack explained here), but get a `fatal error: numpy/arrayobject.h: No such file or directory...compilation terminated` error. Can anyone tell me if it’s a problem with my code, or some esoteric subtlety with Cython?

Below is my code.

``````import numpy as np
import scipy as sp
cimport numpy as np
cimport cython

cdef inline np.ndarray[np.int, ndim=1] fbincount(np.ndarray[np.int_t, ndim=1] x):
cdef int m = np.amax(x)+1
cdef int n = x.size
cdef unsigned int i
cdef np.ndarray[np.int_t, ndim=1] c = np.zeros(m, dtype=np.int)

for i in xrange(n):
c[<unsigned int>x[i]] += 1

return c

cdef packed struct Point:
np.float64_t f0, f1

@cython.boundscheck(False)
def sparsemaker(np.ndarray[np.float_t, ndim=2] X not None,
np.ndarray[np.float_t, ndim=2] Y not None,
np.ndarray[np.float_t, ndim=2] Z not None):

cdef np.ndarray[np.float64_t, ndim=1] counts, factor
cdef np.ndarray[np.int_t, ndim=1] row, col, repeats
cdef np.ndarray[Point] indices

cdef int x_, y_

_, row = np.unique(X, return_inverse=True); x_ = _.size
_, col = np.unique(Y, return_inverse=True); y_ = _.size
indices = np.rec.fromarrays([row,col])
_, repeats = np.unique(indices, return_inverse=True)
counts = 1. / fbincount(repeats)
Z.flat *= counts.take(repeats)

return sp.sparse.csr_matrix((Z.flat,(row,col)), shape=(x_, y_)).toarray()
``````

## 回答 0

``````from distutils.core import setup, Extension
from Cython.Build import cythonize
import numpy

setup(
ext_modules=[
Extension("my_module", ["my_module.c"],
include_dirs=[numpy.get_include()]),
],
)

# Or, if you use cythonize() to make the ext_modules list,
# include_dirs can be passed to setup()

setup(
ext_modules=cythonize("my_module.pyx"),
include_dirs=[numpy.get_include()]
)    ``````

In your `setup.py`, the `Extension` should have the argument `include_dirs=[numpy.get_include()]`.

Also, you are missing `np.import_array()` in your code.

Example setup.py:

``````from distutils.core import setup, Extension
from Cython.Build import cythonize
import numpy

setup(
ext_modules=[
Extension("my_module", ["my_module.c"],
include_dirs=[numpy.get_include()]),
],
)

# Or, if you use cythonize() to make the ext_modules list,
# include_dirs can be passed to setup()

setup(
ext_modules=cythonize("my_module.pyx"),
include_dirs=[numpy.get_include()]
)
``````

## 回答 1

``````import numpy
import pyximport
pyximport.install(setup_args={"script_args":["--compiler=mingw32"],
"include_dirs":numpy.get_include()},

import my_pyx_module

print my_pyx_module.some_function(...)
...``````

For a one-file project like yours, another alternative is to use `pyximport`. You don’t need to create a `setup.py` … you don’t need to even open a command line if you use IPython … it’s all very convenient. In your case, try running these commands in IPython or in a normal Python script:

``````import numpy
import pyximport
pyximport.install(setup_args={"script_args":["--compiler=mingw32"],
"include_dirs":numpy.get_include()},

import my_pyx_module

print my_pyx_module.some_function(...)
...
``````

You may need to edit the compiler of course. This makes import and reload work the same for `.pyx` files as they work for `.py` files.

## 回答 2

The error means that a numpy header file isn’t being found during compilation.

Try doing `export CFLAGS=-I/usr/lib/python2.7/site-packages/numpy/core/include/`, and then compiling. This is a problem with a few different packages. There’s a bug filed in ArchLinux for the same issue: https://bugs.archlinux.org/task/22326

## 回答 3

### 简单的答案

``````[build_ext]
include_dirs= C:\Python27\Lib\site-packages\numpy\core\include``````

### 整个配置文件

``````[build]
compiler = mingw32

[build_ext]
include_dirs= C:\Python27\Lib\site-packages\numpy\core\include
compiler = mingw32``````

A way simpler way is to add the path to your file `distutils.cfg`. It’s path behalf of Windows 7 is by default `C:\Python27\Lib\distutils\`. You just assert the following contents and it should work out:

``````[build_ext]
include_dirs= C:\Python27\Lib\site-packages\numpy\core\include
``````

### Entire config file

To give you an example how the config file could look like, my entire file reads:

``````[build]
compiler = mingw32

[build_ext]
include_dirs= C:\Python27\Lib\site-packages\numpy\core\include
compiler = mingw32
``````

## 回答 4

It should be able to do it within `cythonize()` function as mentioned here, but it doesn’t work beacuse there is a known issue

## 回答 5

``````code = open(your_pyx_file).read()
cymodule = cyper.inline(code)

cymodule.sparsemaker(...)
# do what you want with your function``````

If you are too lazy to write setup files and figure out the path for include directories, try cyper. It can compile your Cython code and set `include_dirs` for Numpy automatically.

Load your code into a string, then simply run `cymodule = cyper.inline(code_string)`, then your function is available as `cymodule.sparsemaker` instantaneously. Something like this

``````code = open(your_pyx_file).read()
cymodule = cyper.inline(code)

cymodule.sparsemaker(...)
# do what you want with your function
``````

You can install cyper via `pip install cyper`.

# 如何在Pandas DataFrame中将True / False映射到1/0？

## 问题：如何在Pandas DataFrame中将True / False映射到1/0？

I have a column in python pandas DataFrame that has boolean True/False values, but for further calculations I need 1/0 representation. Is there a quick pandas/numpy way to do that?

## 回答 0

``df["somecolumn"] = df["somecolumn"].astype(int)``

A succinct way to convert a single column of boolean values to a column of integers 1 or 0:

``````df["somecolumn"] = df["somecolumn"].astype(int)
``````

## 回答 1

``````: data = pd.DataFrame([[True, False, True], [False, False, True]])
: print data
0      1     2
0   True  False  True
1   False False  True

: print data*1
0  1  2
0   1  0  1
1   0  0  1``````

Just multiply your Dataframe by 1 (int)

``````: data = pd.DataFrame([[True, False, True], [False, False, True]])
: print data
0      1     2
0   True  False  True
1   False False  True

: print data*1
0  1  2
0   1  0  1
1   0  0  1
``````

## 回答 2

`True``1`在Python，同样`False``0`*

``````>>> True == 1
True
>>> False == 0
True``````

``````>>> issubclass(bool, int)
True
>>> True * 5
5``````

*请注意，我使用的英文单词，而不是Python关键字`is``True`与任何random都不是同一对象`1`

`True` is `1` in Python, and likewise `False` is `0`*:

``````>>> True == 1
True
>>> False == 0
True
``````

You should be able to perform any operations you want on them by just treating them as though they were numbers, as they are numbers:

``````>>> issubclass(bool, int)
True
>>> True * 5
5
``````

So to answer your question, no work necessary – you already have what you are looking for.

* Note I use is as an English word, not the Python keyword `is``True` will not be the same object as any random `1`.

## 回答 3

``````In : df = DataFrame(dict(A = True, B = False),index=range(3))

In : df
Out:
A      B
0  True  False
1  True  False
2  True  False

In : df.dtypes
Out:
A    bool
B    bool
dtype: object

In : df.astype(int)
Out:
A  B
0  1  0
1  1  0
2  1  0

In : df.astype(int).dtypes
Out:
A    int64
B    int64
dtype: object``````

You also can do this directly on Frames

``````In : df = DataFrame(dict(A = True, B = False),index=range(3))

In : df
Out:
A      B
0  True  False
1  True  False
2  True  False

In : df.dtypes
Out:
A    bool
B    bool
dtype: object

In : df.astype(int)
Out:
A  B
0  1  0
1  1  0
2  1  0

In : df.astype(int).dtypes
Out:
A    int64
B    int64
dtype: object
``````

## 回答 4

``df = pd.DataFrame(my_data condition)``

# 在1/0中转换真/假

``df = df*1``

You can use a transformation for your data frame:

``````df = pd.DataFrame(my_data condition)
``````

# transforming True/False in 1/0

``````df = df*1
``````

## 回答 5

``df["somecolumn"] = df["somecolumn"].view('i1')``

Use `Series.view` for convert boolean to integers:

``````df["somecolumn"] = df["somecolumn"].view('i1')
``````