标签归档:deep-learning

在PyTorch中保存经过训练的模型的最佳方法?

问题:在PyTorch中保存经过训练的模型的最佳方法?

我一直在寻找其他方法来在PyTorch中保存经过训练的模型。到目前为止,我发现了两种选择。

  1. 使用torch.save()保存模型,使用torch.load()加载模型。
  2. model.state_dict()保存训练的模型,model.load_state_dict()加载保存的模型。

我碰到过这种讨论,其中建议方法2优于方法1。

我的问题是,为什么选择第二种方法呢?仅仅是因为torch.nn模块具有这两个功能,我们被鼓励使用它们吗?

I was looking for alternative ways to save a trained model in PyTorch. So far, I have found two alternatives.

  1. torch.save() to save a model and torch.load() to load a model.
  2. model.state_dict() to save a trained model and model.load_state_dict() to load the saved model.

I have come across to this discussion where approach 2 is recommended over approach 1.

My question is, why the second approach is preferred? Is it only because torch.nn modules have those two function and we are encouraged to use them?


回答 0

我在他们的github仓库中找到了此页面,我将内容粘贴在这里。


推荐的模型保存方法

序列化和还原模型有两种主要方法。

第一个(推荐)仅保存和加载模型参数:

torch.save(the_model.state_dict(), PATH)

然后再:

the_model = TheModelClass(*args, **kwargs)
the_model.load_state_dict(torch.load(PATH))

第二个保存并加载整个模型:

torch.save(the_model, PATH)

然后再:

the_model = torch.load(PATH)

但是,在这种情况下,序列化的数据将绑定到所使用的特定类和确切的目录结构,因此在其他项目中使用时或经过一些严重的重构后,它可能以各种方式中断。

I’ve found this page on their github repo, I’ll just paste the content here.


Recommended approach for saving a model

There are two main approaches for serializing and restoring a model.

The first (recommended) saves and loads only the model parameters:

torch.save(the_model.state_dict(), PATH)

Then later:

the_model = TheModelClass(*args, **kwargs)
the_model.load_state_dict(torch.load(PATH))

The second saves and loads the entire model:

torch.save(the_model, PATH)

Then later:

the_model = torch.load(PATH)

However in this case, the serialized data is bound to the specific classes and the exact directory structure used, so it can break in various ways when used in other projects, or after some serious refactors.


回答 1

这取决于您想做什么。

案例1:保存模型以供您自己进行推断:保存模型,还原模型,然后将模型更改为评估模式。这样做是因为您通常在构造上具有BatchNormDropout图层,这些图层默认情况下处于训练模式:

torch.save(model.state_dict(), filepath)

#Later to restore:
model.load_state_dict(torch.load(filepath))
model.eval()

案例2:保存模型以便以后继续训练:如果您需要继续训练将要保存的模型,则需要保存的不仅仅是模型。您还需要保存优化器的状态,时期,得分等。您可以这样操作:

state = {
    'epoch': epoch,
    'state_dict': model.state_dict(),
    'optimizer': optimizer.state_dict(),
    ...
}
torch.save(state, filepath)

要恢复训练,您将执行以下操作:state = torch.load(filepath),然后恢复每个对象的状态,如下所示:

model.load_state_dict(state['state_dict'])
optimizer.load_state_dict(state['optimizer'])

由于您正在恢复训练,因此在加载时恢复状态后,请勿model.eval()再调用。

案例3:无法访问您的代码的其他人可以使用的模型:在Tensorflow中,您可以创建一个.pb文件,该文件定义了体系结构和模型权重。这非常方便,尤其是在使用时Tensorflow serve。在Pytorch中执行此操作的等效方法是:

torch.save(model, filepath)

# Then later:
model = torch.load(filepath)

这种方式仍然不能保证安全,并且由于pytorch仍在进行大量更改,因此我不建议这样做。

It depends on what you want to do.

Case # 1: Save the model to use it yourself for inference: You save the model, you restore it, and then you change the model to evaluation mode. This is done because you usually have BatchNorm and Dropout layers that by default are in train mode on construction:

torch.save(model.state_dict(), filepath)

#Later to restore:
model.load_state_dict(torch.load(filepath))
model.eval()

Case # 2: Save model to resume training later: If you need to keep training the model that you are about to save, you need to save more than just the model. You also need to save the state of the optimizer, epochs, score, etc. You would do it like this:

state = {
    'epoch': epoch,
    'state_dict': model.state_dict(),
    'optimizer': optimizer.state_dict(),
    ...
}
torch.save(state, filepath)

To resume training you would do things like: state = torch.load(filepath), and then, to restore the state of each individual object, something like this:

model.load_state_dict(state['state_dict'])
optimizer.load_state_dict(state['optimizer'])

Since you are resuming training, DO NOT call model.eval() once you restore the states when loading.

Case # 3: Model to be used by someone else with no access to your code: In Tensorflow you can create a .pb file that defines both the architecture and the weights of the model. This is very handy, specially when using Tensorflow serve. The equivalent way to do this in Pytorch would be:

torch.save(model, filepath)

# Then later:
model = torch.load(filepath)

This way is still not bullet proof and since pytorch is still undergoing a lot of changes, I wouldn’t recommend it.


回答 2

泡菜的Python库实现二进制协议的序列化和反序列化Python对象。

当您import torch(或当您使用PyTorch)时,它将import pickle为您而您不需要调用pickle.dump()pickle.load()直接调用,这是保存和加载对象的方法。

事实上,torch.save()torch.load()将包裹pickle.dump()pickle.load()为您服务。

一个state_dict对方的回答值得提及的只是几个音符。

什么state_dict我们有内部PyTorch?实际上有两个state_dict秒。

PyTorch模型torch.nn.Module具有model.parameters()调用以获取可学习的参数(w和b)。这些可学习的参数,一旦被随机设置,将随着我们的学习而随着时间而更新。可学习的参数是第一个state_dict

第二个state_dict是优化器状态字典。您还记得优化器用于改善我们的可学习参数。但是优化器state_dict是固定的。在那没什么可学的。

由于state_dict对象是Python字典,因此可以轻松地保存,更新,更改和还原对象,从而为PyTorch模型和优化器增加了很多模块化。

让我们创建一个超级简单的模型来解释这一点:

import torch
import torch.optim as optim

model = torch.nn.Linear(5, 2)

# Initialize optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

print("Model's state_dict:")
for param_tensor in model.state_dict():
    print(param_tensor, "\t", model.state_dict()[param_tensor].size())

print("Model weight:")    
print(model.weight)

print("Model bias:")    
print(model.bias)

print("---")
print("Optimizer's state_dict:")
for var_name in optimizer.state_dict():
    print(var_name, "\t", optimizer.state_dict()[var_name])

此代码将输出以下内容:

Model's state_dict:
weight   torch.Size([2, 5])
bias     torch.Size([2])
Model weight:
Parameter containing:
tensor([[ 0.1328,  0.1360,  0.1553, -0.1838, -0.0316],
        [ 0.0479,  0.1760,  0.1712,  0.2244,  0.1408]], requires_grad=True)
Model bias:
Parameter containing:
tensor([ 0.4112, -0.0733], requires_grad=True)
---
Optimizer's state_dict:
state    {}
param_groups     [{'lr': 0.001, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'params': [140695321443856, 140695321443928]}]

请注意,这是最小模型。您可以尝试添加顺序堆栈

model = torch.nn.Sequential(
          torch.nn.Linear(D_in, H),
          torch.nn.Conv2d(A, B, C)
          torch.nn.Linear(H, D_out),
        )

请注意,只有具有可学习参数的层(卷积层,线性层等)和已注册的缓冲区(batchnorm层)才在模型的中具有条目state_dict

不可学习的东西属于优化器对象state_dict,该对象包含有关优化器状态以及所用超参数的信息。

故事的其余部分是相同的。在推理阶段(这是我们训练后使用模型的阶段)进行预测;我们会根据所学的参数进行预测。因此,为了进行推断,我们只需要保存参数model.state_dict()

torch.save(model.state_dict(), filepath)

并在以后使用model.load_state_dict(torch.load(filepath))model.eval()

注意:不要忘了最后一行,model.eval()在加载模型之后,这是至关重要的。

也不要试图保存torch.save(model.parameters(), filepath)。该model.parameters()只是生成对象。

另一方面,torch.save(model, filepath)保存模型对象本身,但请记住,模型没有优化程序state_dict。检查@Jadiel de Armas的其他出色答案,以保存优化程序的状态字典。

The pickle Python library implements binary protocols for serializing and de-serializing a Python object.

When you import torch (or when you use PyTorch) it will import pickle for you and you don’t need to call pickle.dump() and pickle.load() directly, which are the methods to save and to load the object.

In fact, torch.save() and torch.load() will wrap pickle.dump() and pickle.load() for you.

A state_dict the other answer mentioned deserves just few more notes.

What state_dict do we have inside PyTorch? There are actually two state_dicts.

The PyTorch model is torch.nn.Module has model.parameters() call to get learnable parameters (w and b). These learnable parameters, once randomly set, will update over time as we learn. Learnable parameters are the first state_dict.

The second state_dict is the optimizer state dict. You recall that the optimizer is used to improve our learnable parameters. But the optimizer state_dict is fixed. Nothing to learn in there.

Because state_dict objects are Python dictionaries, they can be easily saved, updated, altered, and restored, adding a great deal of modularity to PyTorch models and optimizers.

Let’s create a super simple model to explain this:

import torch
import torch.optim as optim

model = torch.nn.Linear(5, 2)

# Initialize optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

print("Model's state_dict:")
for param_tensor in model.state_dict():
    print(param_tensor, "\t", model.state_dict()[param_tensor].size())

print("Model weight:")    
print(model.weight)

print("Model bias:")    
print(model.bias)

print("---")
print("Optimizer's state_dict:")
for var_name in optimizer.state_dict():
    print(var_name, "\t", optimizer.state_dict()[var_name])

This code will output the following:

Model's state_dict:
weight   torch.Size([2, 5])
bias     torch.Size([2])
Model weight:
Parameter containing:
tensor([[ 0.1328,  0.1360,  0.1553, -0.1838, -0.0316],
        [ 0.0479,  0.1760,  0.1712,  0.2244,  0.1408]], requires_grad=True)
Model bias:
Parameter containing:
tensor([ 0.4112, -0.0733], requires_grad=True)
---
Optimizer's state_dict:
state    {}
param_groups     [{'lr': 0.001, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'params': [140695321443856, 140695321443928]}]

Note this is a minimal model. You may try to add stack of sequential

model = torch.nn.Sequential(
          torch.nn.Linear(D_in, H),
          torch.nn.Conv2d(A, B, C)
          torch.nn.Linear(H, D_out),
        )

Note that only layers with learnable parameters (convolutional layers, linear layers, etc.) and registered buffers (batchnorm layers) have entries in the model’s state_dict.

Non learnable things, belong to the optimizer object state_dict, which contains information about the optimizer’s state, as well as the hyperparameters used.

The rest of the story is the same; in the inference phase (this is a phase when we use the model after training) for predicting; we do predict based on the parameters we learned. So for the inference, we just need to save the parameters model.state_dict().

torch.save(model.state_dict(), filepath)

And to use later model.load_state_dict(torch.load(filepath)) model.eval()

Note: Don’t forget the last line model.eval() this is crucial after loading the model.

Also don’t try to save torch.save(model.parameters(), filepath). The model.parameters() is just the generator object.

On the other side, torch.save(model, filepath) saves the model object itself, but keep in mind the model doesn’t have the optimizer’s state_dict. Check the other excellent answer by @Jadiel de Armas to save the optimizer’s state dict.


回答 3

常见的PyTorch约定是使用.pt或.pth文件扩展名保存模型。

保存/加载整个模型 保存:

path = "username/directory/lstmmodelgpu.pth"
torch.save(trainer, path)

加载:

模型类必须在某处定义

model = torch.load(PATH)
model.eval()

A common PyTorch convention is to save models using either a .pt or .pth file extension.

Save/Load Entire Model Save:

path = "username/directory/lstmmodelgpu.pth"
torch.save(trainer, path)

Load:

Model class must be defined somewhere

model = torch.load(PATH)
model.eval()

回答 4

如果您要保存模型并希望以后继续训练,请执行以下操作:

单个GPU: 保存:

state = {
        'epoch': epoch,
        'state_dict': model.state_dict(),
        'optimizer': optimizer.state_dict(),
}
savepath='checkpoint.t7'
torch.save(state,savepath)

加载:

checkpoint = torch.load('checkpoint.t7')
model.load_state_dict(checkpoint['state_dict'])
optimizer.load_state_dict(checkpoint['optimizer'])
epoch = checkpoint['epoch']

多GPU: 保存

state = {
        'epoch': epoch,
        'state_dict': model.module.state_dict(),
        'optimizer': optimizer.state_dict(),
}
savepath='checkpoint.t7'
torch.save(state,savepath)

加载:

checkpoint = torch.load('checkpoint.t7')
model.load_state_dict(checkpoint['state_dict'])
optimizer.load_state_dict(checkpoint['optimizer'])
epoch = checkpoint['epoch']

#Don't call DataParallel before loading the model otherwise you will get an error

model = nn.DataParallel(model) #ignore the line if you want to load on Single GPU

If you want to save the model and wants to resume the training later:

Single GPU: Save:

state = {
        'epoch': epoch,
        'state_dict': model.state_dict(),
        'optimizer': optimizer.state_dict(),
}
savepath='checkpoint.t7'
torch.save(state,savepath)

Load:

checkpoint = torch.load('checkpoint.t7')
model.load_state_dict(checkpoint['state_dict'])
optimizer.load_state_dict(checkpoint['optimizer'])
epoch = checkpoint['epoch']

Multiple GPU: Save

state = {
        'epoch': epoch,
        'state_dict': model.module.state_dict(),
        'optimizer': optimizer.state_dict(),
}
savepath='checkpoint.t7'
torch.save(state,savepath)

Load:

checkpoint = torch.load('checkpoint.t7')
model.load_state_dict(checkpoint['state_dict'])
optimizer.load_state_dict(checkpoint['optimizer'])
epoch = checkpoint['epoch']

#Don't call DataParallel before loading the model otherwise you will get an error

model = nn.DataParallel(model) #ignore the line if you want to load on Single GPU

Autokeras-面向深度学习的AutoML库

logo

官网:autokeras.com

AutoKera:一个基于KERS的AutoML系统。它是由DATA Lab在德克萨斯农工大学。AutoKera的目标是让每个人都可以使用机器学习

学习资源

  • 一个简短的例子

import autokeras as ak

clf = ak.ImageClassifier()
clf.fit(x_train, y_train)
results = clf.predict(x_test)

drawing

安装

要安装该软件包,请使用pip安装步骤如下:

pip3 install autokeras

请按照installation guide有关更多详细信息,请参阅

注:目前,AutoKera仅与Python>=3.5TensorFlow>=2.3.0

社区

随时了解最新信息

推特:你也可以在推特上关注我们@autokeras了解最新消息

电子邮件:订阅我们的email list接收通知的步骤

问题和讨论

GitHub讨论:请在我们的GitHub Discussions这是一个在GitHub上托管的论坛。我们将在那里监控并回答问题

即时通信

松弛Request an invitation使用#autokeras通信通道

QQ群:加入我们的QQ群1150366085。密码:akqqgroup

在线会议:加入online meeting Google group日历事件将出现在您的Google日历上

贡献代码

我们致力于让AutoKera的一切向公众开放。每个人都可以很容易地以开发人员的身份加入。以下是我们如何管理我们的项目

  • 对问题进行分类例如,我们从中挑选要解决的关键问题GitHub issues它们将被添加到此Project其中一些问题随后将添加到milestones,用于计划发布
  • 分配任务:我们在网上会议期间将任务分配给人们
  • 讨论:我们可以在多个地方进行讨论。代码审查在GitHub上。问题可以在Slake或在会议期间提问

请加入我们的Slack给金海峰发个口信。或顺道拜访我们的online meetings然后跟我们谈谈。我们将帮助您入门!

请参阅我们的Contributing Guide学习最佳实践

感谢所有的贡献者!

捐赠

我们接受财政上的支持Open Collective感谢每一位赞助商对我们的支持!


引用这部作品

金海峰、宋清泉、夏虎。“Auto-keras:一种高效的神经结构搜索系统。”第25届ACM SIGKDD知识发现与数据挖掘国际会议论文集。ACM,2019年。(Download)

Biblatex条目:

@inproceedings{jin2019auto,
  title={Auto-Keras: An Efficient Neural Architecture Search System},
  author={Jin, Haifeng and Song, Qingquan and Hu, Xia},
  booktitle={Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  pages={1946--1956},
  year={2019},
  organization={ACM}
}

确认

作者感谢国防高级研究计划局(DARPA)通过AFRL合同FA8750-17-2-0116、德克萨斯农工学院和德克萨斯农工大学管理的D3M计划

Numerical-linear-algebra-Jupyter笔记本免费在线教材Fast.ai计算线性代数课程

编码器的计算线性代数

本课程重点讨论以下问题:我们如何以可接受的速度和可接受的精度进行矩阵计算?

这门课是在University of San Francisco’s Masters of Science in Analytics计划,2017年夏季(面向正在学习成为数据科学家的研究生)。本课程使用Python和Jupyter笔记本讲授,在大多数课程中使用的库包括Scikit-Learning和Numpy,还有几节课使用Numba(将Python编译为C以提高性能的库)和PyTorch(用于GPU的Numpy的替代库

随附笔记本的还有一个playlist of lecture videos, available on YouTube如果你曾经被一堂课弄糊涂了,或者它讲得太快,请看下一段视频的开头,我在视频的开头复习了上一节课的概念,经常从新的角度或用不同的插图来解释,并回答问题。

获取帮助

您可以通过以下方式提问或分享您的想法和资源Computational Linear Algebra category on our fast.ai discussion forums

目录

下面的清单链接到此存储库中的笔记本,通过nbviewer服务。涵盖的主题:

0. Course Logistics(Video 1)

1. Why are we here?(Video 1)

我们首先对数值线性代数中的一些基本概念做一个高层次的概述

2. Topic Modeling with NMF and SVD(Video 2Video 3)

我们将使用新闻组数据集来尝试识别不同帖子的主题。我们使用术语-文档矩阵来表示文档中词汇的频率。我们使用NMF进行因子分解,然后使用奇异值分解(SVD

3. Background Removal with Robust PCA(Video 3Video 4,以及Video 5)

奇异值分解的另一个应用是识别人物并去除监控视频的背景。我们将介绍使用随机奇异值分解的鲁棒PCA。随机奇异值分解使用LU因式分解

4. Compressed Sensing with Robust Regression(Video 6Video 7)

压缩感知对于以较低的辐射进行CT扫描至关重要–可以用较少的数据重建图像。在这里,我们将学习这项技术,并将其应用于CT图像

5. Predicting Health Outcomes with Linear Regressions(Video 8)

6. How to Implement Linear Regression(Video 8)

7. PageRank with Eigen Decompositions(Video 9Video 10)

我们已经将奇异值分解应用于主题建模、背景去除和线性回归。奇异值分解与特征分解密切相关,因此我们现在将学习如何计算大型矩阵的特征值。我们将使用DBpedia数据,这是维基百科链接的大型数据集,因为这里的主要特征向量给出了不同维基百科页面的相对重要性(这是Google的PageRank算法的基本思想)。我们将看三种不同的计算特征向量的方法,它们的复杂度越来越高(实用性也越来越强!)

8. Implementing QR Factorization(Video 10)


为什么这门课的授课顺序如此怪异?

本课程的结构包括自上而下教学方法,这与大多数数学课程的操作方式不同。通常,在自下而上方法时,您首先学习要使用的所有独立组件,然后逐渐将它们构建成更复杂的结构。这样做的问题是,学生经常失去动力,没有“大局”意识,也不知道他们需要什么

哈佛大学教授大卫·珀金斯有一本书,Making Learning Whole他用棒球作类比。我们不要求孩子们在让他们玩棒球之前记住所有的棒球规则,了解所有的技术细节。相反,他们开始只是玩一般意义上的游戏,然后随着时间的推移逐渐学习更多的规则/细节。

如果你上了Fast.ai深度学习课程,那就是我们用的。你可以听到更多关于我的教学理念in this blog postthis talk I gave at the San Francisco Machine Learning meetup

总而言之,如果你一开始什么都不懂,也不要担心!你不应该这么做的。我们将开始使用一些尚未解释的“黑盒”或矩阵分解,然后我们将在稍后对更低级别的细节进行挖洞分析

首先,把重点放在事情做什么上,而不是它们是什么

Tensorlayer-面向科学家和工程师的深度学习和强化学习库🔥

TensorLayer是一个为研究人员和工程师设计的基于TensorFlow的深度学习和强化学习库。它提供了大量的可定制的神经层来快速构建高级的AI模型,在此基础上,社区开源的MASStutorialsapplicationsTensorLayer荣获2017年度最佳开源软件奖ACM Multimedia Society该项目也可以在iHubGitee

新闻

🔥3.0.0将支持多个后端,如TensorFlow、MindSporte、PaddlePaddle等,允许用户在NVIDIA-GPU和华为-Ascend等不同硬件上运行代码。我们需要更多的人加入开发团队,如果你感兴趣,请发电子邮件hao.dong@pku.edu.cn

🔥强化学习动物园:Low-level APIs对于专业用途,High-level APIs对于简单的用法,以及相应的Springer textbook

🔥Sipeed Maxi-EMC:在上运行TensorLayer模型一种低成本的AI芯片(例如K210)(Alpha版本)

设计特点

TensorLayer是一个全新的深度学习库,其设计考虑到了简单性、灵活性和高性能

  • 简单性:TensorLayer具有易于学习的高级层/模型抽象。您可以在几分钟内了解深度学习如何为您的人工智能任务带来好处,通过海量的examples
  • 灵活性:TensorLayer API是透明和灵活的,灵感来自新兴的PyTorch库。与KERAS抽象相比,TensorLayer使构建和训练复杂的AI模型变得容易得多
  • 零成本抽象:虽然使用起来很简单,但TensorLayer并不要求您在TensorFlow的性能上做出任何妥协(有关更多详细信息,请查看以下基准测试部分)

TensorLayer位于TensorFlow包装器中的一个独特位置。其他包装器,如Kera和TFLearn,隐藏了TensorFlow的许多强大功能,并且对编写自定义AI模型几乎没有提供支持。受到PyTorch的启发,TensorLayer API简单、灵活、Pythonic,让学习变得轻松,同时足够灵活地应对复杂的AI任务。TensorLayer拥有一个快速增长的社区。它已经被世界各地的研究人员和工程师使用,包括来自北京大学、伦敦帝国理工学院、加州大学伯克利分校、卡内基梅隆大学、斯坦福大学的研究人员和工程师,以及谷歌、微软、阿里巴巴、腾讯、小米和彭博社等公司的研究人员和工程师

多语种文档

TensorLayer为初学者和专业人士提供了大量文档。该文档有英文和中文两种版本。

English Documentation
Chinese Documentation
Chinese Book

如果您想在主分支上尝试实验功能,可以找到最新的文档here

大量的例子

中可以找到大量使用TensorLayer的示例here以及以下空格:

快速入门

TensorLayer2.0依赖于TensorFlow、Numpy等。要使用GPU,需要CUDA和cuDNN

安装TensorFlow:

pip3 install tensorflow-gpu==2.0.0-rc1 # TensorFlow GPU (version 2.0 RC1)
pip3 install tensorflow # CPU version

安装稳定版本的TensorLayer:

pip3 install tensorlayer

安装TensorLayer的不稳定开发版本:

pip3 install git+https://github.com/tensorlayer/tensorlayer.git

如果要安装其他依赖项,还可以运行

pip3 install --upgrade tensorlayer[all]              # all additional dependencies
pip3 install --upgrade tensorlayer[extra]            # only the `extra` dependencies
pip3 install --upgrade tensorlayer[contrib_loggers]  # only the `contrib_loggers` dependencies

如果您是TensorFlow 1.X用户,则可以使用TensorLayer 1.11.0:

# For last stable version of TensorLayer 1.X
pip3 install --upgrade tensorlayer==1.11.0

性能基准

下表显示了以下各项的训练速度VGG16在Titan XP上使用TensorLayer和原生TensorFlow

模式 Lib 数据格式 最大GPU内存使用量(MB) 最大CPU内存使用量(MB) 平均CPU内存使用率(MB) 运行时间(秒)
亲笔签名 TensorFlow 2.0 最后一个频道 11833 2161 2136 74
TensorLayer 2.0 最后一个频道 11833 2187 2169 76
图表 凯拉斯 最后一个频道 8677 2580 2576 101
渴望 TensorFlow 2.0 最后一个频道 8723 2052年 2024年 97
TensorLayer 2.0 最后一个频道 8723 2010年 2007年 95

参与其中

请阅读Contributor Guideline在提交您的请购单之前

我们建议用户使用Github问题报告错误。用户还可以讨论如何在以下空闲通道中使用TensorLayer

引用TensorLayer

如果您觉得TensorLayer对您的项目有用,请引用以下论文:

@article{tensorlayer2017,
    author  = {Dong, Hao and Supratak, Akara and Mai, Luo and Liu, Fangde and Oehmichen, Axel and Yu, Simiao and Guo, Yike},
    journal = {ACM Multimedia},
    title   = {{TensorLayer: A Versatile Library for Efficient Deep Learning Development}},
    url     = {http://tensorlayer.org},
    year    = {2017}
}

@inproceedings{tensorlayer2021,
  title={Tensorlayer 3.0: A Deep Learning Library Compatible With Multiple Backends},
  author={Lai, Cheng and Han, Jiarong and Dong, Hao},
  booktitle={2021 IEEE International Conference on Multimedia \& Expo Workshops (ICMEW)},
  pages={1--3},
  year={2021},
  organization={IEEE}
}

了解Keras LSTM

问题:了解Keras LSTM

我试图调和对LSTM的理解,并在克里斯托弗·奥拉(Christopher Olah)在Keras中实现的这篇文章中指出了这一点。我正在关注Jason Brownlee为Keras教程撰写博客。我主要感到困惑的是

  1. 将数据系列重塑为 [samples, time steps, features]和,
  2. 有状态的LSTM

让我们参考下面粘贴的代码专注于以上两个问题:

# reshape into X=t and Y=t+1
look_back = 3
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)

# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], look_back, 1))
testX = numpy.reshape(testX, (testX.shape[0], look_back, 1))
########################
# The IMPORTANT BIT
##########################
# create and fit the LSTM network
batch_size = 1
model = Sequential()
model.add(LSTM(4, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(100):
    model.fit(trainX, trainY, nb_epoch=1, batch_size=batch_size, verbose=2, shuffle=False)
    model.reset_states()

注意:create_dataset采用长度为N的序列,并返回一个N-look_back数组,其中每个元素都是一个look_back长度序列。

什么是时间步骤和功能?

可以看出TrainX是一个3D数组,其中Time_steps和Feature是最后两个维度(在此特定代码中为3和1)。关于下图,这是否意味着我们正在考虑many to one粉红色盒数为3的情况?还是字面上的意思是链长为3(即仅考虑了3个绿色框)。在此处输入图片说明

当我们考虑多元序列时,features参数是否有意义?例如同时模拟两个金融股票?

有状态的LSTM

有状态LSTM是否意味着我们在批次运行之间保存了单元内存值?如果是这样,batch_size则为1,并且在两次训练之间将内存重置,那么说它是有状态的就意味着什么。我猜想这与训练数据没有被改组的事实有关,但是我不确定如何做。

有什么想法吗?图片参考:http : //karpathy.github.io/2015/05/21/rnn-efficiency/

编辑1:

@van对红色和绿色方框相等的评论有点困惑。因此,为了确认一下,以下API调用是否与展开的图相对应?特别注意第二张图(batch_size被任意选择): 在此处输入图片说明 在此处输入图片说明

编辑2:

对于已经完成Udacity深度学习类但仍对time_step参数感到困惑的人,请查看以下讨论:https ://discussions.udacity.com/t/rnn-lstm-use-implementation/163169

更新:

原来model.add(TimeDistributed(Dense(vocab_len)))是我要找的东西。这是一个示例:https : //github.com/sachinruk/ShakespeareBot

更新2:

我在这里总结了我对LSTM的大部分理解:https : //www.youtube.com/watch?v= ywinX5wgdEU

I am trying to reconcile my understand of LSTMs and pointed out here in this post by Christopher Olah implemented in Keras. I am following the blog written by Jason Brownlee for the Keras tutorial. What I am mainly confused about is,

  1. The reshaping of the data series into [samples, time steps, features] and,
  2. The stateful LSTMs

Lets concentrate on the above two questions with reference to the code pasted below:

# reshape into X=t and Y=t+1
look_back = 3
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)

# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], look_back, 1))
testX = numpy.reshape(testX, (testX.shape[0], look_back, 1))
########################
# The IMPORTANT BIT
##########################
# create and fit the LSTM network
batch_size = 1
model = Sequential()
model.add(LSTM(4, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(100):
    model.fit(trainX, trainY, nb_epoch=1, batch_size=batch_size, verbose=2, shuffle=False)
    model.reset_states()

Note: create_dataset takes a sequence of length N and returns a N-look_back array of which each element is a look_back length sequence.

What is Time Steps and Features?

As can be seen TrainX is a 3-D array with Time_steps and Feature being the last two dimensions respectively (3 and 1 in this particular code). With respect to the image below, does this mean that we are considering the many to one case, where the number of pink boxes are 3? Or does it literally mean the chain length is 3 (i.e. only 3 green boxes considered). enter image description here

Does the features argument become relevant when we consider multivariate series? e.g. modelling two financial stocks simultaneously?

Stateful LSTMs

Does stateful LSTMs mean that we save the cell memory values between runs of batches? If this is the case, batch_size is one, and the memory is reset between the training runs so what was the point of saying that it was stateful. I’m guessing this is related to the fact that training data is not shuffled, but I’m not sure how.

Any thoughts? Image reference: http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Edit 1:

A bit confused about @van’s comment about the red and green boxes being equal. So just to confirm, does the following API calls correspond to the unrolled diagrams? Especially noting the second diagram (batch_size was arbitrarily chosen.): enter image description here enter image description here

Edit 2:

For people who have done Udacity’s deep learning course and still confused about the time_step argument, look at the following discussion: https://discussions.udacity.com/t/rnn-lstm-use-implementation/163169

Update:

It turns out model.add(TimeDistributed(Dense(vocab_len))) was what I was looking for. Here is an example: https://github.com/sachinruk/ShakespeareBot

Update2:

I have summarised most of my understanding of LSTMs here: https://www.youtube.com/watch?v=ywinX5wgdEU


回答 0

首先,你选择伟大的教程(12)开始。

Time-step的含义Time-steps==3X.shape(描述数据形状)表示三个粉红色的框。由于在Keras中,每个步骤都需要输入,因此绿色框的数量通常应等于红色框的数量。除非您破解结构。

多对多与多对一:在keras中,return_sequences初始化LSTMor GRU或时有一个参数SimpleRNN。当return_sequencesFalse(默认情况下)时,则如图所示多对一。其返回形状为(batch_size, hidden_unit_length),代表最后一个状态。如果return_sequences是的True话,那就是很多很多。它的返回形状是(batch_size, time_step, hidden_unit_length)

features参数是否相关:Feature参数表示“您的红框有多大”或每步的输入维数是多少?例如,如果您要从8种市场信息中进行预测,则可以使用生成数据feature==8

有状态:您可以查找源代码。初始化状态时,如果为stateful==True,则将最后一次训练的状态用作初始状态,否则将生成新状态。我还没打开stateful呢。但是,我不同意的是,当batch_size只能为1 stateful==True

当前,您将使用收集的数据生成数据。将您的库存信息以流的形式显示,而不是等待一天收集所有顺序的图像,而是想在通过网络进行培训/预测时在线生成输入数据。如果您有400只股票共享同一网络,则可以设置batch_size==400

First of all, you choose great tutorials(1,2) to start.

What Time-step means: Time-steps==3 in X.shape (Describing data shape) means there are three pink boxes. Since in Keras each step requires an input, therefore the number of the green boxes should usually equal to the number of red boxes. Unless you hack the structure.

many to many vs. many to one: In keras, there is a return_sequences parameter when your initializing LSTM or GRU or SimpleRNN. When return_sequences is False (by default), then it is many to one as shown in the picture. Its return shape is (batch_size, hidden_unit_length), which represent the last state. When return_sequences is True, then it is many to many. Its return shape is (batch_size, time_step, hidden_unit_length)

Does the features argument become relevant: Feature argument means “How big is your red box” or what is the input dimension each step. If you want to predict from, say, 8 kinds of market information, then you can generate your data with feature==8.

Stateful: You can look up the source code. When initializing the state, if stateful==True, then the state from last training will be used as the initial state, otherwise it will generate a new state. I haven’t turn on stateful yet. However, I disagree with that the batch_size can only be 1 when stateful==True.

Currently, you generate your data with collected data. Image your stock information is coming as stream, rather than waiting for a day to collect all sequential, you would like to generate input data online while training/predicting with network. If you have 400 stocks sharing a same network, then you can set batch_size==400.


回答 1

作为已接受答案的补充,此答案显示了keras行为以及如何获得每张照片。

一般Keras行为

标准keras内部处理总是如下图所示(我在其中使用features=2,压力和温度为例):

多对多

在此图中,我将步骤数增加到5,以避免与其他维度混淆。

对于此示例:

  • 我们有N个油箱
  • 我们每小时花费5个小时采取措施(时间步长)
  • 我们测量了两个功能:
    • 压力P
    • 温度T

输入数组的形状应为(N,5,2)

        [     Step1      Step2      Step3      Step4      Step5
Tank A:    [[Pa1,Ta1], [Pa2,Ta2], [Pa3,Ta3], [Pa4,Ta4], [Pa5,Ta5]],
Tank B:    [[Pb1,Tb1], [Pb2,Tb2], [Pb3,Tb3], [Pb4,Tb4], [Pb5,Tb5]],
  ....
Tank N:    [[Pn1,Tn1], [Pn2,Tn2], [Pn3,Tn3], [Pn4,Tn4], [Pn5,Tn5]],
        ]

滑动窗输入

通常,LSTM层应该处理整个序列。划分窗口可能不是最好的主意。该层具有有关序列前进过程的内部状态。Windows消除了学习长序列的可能性,从而将所有序列限制为窗口大小。

在窗口中,每个窗口都是一个较长的原始序列的一部分,但是Keras会将它们视为独立的序列:

        [     Step1    Step2    Step3    Step4    Step5
Window  A:  [[P1,T1], [P2,T2], [P3,T3], [P4,T4], [P5,T5]],
Window  B:  [[P2,T2], [P3,T3], [P4,T4], [P5,T5], [P6,T6]],
Window  C:  [[P3,T3], [P4,T4], [P5,T5], [P6,T6], [P7,T7]],
  ....
        ]

请注意,在这种情况下,最初只有一个序列,但是您将其分为许多序列以创建窗口。

“什么是序列”的概念是抽象的。重要的部分是:

  • 您可以批量处理许多单独的序列
  • 使序列成为序列的原因是它们是逐步演化的(通常是时间步长)

通过“单层”实现每种情况

实现许多标准:

标准多对多

您可以使用一个简单的LSTM层来实现很多对很多return_sequences=True

outputs = LSTM(units, return_sequences=True)(inputs)

#output_shape -> (batch_size, steps, units)

实现多对一:

使用完全相同的图层,keras将执行完全相同的内部预处理,但是当您使用return_sequences=False(或简单地忽略此参数)时,keras会自动放弃最后一步的步骤:

多对一

outputs = LSTM(units)(inputs)

#output_shape -> (batch_size, units) --> steps were discarded, only the last was returned

实现一对多

现在,仅keras LSTM层不支持此功能。您将必须创建自己的策略来重复步骤。有两种好的方法:

  • 通过重复张量创建恒定的多步输入
  • 使用a stateful=True反复获取一个步骤的输出,并将其用作下一步的输入(需要output_features == input_features

一对多与重复向量

为了适应keras的标准行为,我们需要分步进行输入,因此,我们只需重复输入所需的长度即可:

一对多重复

outputs = RepeatVector(steps)(inputs) #where inputs is (batch,features)
outputs = LSTM(units,return_sequences=True)(outputs)

#output_shape -> (batch_size, steps, units)

了解状态=真

现在出现一种可能的用法 stateful=True(避免避免一次加载无法容纳计算机内存的数据)

有状态允许我们分阶段输入序列的“部分”。区别在于:

  • 在中stateful=False,第二批包含完整的新序列,独立于第一批
  • 在中stateful=True,第二批继续第一批,扩展了相同的序列。

这就像在Windows中划分序列一样,有两个主要区别:

  • 这些窗户不叠加!
  • stateful=True 将看到这些窗口作为单个长序列连接

在中stateful=True,每个新批次将被解释为继续前一个批次(直到您调用model.reset_states())。

  • 批次2中的序列1将继续批次1中的序列1。
  • 批次2中的序列2将继续批次1中的序列2。
  • 批次2中的序列n将继续批次1中的序列n。

输入示例,批次1包含步骤1和2,批次2包含步骤3至5:

                   BATCH 1                           BATCH 2
        [     Step1      Step2        |    [    Step3      Step4      Step5
Tank A:    [[Pa1,Ta1], [Pa2,Ta2],     |       [Pa3,Ta3], [Pa4,Ta4], [Pa5,Ta5]],
Tank B:    [[Pb1,Tb1], [Pb2,Tb2],     |       [Pb3,Tb3], [Pb4,Tb4], [Pb5,Tb5]],
  ....                                |
Tank N:    [[Pn1,Tn1], [Pn2,Tn2],     |       [Pn3,Tn3], [Pn4,Tn4], [Pn5,Tn5]],
        ]                                  ]

注意批次1和批次2中的储罐对齐!这就是我们需要的原因shuffle=False(当然,除非我们仅使用一个序列)。

您可以无限期地拥有任意数量的批次。(对于每批具有可变长度,请使用input_shape=(None,features)

一对多与有状态= True

对于我们这里的情况,每批将只使用1步,因为我们希望获得一个输出步并将其作为输入。

请注意,图片中的行为不是由“引起的” stateful=True。我们将在下面的手动循环中强制执行该操作。在此示例中,stateful=True是“允许”我们停止序列,操纵我们想要的并从我们停止的地方继续进行操作的东西。

一对多状态

老实说,对于这种情况,重复方法可能是更好的选择。但是,由于我们正在研究stateful=True,所以这是一个很好的例子。最好的使用方法是下一个“多对多”案例。

层:

outputs = LSTM(units=features, 
               stateful=True, 
               return_sequences=True, #just to keep a nice output shape even with length 1
               input_shape=(None,features))(inputs) 
    #units = features because we want to use the outputs as inputs
    #None because we want variable length

#output_shape -> (batch_size, steps, units) 

现在,我们将需要一个手动循环进行预测:

input_data = someDataWithShape((batch, 1, features))

#important, we're starting new sequences, not continuing old ones:
model.reset_states()

output_sequence = []
last_step = input_data
for i in steps_to_predict:

    new_step = model.predict(last_step)
    output_sequence.append(new_step)
    last_step = new_step

 #end of the sequences
 model.reset_states()

有状态=真对多对多

现在,在这里,我们得到一个非常好的应用程序:给定一个输入序列,尝试预测其未来未知的步骤。

我们使用的方法与上述“一对多”方法相同,不同之处在于:

  • 我们将使用序列本身作为目标数据,向前迈出一步
  • 我们知道序列的一部分(因此我们丢弃了这部分结果)。

多对多状态

图层(与上面相同):

outputs = LSTM(units=features, 
               stateful=True, 
               return_sequences=True, 
               input_shape=(None,features))(inputs) 
    #units = features because we want to use the outputs as inputs
    #None because we want variable length

#output_shape -> (batch_size, steps, units) 

训练:

我们将训练模型以预测序列的下一步:

totalSequences = someSequencesShaped((batch, steps, features))
    #batch size is usually 1 in these cases (often you have only one Tank in the example)

X = totalSequences[:,:-1] #the entire known sequence, except the last step
Y = totalSequences[:,1:] #one step ahead of X

#loop for resetting states at the start/end of the sequences:
for epoch in range(epochs):
    model.reset_states()
    model.train_on_batch(X,Y)

预测:

我们预测的第一阶段涉及“调整状态”。这就是为什么即使我们已经知道序列的这一部分,我们也要再次预测整个序列:

model.reset_states() #starting a new sequence
predicted = model.predict(totalSequences)
firstNewStep = predicted[:,-1:] #the last step of the predictions is the first future step

现在我们像一对多情况一样进入循环。但是请不要在这里重置状态!。我们希望模型知道序列的哪一步(并且由于上面我们所做的预测,它知道它在第一步)

output_sequence = [firstNewStep]
last_step = firstNewStep
for i in steps_to_predict:

    new_step = model.predict(last_step)
    output_sequence.append(new_step)
    last_step = new_step

 #end of the sequences
 model.reset_states()

这些答案和文件中使用了这种方法:

实现复杂的配置

在上面的所有示例中,我都展示了“一层”的行为。

当然,您可以在彼此之上堆叠许多层,而不必全部遵循相同的模式,然后创建自己的模型。

出现的一个有趣的例子是“自动编码器”,它具有“多对一编码器”,后跟“一对多”解码器:

编码器:

inputs = Input((steps,features))

#a few many to many layers:
outputs = LSTM(hidden1,return_sequences=True)(inputs)
outputs = LSTM(hidden2,return_sequences=True)(outputs)    

#many to one layer:
outputs = LSTM(hidden3)(outputs)

encoder = Model(inputs,outputs)

解码器:

使用“重复”方法;

inputs = Input((hidden3,))

#repeat to make one to many:
outputs = RepeatVector(steps)(inputs)

#a few many to many layers:
outputs = LSTM(hidden4,return_sequences=True)(outputs)

#last layer
outputs = LSTM(features,return_sequences=True)(outputs)

decoder = Model(inputs,outputs)

自动编码器:

inputs = Input((steps,features))
outputs = encoder(inputs)
outputs = decoder(outputs)

autoencoder = Model(inputs,outputs)

与一起训练 fit(X,X)

补充说明

如果您想了解有关LSTM中如何计算步数的详细信息,或有关上述stateful=True情况的详细信息,则可以在此答案中阅读更多内容:关于“了解Keras LSTM”的疑问

As a complement to the accepted answer, this answer shows keras behaviors and how to achieve each picture.

General Keras behavior

The standard keras internal processing is always a many to many as in the following picture (where I used features=2, pressure and temperature, just as an example):

ManyToMany

In this image, I increased the number of steps to 5, to avoid confusion with the other dimensions.

For this example:

  • We have N oil tanks
  • We spent 5 hours taking measures hourly (time steps)
  • We measured two features:
    • Pressure P
    • Temperature T

Our input array should then be something shaped as (N,5,2):

        [     Step1      Step2      Step3      Step4      Step5
Tank A:    [[Pa1,Ta1], [Pa2,Ta2], [Pa3,Ta3], [Pa4,Ta4], [Pa5,Ta5]],
Tank B:    [[Pb1,Tb1], [Pb2,Tb2], [Pb3,Tb3], [Pb4,Tb4], [Pb5,Tb5]],
  ....
Tank N:    [[Pn1,Tn1], [Pn2,Tn2], [Pn3,Tn3], [Pn4,Tn4], [Pn5,Tn5]],
        ]

Inputs for sliding windows

Often, LSTM layers are supposed to process the entire sequences. Dividing windows may not be the best idea. The layer has internal states about how a sequence is evolving as it steps forward. Windows eliminate the possibility of learning long sequences, limiting all sequences to the window size.

In windows, each window is part of a long original sequence, but by Keras they will be seen each as an independent sequence:

        [     Step1    Step2    Step3    Step4    Step5
Window  A:  [[P1,T1], [P2,T2], [P3,T3], [P4,T4], [P5,T5]],
Window  B:  [[P2,T2], [P3,T3], [P4,T4], [P5,T5], [P6,T6]],
Window  C:  [[P3,T3], [P4,T4], [P5,T5], [P6,T6], [P7,T7]],
  ....
        ]

Notice that in this case, you have initially only one sequence, but you’re dividing it in many sequences to create windows.

The concept of “what is a sequence” is abstract. The important parts are:

  • you can have batches with many individual sequences
  • what makes the sequences be sequences is that they evolve in steps (usually time steps)

Achieving each case with “single layers”

Achieving standard many to many:

StandardManyToMany

You can achieve many to many with a simple LSTM layer, using return_sequences=True:

outputs = LSTM(units, return_sequences=True)(inputs)

#output_shape -> (batch_size, steps, units)

Achieving many to one:

Using the exact same layer, keras will do the exact same internal preprocessing, but when you use return_sequences=False (or simply ignore this argument), keras will automatically discard the steps previous to the last:

ManyToOne

outputs = LSTM(units)(inputs)

#output_shape -> (batch_size, units) --> steps were discarded, only the last was returned

Achieving one to many

Now, this is not supported by keras LSTM layers alone. You will have to create your own strategy to multiplicate the steps. There are two good approaches:

  • Create a constant multi-step input by repeating a tensor
  • Use a stateful=True to recurrently take the output of one step and serve it as the input of the next step (needs output_features == input_features)

One to many with repeat vector

In order to fit to keras standard behavior, we need inputs in steps, so, we simply repeat the inputs for the length we want:

OneToManyRepeat

outputs = RepeatVector(steps)(inputs) #where inputs is (batch,features)
outputs = LSTM(units,return_sequences=True)(outputs)

#output_shape -> (batch_size, steps, units)

Understanding stateful = True

Now comes one of the possible usages of stateful=True (besides avoiding loading data that can’t fit your computer’s memory at once)

Stateful allows us to input “parts” of the sequences in stages. The difference is:

  • In stateful=False, the second batch contains whole new sequences, independent from the first batch
  • In stateful=True, the second batch continues the first batch, extending the same sequences.

It’s like dividing the sequences in windows too, with these two main differences:

  • these windows do not superpose!!
  • stateful=True will see these windows connected as a single long sequence

In stateful=True, every new batch will be interpreted as continuing the previous batch (until you call model.reset_states()).

  • Sequence 1 in batch 2 will continue sequence 1 in batch 1.
  • Sequence 2 in batch 2 will continue sequence 2 in batch 1.
  • Sequence n in batch 2 will continue sequence n in batch 1.

Example of inputs, batch 1 contains steps 1 and 2, batch 2 contains steps 3 to 5:

                   BATCH 1                           BATCH 2
        [     Step1      Step2        |    [    Step3      Step4      Step5
Tank A:    [[Pa1,Ta1], [Pa2,Ta2],     |       [Pa3,Ta3], [Pa4,Ta4], [Pa5,Ta5]],
Tank B:    [[Pb1,Tb1], [Pb2,Tb2],     |       [Pb3,Tb3], [Pb4,Tb4], [Pb5,Tb5]],
  ....                                |
Tank N:    [[Pn1,Tn1], [Pn2,Tn2],     |       [Pn3,Tn3], [Pn4,Tn4], [Pn5,Tn5]],
        ]                                  ]

Notice the alignment of tanks in batch 1 and batch 2! That’s why we need shuffle=False (unless we are using only one sequence, of course).

You can have any number of batches, indefinitely. (For having variable lengths in each batch, use input_shape=(None,features).

One to many with stateful=True

For our case here, we are going to use only 1 step per batch, because we want to get one output step and make it be an input.

Please notice that the behavior in the picture is not “caused by” stateful=True. We will force that behavior in a manual loop below. In this example, stateful=True is what “allows” us to stop the sequence, manipulate what we want, and continue from where we stopped.

OneToManyStateful

Honestly, the repeat approach is probably a better choice for this case. But since we’re looking into stateful=True, this is a good example. The best way to use this is the next “many to many” case.

Layer:

outputs = LSTM(units=features, 
               stateful=True, 
               return_sequences=True, #just to keep a nice output shape even with length 1
               input_shape=(None,features))(inputs) 
    #units = features because we want to use the outputs as inputs
    #None because we want variable length

#output_shape -> (batch_size, steps, units) 

Now, we’re going to need a manual loop for predictions:

input_data = someDataWithShape((batch, 1, features))

#important, we're starting new sequences, not continuing old ones:
model.reset_states()

output_sequence = []
last_step = input_data
for i in steps_to_predict:

    new_step = model.predict(last_step)
    output_sequence.append(new_step)
    last_step = new_step

 #end of the sequences
 model.reset_states()

Many to many with stateful=True

Now, here, we get a very nice application: given an input sequence, try to predict its future unknown steps.

We’re using the same method as in the “one to many” above, with the difference that:

  • we will use the sequence itself to be the target data, one step ahead
  • we know part of the sequence (so we discard this part of the results).

ManyToManyStateful

Layer (same as above):

outputs = LSTM(units=features, 
               stateful=True, 
               return_sequences=True, 
               input_shape=(None,features))(inputs) 
    #units = features because we want to use the outputs as inputs
    #None because we want variable length

#output_shape -> (batch_size, steps, units) 

Training:

We are going to train our model to predict the next step of the sequences:

totalSequences = someSequencesShaped((batch, steps, features))
    #batch size is usually 1 in these cases (often you have only one Tank in the example)

X = totalSequences[:,:-1] #the entire known sequence, except the last step
Y = totalSequences[:,1:] #one step ahead of X

#loop for resetting states at the start/end of the sequences:
for epoch in range(epochs):
    model.reset_states()
    model.train_on_batch(X,Y)

Predicting:

The first stage of our predicting involves “ajusting the states”. That’s why we’re going to predict the entire sequence again, even if we already know this part of it:

model.reset_states() #starting a new sequence
predicted = model.predict(totalSequences)
firstNewStep = predicted[:,-1:] #the last step of the predictions is the first future step

Now we go to the loop as in the one to many case. But don’t reset states here!. We want the model to know in which step of the sequence it is (and it knows it’s at the first new step because of the prediction we just made above)

output_sequence = [firstNewStep]
last_step = firstNewStep
for i in steps_to_predict:

    new_step = model.predict(last_step)
    output_sequence.append(new_step)
    last_step = new_step

 #end of the sequences
 model.reset_states()

This approach was used in these answers and file:

Achieving complex configurations

In all examples above, I showed the behavior of “one layer”.

You can, of course, stack many layers on top of each other, not necessarly all following the same pattern, and create your own models.

One interesting example that has been appearing is the “autoencoder” that has a “many to one encoder” followed by a “one to many” decoder:

Encoder:

inputs = Input((steps,features))

#a few many to many layers:
outputs = LSTM(hidden1,return_sequences=True)(inputs)
outputs = LSTM(hidden2,return_sequences=True)(outputs)    

#many to one layer:
outputs = LSTM(hidden3)(outputs)

encoder = Model(inputs,outputs)

Decoder:

Using the “repeat” method;

inputs = Input((hidden3,))

#repeat to make one to many:
outputs = RepeatVector(steps)(inputs)

#a few many to many layers:
outputs = LSTM(hidden4,return_sequences=True)(outputs)

#last layer
outputs = LSTM(features,return_sequences=True)(outputs)

decoder = Model(inputs,outputs)

Autoencoder:

inputs = Input((steps,features))
outputs = encoder(inputs)
outputs = decoder(outputs)

autoencoder = Model(inputs,outputs)

Train with fit(X,X)

Additional explanations

If you want details about how steps are calculated in LSTMs, or details about the stateful=True cases above, you can read more in this answer: Doubts regarding `Understanding Keras LSTMs`


回答 2

当您在RNN的最后一层中有return_sequences时,您不能使用简单的Dense层,而应使用TimeDistributed。

这是一段示例代码,可能会对其他人有所帮助。

单词= keras.layers.Input(batch_shape =(None,self.maxSequenceLength),名称=“输入”)

    # Build a matrix of size vocabularySize x EmbeddingDimension 
    # where each row corresponds to a "word embedding" vector.
    # This layer will convert replace each word-id with a word-vector of size Embedding Dimension.
    embeddings = keras.layers.embeddings.Embedding(self.vocabularySize, self.EmbeddingDimension,
        name = "embeddings")(words)
    # Pass the word-vectors to the LSTM layer.
    # We are setting the hidden-state size to 512.
    # The output will be batchSize x maxSequenceLength x hiddenStateSize
    hiddenStates = keras.layers.GRU(512, return_sequences = True, 
                                        input_shape=(self.maxSequenceLength,
                                        self.EmbeddingDimension),
                                        name = "rnn")(embeddings)
    hiddenStates2 = keras.layers.GRU(128, return_sequences = True, 
                                        input_shape=(self.maxSequenceLength, self.EmbeddingDimension),
                                        name = "rnn2")(hiddenStates)

    denseOutput = TimeDistributed(keras.layers.Dense(self.vocabularySize), 
        name = "linear")(hiddenStates2)
    predictions = TimeDistributed(keras.layers.Activation("softmax"), 
        name = "softmax")(denseOutput)  

    # Build the computational graph by specifying the input, and output of the network.
    model = keras.models.Model(input = words, output = predictions)
    # model.compile(loss='kullback_leibler_divergence', \
    model.compile(loss='sparse_categorical_crossentropy', \
        optimizer = keras.optimizers.Adam(lr=0.009, \
            beta_1=0.9,\
            beta_2=0.999, \
            epsilon=None, \
            decay=0.01, \
            amsgrad=False))

When you have return_sequences in your last layer of RNN you cannot use a simple Dense layer instead use TimeDistributed.

Here is an example piece of code this might help others.

words = keras.layers.Input(batch_shape=(None, self.maxSequenceLength), name = “input”)

    # Build a matrix of size vocabularySize x EmbeddingDimension 
    # where each row corresponds to a "word embedding" vector.
    # This layer will convert replace each word-id with a word-vector of size Embedding Dimension.
    embeddings = keras.layers.embeddings.Embedding(self.vocabularySize, self.EmbeddingDimension,
        name = "embeddings")(words)
    # Pass the word-vectors to the LSTM layer.
    # We are setting the hidden-state size to 512.
    # The output will be batchSize x maxSequenceLength x hiddenStateSize
    hiddenStates = keras.layers.GRU(512, return_sequences = True, 
                                        input_shape=(self.maxSequenceLength,
                                        self.EmbeddingDimension),
                                        name = "rnn")(embeddings)
    hiddenStates2 = keras.layers.GRU(128, return_sequences = True, 
                                        input_shape=(self.maxSequenceLength, self.EmbeddingDimension),
                                        name = "rnn2")(hiddenStates)

    denseOutput = TimeDistributed(keras.layers.Dense(self.vocabularySize), 
        name = "linear")(hiddenStates2)
    predictions = TimeDistributed(keras.layers.Activation("softmax"), 
        name = "softmax")(denseOutput)  

    # Build the computational graph by specifying the input, and output of the network.
    model = keras.models.Model(input = words, output = predictions)
    # model.compile(loss='kullback_leibler_divergence', \
    model.compile(loss='sparse_categorical_crossentropy', \
        optimizer = keras.optimizers.Adam(lr=0.009, \
            beta_1=0.9,\
            beta_2=0.999, \
            epsilon=None, \
            decay=0.01, \
            amsgrad=False))

张量流的tf.nn.max_pool中的’SAME’和’VALID’填充有什么区别?

问题:张量流的tf.nn.max_pool中的’SAME’和’VALID’填充有什么区别?

什么是“相同”和“有效”填充之间的区别tf.nn.max_pooltensorflow

我认为,“有效”表示在进行最大池化时,边缘外部不会出现零填充。

根据深度学习卷积算法指南,它说池运算符中将没有填充,即仅使用的“ VALID” tensorflow。但是最大池的“相同”填充是tensorflow什么?

What is the difference between ‘SAME’ and ‘VALID’ padding in tf.nn.max_pool of tensorflow?

In my opinion, ‘VALID’ means there will be no zero padding outside the edges when we do max pool.

According to A guide to convolution arithmetic for deep learning, it says that there will be no padding in pool operator, i.e. just use ‘VALID’ of tensorflow. But what is ‘SAME’ padding of max pool in tensorflow?


回答 0

我将举一个例子使其更加清晰:

  • x:输入形状为[2,3],1通道的图像
  • valid_pad:具有2×2内核,步幅2和有效填充的最大池。
  • same_pad:具有2×2内核,步幅2和相同填充的最大池(这是经典的处理方式)

输出形状为:

  • valid_pad:这里,没有填充,因此输出形状为[1,1]
  • same_pad:在这里,我们将图像填充为[2,4]形状(使用-inf,然后应用最大池),因此输出形状为[1、2]

x = tf.constant([[1., 2., 3.],
                 [4., 5., 6.]])

x = tf.reshape(x, [1, 2, 3, 1])  # give a shape accepted by tf.nn.max_pool

valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')

valid_pad.get_shape() == [1, 1, 1, 1]  # valid_pad is [5.]
same_pad.get_shape() == [1, 1, 2, 1]   # same_pad is  [5., 6.]

I’ll give an example to make it clearer:

  • x: input image of shape [2, 3], 1 channel
  • valid_pad: max pool with 2×2 kernel, stride 2 and VALID padding.
  • same_pad: max pool with 2×2 kernel, stride 2 and SAME padding (this is the classic way to go)

The output shapes are:

  • valid_pad: here, no padding so the output shape is [1, 1]
  • same_pad: here, we pad the image to the shape [2, 4] (with -inf and then apply max pool), so the output shape is [1, 2]

x = tf.constant([[1., 2., 3.],
                 [4., 5., 6.]])

x = tf.reshape(x, [1, 2, 3, 1])  # give a shape accepted by tf.nn.max_pool

valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')

valid_pad.get_shape() == [1, 1, 1, 1]  # valid_pad is [5.]
same_pad.get_shape() == [1, 1, 2, 1]   # same_pad is  [5., 6.]


回答 1

如果您喜欢ascii艺术:

  • "VALID" =不带填充:

       inputs:         1  2  3  4  5  6  7  8  9  10 11 (12 13)
                      |________________|                dropped
                                     |_________________|
  • "SAME" =零填充:

                   pad|                                      |pad
       inputs:      0 |1  2  3  4  5  6  7  8  9  10 11 12 13|0  0
                   |________________|
                                  |_________________|
                                                 |________________|

在此示例中:

  • 输入宽度= 13
  • 滤镜宽度= 6
  • 步幅= 5

笔记:

  • "VALID" 只删除最右边的列(或最下面的行)。
  • "SAME" 尝试左右均匀填充,但是如果要添加的列数是奇数,它将在右边添加额外的列,如本示例中的情况(相同的逻辑在垂直方向上适用:可能会有额外的行底部的零)。

编辑

关于名字:

  • 使用"SAME"填充时,如果跨度为1,则图层的输出将具有与其输入相同的空间尺寸。
  • 使用"VALID"填充时,没有“虚构的”填充输入。该图层仅使用有效的输入数据。

If you like ascii art:

  • "VALID" = without padding:

       inputs:         1  2  3  4  5  6  7  8  9  10 11 (12 13)
                      |________________|                dropped
                                     |_________________|
    
  • "SAME" = with zero padding:

                   pad|                                      |pad
       inputs:      0 |1  2  3  4  5  6  7  8  9  10 11 12 13|0  0
                   |________________|
                                  |_________________|
                                                 |________________|
    

In this example:

  • Input width = 13
  • Filter width = 6
  • Stride = 5

Notes:

  • "VALID" only ever drops the right-most columns (or bottom-most rows).
  • "SAME" tries to pad evenly left and right, but if the amount of columns to be added is odd, it will add the extra column to the right, as is the case in this example (the same logic applies vertically: there may be an extra row of zeros at the bottom).

Edit:

About the name:

  • With "SAME" padding, if you use a stride of 1, the layer’s outputs will have the same spatial dimensions as its inputs.
  • With "VALID" padding, there’s no “made-up” padding inputs. The layer only uses valid input data.

回答 2

stride为1时(卷积比池化更典型),我们可以想到以下区别:

  • "SAME":输出大小输入大小相同。这要求过滤器窗口滑到输入图的外部,因此需要填充。
  • "VALID":过滤器窗口停留在输入地图内的有效位置,因此输出大小缩小filter_size - 1。没有填充发生。

When stride is 1 (more typical with convolution than pooling), we can think of the following distinction:

  • "SAME": output size is the same as input size. This requires the filter window to slip outside input map, hence the need to pad.
  • "VALID": Filter window stays at valid position inside input map, so output size shrinks by filter_size - 1. No padding occurs.

回答 3

所述TensorFlow卷积示例给出关于之间的差的概述SAMEVALID

  • 对于SAME填充,输出高度和宽度的计算公式如下:

    out_height = ceil(float(in_height) / float(strides[1]))
    out_width  = ceil(float(in_width) / float(strides[2]))

  • 对于VALID填充,输出高度和宽度的计算公式如下:

    out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
    out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))

The TensorFlow Convolution example gives an overview about the difference between SAME and VALID :

  • For the SAME padding, the output height and width are computed as:

    out_height = ceil(float(in_height) / float(strides[1]))
    out_width  = ceil(float(in_width) / float(strides[2]))
    

And

  • For the VALID padding, the output height and width are computed as:

    out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
    out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))
    

回答 4

填充是增加输入数据大小的操作。如果是一维数据,则只需在数组上附加/添加常量,在二维中,将这些常量包围在矩阵中。在n-dim中,将常数包围在n-dim超立方体中。在大多数情况下,此常数为零,称为零填充。

这是一个p=1应用于2-d张量的零填充示例: 在此处输入图片说明


您可以对内核使用任意填充,但是某些填充值的使用频率要比其他填充值高:

  • 有效填充。最简单的情况是完全没有填充。只需保持数据不变即可。
  • 相同的填充有时也称为HALF填充。之所以称为SAME,是因为对于步幅= 1的卷积(或对于池化),它应产生与输入大小相同的输出。之所以称为HALF,是因为对于一个大小的内核k 在此处输入图片说明
  • 全填充是最大填充,不会导致仅填充元素的卷积。对于内核大小k,此填充等于k - 1

要在TF中使用任意填充,您可以使用 tf.pad()

Padding is an operation to increase the size of the input data. In case of 1-dimensional data you just append/prepend the array with a constant, in 2-dim you surround matrix with these constants. In n-dim you surround your n-dim hypercube with the constant. In most of the cases this constant is zero and it is called zero-padding.

Here is an example of zero-padding with p=1 applied to 2-d tensor: enter image description here


You can use arbitrary padding for your kernel but some of the padding values are used more frequently than others they are:

  • VALID padding. The easiest case, means no padding at all. Just leave your data the same it was.
  • SAME padding sometimes called HALF padding. It is called SAME because for a convolution with a stride=1, (or for pooling) it should produce output of the same size as the input. It is called HALF because for a kernel of size k enter image description here
  • FULL padding is the maximum padding which does not result in a convolution over just padded elements. For a kernel of size k, this padding is equal to k - 1.

To use arbitrary padding in TF, you can use tf.pad()


回答 5

快速说明

VALID:不要应用任何填充,即假设所有尺寸均有效,以便输入图像完全被您指定的过滤器和步幅覆盖。

SAME:对输入使用填充(如果需要),以使输入图像完全被滤镜覆盖,并跨步指定。对于第1步,这将确保输出图像大小相同输入。

笔记

  • 这同样适用于转换层和最大池层
  • 术语“有效”有点用词不当,因为如果您删除部分图像,事情不会变得“无效”。有时您甚至可能想要那样。这应该被称为NO_PADDING改为。
  • 术语“相同”也是错误的称呼,因为只有当输出尺寸与输入尺寸相同时,步幅为1才有意义。例如,对于跨度为2的步,输出尺寸将为一半。应该应该AUTO_PADDING改为调用它。
  • SAME(即自动填充模式)下,Tensorflow将尝试在左右两侧均匀地填充填充。
  • VALID(即无填充模式)下,如果您的过滤器和步幅未完全覆盖输入图像,Tensorflow将在右边和/或底部单元格下降。

Quick Explanation

VALID: Don’t apply any padding, i.e., assume that all dimensions are valid so that input image fully gets covered by filter and stride you specified.

SAME: Apply padding to input (if needed) so that input image gets fully covered by filter and stride you specified. For stride 1, this will ensure that output image size is same as input.

Notes

  • This applies to conv layers as well as max pool layers in same way
  • The term “valid” is bit of a misnomer because things don’t become “invalid” if you drop part of the image. Sometime you might even want that. This should have probably be called NO_PADDING instead.
  • The term “same” is a misnomer too because it only makes sense for stride of 1 when output dimension is same as input dimension. For stride of 2, output dimensions will be half, for example. This should have probably be called AUTO_PADDING instead.
  • In SAME (i.e. auto-pad mode), Tensorflow will try to spread padding evenly on both left and right.
  • In VALID (i.e. no padding mode), Tensorflow will drop right and/or bottom cells if your filter and stride doesn’t full cover input image.

回答 6

我引用官方tensorflow文档https://www.tensorflow.org/api_guides/python/nn#Convolution的答案 对于’SAME’填充,输出高度和宽度的计算方式如下:

out_height = ceil(float(in_height) / float(strides[1]))
out_width  = ceil(float(in_width) / float(strides[2]))

和顶部和左侧的填充计算为:

pad_along_height = max((out_height - 1) * strides[1] +
                    filter_height - in_height, 0)
pad_along_width = max((out_width - 1) * strides[2] +
                   filter_width - in_width, 0)
pad_top = pad_along_height // 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width // 2
pad_right = pad_along_width - pad_left

对于“有效”填充,输出高度和宽度的计算公式如下:

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))

填充值始终为零。

I am quoting this answer from official tensorflow docs https://www.tensorflow.org/api_guides/python/nn#Convolution For the ‘SAME’ padding, the output height and width are computed as:

out_height = ceil(float(in_height) / float(strides[1]))
out_width  = ceil(float(in_width) / float(strides[2]))

and the padding on the top and left are computed as:

pad_along_height = max((out_height - 1) * strides[1] +
                    filter_height - in_height, 0)
pad_along_width = max((out_width - 1) * strides[2] +
                   filter_width - in_width, 0)
pad_top = pad_along_height // 2
pad_bottom = pad_along_height - pad_top
pad_left = pad_along_width // 2
pad_right = pad_along_width - pad_left

For the ‘VALID’ padding, the output height and width are computed as:

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))

and the padding values are always zero.


回答 7

填充有三种选择:有效(无填充),相同(或一半),完整。您可以在以下位置找到说明(在Theano中):http : //deeplearning.net/software/theano/tutorial/conv_arithmetic.html

  • 有效或无填充:

有效填充不涉及零填充,因此它仅覆盖有效输入,不包括人工生成的零。如果步幅s = 1,则对于内核大小k,输出的长度为((输入的长度)-(k-1))。

  • 相同或一半填充:

当s = 1时,相同的填充使输出的大小与输入的大小相同。如果s = 1,则填充的零数为(k-1)。

  • 全填充:

完全填充意味着内核将在整个输入上运行,因此,在最后,内核可能会遇到唯一的一个输入,而其他输入可能为零。如果s = 1,则填充的零数为2(k-1)。如果s = 1,则输出长度为((输入长度​​)+(k-1))。

因此,填充数:(有效)<=(相同)<=(满)

There are three choices of padding: valid (no padding), same (or half), full. You can find explanations (in Theano) here: http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html

  • Valid or no padding:

The valid padding involves no zero padding, so it covers only the valid input, not including artificially generated zeros. The length of output is ((the length of input) – (k-1)) for the kernel size k if the stride s=1.

  • Same or half padding:

The same padding makes the size of outputs be the same with that of inputs when s=1. If s=1, the number of zeros padded is (k-1).

  • Full padding:

The full padding means that the kernel runs over the whole inputs, so at the ends, the kernel may meet the only one input and zeros else. The number of zeros padded is 2(k-1) if s=1. The length of output is ((the length of input) + (k-1)) if s=1.

Therefore, the number of paddings: (valid) <= (same) <= (full)


回答 8

启用/禁用填充。确定输入的有效大小。

VALID:没有填充。卷积运算等操作仅在“有效”的位置执行,即不太靠近张量的边界。
使用3×3的内核和10×10的图像,您将在边界内的8×8区域执行卷积。

SAME:提供填充。每当您的操作引用邻域(无论大小)时,当该邻域超出原始张量时,都会提供零值,以使该操作也可以处理边界值。
使用3×3的内核和10×10的图像,您将在整个10×10区域上进行卷积。

Padding on/off. Determines the effective size of your input.

VALID: No padding. Convolution etc. ops are only performed at locations that are “valid”, i.e. not too close to the borders of your tensor.
With a kernel of 3×3 and image of 10×10, you would be performing convolution on the 8×8 area inside the borders.

SAME: Padding is provided. Whenever your operation references a neighborhood (no matter how big), zero values are provided when that neighborhood extends outside the original tensor to allow that operation to work also on border values.
With a kernel of 3×3 and image of 10×10, you would be performing convolution on the full 10×10 area.


回答 9

有效填充:这是零填充。希望没有混乱。

x = tf.constant([[1., 2., 3.], [4., 5., 6.],[ 7., 8., 9.], [ 7., 8., 9.]])
x = tf.reshape(x, [1, 4, 3, 1])
valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
print (valid_pad.get_shape()) # output-->(1, 2, 1, 1)

相同的 填充:首先要理解这有点棘手,因为我们必须分别考虑两个条件,如官方文档中所述

让我们将输入设为,将输出设为,将填充设为,将步幅设为,将内核大小设为(仅考虑一个维度)

情况01 ::

情况02 ::

计算出使得填充可用的最小值。由于的值是已知的,因此可以使用此公式找到的值

让我们算出这个例子:

x = tf.constant([[1., 2., 3.], [4., 5., 6.],[ 7., 8., 9.], [ 7., 8., 9.]])
x = tf.reshape(x, [1, 4, 3, 1])
same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
print (same_pad.get_shape()) # --> output (1, 2, 2, 1)

x的维数为(3,4)。然后,如果采取水平方向(3):

如果采用垂直方向(4):

希望这将有助于理解SAME填充在TF中的实际作用。

VALID padding: this is with zero padding. Hope there is no confusion.

x = tf.constant([[1., 2., 3.], [4., 5., 6.],[ 7., 8., 9.], [ 7., 8., 9.]])
x = tf.reshape(x, [1, 4, 3, 1])
valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
print (valid_pad.get_shape()) # output-->(1, 2, 1, 1)

SAME padding: This is kind of tricky to understand in the first place because we have to consider two conditions separately as mentioned in the official docs.

Let’s take input as , output as , padding as , stride as and kernel size as (only a single dimension is considered)

Case 01: :

Case 02: :

is calculated such that the minimum value which can be taken for padding. Since value of is known, value of can be found using this formula .

Let’s work out this example:

x = tf.constant([[1., 2., 3.], [4., 5., 6.],[ 7., 8., 9.], [ 7., 8., 9.]])
x = tf.reshape(x, [1, 4, 3, 1])
same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
print (same_pad.get_shape()) # --> output (1, 2, 2, 1)

Here the dimension of x is (3,4). Then if the horizontal direction is taken (3):

If the vertial direction is taken (4):

Hope this will help to understand how actually SAME padding works in TF.


回答 10

根据此处的说明以及Tristan的回答,我通常使用这些快速功能进行健全性检查。

# a function to help us stay clean
def getPaddings(pad_along_height,pad_along_width):
    # if even.. easy..
    if pad_along_height%2 == 0:
        pad_top = pad_along_height / 2
        pad_bottom = pad_top
    # if odd
    else:
        pad_top = np.floor( pad_along_height / 2 )
        pad_bottom = np.floor( pad_along_height / 2 ) +1
    # check if width padding is odd or even
    # if even.. easy..
    if pad_along_width%2 == 0:
        pad_left = pad_along_width / 2
        pad_right= pad_left
    # if odd
    else:
        pad_left = np.floor( pad_along_width / 2 )
        pad_right = np.floor( pad_along_width / 2 ) +1
        #
    return pad_top,pad_bottom,pad_left,pad_right

# strides [image index, y, x, depth]
# padding 'SAME' or 'VALID'
# bottom and right sides always get the one additional padded pixel (if padding is odd)
def getOutputDim (inputWidth,inputHeight,filterWidth,filterHeight,strides,padding):
    if padding == 'SAME':
        out_height = np.ceil(float(inputHeight) / float(strides[1]))
        out_width  = np.ceil(float(inputWidth) / float(strides[2]))
        #
        pad_along_height = ((out_height - 1) * strides[1] + filterHeight - inputHeight)
        pad_along_width = ((out_width - 1) * strides[2] + filterWidth - inputWidth)
        #
        # now get padding
        pad_top,pad_bottom,pad_left,pad_right = getPaddings(pad_along_height,pad_along_width)
        #
        print 'output height', out_height
        print 'output width' , out_width
        print 'total pad along height' , pad_along_height
        print 'total pad along width' , pad_along_width
        print 'pad at top' , pad_top
        print 'pad at bottom' ,pad_bottom
        print 'pad at left' , pad_left
        print 'pad at right' ,pad_right

    elif padding == 'VALID':
        out_height = np.ceil(float(inputHeight - filterHeight + 1) / float(strides[1]))
        out_width  = np.ceil(float(inputWidth - filterWidth + 1) / float(strides[2]))
        #
        print 'output height', out_height
        print 'output width' , out_width
        print 'no padding'


# use like so
getOutputDim (80,80,4,4,[1,1,1,1],'SAME')

Based on the explanation here and following up on Tristan’s answer, I usually use these quick functions for sanity checks.

# a function to help us stay clean
def getPaddings(pad_along_height,pad_along_width):
    # if even.. easy..
    if pad_along_height%2 == 0:
        pad_top = pad_along_height / 2
        pad_bottom = pad_top
    # if odd
    else:
        pad_top = np.floor( pad_along_height / 2 )
        pad_bottom = np.floor( pad_along_height / 2 ) +1
    # check if width padding is odd or even
    # if even.. easy..
    if pad_along_width%2 == 0:
        pad_left = pad_along_width / 2
        pad_right= pad_left
    # if odd
    else:
        pad_left = np.floor( pad_along_width / 2 )
        pad_right = np.floor( pad_along_width / 2 ) +1
        #
    return pad_top,pad_bottom,pad_left,pad_right

# strides [image index, y, x, depth]
# padding 'SAME' or 'VALID'
# bottom and right sides always get the one additional padded pixel (if padding is odd)
def getOutputDim (inputWidth,inputHeight,filterWidth,filterHeight,strides,padding):
    if padding == 'SAME':
        out_height = np.ceil(float(inputHeight) / float(strides[1]))
        out_width  = np.ceil(float(inputWidth) / float(strides[2]))
        #
        pad_along_height = ((out_height - 1) * strides[1] + filterHeight - inputHeight)
        pad_along_width = ((out_width - 1) * strides[2] + filterWidth - inputWidth)
        #
        # now get padding
        pad_top,pad_bottom,pad_left,pad_right = getPaddings(pad_along_height,pad_along_width)
        #
        print 'output height', out_height
        print 'output width' , out_width
        print 'total pad along height' , pad_along_height
        print 'total pad along width' , pad_along_width
        print 'pad at top' , pad_top
        print 'pad at bottom' ,pad_bottom
        print 'pad at left' , pad_left
        print 'pad at right' ,pad_right

    elif padding == 'VALID':
        out_height = np.ceil(float(inputHeight - filterHeight + 1) / float(strides[1]))
        out_width  = np.ceil(float(inputWidth - filterWidth + 1) / float(strides[2]))
        #
        print 'output height', out_height
        print 'output width' , out_width
        print 'no padding'


# use like so
getOutputDim (80,80,4,4,[1,1,1,1],'SAME')

回答 11

综上所述,“有效”填充表示没有填充。卷积层的输出大小根据输入大小和内核大小而缩小。

相反,“相同”填充表示使用填充。当跨度设置为1时,在计算卷积时,通过在输入数据周围附加一定数量的“ 0边界”,将卷积层的输出大小保持为输入大小。

希望这种直观的描述有所帮助。

To sum up, ‘valid’ padding means no padding. The output size of the convolutional layer shrinks depending on the input size & kernel size.

On the contrary, ‘same’ padding means using padding. When the stride is set as 1, the output size of the convolutional layer maintains as the input size by appending a certain number of ‘0-border’ around the input data when calculating convolution.

Hope this intuitive description helps.


回答 12

通式

此处,W和H是输入的宽度和高度,F是过滤器尺寸,P是填充大小(即,要填充的行数或列数)

对于相同的填充:

相同的填充

对于有效填充:

有效填充

General Formula

Here, W and H are width and height of input, F are filter dimensions, P is padding size (i.e., number of rows or columns to be padded)

For SAME padding:

SAME Padding

For VALID padding:

VALID padding


回答 13

补充YvesgereY的好答案,我发现此可视化非常有用:

填充可视化

填充有效 ”是第一个数字。过滤器窗口停留在图像内部。

填充“ 相同 ”是第三个数字。输出大小相同。


本文上找到它。

Complementing YvesgereY’s great answer, I found this visualization extremely helpful:

Padding visualization

Padding ‘valid‘ is the first figure. The filter window stays inside the image.

Padding ‘same‘ is the third figure. The output is the same size.


Found it on this article.


回答 14

Tensorflow 2.0兼容答案:上面已经提供了有关“有效”和“相同”填充的详细说明。

但是,Tensorflow 2.x (>= 2.0)为了社区的利益,我将在中指定不同的Pooling Function及其各自的Command 。

1.x中的功能

tf.nn.max_pool

tf.keras.layers.MaxPool2D

Average Pooling => None in tf.nn, tf.keras.layers.AveragePooling2D

2.x中的功能

tf.nn.max_pool如果在2.x和tf.compat.v1.nn.max_pool_v2或中使用tf.compat.v2.nn.max_pool,则从1.x迁移到2.x。

tf.keras.layers.MaxPool2D 如果在2.x和

tf.compat.v1.keras.layers.MaxPool2Dtf.compat.v1.keras.layers.MaxPooling2Dtf.compat.v2.keras.layers.MaxPool2Dtf.compat.v2.keras.layers.MaxPooling2D(如果从1.x迁移到2.x)。

Average Pooling => tf.nn.avg_pool2d或者tf.keras.layers.AveragePooling2D在TF 2.x中使用

tf.compat.v1.nn.avg_pool_v2tf.compat.v2.nn.avg_pooltf.compat.v1.keras.layers.AveragePooling2Dtf.compat.v1.keras.layers.AvgPool2Dtf.compat.v2.keras.layers.AveragePooling2Dtf.compat.v2.keras.layers.AvgPool2D,如果从1.x中迁移到2.x版本

有关从Tensorflow 1.x迁移到2.x的更多信息,请参阅此迁移指南

Tensorflow 2.0 Compatible Answer: Detailed Explanations have been provided above, about “Valid” and “Same” Padding.

However, I will specify different Pooling Functions and their respective Commands in Tensorflow 2.x (>= 2.0), for the benefit of the community.

Functions in 1.x:

tf.nn.max_pool

tf.keras.layers.MaxPool2D

Average Pooling => None in tf.nn, tf.keras.layers.AveragePooling2D

Functions in 2.x:

tf.nn.max_pool if used in 2.x and tf.compat.v1.nn.max_pool_v2 or tf.compat.v2.nn.max_pool, if migrated from 1.x to 2.x.

tf.keras.layers.MaxPool2D if used in 2.x and

tf.compat.v1.keras.layers.MaxPool2D or tf.compat.v1.keras.layers.MaxPooling2D or tf.compat.v2.keras.layers.MaxPool2D or tf.compat.v2.keras.layers.MaxPooling2D, if migrated from 1.x to 2.x.

Average Pooling => tf.nn.avg_pool2d or tf.keras.layers.AveragePooling2D if used in TF 2.x and

tf.compat.v1.nn.avg_pool_v2 or tf.compat.v2.nn.avg_pool or tf.compat.v1.keras.layers.AveragePooling2D or tf.compat.v1.keras.layers.AvgPool2D or tf.compat.v2.keras.layers.AveragePooling2D or tf.compat.v2.keras.layers.AvgPool2D , if migrated from 1.x to 2.x.

For more information about Migration from Tensorflow 1.x to 2.x, please refer to this Migration Guide.


Labelme-使用Python的图像多边形批注(多边形、矩形、圆、直线、点和图像级标志批注)

Labelme是一个图形图像标注工具,灵感来自http://labelme.csail.mit.edu
它是用Python编写的,并使用Qt作为其图形界面


各种图元(多边形、矩形、圆、直线和点)

功能

要求

安装

有以下选项:

python

您需要安装Anaconda,然后在下面运行:

# python2
conda create --name=labelme python=2.7
source activate labelme
# conda install -c conda-forge pyside2
conda install pyqt
pip install labelme
# if you'd like to use the latest version. run below:
# pip install git+https://github.com/wkentaro/labelme.git

# python3
conda create --name=labelme python=3.6
source activate labelme
# conda install -c conda-forge pyside2
# conda install pyqt
# pip install pyqt5  # pyqt5 can be installed via pip on python3
pip install labelme
# or you can install everything by conda command
# conda install labelme -c conda-forge

码头工人

您需要安装docker,然后在下面运行:

# on macOS
socat TCP-LISTEN:6000,reuseaddr,fork UNIX-CLIENT:\"$DISPLAY\" &
docker run -it -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=docker.for.mac.host.internal:0 -v $(pwd):/root/workdir wkentaro/labelme

# on Linux
xhost +
docker run -it -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=:0 -v $(pwd):/root/workdir wkentaro/labelme

Ubuntu

# Ubuntu 14.04 / Ubuntu 16.04
# Python2
# sudo apt-get install python-qt4  # PyQt4
sudo apt-get install python-pyqt5  # PyQt5
sudo pip install labelme
# Python3
sudo apt-get install python3-pyqt5  # PyQt5
sudo pip3 install labelme

# or install standalone executable from:
# https://github.com/wkentaro/labelme/releases

Ubuntu 19.10+/debian(Sid)

sudo apt-get install labelme

MacOS

# macOS Sierra
brew install pyqt  # maybe pyqt5
pip install labelme  # both python2/3 should work

# or install standalone executable/app from:
# https://github.com/wkentaro/labelme/releases

窗口

安装Anaconda,然后在python提示运行中:

# python3
conda create --name=labelme python=3.6
conda activate labelme
pip install labelme

用法

labelme --help有关详细信息,请参阅
批注另存为JSON文件

labelme  # just open gui

# tutorial (single image example)
cd examples/tutorial
labelme apc2016_obj3.jpg  # specify image file
labelme apc2016_obj3.jpg -O apc2016_obj3.json  # close window after the save
labelme apc2016_obj3.jpg --nodata  # not include image data but relative image path in JSON file
labelme apc2016_obj3.jpg \
  --labels highland_6539_self_stick_notes,mead_index_cards,kong_air_dog_squeakair_tennis_ball  # specify label list

# semantic segmentation example
cd examples/semantic_segmentation
labelme data_annotated/  # Open directory to annotate all images in it
labelme data_annotated/ --labels labels.txt  # specify label list with a file

有关更高级的用法,请参阅示例:

命令行参数

  • --output指定将写入批注的位置。如果位置以.json结尾,则此文件中将写入单个注释。如果使用.json指定位置,则只能注释一个图像。如果位置不是以.json结尾,程序将假定它是一个目录。批注将存储在此目录中,其名称与在其上进行批注的图像相对应
  • 第一次运行labelme时,它将在~/.labelmerc您可以编辑此文件,更改将在下次启动labelme时应用。如果您希望使用来自其他位置的配置文件,可以使用--config旗帜
  • 如果没有--nosortlabels标志时,程序将按字母顺序列出标签。当程序使用此标志运行时,它将按照标签提供的顺序显示标签
  • 将标志分配给整个图像。Example
  • 标签指定给单个多边形。Example

常见问题解答

测试

pip install hacking pytest pytest-qt
flake8 .
pytest -v tests

发展中的

git clone https://github.com/wkentaro/labelme.git
cd labelme

# Install anaconda3 and labelme
curl -L https://github.com/wkentaro/dotfiles/raw/master/local/bin/install_anaconda3.sh | bash -s .
source .anaconda3/bin/activate
pip install -e .

如何构建独立的可执行文件

下面显示了如何在MacOS、Linux和Windows上构建独立的可执行文件

# Setup conda
conda create --name labelme python==3.6.0
conda activate labelme

# Build the standalone executable
pip install .
pip install pyinstaller
pyinstaller labelme.spec
dist/labelme --version

如何做出贡献

确保以下测试在您的环境中通过
看见.github/workflows/ci.yml有关更多详细信息,请参阅

pip install black hacking pytest pytest-qt

flake8 .
black --line-length 79 --check labelme/
MPLBACKEND='agg' pytest tests/ -m 'not gpu'

确认

这项回购是mpitid/pylabelme,它的发育已经停止了

引用此项目

如果您在研究中使用此项目或希望参考自述文件中发布的基线结果,请使用以下BibTeX条目

@misc{labelme2016,
  author =       {Kentaro Wada},
  title =        {{labelme: Image Polygonal Annotation with Python}},
  howpublished = {\url{https://github.com/wkentaro/labelme}},
  year =         {2016}
}

Computervision-recipes-计算机视觉的最佳实践、代码示例和文档

计算机视觉

近年来,我们看到了计算机视觉的非同寻常的增长,应用于人脸识别、图像理解、搜索、无人机、地图绘制、半自动和自动驾驶车辆。其中许多应用的关键部分是视觉识别任务,例如图像分类、目标检测和图像相似度

此存储库提供构建计算机视觉系统的示例和最佳实践指南。该存储库的目标是构建一套全面的工具和示例,以利用计算机视觉算法、神经体系结构和实现此类系统的最新进展。我们不是从头开始创建实现,而是利用现有的最先进的库,围绕加载图像数据、优化和评估模型以及向上扩展到云来构建额外的实用程序。此外,在此领域工作多年后,我们的目标是回答常见问题,指出经常观察到的陷阱,并展示如何使用云进行培训和部署

我们希望这些示例和实用程序可以通过按数量级简化从定义业务问题到开发解决方案的过程来显著缩短“上市时间”。此外,示例笔记本将作为指南,并以多种语言展示工具的最佳实践和用法

这些示例提供为Jupyter notebooks也很常见utility functions所有示例都使用PyTorch作为底层深度学习库

目标受众

我们这个存储库的目标受众包括具有不同计算机视觉知识水平的数据科学家和机器学习工程师,因为我们的内容是纯来源的,目标是自定义的机器学习建模。所提供的实用程序和示例旨在作为解决实际视觉问题的加速器

快速入门

要开始,请导航到Setup Guide,其中列出了有关如何设置计算环境和运行此Repo中的笔记本所需的依赖项的说明。设置环境后,请导航到Scenarios文件夹,开始浏览笔记本。我们建议从图像分类笔记本,因为这引入了其他场景也使用的概念(例如关于ImageNet的预培训)

或者,我们支持活页夹Binder只需点击此链接,即可在网络浏览器中轻松试用我们的笔记本电脑。然而,Binder是免费的,因此只提供有限的CPU计算能力,并且没有GPU支持。预计笔记本的运行速度会非常慢(通过将图像分辨率降低到例如60像素,这在一定程度上有所改善,但代价是精确度较低)

场景

以下是此存储库中涵盖的常用计算机视觉场景的摘要。对于每个主要场景(“基础”),我们都会提供工具来有效地构建您自己的模型。这包括在您自己的数据上微调您自己的模型等简单任务,以及硬性否定挖掘甚至模型部署等更复杂的任务

场景 支持 描述
Classification 基地 图像分类是一种有监督的机器学习技术,用于学习和预测给定图像的类别
Similarity 基地 图像相似度是一种计算给定一对图像的相似度分数的方法。在给定图像的情况下,它允许您识别给定数据集中最相似的图像
Detection 基地 对象检测是一种允许您检测图像中对象的边界框的技术
Keypoints 基地 关键点检测可用于检测对象上的特定点。提供了一种预先训练的模型来检测人体关节,以进行人体姿态估计。
Segmentation 基地 图像分割为图像中的每个像素分配类别
Action recognition 基地 动作识别,用于在视频/网络摄像机镜头中识别执行的动作(例如,“运行”、“打开瓶子”)以及各自的开始/结束时间。我们还实现了可以在(Contrib)[contrib]下找到的动作识别的i3D实现
Tracking 基地 跟踪允许随时间检测和跟踪视频序列中的多个对象
Crowd counting Contrrib 统计低人群密度(如10人以下)和高人群密度(如数千人)场景下的人数

我们将支持的CV方案分为两个位置:(I)基地:“utils_cv”和“Scenario”文件夹中的代码和笔记本遵循严格的编码准则,经过良好的测试和维护;(Ii)Contrrib:“contrib”文件夹中的代码和其他资源,主要介绍使用尖端技术的不太常见的CV场景。“contrib”中的代码没有定期测试或维护

计算机视觉在蔚蓝上的应用

请注意,对于某些计算机视觉问题,您可能不需要构建自己的模型。取而代之的是,Azure上存在预先构建的或可轻松定制的解决方案,不需要任何自定义编码或机器学习专业知识。我们强烈建议您评估这些方法是否足以解决您的问题。如果这些解决方案不适用,或者这些解决方案的准确性不够,则可能需要求助于更复杂、更耗时的自定义方法

以下Microsoft服务提供了解决常见计算机视觉任务的简单解决方案:

  • Vision Services是一组经过预先训练的睡觉API,可以调用它们来进行图像标记、人脸识别、光学字符识别、视频分析等。这些API开箱即用,只需要极少的机器学习专业知识,但定制功能有限。查看各种可用的演示以体验该功能(例如Computer Vision)。该服务可通过API调用或通过SDK(以.NET、Python、Java、Node和Go语言提供)使用
  • Custom Vision是一项SaaS服务,用于在给定用户提供的培训集的情况下将模型训练和部署为睡觉应用编程接口。所有步骤,包括图像上传、注释和模型部署,都可以使用直观的UI或通过SDK(.Net、Python、Java、Node和Go语言)执行。训练图像分类或目标检测模型可以用最少的机器学习专业知识来实现。与使用预先培训的认知服务API相比,Custom Vision提供了更大的灵活性,但需要用户自带数据并对其进行注释

如果您需要培训您自己的模型,以下服务和链接提供了可能有用的附加信息

  • Azure Machine Learning service (AzureML)是一项帮助用户加速训练和部署机器学习模型的服务。虽然AzureML Python SDK不特定于计算机视觉工作负载,但它可以用于可伸缩且可靠的培训,并将机器学习解决方案部署到云中。我们在此存储库中的几个笔记本中利用Azure机器学习(例如deployment to Azure Kubernetes Service)
  • Azure AI Reference architectures提供一组示例(由代码支持),说明如何构建利用多个云组件的常见面向AI的工作负载。虽然不是特定于计算机视觉的,但这些参考体系结构涵盖了几个机器学习工作负载,例如模型部署或批处理评分

生成状态

AzureML测试

构建类型 分支机构 状态 分支机构 状态
Linux GPU 师傅 Build Status 试运行 Build Status
Linux CPU 师傅 Build Status 试运行 Build Status
笔记本电脑单元GPU 师傅 Build Status 试运行 Build Status

贡献

这个项目欢迎大家提供意见和建议。请参阅我们的contribution guidelines

Jina-面向任何类别数据的云原生神经搜索框架

Jina logo: Jina is a cloud-native neural search framework

云-本地神经搜索[?]适用于以下方面的框架任何数据类型

Jina 允许您在短短几分钟内构建以深度学习为动力的搜索即服务

🌌所有数据类型-大规模索引和查询任何类型的非结构化数据:视频、图像、长/短文本、音乐、源代码、PDF等

🌩️FAST和本机云-从第一天开始的分布式架构,可扩展且设计为本地云:享受集装箱化、流式处理、并行、分片、异步调度、HTTP/GRPC/WebSocket协议

⏱️节省时间这个神经搜索系统的设计模式,从零到生产准备就绪的系统只需几分钟

🍱拥有您的堆栈-保持解决方案的端到端堆栈所有权,避免使用零散的、多供应商的通用旧式工具带来的集成陷阱

运行快速演示

安装

  • 通过PyPI:pip install -U "jina[standard]"
  • 通过Docker:docker run jinaai/jina:latest
更多安装选项
x86/64、arm64、v6、v7 Linux/MacOS和Python 3.7/3.8/3.9 Docker用户
最低要求
(不支持HTTP、WebSocket、Docker)
pip install jina docker run jinaai/jina:latest
Daemon pip install "jina[daemon]" docker run --network=host jinaai/jina:latest-daemon
使用附加服务 pip install "jina[devel]" docker run jinaai/jina:latest-devel

版本标识符are explained here吉娜可以继续奔跑Windows Subsystem for Linux我们欢迎社会各界帮助我们native Windows support

开始使用

文档、执行者和流是JINA中的三个基本概念

1个️⃣复制-粘贴下面的最小示例并运行它:

💡预赛:character embeddingpoolingEuclidean distance

The architecture of a simple neural search system powered by Jina

import numpy as np
from jina import Document, DocumentArray, Executor, Flow, requests

class CharEmbed(Executor):  # a simple character embedding with mean-pooling
    offset = 32  # letter `a`
    dim = 127 - offset + 1  # last pos reserved for `UNK`
    char_embd = np.eye(dim) * 1  # one-hot embedding for all chars

    @requests
    def foo(self, docs: DocumentArray, **kwargs):
        for d in docs:
            r_emb = [ord(c) - self.offset if self.offset <= ord(c) <= 127 else (self.dim - 1) for c in d.text]
            d.embedding = self.char_embd[r_emb, :].mean(axis=0)  # average pooling

class Indexer(Executor):
    _docs = DocumentArray()  # for storing all documents in memory

    @requests(on='/index')
    def foo(self, docs: DocumentArray, **kwargs):
        self._docs.extend(docs)  # extend stored `docs`

    @requests(on='/search')
    def bar(self, docs: DocumentArray, **kwargs):
        q = np.stack(docs.get_attributes('embedding'))  # get all embeddings from query docs
        d = np.stack(self._docs.get_attributes('embedding'))  # get all embeddings from stored docs
        euclidean_dist = np.linalg.norm(q[:, None, :] - d[None, :, :], axis=-1)  # pairwise euclidean distance
        for dist, query in zip(euclidean_dist, docs):  # add & sort match
            query.matches = [Document(self._docs[int(idx)], copy=True, scores={'euclid': d}) for idx, d in enumerate(dist)]
            query.matches.sort(key=lambda m: m.scores['euclid'].value)  # sort matches by their values

f = Flow(port_expose=12345, protocol='http', cors=True).add(uses=CharEmbed, parallel=2).add(uses=Indexer)  # build a Flow, with 2 parallel CharEmbed, tho unnecessary
with f:
    f.post('/index', (Document(text=t.strip()) for t in open(__file__) if t.strip()))  # index all lines of _this_ file
    f.block()  # block for listening request

2个️⃣打开http://localhost:12345/docs(扩展的Swagger UI)在浏览器中,单击/搜索制表符和输入:

{"data": [{"text": "@requests(on=something)"}]}

也就是说,我们希望从上面的代码片段中找到与以下内容最相似的行@request(on=something)现在单击执行巴顿!

Jina Swagger UI extension on visualizing neural search results

3个️⃣不是图形用户界面的人?那就让我们用Python来做吧!保持上述服务器运行,并启动一个简单的客户端:

from jina import Client, Document
from jina.types.request import Response


def print_matches(resp: Response):  # the callback function invoked when task is done
    for idx, d in enumerate(resp.docs[0].matches[:3]):  # print top-3 matches
        print(f'[{idx}]{d.scores["euclid"].value:2f}: "{d.text}"')


c = Client(protocol='http', port_expose=12345)  # connect to localhost:12345
c.post('/search', Document(text='request(on=something)'), on_done=print_matches)

,它打印以下结果:

         Client@1608[S]:connected to the gateway at localhost:12345!
[0]0.168526: "@requests(on='/index')"
[1]0.181676: "@requests(on='/search')"
[2]0.192049: "query.matches = [Document(self._docs[int(idx)], copy=True, score=d) for idx, d in enumerate(dist)]"

😔不管用吗?我们的错!Please report it here.

阅读教程

支持

加入我们吧

吉娜的后盾是Jina AIWe are actively hiring全栈开发人员、解决方案工程师在开源领域构建下一个神经搜索生态系统

贡献

我们欢迎来自开源社区、个人和合作伙伴的各种贡献。我们的成功归功于你的积极参与

All Contributors



















Pandas-profiling 从Pandas DataFrame对象创建HTML分析报告

Pandas Profiling Logo Header

Documentation|Slack|Stack Overflow

从熊猫生成配置文件报告DataFrame

熊猫们df.describe()函数很棒,但对于严肃的探索性数据分析来说有点基础pandas_profiling将熊猫DataFrame扩展为df.profile_report()用于快速数据分析

对于每个列,以下统计信息(如果与列类型相关)显示在交互式HTML报告中:

  • 类型推理:检测types数据帧中的列数
  • 要领:类型、唯一值、缺少值
  • 分位数统计如最小值、Q1、中位数、Q3、最大值、范围、四分位数间范围
  • 描述性统计如均值、模态、标准差、和、中位数绝对偏差、变异系数、峰度、偏度
  • 最频繁值
  • 直方图
  • 相关性突出高度相关的变量、Spearman、Pearson和Kendall矩阵
  • 缺少值缺失值的矩阵、计数、热图和树状图
  • 文本分析了解文本数据的类别(大写、空格)、脚本(拉丁文、西里尔文)和块(ASCII
  • 文件和图像分析提取文件大小、创建日期和维度,并扫描截断的图像或包含EXIF信息的图像

公告

发布版本v3.0.0其中对报告配置进行了全面检查,提供了更直观的API并修复了以前全局配置固有的问题

这是第一个坚持SemverConventional Commits规格说明

电光后端正在进行中:我们可以很高兴地宣布,用于生成个人资料报告的电光后端已经接近v1。招聘测试者!电光后端将作为此软件包的预发行版发布

支持pandas-profiling

关于……的发展pandas-profiling完全依赖于捐款。如果您在该包中发现了价值,我们欢迎您通过以下方式直接支持该项目GitHub Sponsors好了!请帮助我继续支持这个方案。特别令人兴奋的是GitHub与您的贡献相匹配第一年

请在此处查找更多信息:

2021年5月9日💘


内容:Examples|Installation|Documentation|Large datasets|Command line usage|Advanced usage|integrations|Support|Types|How to contribute|Editor Integration|Dependencies


示例

下面的示例可以让您对软件包的功能有一个印象:

具体功能:

教程:

安装

使用管道

PyPi Downloads
PyPi Monthly Downloads
PyPi Version

通过运行以下命令,可以使用pip包管理器进行安装

pip install pandas-profiling[notebook]

或者,您也可以直接从Github安装最新版本:

pip install https://github.com/pandas-profiling/pandas-profiling/archive/master.zip

使用CONDA

Conda Downloads
Conda Version

通过运行以下命令,可以使用Conda包管理器进行安装

conda install -c conda-forge pandas-profiling

从源开始

通过克隆存储库或按键下载源代码‘Download ZIP’在这一页上

通过导航到正确的目录并运行以下命令来安装:

python setup.py install

文档

的文档pandas_profiling可以找到here以前的文档仍然可用here

快速入门

首先加载您的熊猫DataFrame,例如使用:

import numpy as np
import pandas as pd
from pandas_profiling import ProfileReport

df = pd.DataFrame(np.random.rand(100, 5), columns=["a", "b", "c", "d", "e"])

要生成报告,请运行以下命令:

profile = ProfileReport(df, title="Pandas Profiling Report")

更深入地探索

您可以按您喜欢的任何方式配置配置文件报告。下面的示例代码将explorative configuration file,它包括文本(长度分布、Unicode信息)、文件(文件大小、创建时间)和图像(尺寸、EXIF信息)的许多功能。如果您对使用的确切设置感兴趣,可以与default configuration file

profile = ProfileReport(df, title="Pandas Profiling Report", explorative=True)

了解有关配置的详细信息pandas-profilingAdvanced usage页面

木星笔记本

我们建议使用Jupyter笔记本以交互方式生成报告。有两个界面(参见下面的动画):通过小部件和通过HTML报告

Notebook Widgets

这是通过简单地显示报告来实现的。在Jupyter笔记本中,运行:

profile.to_widgets()

HTML报告可以包含在Jupyter笔记本中:

HTML

运行以下代码:

profile.to_notebook_iframe()

保存报告

如果要生成HTML报告文件,请保存ProfileReport添加到对象,并使用to_file()功能:

profile.to_file("your_report.html")

或者,您也可以以JSON的形式获取数据:

# As a string
json_data = profile.to_json()

# As a file
profile.to_file("your_report.json")

大型数据集

版本2.4引入了最小模式

这是禁用代价高昂的计算(如关联和重复行检测)的默认配置

使用以下语法:

profile = ProfileReport(large_dataset, minimal=True)
profile.to_file("output.html")

有基准可用here

命令行用法

对于熊猫可以立即读取的标准格式的CSV文件,您可以使用pandas_profiling可执行文件

有关选项和参数的信息,请运行以下命令

pandas_profiling -h

高级用法

可以使用一组选项来调整生成的报告

  • title(str):报告标题(默认为‘Pandas Profiling Report’)
  • pool_size(int):线程池中的工作进程数。设置为零时,它将设置为可用CPU数(默认情况下为0)
  • progress_bar(bool):如果为True,pandas-profiling将显示进度条
  • infer_dtypes(bool):何时True(默认)dtype的变量是使用visions使用排版逻辑(例如,将整数存储为字符串的列将被视为数字进行分析)

有关更多设置,请参阅default configuration fileminimal configuration file

您可以在高级用法页面上找到配置文档here

示例

profile = df.profile_report(
    title="Pandas Profiling Report", plot={"histogram": {"bins": 8}}
)
profile.to_file("output.html")

集成

寄予厚望

Great Expectations 分析数据与数据验证密切相关:通常,验证规则是根据众所周知的统计数据定义的。为此,pandas-profilingGreat Expectations这是一个世界级的开源库,可以帮助您维护数据质量并改善团队之间关于数据的沟通。远大期望允许您创建期望(基本上是数据的单元测试)和数据文档(便于共享的HTML数据报告)pandas-profiling提供了一种基于ProfileReport的结果创建一套预期的方法,您可以存储这些预期,并使用它来验证另一个(或将来的)数据集

您可以找到有关《远大前程》集成的更多详细信息here

支持开源

如果没有我们慷慨的赞助商的支持,维护和开发熊猫侧写的开源代码是不可能的,它有数百万的下载量和数千的用户

Lambda Labs Lambda workstations、服务器、笔记本电脑和云服务为财富500强公司和94%的前50所大学的工程师和研究人员提供动力。Lambda Cloud提供4个和8个GPU实例,起步价为1.5美元/小时。预装TensorFlow、PyTorch、Ubuntu、CUDA和cuDNN

我们要感谢我们慷慨的Github赞助商和支持者,是他们让熊猫侧写成为可能:

Martin Sotir, Brian Lee, Stephanie Rivera, abdulAziz, gramster

如果您想出现在此处,请查看更多信息:Github Sponsor page

类型

类型是有效数据分析的强大抽象,它超越了逻辑数据类型(整型、浮点型等)。pandas-profiling目前,可识别以下类型:布尔值、数值、日期、分类、URL、路径、文件图像

我们为Python开发了一个类型系统,为数据分析量身定做:visions选择合适的排版既可以提高整体表现力,又可以降低分析/代码的复杂性。要了解更多信息,请执行以下操作pandas-profiling的类型系统,请签出默认实现here同时,现在完全支持用户自定义摘要和类型定义-如果您有特定的用例,请提出想法或公关!

贡献

请阅读有关参与Contribution Guide

提出问题或开始贡献的一个低门槛的地方是通过接触熊猫-侧写松弛。Join the Slack community

编辑器集成

PyCharm集成

  1. 安装pandas-profiling通过上述说明
  2. 找到您的pandas-profiling可执行文件
    • 在MacOS/Linux/BSD上:

      $ which pandas_profiling
      (example) /usr/local/bin/pandas_profiling
    • 在Windows上:

      $ where pandas_profiling
      (example) C:\ProgramData\Anaconda3\Scripts\pandas_profiling.exe
  3. 在PyCharm中,转到设置(或首选项在MacOS上)>工具>外部工具
  4. 单击+图标以添加新的外部工具
  5. 插入以下值
    • 名称:熊猫侧写
    • 计划:在步骤2中获得的位置
    • 参数:"$FilePath$" "$FileDir$/$FileNameWithoutAllExtensions$_report.html"
    • 工作目录:$ProjectFileDir$

PyCharm Integration

要使用PyCharm集成,请右键单击任意数据集文件:

外部工具>熊猫侧写

其他集成

其他编辑器集成可以通过拉请求进行贡献

依赖项

配置文件报告是用HTML和CSS编写的,这意味着pandas-profiling需要现代浏览器

你需要Python 3来运行此程序包。其他依赖关系可以在需求文件中找到:

文件名 要求
requirements.txt 套餐要求
requirements-dev.txt 发展的要求
requirements-test.txt 测试的规定
setup.py 对微件等的要求