标签归档:keras

为什么TensorFlow 2比TensorFlow 1慢得多?

问题:为什么TensorFlow 2比TensorFlow 1慢得多?

许多用户都将其作为切换到Pytorch的原因,但是我还没有找到牺牲/最渴望的实用质量,速度和执行力的理由/解释。

以下是代码基准测试性能,即TF1与TF2的对比-TF1的运行速度提高了47%至276%

我的问题是:在图形或硬件级别上,什么导致如此显着的下降?


寻找详细的答案-已经熟悉广泛的概念。相关的Git

规格:CUDA 10.0.130,cuDNN 7.4.2,Python 3.7.4,Windows 10,GTX 1070


基准测试结果


UPDATE:禁用每下面的代码不会急于执行没有帮助。但是,该行为是不一致的:有时以图形方式运行会有所帮助,而其他时候其运行速度要比 Eager

由于TF开发人员没有出现在任何地方,因此我将自己进行调查-可以跟踪相关Github问题的进展。

更新2:分享大量实验结果,并附有解释;应该在今天完成。


基准代码

# use tensorflow.keras... to benchmark tf.keras; used GPU for all above benchmarks
from keras.layers import Input, Dense, LSTM, Bidirectional, Conv1D
from keras.layers import Flatten, Dropout
from keras.models import Model
from keras.optimizers import Adam
import keras.backend as K
import numpy as np
from time import time

batch_shape = (32, 400, 16)
X, y = make_data(batch_shape)

model_small = make_small_model(batch_shape)
model_small.train_on_batch(X, y)  # skip first iteration which builds graph
timeit(model_small.train_on_batch, 200, X, y)

K.clear_session()  # in my testing, kernel was restarted instead

model_medium = make_medium_model(batch_shape)
model_medium.train_on_batch(X, y)  # skip first iteration which builds graph
timeit(model_medium.train_on_batch, 10, X, y)

使用的功能

def timeit(func, iterations, *args):
    t0 = time()
    for _ in range(iterations):
        func(*args)
    print("Time/iter: %.4f sec" % ((time() - t0) / iterations))

def make_small_model(batch_shape):
    ipt   = Input(batch_shape=batch_shape)
    x     = Conv1D(128, 400, strides=4, padding='same')(ipt)
    x     = Flatten()(x)
    x     = Dropout(0.5)(x)
    x     = Dense(64, activation='relu')(x)
    out   = Dense(1,  activation='sigmoid')(x)
    model = Model(ipt, out)
    model.compile(Adam(lr=1e-4), 'binary_crossentropy')
    return model

def make_medium_model(batch_shape):
    ipt   = Input(batch_shape=batch_shape)
    x     = Bidirectional(LSTM(512, activation='relu', return_sequences=True))(ipt)
    x     = LSTM(512, activation='relu', return_sequences=True)(x)
    x     = Conv1D(128, 400, strides=4, padding='same')(x)
    x     = Flatten()(x)
    x     = Dense(256, activation='relu')(x)
    x     = Dropout(0.5)(x)
    x     = Dense(128, activation='relu')(x)
    x     = Dense(64,  activation='relu')(x)
    out   = Dense(1,   activation='sigmoid')(x)
    model = Model(ipt, out)
    model.compile(Adam(lr=1e-4), 'binary_crossentropy')
    return model

def make_data(batch_shape):
    return np.random.randn(*batch_shape), np.random.randint(0, 2, (batch_shape[0], 1))

It’s been cited by many users as the reason for switching to Pytorch, but I’ve yet to find a justification / explanation for sacrificing the most important practical quality, speed, for eager execution.

Below is code benchmarking performance, TF1 vs. TF2 – with TF1 running anywhere from 47% to 276% faster.

My question is: what is it, at the graph or hardware level, that yields such a significant slowdown?


Looking for a detailed answer – am already familiar with broad concepts. Relevant Git

Specs: CUDA 10.0.130, cuDNN 7.4.2, Python 3.7.4, Windows 10, GTX 1070


Benchmark results:


UPDATE: Disabling Eager Execution per below code does not help. The behavior, however, is inconsistent: sometimes running in graph mode helps considerably, other times it runs slower relative to Eager.

As TF devs don’t appear around anywhere, I’ll be investigating this matter myself – can follow progress in the linked Github issue.

UPDATE 2: tons of experimental results to share, along explanations; should be done today.


Benchmark code:

# use tensorflow.keras... to benchmark tf.keras; used GPU for all above benchmarks
from keras.layers import Input, Dense, LSTM, Bidirectional, Conv1D
from keras.layers import Flatten, Dropout
from keras.models import Model
from keras.optimizers import Adam
import keras.backend as K
import numpy as np
from time import time

batch_shape = (32, 400, 16)
X, y = make_data(batch_shape)

model_small = make_small_model(batch_shape)
model_small.train_on_batch(X, y)  # skip first iteration which builds graph
timeit(model_small.train_on_batch, 200, X, y)

K.clear_session()  # in my testing, kernel was restarted instead

model_medium = make_medium_model(batch_shape)
model_medium.train_on_batch(X, y)  # skip first iteration which builds graph
timeit(model_medium.train_on_batch, 10, X, y)

Functions used:

def timeit(func, iterations, *args):
    t0 = time()
    for _ in range(iterations):
        func(*args)
    print("Time/iter: %.4f sec" % ((time() - t0) / iterations))

def make_small_model(batch_shape):
    ipt   = Input(batch_shape=batch_shape)
    x     = Conv1D(128, 400, strides=4, padding='same')(ipt)
    x     = Flatten()(x)
    x     = Dropout(0.5)(x)
    x     = Dense(64, activation='relu')(x)
    out   = Dense(1,  activation='sigmoid')(x)
    model = Model(ipt, out)
    model.compile(Adam(lr=1e-4), 'binary_crossentropy')
    return model

def make_medium_model(batch_shape):
    ipt   = Input(batch_shape=batch_shape)
    x     = Bidirectional(LSTM(512, activation='relu', return_sequences=True))(ipt)
    x     = LSTM(512, activation='relu', return_sequences=True)(x)
    x     = Conv1D(128, 400, strides=4, padding='same')(x)
    x     = Flatten()(x)
    x     = Dense(256, activation='relu')(x)
    x     = Dropout(0.5)(x)
    x     = Dense(128, activation='relu')(x)
    x     = Dense(64,  activation='relu')(x)
    out   = Dense(1,   activation='sigmoid')(x)
    model = Model(ipt, out)
    model.compile(Adam(lr=1e-4), 'binary_crossentropy')
    return model

def make_data(batch_shape):
    return np.random.randn(*batch_shape), np.random.randint(0, 2, (batch_shape[0], 1))

回答 0

2020年2月18日更新:我每晚排练 2.1和2.1;结果好坏参半。除了一个配置(模型和数据大小)外,其他配置的运行速度都快于TF2和TF1的最佳配置。速度较慢且急剧下降的是大型-尤其是。在图形执行中(慢1.6倍至2.5倍)。

此外,对于我测试的大型模型,Graph和Eager之间存在极大的可重复性差异-无法通过随机性/计算并行性来解释这一差异。我目前无法按时间限制显示这些声明的可重现代码,因此我强烈建议您针对自己的模型进行测试。

尚未针对这些问题打开Git问题,但我确实对原始内容发表了评论-尚未回复。取得进展后,我将更新答案。


VERDICT:它是不是,如果你知道自己在做什么。但是,如果您不这样做,则可能会花费大量成本-平均而言,需要进行几次GPU升级,而在最坏的情况下,则需要多个GPU。


答案:旨在提供对该问题的高级描述,以及有关如何根据您的需求决定培训配置的指南。有关详细的低级描述(包括所有基准测试结果和所使用的代码),请参阅我的其他答案。

如果我学到更多信息,我将更新我的答案,并提供更多信息-可以为该问题添加书签/“加上星号”以供参考。


问题摘要:正如TensorFlow开发人员Q. Scott Zhu 确认的那样,TF2专注于Eager执行和带有Keras的紧密集成的开发,这涉及到TF源的全面更改-包括图形级。好处:大大扩展了处理,分发,调试和部署功能。但是,其中一些成本是速度。

但是,这个问题要复杂得多。不仅仅是TF1和TF2-导致火车速度显着差异的因素包括:

  1. TF2与TF1
  2. 渴望与图表模式
  3. kerastf.keras
  4. numpyvs. tf.data.Datasetvs ….
  5. train_on_batch()fit()
  6. GPU与CPU
  7. model(x)vs. model.predict(x)vs ….

不幸的是,以上几乎没有一个是彼此独立的,并且每个相对于另一个可以至少使执行时间加倍。幸运的是,您可以确定哪些是系统上最有效的方法,并提供一些捷径-正如我将要展示的。


我该怎么办?当前,唯一的方法是-针对您的特定模型,数据和硬件进行实验。没有任何单一的配置总是最好的工作-但也做的,并没有对简化搜索:

>>做:

  • train_on_batch()++ numpy+ tf.kerasTF1 +热切/图
  • train_on_batch()+ numpy+ tf.keras+ + TF2图
  • fit()++ numpy+ tf.kerasTF1 / TF2 +图表+大型模型和数据

>>不要:

  • fit()+ numpy+ keras用于中小型模型和数据
  • fit()++ numpy+ tf.kerasTF1 / TF2 +渴望
  • train_on_batch()+ numpy+ keras+ + TF1伊格

  • [主要] tf.python.keras;它的运行速度可以降低10到100倍,并且带有许多错误;更多信息

    • 这包括layersmodelsoptimizers,和相关的“乱用”的使用进口; ops,utils和相关的“私有”导入都可以-但可以肯定的是,请检查alt以及它们是否用于tf.keras

请参阅其他答案底部的代码,以获取基准测试设置示例。上面的列表主要基于其他答案中的“ BENCHMARKS”表。


上述注意事项的局限性

  • 这个问题的标题是“为什么TF2比TF1慢得多?”,尽管它的主体明确地涉及训练,但问题并不局限于此。即使在相同的TF版本,导入,数据格式等中,推理也将受到主要速度差异的影响-参见此答案
  • RNN在TF2中得到了改进,很可能会明显改变其他答案中的数据网格。
  • 模型主要用于Conv1DDense-不RNNs,稀疏数据/目标,4 / 5D输入,和其他CONFIGS
  • 输入数据限制为numpytf.data.Dataset,同时存在许多其他格式;查看其他答案
  • 使用了GPU;结果在CPU上有所不同。实际上,当我问这个问题时,我的CUDA配置不正确,并且某些结果是基于CPU的。

为什么TF2为急切执行而牺牲了最实用的质量,速度?显然,它还没有-图形仍然可用。但是如果问题是“为什么要渴望”:

  • 出色的调试:您可能会遇到许多问题,询问“如何获得中间层输出”或“如何检查权重”;渴望,它(几乎)很简单.__dict__。相比之下,Graph需要熟悉特殊的后端功能-极大地增加了调试和自省的整个过程。
  • 更快的原型制作:与上述类似的想法;更快的理解=剩下更多的时间用于实际DL。

如何启用/禁用EAGER?

tf.enable_eager_execution()  # TF1; must be done before any model/tensor creation
tf.compat.v1.disable_eager_execution() # TF2; above holds

附加信息

  • 仔细_on_batch()研究TF2中的方法;根据TF开发人员的说法,他们仍然使用较慢的实现方式,但不是故意的 -即必须解决。有关详细信息,请参见其他答案。

张力流需求

  1. 请修复train_on_batch(),以及fit()迭代调用的性能方面;定制火车循环对许多人尤其是我来说很重要。
  2. 添加有关这些性能差异的文档/文档字符串,以供用户了解。
  3. 提高一般执行速度,以防止窥视现象跳入Pytorch。

致谢:感谢


更新

  • 191114日 -找到了一个模型(在我的实际应用程序中),该模型在TF2上针对所有*配置(带有Numpy输入数据)的速度较慢。差异范围为13-19%,平均为17%。但是,keras和之间的tf.keras差异更为明显:平均18-40%。32%(TF1和2)。(*-渴望者(TF2 OOM’d为此)

  • 11/17/19 -devs on_batch()最近的一次提交中更新了方法,指出已提高了速度-将在TF 2.1中发布,或现在以形式提供tf-nightly。由于我无法让后者运行,因此将替补席推迟到2.1。

  • 2/20/20-预测性能也值得借鉴;例如,在TF2中,CPU预测时间可能涉及周期性的峰值

UPDATE 2/18/2020: I’ve benched 2.1 and 2.1-nightly; the results are mixed. All but one configs (model & data size) are as fast as or much faster than the best of TF2 & TF1. The one that’s slower, and slower dramatically, is Large-Large – esp. in Graph execution (1.6x to 2.5x slower).

Furthermore, there are extreme reproducibility differences between Graph and Eager for a large model I tested – one not explainable via randomness/compute-parallelism. I can’t currently present reproducible code for these claims per time constraints, so instead I strongly recommend testing this for your own models.

Haven’t opened a Git issue on these yet, but I did comment on the original – no response yet. I’ll update the answer(s) once progress is made.


VERDICT: it isn’t, IF you know what you’re doing. But if you don’t, it could cost you, lots – by a few GPU upgrades on average, and by multiple GPUs worst-case.


THIS ANSWER: aims to provide a high-level description of the issue, as well as guidelines for how to decide on the training configuration specific to your needs. For a detailed, low-level description, which includes all benchmarking results + code used, see my other answer.

I’ll be updating my answer(s) w/ more info if I learn any – can bookmark / “star” this question for reference.


ISSUE SUMMARY: as confirmed by a TensorFlow developer, Q. Scott Zhu, TF2 focused development on Eager execution & tight integration w/ Keras, which involved sweeping changes in TF source – including at graph-level. Benefits: greatly expanded processing, distribution, debug, and deployment capabilities. The cost of some of these, however, is speed.

The matter, however, is fairly more complex. It isn’t just TF1 vs. TF2 – factors yielding significant differences in train speed include:

  1. TF2 vs. TF1
  2. Eager vs. Graph mode
  3. keras vs. tf.keras
  4. numpy vs. tf.data.Dataset vs. …
  5. train_on_batch() vs. fit()
  6. GPU vs. CPU
  7. model(x) vs. model.predict(x) vs. …

Unfortunately, almost none of the above are independent of the other, and each can at least double execution time relative to another. Fortunately, you can determine what’ll work best systematically, and with a few shortcuts – as I’ll be showing.


WHAT SHOULD I DO? Currently, the only way is – experiment for your specific model, data, and hardware. No single configuration will always work best – but there are do’s and don’t’s to simplify your search:

>> DO:

  • train_on_batch() + numpy + tf.keras + TF1 + Eager/Graph
  • train_on_batch() + numpy + tf.keras + TF2 + Graph
  • fit() + numpy + tf.keras + TF1/TF2 + Graph + large model & data

>> DON’T:

  • fit() + numpy + keras for small & medium models and data
  • fit() + numpy + tf.keras + TF1/TF2 + Eager
  • train_on_batch() + numpy + keras + TF1 + Eager

  • [Major] tf.python.keras; it can run 10-100x slower, and w/ plenty of bugs; more info

    • This includes layers, models, optimizers, & related “out-of-box” usage imports; ops, utils, & related ‘private’ imports are fine – but to be sure, check for alts, & whether they’re used in tf.keras

Refer to code at bottom of my other answer for an example benchmarking setup. The list above is based mainly on the “BENCHMARKS” tables in the other answer.


LIMITATIONS of the above DO’s & DON’T’s:

  • This question’s titled “Why is TF2 much slower than TF1?”, and while its body concerns training explicitly, the matter isn’t limited to it; inference, too, is subject to major speed differences, even within the same TF version, import, data format, etc. – see this answer.
  • RNNs are likely to notably change the data grid in the other answer, as they’ve been improved in TF2
  • Models primarily used Conv1D and Dense – no RNNs, sparse data/targets, 4/5D inputs, & other configs
  • Input data limited to numpy and tf.data.Dataset, while many other formats exist; see other answer
  • GPU was used; results will differ on a CPU. In fact, when I asked the question, my CUDA wasn’t properly configured, and some of the results were CPU-based.

Why did TF2 sacrifice the most practical quality, speed, for eager execution? It hasn’t, clearly – graph is still available. But if the question is “why eager at all”:

  • Superior debugging: you’ve likely come across multitudes of questions asking “how do I get intermediate layer outputs” or “how do I inspect weights”; with eager, it’s (almost) as simple as .__dict__. Graph, in contrast, requires familiarity with special backend functions – greatly complicating the entire process of debugging & introspection.
  • Faster prototyping: per ideas similar to above; faster understanding = more time left for actual DL.

HOW TO ENABLE/DISABLE EAGER?

tf.enable_eager_execution()  # TF1; must be done before any model/tensor creation
tf.compat.v1.disable_eager_execution() # TF2; above holds

ADDITIONAL INFO:

  • Careful with _on_batch() methods in TF2; according to the TF dev, they still use a slower implementation, but not intentionally – i.e. it’s to be fixed. See other answer for details.

REQUESTS TO TENSORFLOW DEVS:

  1. Please fix train_on_batch(), and the performance aspect of calling fit() iteratively; custom train loops are important to many, especially to me.
  2. Add documentation / docstring mention of these performance differences for users’ knowledge.
  3. Improve general execution speed to keep peeps from hopping to Pytorch.

ACKNOWLEDGEMENTS: Thanks to


UPDATES:

  • 11/14/19 – found a model (in my real application) that that runs slower on TF2 for all* configurations w/ Numpy input data. Differences ranged 13-19%, averaging 17%. Differences between keras and tf.keras, however, were more dramatic: 18-40%, avg. 32% (both TF1 & 2). (* – except Eager, for which TF2 OOM’d)

  • 11/17/19 – devs updated on_batch() methods in a recent commit, stating to have improved speed – to be released in TF 2.1, or available now as tf-nightly. As I’m unable to get latter running, will delay benching until 2.1.

  • 2/20/20 – prediction performance is also worth benching; in TF2, for example, CPU prediction times can involve periodic spikes

回答 1

解答:旨在提供对该问题的详细图形/硬件级别描述-包括TF2与TF1训练循环,输入数据处理器以及Eager与Graph模式的执行。有关问题摘要和解决方案的指南,请参见我的其他答案。


性能验证:有时一个更快,有时另一个更快,具体取决于配置。就TF2与TF1而言,它们的平均水平差不多,但是确实存在基于配置的重大差异,并且TF1比TF2更为常见。请参阅下面的“标记”。


EAGER VS. GRAPH:这可以说是整个答案的关键:根据我的测试,TF2的渴望比TF1的渴望。细节进一步下降。

两者之间的根本区别是:Graph 主动设置计算网络,并在“提示”时执行-而Eager在创建时执行所有操作。但故事只从这里开始:

  • 渴望并不是没有Graph,实际上可能主要是 Graph,这与预期相反。它主要是执行图 -包括模型和优化器权重,占图的很大一部分。

  • 渴望在执行时重建自己图的一部分 ; Graph未完全构建的直接结果-请参阅分析器结果。这具有计算开销。

  • 渴望慢与脾气暴躁的输入 ; 根据此Git注释和代码,Eager中的Numpy输入包括将张量从CPU复制到GPU的开销成本。遍历源代码,数据处理差异很明显;渴望直接通过Numpy,而图则通过张量,然后求和为Numpy。不确定确切的过程,但后者应涉及GPU级别的优化

  • TF2 Eager 比TF1 Eager -这是…意外。请参阅下面的基准测试结果。差异从可以忽略不计到显着,但是是一致的。不确定为什么会这样-如果TF开发人员澄清了,将会更新答案。


TF2与TF1:引用TF开发人员Q. Scott Zhu的相关部分的回复 -附上我的强调和改写:

急切地,运行时需要执行ops并为python代码的每一行返回数值。单步执行的性质使其运行缓慢

在TF2中,Keras利用tf.function构建其图形进行训练,评估和预测。我们称它们为模型的“执行功能”。在TF1中,“执行功能”是FuncGraph,它与TF功能共享一些公共组件,但是实现方式不同。

在此过程中,我们以某种方式为train_on_batch(),test_on_batch()和预报_on_batch()留下了错误的实现。它们在数值上仍然是正确的,但是x_on_batch的执行函数是纯python函数,而不是tf.function包装的python函数。这会导致缓慢

在TF2中,我们将所有输入数据转换为tf.data.Dataset,通过它我们可以统一执行函数来处理单一类型的输入。数据集转换中可能会有一些开销,我认为这是一次性的开销,而不是每次批处理的开销

带有上一段的最后一句和下段的最后一句:

为了克服急切模式下的缓慢性,我们提供了@ tf.function,它将把python函数变成图形。当像np数组一样输入数值时,tf.function的主体将转换为静态图,进行优化,并返回最终值,该值很快,并且应具有与TF1图模式相似的性能。

我不同意-根据我的分析结果,该结果表明Eager的输入数据处理比Graph的处理要慢得多。另外,tf.data.Dataset尤其不确定,但是Eager确实反复调用了多个相同的数据转换方法-请参阅事件探查器。

最后,开发人员的链接提交:支持Keras v2循环的大量更改


训练循环:取决于(1)渴望与图表;(2)输入数据格式,训练将在一个独特的训练循环中进行-在TF2中_select_training_loop()training.py,其中之一:

training_v2.Loop()
training_distributed.DistributionMultiWorkerTrainingLoop(
              training_v2.Loop()) # multi-worker mode
# Case 1: distribution strategy
training_distributed.DistributionMultiWorkerTrainingLoop(
            training_distributed.DistributionSingleWorkerTrainingLoop())
# Case 2: generator-like. Input is Python generator, or Sequence object,
# or a non-distributed Dataset or iterator in eager execution.
training_generator.GeneratorOrSequenceTrainingLoop()
training_generator.EagerDatasetOrIteratorTrainingLoop()
# Case 3: Symbolic tensors or Numpy array-like. This includes Datasets and iterators 
# in graph mode (since they generate symbolic tensors).
training_generator.GeneratorLikeTrainingLoop() # Eager
training_arrays.ArrayLikeTrainingLoop() # Graph

每个人对资源分配的处理方式不同,并会对性能和功能造成影响。


火车循环:fitvs train_on_batchkerasvstf.keras:四个循环都使用不同的火车循环,尽管可能不是每种可能的组合。kerasfit,例如,使用的形式fit_loop,例如training_arrays.fit_loop(),其train_on_batch可以使用K.function()tf.keras具有更复杂的层次结构,在上一节中进行了部分描述。


训练循环:文档 -有关某些不同执行方法的相关源文档字符串

与其他TensorFlow操作不同,我们不会将python数值输入转换为张量。此外,将为每个不同的python数值生成一个新图

function 为每个唯一的输入形状和数据类型集实例化一个单独的图

一个tf.function对象可能需要映射到后台的多个计算图。这应该仅在性能上可见(跟踪图的计算和内存成本非零


输入数据处理器:与上述类似,根据运行时配置(执行模式,数据格式,分发策略)设置的内部标志,视情况选择处理器。最简单的情况是使用Eager,它可以直接与Numpy数组一起使用。有关某些特定示例,请参见此答案


模型大小,数据大小:

  • 是决定性的;没有任何一种配置能在所有型号和数据尺寸上脱颖而出。
  • 相对于模型大小的数据大小很重要;对于小型数据和模型,数据传输(例如,CPU至GPU)的开销可能占主导。同样,小型的开销处理器在每个数据转换时间对大型数据的运行速度上较慢(请参见 convert_to_tensor“配置文件”)
  • 速度因火车循环和输入数据处理器处理资源的方式而异。

基准:磨碎的肉。- Word文档Excel电子表格


术语

  • 减去%的数字都是
  • %计算为(1 - longer_time / shorter_time)*100; 理由:我们对哪个因素比另一个因素更快感兴趣shorter / longer实际上是非线性关系,对直接比较没有用
  • 百分号确定:
    • TF2 vs TF1:+如果TF2更快
    • GvE(图表vs.渴望):+如果图表更快
  • TF2 = TensorFlow 2.0.0 + Keras 2.3.1; TF1 = TensorFlow 1.14.0 + Keras 2.2.5

简介


PROFILER-说明:Spyder 3.3.6 IDE分析器。

  • 有些功能在其他嵌套中重复;因此,很难找到“数据处理”和“训练”功能之间的确切间隔,因此会有一些重叠-在最后一个结果中很明显。

  • 计算的wrt运行时数减去构建时间的百分比

  • 通过将所有(唯一)运行时间相加得出的构建时间来计算,这些运行时间称为1或2次
  • 通过累加所有(唯一的)运行时间(与迭代的次数和它们的嵌套的运行时间相同)计算出的训练时间
  • 不幸的是,函数是根据其原始名称进行_func = func概要分析的(即,将概要分析为func),这会混入构建时间-因此需要将其排除在外

测试环境

  • 底部执行的代码带有最少的后台任务运行
  • GPU是“热身” W /定时重复前几次反复,在提出这个帖子
  • 从源代码构建的CUDA 10.0.130,cuDNN 7.6.0,TensorFlow 1.14.0和TensorFlow 2.0.0,以及Anaconda
  • Python 3.7.4,Spyder 3.3.6 IDE
  • GTX 1070,Windows 10、24 GB DDR4 2.4 MHz RAM,i7-7700HQ 2.8 GHz CPU

方法

  • 基准“小”,“中”和“大”模型和数据大小
  • 固定每个模型大小的参数数,与输入数据大小无关
  • “较大”模型具有更多参数和层
  • “较大”的数据具有更长的序列,但相同batch_sizenum_channels
  • 模型只使用Conv1DDense“可学习”层; 每个TF版本的符号都避免了RNN。差异
  • 始终在基准循环之外运行一列火车,以省略模型和优化器图的构建
  • 不使用稀疏数据(例如layers.Embedding())或稀疏目标(例如SparseCategoricalCrossEntropy()

局限性:“完整”的答案将解释所有可能的火车循环和迭代器,但这肯定超出了我的时间能力,不存在薪水或一般必要性。结果仅与方法学一样好-用开放的心态进行解释。


代码

import numpy as np
import tensorflow as tf
import random
from termcolor import cprint
from time import time

from tensorflow.keras.layers import Input, Dense, Conv1D
from tensorflow.keras.layers import Dropout, GlobalAveragePooling1D
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
import tensorflow.keras.backend as K
#from keras.layers import Input, Dense, Conv1D
#from keras.layers import Dropout, GlobalAveragePooling1D
#from keras.models import Model 
#from keras.optimizers import Adam
#import keras.backend as K

#tf.compat.v1.disable_eager_execution()
#tf.enable_eager_execution()

def reset_seeds(reset_graph_with_backend=None, verbose=1):
    if reset_graph_with_backend is not None:
        K = reset_graph_with_backend
        K.clear_session()
        tf.compat.v1.reset_default_graph()
        if verbose:
            print("KERAS AND TENSORFLOW GRAPHS RESET")

    np.random.seed(1)
    random.seed(2)
    if tf.__version__[0] == '2':
        tf.random.set_seed(3)
    else:
        tf.set_random_seed(3)
    if verbose:
        print("RANDOM SEEDS RESET")

print("TF version: {}".format(tf.__version__))
reset_seeds()

def timeit(func, iterations, *args, _verbose=0, **kwargs):
    t0 = time()
    for _ in range(iterations):
        func(*args, **kwargs)
        print(end='.'*int(_verbose))
    print("Time/iter: %.4f sec" % ((time() - t0) / iterations))

def make_model_small(batch_shape):
    ipt   = Input(batch_shape=batch_shape)
    x     = Conv1D(128, 40, strides=4, padding='same')(ipt)
    x     = GlobalAveragePooling1D()(x)
    x     = Dropout(0.5)(x)
    x     = Dense(64, activation='relu')(x)
    out   = Dense(1,  activation='sigmoid')(x)
    model = Model(ipt, out)
    model.compile(Adam(lr=1e-4), 'binary_crossentropy')
    return model

def make_model_medium(batch_shape):
    ipt = Input(batch_shape=batch_shape)
    x = ipt
    for filters in [64, 128, 256, 256, 128, 64]:
        x  = Conv1D(filters, 20, strides=1, padding='valid')(x)
    x     = GlobalAveragePooling1D()(x)
    x     = Dense(256, activation='relu')(x)
    x     = Dropout(0.5)(x)
    x     = Dense(128, activation='relu')(x)
    x     = Dense(64,  activation='relu')(x)
    out   = Dense(1,   activation='sigmoid')(x)
    model = Model(ipt, out)
    model.compile(Adam(lr=1e-4), 'binary_crossentropy')
    return model

def make_model_large(batch_shape):
    ipt   = Input(batch_shape=batch_shape)
    x     = Conv1D(64,  400, strides=4, padding='valid')(ipt)
    x     = Conv1D(128, 200, strides=1, padding='valid')(x)
    for _ in range(40):
        x = Conv1D(256,  12, strides=1, padding='same')(x)
    x     = Conv1D(512,  20, strides=2, padding='valid')(x)
    x     = Conv1D(1028, 10, strides=2, padding='valid')(x)
    x     = Conv1D(256,   1, strides=1, padding='valid')(x)
    x     = GlobalAveragePooling1D()(x)
    x     = Dense(256, activation='relu')(x)
    x     = Dropout(0.5)(x)
    x     = Dense(128, activation='relu')(x)
    x     = Dense(64,  activation='relu')(x)    
    out   = Dense(1,   activation='sigmoid')(x)
    model = Model(ipt, out)
    model.compile(Adam(lr=1e-4), 'binary_crossentropy')
    return model

def make_data(batch_shape):
    return np.random.randn(*batch_shape), \
           np.random.randint(0, 2, (batch_shape[0], 1))

def make_data_tf(batch_shape, n_batches, iters):
    data = np.random.randn(n_batches, *batch_shape),
    trgt = np.random.randint(0, 2, (n_batches, batch_shape[0], 1))
    return tf.data.Dataset.from_tensor_slices((data, trgt))#.repeat(iters)

batch_shape_small  = (32, 140,   30)
batch_shape_medium = (32, 1400,  30)
batch_shape_large  = (32, 14000, 30)

batch_shapes = batch_shape_small, batch_shape_medium, batch_shape_large
make_model_fns = make_model_small, make_model_medium, make_model_large
iterations = [200, 100, 50]
shape_names = ["Small data",  "Medium data",  "Large data"]
model_names = ["Small model", "Medium model", "Large model"]

def test_all(fit=False, tf_dataset=False):
    for model_fn, model_name, iters in zip(make_model_fns, model_names, iterations):
        for batch_shape, shape_name in zip(batch_shapes, shape_names):
            if (model_fn is make_model_large) and (batch_shape is batch_shape_small):
                continue
            reset_seeds(reset_graph_with_backend=K)
            if tf_dataset:
                data = make_data_tf(batch_shape, iters, iters)
            else:
                data = make_data(batch_shape)
            model = model_fn(batch_shape)

            if fit:
                if tf_dataset:
                    model.train_on_batch(data.take(1))
                    t0 = time()
                    model.fit(data, steps_per_epoch=iters)
                    print("Time/iter: %.4f sec" % ((time() - t0) / iters))
                else:
                    model.train_on_batch(*data)
                    timeit(model.fit, iters, *data, _verbose=1, verbose=0)
            else:
                model.train_on_batch(*data)
                timeit(model.train_on_batch, iters, *data, _verbose=1)
            cprint(">> {}, {} done <<\n".format(model_name, shape_name), 'blue')
            del model

test_all(fit=True, tf_dataset=False)

THIS ANSWER: aims to provide a detailed, graph/hardware-level description of the issue – including TF2 vs. TF1 train loops, input data processors, and Eager vs. Graph mode executions. For an issue summary & resolution guidelines, see my other answer.


PERFORMANCE VERDICT: sometimes one is faster, sometimes the other, depending on configuration. As far as TF2 vs TF1 goes, they’re about on par on average, but significant config-based differences do exist, and TF1 trumps TF2 more often than vice versa. See “BENCHMARKING” below.


EAGER VS. GRAPH: the meat of this entire answer for some: TF2’s eager is slower than TF1’s, according to my testing. Details further down.

The fundamental difference between the two is: Graph sets up a computational network proactively, and executes when ‘told to’ – whereas Eager executes everything upon creation. But the story only begins here:

  • Eager is NOT devoid of Graph, and may in fact be mostly Graph, contrary to expectation. What it largely is, is executed Graph – this includes model & optimizer weights, comprising a great portion of the graph.

  • Eager rebuilds part of own graph at execution; direct consequence of Graph not being fully built — see profiler results. This has a computational overhead.

  • Eager is slower w/ Numpy inputs; per this Git comment & code, Numpy inputs in Eager include the overhead cost of copying tensors from CPU to GPU. Stepping through source code, data handling differences are clear; Eager directly passes Numpy, while Graph passes tensors which then evaluate to Numpy; uncertain of the exact process, but latter should involve GPU-level optimizations

  • TF2 Eager is slower than TF1 Eager – this is… unexpected. See benchmarking results below. Differences span from negligible to significant, but are consistent. Unsure why it’s the case – if a TF dev clarifies, will update answer.


TF2 vs. TF1: quoting relevant portions of a TF dev’s, Q. Scott Zhu’s, response – w/ bit of my emphasis & rewording:

In eager, the runtime needs to execute the ops and return the numerical value for every line of python code. The nature of single step execution causes it to be slow.

In TF2, Keras leverages tf.function to build its graph for training, eval and prediction. We call them “execution function” for the model. In TF1, the “execution function” was a FuncGraph, which shared some common component as TF function, but has a different implementation.

During the process, we somehow left an incorrect implementation for train_on_batch(), test_on_batch() and predict_on_batch(). They are still numerically correct, but the execution function for x_on_batch is a pure python function, rather than a tf.function wrapped python function. This will cause slowness

In TF2, we convert all input data into a tf.data.Dataset, by which we can unify our execution function to handle the single type of the inputs. There might be some overhead in the dataset conversion, and I think this is a one-time only overhead, rather than a per-batch cost

With the last sentence of last paragraph above, and last clause of below paragraph:

To overcome the slowness in eager mode, we have @tf.function, which will turn a python function into a graph. When feed numerical value like np array, the body of the tf.function is converted into static graph, being optimized, and return the final value, which is fast and should have similar performance as TF1 graph mode.

I disagree – per my profiling results, which show Eager’s input data processing to be substantially slower than Graph’s. Also, unsure about tf.data.Dataset in particular, but Eager does repeatedly call multiple of the same data conversion methods – see profiler.

Lastly, dev’s linked commit: Significant number of changes to support the Keras v2 loops.


Train Loops: depending on (1) Eager vs. Graph; (2) input data format, training in will proceed with a distinct train loop – in TF2, _select_training_loop(), training.py, one of:

training_v2.Loop()
training_distributed.DistributionMultiWorkerTrainingLoop(
              training_v2.Loop()) # multi-worker mode
# Case 1: distribution strategy
training_distributed.DistributionMultiWorkerTrainingLoop(
            training_distributed.DistributionSingleWorkerTrainingLoop())
# Case 2: generator-like. Input is Python generator, or Sequence object,
# or a non-distributed Dataset or iterator in eager execution.
training_generator.GeneratorOrSequenceTrainingLoop()
training_generator.EagerDatasetOrIteratorTrainingLoop()
# Case 3: Symbolic tensors or Numpy array-like. This includes Datasets and iterators 
# in graph mode (since they generate symbolic tensors).
training_generator.GeneratorLikeTrainingLoop() # Eager
training_arrays.ArrayLikeTrainingLoop() # Graph

Each handles resource allocation differently, and bears consequences on performance & capability.


Train Loops: fit vs train_on_batch, keras vs. tf.keras: each of the four uses different train loops, though perhaps not in every possible combination. kerasfit, for example, uses a form of fit_loop, e.g. training_arrays.fit_loop(), and its train_on_batch may use K.function(). tf.keras has a more sophisticated hierarchy described in part in previous section.


Train Loops: documentation — relevant source docstring on some of the different execution methods:

Unlike other TensorFlow operations, we don’t convert python numerical inputs to tensors. Moreover, a new graph is generated for each distinct python numerical value

function instantiates a separate graph for every unique set of input shapes and datatypes.

A single tf.function object might need to map to multiple computation graphs under the hood. This should be visible only as performance (tracing graphs has a nonzero computational and memory cost)


Input data processors: similar to above, the processor is selected case-by-case, depending on internal flags set according to runtime configurations (execution mode, data format, distribution strategy). The simplest case’s with Eager, which works directly w/ Numpy arrays. For some specific examples, see this answer.


MODEL SIZE, DATA SIZE:

  • Is decisive; no single configuration crowned itself atop all model & data sizes.
  • Data size relative to model size is important; for small data & model, data transfer (e.g. CPU to GPU) overhead can dominate. Likewise, small overhead processors can run slower on large data per data conversion time dominating (see convert_to_tensor in “PROFILER”)
  • Speed differs per train loops’ and input data processors’ differing means of handling resources.

BENCHMARKS: the grinded meat. — Word DocumentExcel Spreadsheet


Terminology:

  • %-less numbers are all seconds
  • % computed as (1 - longer_time / shorter_time)*100; rationale: we’re interested by what factor one is faster than the other; shorter / longer is actually a non-linear relation, not useful for direct comparison
  • % sign determination:
    • TF2 vs TF1: + if TF2 is faster
    • GvE (Graph vs. Eager): + if Graph is faster
  • TF2 = TensorFlow 2.0.0 + Keras 2.3.1; TF1 = TensorFlow 1.14.0 + Keras 2.2.5

PROFILER:


PROFILER – Explanation: Spyder 3.3.6 IDE profiler.

  • Some functions are repeated in nests of others; hence, it’s hard to track down the exact separation between “data processing” and “training” functions, so there will be some overlap – as pronounced in the very last result.

  • % figures computed w.r.t. runtime minus build time

  • Build time computed by summing all (unique) runtimes which were called 1 or 2 times
  • Train time computed by summing all (unique) runtimes which were called the same # of times as the # of iterations, and some of their nests’ runtimes
  • Functions are profiled according to their original names, unfortunately (i.e. _func = func will profile as func), which mixes in build time – hence the need to exclude it

TESTING ENVIRONMENT:

  • Executed code at bottom w/ minimal background tasks running
  • GPU was “warmed up” w/ a few iterations before timing iterations, as suggested in this post
  • CUDA 10.0.130, cuDNN 7.6.0, TensorFlow 1.14.0, & TensorFlow 2.0.0 built from source, plus Anaconda
  • Python 3.7.4, Spyder 3.3.6 IDE
  • GTX 1070, Windows 10, 24GB DDR4 2.4-MHz RAM, i7-7700HQ 2.8-GHz CPU

METHODOLOGY:

  • Benchmark ‘small’, ‘medium’, & ‘large’ model & data sizes
  • Fix # of parameters for each model size, independent of input data size
  • “Larger” model has more parameters and layers
  • “Larger” data has a longer sequence, but same batch_size and num_channels
  • Models only use Conv1D, Dense ‘learnable’ layers; RNNs avoided per TF-version implem. differences
  • Always ran one train fit outside of benchmarking loop, to omit model & optimizer graph building
  • Not using sparse data (e.g. layers.Embedding()) or sparse targets (e.g. SparseCategoricalCrossEntropy()

LIMITATIONS: a “complete” answer would explain every possible train loop & iterator, but that’s surely beyond my time ability, nonexistent paycheck, or general necessity. The results are only as good as the methodology – interpret with an open mind.


CODE:

import numpy as np
import tensorflow as tf
import random
from termcolor import cprint
from time import time

from tensorflow.keras.layers import Input, Dense, Conv1D
from tensorflow.keras.layers import Dropout, GlobalAveragePooling1D
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
import tensorflow.keras.backend as K
#from keras.layers import Input, Dense, Conv1D
#from keras.layers import Dropout, GlobalAveragePooling1D
#from keras.models import Model 
#from keras.optimizers import Adam
#import keras.backend as K

#tf.compat.v1.disable_eager_execution()
#tf.enable_eager_execution()

def reset_seeds(reset_graph_with_backend=None, verbose=1):
    if reset_graph_with_backend is not None:
        K = reset_graph_with_backend
        K.clear_session()
        tf.compat.v1.reset_default_graph()
        if verbose:
            print("KERAS AND TENSORFLOW GRAPHS RESET")

    np.random.seed(1)
    random.seed(2)
    if tf.__version__[0] == '2':
        tf.random.set_seed(3)
    else:
        tf.set_random_seed(3)
    if verbose:
        print("RANDOM SEEDS RESET")

print("TF version: {}".format(tf.__version__))
reset_seeds()

def timeit(func, iterations, *args, _verbose=0, **kwargs):
    t0 = time()
    for _ in range(iterations):
        func(*args, **kwargs)
        print(end='.'*int(_verbose))
    print("Time/iter: %.4f sec" % ((time() - t0) / iterations))

def make_model_small(batch_shape):
    ipt   = Input(batch_shape=batch_shape)
    x     = Conv1D(128, 40, strides=4, padding='same')(ipt)
    x     = GlobalAveragePooling1D()(x)
    x     = Dropout(0.5)(x)
    x     = Dense(64, activation='relu')(x)
    out   = Dense(1,  activation='sigmoid')(x)
    model = Model(ipt, out)
    model.compile(Adam(lr=1e-4), 'binary_crossentropy')
    return model

def make_model_medium(batch_shape):
    ipt = Input(batch_shape=batch_shape)
    x = ipt
    for filters in [64, 128, 256, 256, 128, 64]:
        x  = Conv1D(filters, 20, strides=1, padding='valid')(x)
    x     = GlobalAveragePooling1D()(x)
    x     = Dense(256, activation='relu')(x)
    x     = Dropout(0.5)(x)
    x     = Dense(128, activation='relu')(x)
    x     = Dense(64,  activation='relu')(x)
    out   = Dense(1,   activation='sigmoid')(x)
    model = Model(ipt, out)
    model.compile(Adam(lr=1e-4), 'binary_crossentropy')
    return model

def make_model_large(batch_shape):
    ipt   = Input(batch_shape=batch_shape)
    x     = Conv1D(64,  400, strides=4, padding='valid')(ipt)
    x     = Conv1D(128, 200, strides=1, padding='valid')(x)
    for _ in range(40):
        x = Conv1D(256,  12, strides=1, padding='same')(x)
    x     = Conv1D(512,  20, strides=2, padding='valid')(x)
    x     = Conv1D(1028, 10, strides=2, padding='valid')(x)
    x     = Conv1D(256,   1, strides=1, padding='valid')(x)
    x     = GlobalAveragePooling1D()(x)
    x     = Dense(256, activation='relu')(x)
    x     = Dropout(0.5)(x)
    x     = Dense(128, activation='relu')(x)
    x     = Dense(64,  activation='relu')(x)    
    out   = Dense(1,   activation='sigmoid')(x)
    model = Model(ipt, out)
    model.compile(Adam(lr=1e-4), 'binary_crossentropy')
    return model

def make_data(batch_shape):
    return np.random.randn(*batch_shape), \
           np.random.randint(0, 2, (batch_shape[0], 1))

def make_data_tf(batch_shape, n_batches, iters):
    data = np.random.randn(n_batches, *batch_shape),
    trgt = np.random.randint(0, 2, (n_batches, batch_shape[0], 1))
    return tf.data.Dataset.from_tensor_slices((data, trgt))#.repeat(iters)

batch_shape_small  = (32, 140,   30)
batch_shape_medium = (32, 1400,  30)
batch_shape_large  = (32, 14000, 30)

batch_shapes = batch_shape_small, batch_shape_medium, batch_shape_large
make_model_fns = make_model_small, make_model_medium, make_model_large
iterations = [200, 100, 50]
shape_names = ["Small data",  "Medium data",  "Large data"]
model_names = ["Small model", "Medium model", "Large model"]

def test_all(fit=False, tf_dataset=False):
    for model_fn, model_name, iters in zip(make_model_fns, model_names, iterations):
        for batch_shape, shape_name in zip(batch_shapes, shape_names):
            if (model_fn is make_model_large) and (batch_shape is batch_shape_small):
                continue
            reset_seeds(reset_graph_with_backend=K)
            if tf_dataset:
                data = make_data_tf(batch_shape, iters, iters)
            else:
                data = make_data(batch_shape)
            model = model_fn(batch_shape)

            if fit:
                if tf_dataset:
                    model.train_on_batch(data.take(1))
                    t0 = time()
                    model.fit(data, steps_per_epoch=iters)
                    print("Time/iter: %.4f sec" % ((time() - t0) / iters))
                else:
                    model.train_on_batch(*data)
                    timeit(model.fit, iters, *data, _verbose=1, verbose=0)
            else:
                model.train_on_batch(*data)
                timeit(model.train_on_batch, iters, *data, _verbose=1)
            cprint(">> {}, {} done <<\n".format(model_name, shape_name), 'blue')
            del model

test_all(fit=True, tf_dataset=False)

Tensorflow 2.0-AttributeError:模块’tensorflow’没有属性’Session’

问题:Tensorflow 2.0-AttributeError:模块’tensorflow’没有属性’Session’

sess = tf.Session()在Tensorflow 2.0环境中执行命令时,出现如下错误消息:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'tensorflow' has no attribute 'Session'

系统信息:

  • 操作系统平台和发行版:Windows 10
  • python版本:3.7.1
  • Tensorflow版本:2.0.0-alpha0(随pip一起安装)

重现步骤:

安装:

  1. 点安装-升级点
  2. pip install tensorflow == 2.0.0-alpha0
  3. 点安装keras
  4. 点安装numpy == 1.16.2

执行:

  1. 执行命令:将tensorflow导入为tf
  2. 执行命令:sess = tf.Session()

When I am executing the command sess = tf.Session() in Tensorflow 2.0 environment, I am getting an error message as below:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'tensorflow' has no attribute 'Session'

System Information:

  • OS Platform and Distribution: Windows 10
  • Python Version: 3.7.1
  • Tensorflow Version: 2.0.0-alpha0 (installed with pip)

Steps to reproduce:

Installation:

  1. pip install –upgrade pip
  2. pip install tensorflow==2.0.0-alpha0
  3. pip install keras
  4. pip install numpy==1.16.2

Execution:

  1. Execute command: import tensorflow as tf
  2. Execute command: sess = tf.Session()

回答 0

根据TF 1:1 Symbols Map,在TF 2.0中,您应该使用tf.compat.v1.Session()而不是tf.Session()

https://docs.google.com/spreadsheets/d/1FLFJLzg7WNP6JHODX5q8BDgptKafq_slHpnHVbJIteQ/edit#gid=0

要获得TF 2.0中类似TF 1.x的行为,可以运行

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

但后来人们无法受益于TF 2.0所做的许多改进。有关更多详细信息,请参阅迁移指南 https://www.tensorflow.org/guide/migrate

According to TF 1:1 Symbols Map, in TF 2.0 you should use tf.compat.v1.Session() instead of tf.Session()

https://docs.google.com/spreadsheets/d/1FLFJLzg7WNP6JHODX5q8BDgptKafq_slHpnHVbJIteQ/edit#gid=0

To get TF 1.x like behaviour in TF 2.0 one can run

import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

but then one cannot benefit of many improvements made in TF 2.0. For more details please refer to the migration guide https://www.tensorflow.org/guide/migrate


回答 1

TF2默认情况下运行急切执行,因此无需会话。如果要运行静态图,则更正确的方法是tf.function()在TF2中使用。虽然仍然可以通过tf.compat.v1.Session()TF2访问Session ,但我不建议使用它。通过比较问候世界中的差异来证明这种差异可能会有所帮助:

TF1.x你好世界:

import tensorflow as tf
msg = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(msg))

TF2.x你好世界:

import tensorflow as tf
msg = tf.constant('Hello, TensorFlow!')
tf.print(msg)

有关更多信息,请参见Effective TensorFlow 2

TF2 runs Eager Execution by default, thus removing the need for Sessions. If you want to run static graphs, the more proper way is to use tf.function() in TF2. While Session can still be accessed via tf.compat.v1.Session() in TF2, I would discourage using it. It may be helpful to demonstrate this difference by comparing the difference in hello worlds:

TF1.x hello world:

import tensorflow as tf
msg = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(msg))

TF2.x hello world:

import tensorflow as tf
msg = tf.constant('Hello, TensorFlow!')
tf.print(msg)

For more info, see Effective TensorFlow 2


回答 2

安装后第一次尝试python时遇到了这个问题 windows10 + python3.7(64bit) + anacconda3 + jupyter notebook.

我通过参考“ https://vispud.blogspot.com/2019/05/tensorflow200a0-attributeerror-module.html ”解决了此问题

我同意

我相信TF 2.0已删除了“ Session()”。

我插入了两行。一个是tf.compat.v1.disable_eager_execution(),另一个是sess = tf.compat.v1.Session()

我的Hello.py如下:

import tensorflow as tf

tf.compat.v1.disable_eager_execution()

hello = tf.constant('Hello, TensorFlow!')

sess = tf.compat.v1.Session()

print(sess.run(hello))

I faced this problem when I first tried python after installing windows10 + python3.7(64bit) + anacconda3 + jupyter notebook.

I solved this problem by refering to “https://vispud.blogspot.com/2019/05/tensorflow200a0-attributeerror-module.html

I agree with

I believe “Session()” has been removed with TF 2.0.

I inserted two lines. One is tf.compat.v1.disable_eager_execution() and the other is sess = tf.compat.v1.Session()

My Hello.py is as follows:

import tensorflow as tf

tf.compat.v1.disable_eager_execution()

hello = tf.constant('Hello, TensorFlow!')

sess = tf.compat.v1.Session()

print(sess.run(hello))

回答 3

对于TF2.x,您可以这样做。

import tensorflow as tf
with tf.compat.v1.Session() as sess:
    hello = tf.constant('hello world')
    print(sess.run(hello))

>>> b'hello world

For TF2.x, you can do like this.

import tensorflow as tf
with tf.compat.v1.Session() as sess:
    hello = tf.constant('hello world')
    print(sess.run(hello))

>>> b'hello world


回答 4

尝试这个

import tensorflow as tf

tf.compat.v1.disable_eager_execution()

hello = tf.constant('Hello, TensorFlow!')

sess = tf.compat.v1.Session()

print(sess.run(hello))

try this

import tensorflow as tf

tf.compat.v1.disable_eager_execution()

hello = tf.constant('Hello, TensorFlow!')

sess = tf.compat.v1.Session()

print(sess.run(hello))

回答 5

如果这是您的代码,则正确的解决方案是将其重写为不使用Session(),因为在TensorFlow 2中不再需要

如果这只是您正在运行的代码,则可以通过运行降级到TensorFlow 1

pip3 install --upgrade --force-reinstall tensorflow-gpu==1.15.0 

(或TensorFlow 1最新版本

If this is your code, the correct solution is to rewrite it to not use Session(), since that’s no longer necessary in TensorFlow 2

If this is just code you’re running, you can downgrade to TensorFlow 1 by running

pip3 install --upgrade --force-reinstall tensorflow-gpu==1.15.0 

(or whatever the latest version of TensorFlow 1 is)


回答 6

Tensorflow 2.x支持默认执行Eager Execution,因此不支持Session。

Tensorflow 2.x support’s Eager Execution by default hence Session is not supported.


回答 7

使用Anaconda + Spyder(Python 3.7)

[码]

import tensorflow as tf
valor1 = tf.constant(2)
valor2 = tf.constant(3)
type(valor1)
print(valor1)
soma=valor1+valor2
type(soma)
print(soma)
sess = tf.compat.v1.Session()
with sess:
    print(sess.run(soma))

[安慰]

import tensorflow as tf
valor1 = tf.constant(2)
valor2 = tf.constant(3)
type(valor1)
print(valor1)
soma=valor1+valor2
type(soma)
Tensor("Const_8:0", shape=(), dtype=int32)
Out[18]: tensorflow.python.framework.ops.Tensor

print(soma)
Tensor("add_4:0", shape=(), dtype=int32)

sess = tf.compat.v1.Session()

with sess:
    print(sess.run(soma))
5

Using Anaconda + Spyder (Python 3.7)

[code]

import tensorflow as tf
valor1 = tf.constant(2)
valor2 = tf.constant(3)
type(valor1)
print(valor1)
soma=valor1+valor2
type(soma)
print(soma)
sess = tf.compat.v1.Session()
with sess:
    print(sess.run(soma))

[console]

import tensorflow as tf
valor1 = tf.constant(2)
valor2 = tf.constant(3)
type(valor1)
print(valor1)
soma=valor1+valor2
type(soma)
Tensor("Const_8:0", shape=(), dtype=int32)
Out[18]: tensorflow.python.framework.ops.Tensor

print(soma)
Tensor("add_4:0", shape=(), dtype=int32)

sess = tf.compat.v1.Session()

with sess:
    print(sess.run(soma))
5

回答 8

TF v2.0支持Eager模式和v1.0的Graph模式。因此,v2.0不支持tf.session()。因此,建议您重写代码以在Eager模式下工作。

TF v2.0 supports Eager mode vis-a-vis Graph mode of v1.0. Hence, tf.session() is not supported on v2.0. Hence, would suggest you to rewrite your code to work in Eager mode.


回答 9

import tensorflow as tf
sess = tf.Session()

此代码将在版本2.x上显示属性错误

在版本2.x中使用版本1.x代码

尝试这个

import tensorflow.compat.v1 as tf
sess = tf.Session()
import tensorflow as tf
sess = tf.Session()

this code will show an Attribute error on version 2.x

to use version 1.x code in version 2.x

try this

import tensorflow.compat.v1 as tf
sess = tf.Session()

我可以在GPU上运行Keras模型吗?

问题:我可以在GPU上运行Keras模型吗?

我正在运行Keras模型,提交截止日期为36小时,如果我在cpu上训练我的模型大约需要50个小时,是否可以在gpu上运行Keras?

我正在使用Tensorflow后端,并在未安装anaconda的Jupyter笔记本上运行它。

I’m running a Keras model, with a submission deadline of 36 hours, if I train my model on the cpu it will take approx 50 hours, is there a way to run Keras on gpu?

I’m using Tensorflow backend and running it on my Jupyter notebook, without anaconda installed.


回答 0

是的,您可以在GPU上运行keras模型。几件事您将必须首先检查。

  1. 您的系统具有GPU(Nvidia。因为AMD尚未运行)
  2. 您已经安装了Tensorflow的GPU版本
  3. 您已安装CUDA 安装说明
  4. 验证Tensorflow是否与GPU一起运行,检查GPU是否正常工作

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

要么

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

输出将是这样的:

[
  name: "/cpu:0"device_type: "CPU",
  name: "/gpu:0"device_type: "GPU"
]

完成所有这些操作后,您的模型将在GPU上运行:

要检查keras(> = 2.1.1)是否使用GPU:

from keras import backend as K
K.tensorflow_backend._get_available_gpus()

祝一切顺利。

Yes you can run keras models on GPU. Few things you will have to check first.

  1. your system has GPU (Nvidia. As AMD doesn’t work yet)
  2. You have installed the GPU version of tensorflow
  3. You have installed CUDA installation instructions
  4. Verify that tensorflow is running with GPU check if GPU is working

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

OR

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

output will be something like this:

[
  name: "/cpu:0"device_type: "CPU",
  name: "/gpu:0"device_type: "GPU"
]

Once all this is done your model will run on GPU:

To Check if keras(>=2.1.1) is using GPU:

from keras import backend as K
K.tensorflow_backend._get_available_gpus()

All the best.


回答 1

当然。我想您已经安装了TensorFlow for GPU。

导入keras后,需要添加以下块。我正在使用具有56核心cpu和gpu的计算机。

import keras
import tensorflow as tf


config = tf.ConfigProto( device_count = {'GPU': 1 , 'CPU': 56} ) 
sess = tf.Session(config=config) 
keras.backend.set_session(sess)

当然,这种用法会强制执行我的计算机的最大限制。您可以减少cpu和gpu消耗值。

Sure. I suppose that you have already installed TensorFlow for GPU.

You need to add the following block after importing keras. I am working on a machine which have 56 core cpu, and a gpu.

import keras
import tensorflow as tf


config = tf.ConfigProto( device_count = {'GPU': 1 , 'CPU': 56} ) 
sess = tf.Session(config=config) 
keras.backend.set_session(sess)

Of course, this usage enforces my machines maximum limits. You can decrease cpu and gpu consumption values.


回答 2

2.0兼容答案:虽然上面提到的答案详细说明了如何在Keras Model上使用GPU,但我想说明如何实现Tensorflow Version 2.0

要知道有多少个GPU可用,我们可以使用以下代码:

print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

要找出您的操作和张量分配给哪些设备,请将其tf.debugging.set_log_device_placement(True)作为程序的第一条语句。

启用设备放置日志记录将导致打印任何Tensor分配或操作。例如,运行以下代码:

tf.debugging.set_log_device_placement(True)

# Create some tensors
a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.matmul(a, b)

print(c)

给出如下所示的输出:

在设备/ job:localhost / replica:0 / task:0 / device:GPU:0 tf.Tensor([[22. 28.] [49. 64.]],shape =(2,2)中执行op MatMul dtype = float32)

有关更多信息,请参考此链接

2.0 Compatible Answer: While above mentioned answer explain in detail on how to use GPU on Keras Model, I want to explain how it can be done for Tensorflow Version 2.0.

To know how many GPUs are available, we can use the below code:

print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

To find out which devices your operations and tensors are assigned to, put tf.debugging.set_log_device_placement(True) as the first statement of your program.

Enabling device placement logging causes any Tensor allocations or operations to be printed. For example, running the below code:

tf.debugging.set_log_device_placement(True)

# Create some tensors
a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.matmul(a, b)

print(c)

gives the Output shown below:

Executing op MatMul in device /job:localhost/replica:0/task:0/device:GPU:0 tf.Tensor( [[22. 28.] [49. 64.]], shape=(2, 2), dtype=float32)

For more information, refer this link


回答 3

当然。如果您在Tensorflow或CNTk后端上运行,则代码将默认在GPU设备上运行。但是,如果Theano后端,则可以使用以下代码

Theano标志:

“ THEANO_FLAGS = device = gpu,floatX = float32 python my_keras_script.py”

Of course. if you are running on Tensorflow or CNTk backends, your code will run on your GPU devices defaultly.But if Theano backends, you can use following

Theano flags:

“THEANO_FLAGS=device=gpu,floatX=float32 python my_keras_script.py”


回答 4

在任务管理器中查看脚本是否正在运行GPU。如果不是,请怀疑您的CUDA版本是您所使用的tensorflow版本的正确版本,其他答案已经建议了。

此外,需要使用适用于CUDA版本的适当CUDA DNN库,才能使用tensorflow运行GPU。从此处下载/提取它,并将DLL(例如cudnn64_7.dll)放入CUDA bin文件夹(例如C:\ Program Files \ NVIDIA GPU Computing Toolkit \ CUDA \ v10.1 \ bin)。

See if your script is running GPU in Task manager. If not, suspect your CUDA version is right one for the tensorflow version you are using, as the other answers suggested already.

Additionally, a proper CUDA DNN library for the CUDA version is required to run GPU with tensorflow. Download/extract it from here and put the DLL (e.g., cudnn64_7.dll) into CUDA bin folder (e.g., C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin).


如何修复imdb.load_data()函数的“ allow_pickle = False时无法加载对象数组”?

问题:如何修复imdb.load_data()函数的“ allow_pickle = False时无法加载对象数组”?

我正在尝试使用Google Colab中的IMDb数据集实现二进制分类示例。我以前已经实现了此模型。但是,几天后我再次尝试执行此操作时,它返回一个值错误:对于load_data()函数,当allow_pickle = False时无法加载对象数组。

我已经尝试解决此问题,请参考一个类似问题的现有答案:如何修复sketch_rnn算法中的“当allow_pickle = False时无法加载对象数组”, 但事实证明,仅添加allow_pickle参数是不够的。

我的代码:

from keras.datasets import imdb
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

错误:

ValueError                                Traceback (most recent call last)
<ipython-input-1-2ab3902db485> in <module>()
      1 from keras.datasets import imdb
----> 2 (train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

2 frames
/usr/local/lib/python3.6/dist-packages/keras/datasets/imdb.py in load_data(path, num_words, skip_top, maxlen, seed, start_char, oov_char, index_from, **kwargs)
     57                     file_hash='599dadb1135973df5b59232a0e9a887c')
     58     with np.load(path) as f:
---> 59         x_train, labels_train = f['x_train'], f['y_train']
     60         x_test, labels_test = f['x_test'], f['y_test']
     61 

/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py in __getitem__(self, key)
    260                 return format.read_array(bytes,
    261                                          allow_pickle=self.allow_pickle,
--> 262                                          pickle_kwargs=self.pickle_kwargs)
    263             else:
    264                 return self.zip.read(key)

/usr/local/lib/python3.6/dist-packages/numpy/lib/format.py in read_array(fp, allow_pickle, pickle_kwargs)
    690         # The array contained Python objects. We need to unpickle the data.
    691         if not allow_pickle:
--> 692             raise ValueError("Object arrays cannot be loaded when "
    693                              "allow_pickle=False")
    694         if pickle_kwargs is None:

ValueError: Object arrays cannot be loaded when allow_pickle=False

I’m trying to implement the binary classification example using the IMDb dataset in Google Colab. I have implemented this model before. But when I tried to do it again after a few days, it returned a value error: 'Object arrays cannot be loaded when allow_pickle=False' for the load_data() function.

I have already tried solving this, referring to an existing answer for a similar problem: How to fix ‘Object arrays cannot be loaded when allow_pickle=False’ in the sketch_rnn algorithm. But it turns out that just adding an allow_pickle argument isn’t sufficient.

My code:

from keras.datasets import imdb
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

The error:

ValueError                                Traceback (most recent call last)
<ipython-input-1-2ab3902db485> in <module>()
      1 from keras.datasets import imdb
----> 2 (train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

2 frames
/usr/local/lib/python3.6/dist-packages/keras/datasets/imdb.py in load_data(path, num_words, skip_top, maxlen, seed, start_char, oov_char, index_from, **kwargs)
     57                     file_hash='599dadb1135973df5b59232a0e9a887c')
     58     with np.load(path) as f:
---> 59         x_train, labels_train = f['x_train'], f['y_train']
     60         x_test, labels_test = f['x_test'], f['y_test']
     61 

/usr/local/lib/python3.6/dist-packages/numpy/lib/npyio.py in __getitem__(self, key)
    260                 return format.read_array(bytes,
    261                                          allow_pickle=self.allow_pickle,
--> 262                                          pickle_kwargs=self.pickle_kwargs)
    263             else:
    264                 return self.zip.read(key)

/usr/local/lib/python3.6/dist-packages/numpy/lib/format.py in read_array(fp, allow_pickle, pickle_kwargs)
    690         # The array contained Python objects. We need to unpickle the data.
    691         if not allow_pickle:
--> 692             raise ValueError("Object arrays cannot be loaded when "
    693                              "allow_pickle=False")
    694         if pickle_kwargs is None:

ValueError: Object arrays cannot be loaded when allow_pickle=False

回答 0

这是一个强制imdb.load_data在笔记本中允许泡菜代替此行的技巧:

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

这样:

import numpy as np
# save np.load
np_load_old = np.load

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

# call load_data with allow_pickle implicitly set to true
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

# restore np.load for future normal usage
np.load = np_load_old

Here’s a trick to force imdb.load_data to allow pickle by, in your notebook, replacing this line:

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

by this:

import numpy as np
# save np.load
np_load_old = np.load

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

# call load_data with allow_pickle implicitly set to true
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

# restore np.load for future normal usage
np.load = np_load_old

回答 1

这个问题仍然存在于keras git上。我希望它能尽快解决。在此之前,请尝试将numpy版本降级为1.16.2。看来解决了问题。

!pip install numpy==1.16.1
import numpy as np

此版本的numpy的默认值为allow_pickleas True

This issue is still up on keras git. I hope it gets solved as soon as possible. Until then, try downgrading your numpy version to 1.16.2. It seems to solve the problem.

!pip install numpy==1.16.1
import numpy as np

This version of numpy has the default value of allow_pickle as True.


回答 2

在GitHub上发布此问题之后,官方解决方案是编辑imdb.py文件。此修复程序对我来说效果很好,无需降级numpy。在以下位置找到imdb.py文件tensorflow/python/keras/datasets/imdb.py(对我来说完整路径是:C:\Anaconda\Lib\site-packages\tensorflow\python\keras\datasets\imdb.py-其他安装将有所不同),并根据差异更改第85行:

-  with np.load(path) as f:
+  with np.load(path, allow_pickle=True) as f:

进行更改的原因是为了防止在腌制文件中使用与SQL注入等效的Python。上面的更改仅会影响imdb数据,因此您将安全性保留在其他位置(不降低numpy的级别)。

Following this issue on GitHub, the official solution is to edit the imdb.py file. This fix worked well for me without the need to downgrade numpy. Find the imdb.py file at tensorflow/python/keras/datasets/imdb.py (full path for me was: C:\Anaconda\Lib\site-packages\tensorflow\python\keras\datasets\imdb.py – other installs will be different) and change line 85 as per the diff:

-  with np.load(path) as f:
+  with np.load(path, allow_pickle=True) as f:

The reason for the change is security to prevent the Python equivalent of an SQL injection in a pickled file. The change above will ONLY effect the imdb data and you therefore retain the security elsewhere (by not downgrading numpy).


回答 3

我只是使用allow_pickle = True作为np.load()的参数,它对我有用。

I just used allow_pickle = True as an argument to np.load() and it worked for me.


回答 4

就我而言:

np.load(path, allow_pickle=True)

In my case worked with:

np.load(path, allow_pickle=True)

回答 5

我认为cheez的答案(https://stackoverflow.com/users/122933/cheez)是最简单,最有效。我将对其进行详细说明,以便在整个会话期间都不会修改numpy函数。

我的建议如下。我正在使用它从keras下载路透社数据集,该数据集显示了相同类型的错误:

old = np.load
np.load = lambda *a,**k: old(*a,**k,allow_pickle=True)

from keras.datasets import reuters
(train_data, train_labels), (test_data, test_labels) = reuters.load_data(num_words=10000)

np.load = old
del(old)

I think the answer from cheez (https://stackoverflow.com/users/122933/cheez) is the easiest and most effective one. I’d elaborate a little bit over it so it would not modify a numpy function for the whole session period.

My suggestion is below. I´m using it to download the reuters dataset from keras which is showing the same kind of error:

old = np.load
np.load = lambda *a,**k: old(*a,**k,allow_pickle=True)

from keras.datasets import reuters
(train_data, train_labels), (test_data, test_labels) = reuters.load_data(num_words=10000)

np.load = old
del(old)

回答 6

您可以尝试更改标志的值

np.load(training_image_names_array,allow_pickle=True)

You can try changing the flag’s value

np.load(training_image_names_array,allow_pickle=True)

回答 7

上面列出的解决方案都不对我有用:我使用python 3.7.3运行anaconda。对我有用的是

  • 从Anaconda powershell运行“ conda install numpy == 1.16.1”

  • 关闭并重新打开笔记本

none of the above listed solutions worked for me: i run anaconda with python 3.7.3. What worked for me was

  • run “conda install numpy==1.16.1” from Anaconda powershell

  • close and reopen the notebook


回答 8

在jupyter笔记本上使用

np_load_old = np.load

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

工作正常,但问题是在spyder中使用此方法时出现的(您每次必须重新启动内核,否则会收到类似以下的错误:

TypeError :()为关键字参数“ allow_pickle”获得了多个值

我在这里使用解决方案解决了这个问题:

on jupyter notebook using

np_load_old = np.load

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

worked fine, but the problem appears when you use this method in spyder(you have to restart the kernel every time or you will get an error like:

TypeError : () got multiple values for keyword argument ‘allow_pickle’

I solved this issue using the solution here:


回答 9

我降落在这里,尝试了您的方式,无法解决。

我实际上是在编写给定的代码

pickle.load(path)

用过,所以我用

np.load(path, allow_pickle=True)

I landed up here, tried your ways and could not figure out.

I was actually working on a pregiven code where

pickle.load(path)

was used so i replaced it with

np.load(path, allow_pickle=True)

回答 10

是的,安装以前的numpy版本可以解决此问题。

对于使用PyCharm IDE的用户:

在我的IDE(Pycharm)中,依次单击File-> Settings-> Project Interpreter:我发现numpy为1.16.3,所以我恢复为1.16.1。单击+并在搜索中键入numpy,在“指定版本”上打勾:1.16.1并选择->安装软件包。

Yes, installing previous a version of numpy solved the problem.

For those who uses PyCharm IDE:

in my IDE (Pycharm), File->Settings->Project Interpreter: I found my numpy to be 1.16.3, so I revert back to 1.16.1. Click + and type numpy in the search, tick “specify version” : 1.16.1 and choose–> install package.


回答 11

找到imdb.py的路径,然后将标志添加到np.load(path,… flag …)

    def load_data(.......):
    .......................................
    .......................................
    - with np.load(path) as f:
    + with np.load(path,allow_pickle=True) as f:

find the path to imdb.py then just add the flag to np.load(path,…flag…)

    def load_data(.......):
    .......................................
    .......................................
    - with np.load(path) as f:
    + with np.load(path,allow_pickle=True) as f:

回答 12

它对我的工作

        np_load_old = np.load
        np.load = lambda *a: np_load_old(*a, allow_pickle=True)
        (x_train, y_train), (x_test, y_test) = reuters.load_data(num_words=None, test_split=0.2)
        np.load = np_load_old

Its work for me

        np_load_old = np.load
        np.load = lambda *a: np_load_old(*a, allow_pickle=True)
        (x_train, y_train), (x_test, y_test) = reuters.load_data(num_words=None, test_split=0.2)
        np.load = np_load_old

回答 13

我发现TensorFlow 2.0(我正在使用2.0.0-alpha0)与最新版本的Numpy不兼容,即v1.17.0(可能还有v1.16.5 +)。导入TF2后,它会抛出巨大的FutureWarning列表,如下所示:

FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.

尝试从keras加载imdb数据集时,这还会导致allow_pickle错误

我尝试使用以下效果很好的解决方案,但是我必须在导入TF2或tf.keras的每个项目中都这样做。

np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

我找到的最简单的解决方案是全局安装numpy 1.16.1,或者在虚拟环境中使用tensorflow和numpy的兼容版本。

我的答案是指出,这不仅是imdb.load_data的问题,而且是TF2和Numpy版本不兼容所引起的更大的问题,并且可能导致许多其他隐藏的错误或问题。

What I have found is that TensorFlow 2.0 (I am using 2.0.0-alpha0) is not compatible with the latest version of Numpy i.e. v1.17.0 (and possibly v1.16.5+). As soon as TF2 is imported, it throws a huge list of FutureWarning, that looks something like this:

FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/anaconda3/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.

This also resulted in the allow_pickle error when tried to load imdb dataset from keras

I tried to use the following solution which worked just fine, but I had to do it every single project where I was importing TF2 or tf.keras.

np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

The easiest solution I found was to either install numpy 1.16.1 globally, or use compatible versions of tensorflow and numpy in a virtual environment.

My goal with this answer is to point out that its not just a problem with imdb.load_data, but a larger problem vaused by incompatibility of TF2 and Numpy versions and may result in many other hidden bugs or issues.


回答 14

Tensorflow在tf-nightly版本中有一个修复。

!pip install tf-nightly

当前版本是’2.0.0-dev20190511’。

Tensorflow has a fix in tf-nightly version.

!pip install tf-nightly

The current version is ‘2.0.0-dev20190511’.


回答 15

@cheez答案有时不起作用,并一次又一次地递归调用该函数。要解决此问题,您应该深入复制该功能。您可以使用函数来完成此操作partial,因此最终代码为:

import numpy as np
from functools import partial

# save np.load
np_load_old = partial(np.load)

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

# call load_data with allow_pickle implicitly set to true
(train_data, train_labels), (test_data, test_labels) = 
imdb.load_data(num_words=10000)

# restore np.load for future normal usage
np.load = np_load_old

The answer of @cheez sometime doesn’t work and recursively call the function again and again. To solve this problem you should copy the function deeply. You can do this by using the function partial, so the final code is:

import numpy as np
from functools import partial

# save np.load
np_load_old = partial(np.load)

# modify the default parameters of np.load
np.load = lambda *a,**k: np_load_old(*a, allow_pickle=True, **k)

# call load_data with allow_pickle implicitly set to true
(train_data, train_labels), (test_data, test_labels) = 
imdb.load_data(num_words=10000)

# restore np.load for future normal usage
np.load = np_load_old

回答 16

我通常不发布这些东西,但这太烦人了。混淆来自某些Keras imdb.py文件已经更新的事实:

with np.load(path) as f:

到的版本allow_pickle=True。确保检查imdb.py文件以查看是否已实施此更改。如果已进行调整,则可以很好地进行以下操作:

from keras.datasets import imdb
(train_text, train_labels), (test_text, test_labels) = imdb.load_data(num_words=10000)

I don’t usually post to these things but this was super annoying. The confusion comes from the fact that some of the Keras imdb.py files have already updated:

with np.load(path) as f:

to the version with allow_pickle=True. Make sure check the imdb.py file to see if this change was already implemented. If it has been adjusted, the following works fine:

from keras.datasets import imdb
(train_text, train_labels), (test_text, test_labels) = imdb.load_data(num_words=10000)

回答 17

最简单的方法是将imdb.py设置更改allow_pickle=True为引发错误np.load的行imdb.py

The easiest way is to change imdb.py setting allow_pickle=True to np.load at the line where imdb.py throws error.


哪些参数应用于提前停止?

问题:哪些参数应用于提前停止?

我正在使用Keras为我的项目训练神经网络。Keras提供了提前停止的功能。我是否知道应该观察哪些参数,以避免由于使用早期停止而使神经网络过度拟合?

I’m training a neural network for my project using Keras. Keras has provided a function for early stopping. May I know what parameters should be observed to avoid my neural network from overfitting by using early stopping?


回答 0

一旦损失开始增加(或换句话说,验证准确性开始降低),早期停止基本上就是停止训练。根据文件,其用法如下;

keras.callbacks.EarlyStopping(monitor='val_loss',
                              min_delta=0,
                              patience=0,
                              verbose=0, mode='auto')

值取决于您的实现(问题,批处理大小等),但通常是为了防止我使用过度拟合;

  1. 通过将monitor 参数设置为,监控验证损失(需要使用交叉验证或至少训练/测试集)'val_loss'
  2. min_delta是在某个时期是否将损失量化为改善的阈值。如果损失差异小于min_delta,则将其量化为无改善。最好将其保留为0,因为我们对损失越来越严重感兴趣。
  3. patience参数代表损失开始增加(停止改善)后停止之前的时期数。这取决于您的实现,如果您使用的批次非常小学习率较高,则损失呈锯齿状(准确性会更加嘈杂),因此最好设置一个较大的patience参数。如果您使用大批量学习率较低,则损失会更平稳,因此可以使用较小的patience参数。无论哪种方式,我都将其保留为2,以便为模型提供更多机会。
  4. verbose 确定要打印的内容,将其保留为默认值(0)。
  5. mode参数取决于您监视的数量的方向(应该是减少还是增加),因为我们监视损失,所以可以使用min。但是让我们留给喀拉拉邦为我们处理,并将其设置为auto

因此,我将使用类似的方法并通过绘制有无早期停止的错误损失来进行实验。

keras.callbacks.EarlyStopping(monitor='val_loss',
                              min_delta=0,
                              patience=2,
                              verbose=0, mode='auto')

为了避免对回调的工作方式产生歧义,我将尝试解释更多信息。调用fit(... callbacks=[es])模型后,Keras会调用给定的回调对象预定的函数。这些功能可以称为on_train_beginon_train_endon_epoch_beginon_epoch_endon_batch_beginon_batch_end。在每个时期结束时调用提前停止回调,将最佳监视值与当前监视值进行比较,并在条件满足时停止(自观察最佳监视值以来已经过去了多少个时期,这不仅仅是耐心参数,两者之间的差最后一个值大于min_delta等。)。

正如@BrentFaust在评论中指出的那样,模型的训练将继续进行,直到满足Early Stopping条件或满足epochsin(默认值= 10)为止fit()。设置“提早停止”回调不会使模型超出其epochs参数进行训练。因此,调用fit()较大epochs值的函数将受益于Early Stopping回调。

Early stopping is basically stopping the training once your loss starts to increase (or in other words validation accuracy starts to decrease). According to documents it is used as follows;

keras.callbacks.EarlyStopping(monitor='val_loss',
                              min_delta=0,
                              patience=0,
                              verbose=0, mode='auto')

Values depends on your implementation (problem, batch size etc…) but generally to prevent overfitting I would use;

  1. Monitor the validation loss (need to use cross validation or at least train/test sets) by setting the monitor argument to 'val_loss'.
  2. min_delta is a threshold to whether quantify a loss at some epoch as improvement or not. If the difference of loss is below min_delta, it is quantified as no improvement. Better to leave it as 0 since we’re interested in when loss becomes worse.
  3. patience argument represents the number of epochs before stopping once your loss starts to increase (stops improving). This depends on your implementation, if you use very small batches or a large learning rate your loss zig-zag (accuracy will be more noisy) so better set a large patience argument. If you use large batches and a small learning rate your loss will be smoother so you can use a smaller patience argument. Either way I’ll leave it as 2 so I would give the model more chance.
  4. verbose decides what to print, leave it at default (0).
  5. mode argument depends on what direction your monitored quantity has (is it supposed to be decreasing or increasing), since we monitor the loss, we can use min. But let’s leave keras handle that for us and set that to auto

So I would use something like this and experiment by plotting the error loss with and without early stopping.

keras.callbacks.EarlyStopping(monitor='val_loss',
                              min_delta=0,
                              patience=2,
                              verbose=0, mode='auto')

For possible ambiguity on how callbacks work, I’ll try to explain more. Once you call fit(... callbacks=[es]) on your model, Keras calls given callback objects predetermined functions. These functions can be called on_train_begin, on_train_end, on_epoch_begin, on_epoch_end and on_batch_begin, on_batch_end. Early stopping callback is called on every epoch end, compares the best monitored value with the current one and stops if conditions are met (how many epochs have past since the observation of the best monitored value and is it more than patience argument, the difference between last value is bigger than min_delta etc..).

As pointed by @BrentFaust in comments, model’s training will continue until either Early Stopping conditions are met or epochs parameter (default=10) in fit() is satisfied. Setting an Early Stopping callback will not make the model to train beyond its epochs parameter. So calling fit() function with a larger epochs value would benefit more from Early Stopping callback.


Tensorflow后端的Keras能否被迫随意使用CPU或GPU?

问题:Tensorflow后端的Keras能否被迫随意使用CPU或GPU?

我安装了Tensorflow后端和CUDA的Keras。我有时想按需强迫Keras使用CPU。不用说在虚拟环境中安装单独的仅CPU的Tensorflow就能做到吗?如果可以,怎么办?如果后端是Theano,则可以设置标志,但是我还没有听说过可以通过Keras访问Tensorflow标志。

I have Keras installed with the Tensorflow backend and CUDA. I’d like to sometimes on demand force Keras to use CPU. Can this be done without say installing a separate CPU-only Tensorflow in a virtual environment? If so how? If the backend were Theano, the flags could be set, but I have not heard of Tensorflow flags accessible via Keras.


回答 0

如果要强制Keras使用CPU

方式1

import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"   # see issue #152
os.environ["CUDA_VISIBLE_DEVICES"] = ""

在导入Keras / Tensorflow之前。

方式二

运行脚本为

$ CUDA_VISIBLE_DEVICES="" ./your_keras_code.py

也可以看看

  1. https://github.com/keras-team/keras/issues/152
  2. https://github.com/fchollet/keras/issues/4613

If you want to force Keras to use CPU

Way 1

import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"   # see issue #152
os.environ["CUDA_VISIBLE_DEVICES"] = ""

before Keras / Tensorflow is imported.

Way 2

Run your script as

$ CUDA_VISIBLE_DEVICES="" ./your_keras_code.py

See also

  1. https://github.com/keras-team/keras/issues/152
  2. https://github.com/fchollet/keras/issues/4613

回答 1

一个相当分离的方法是使用

import tensorflow as tf
from keras import backend as K

num_cores = 4

if GPU:
    num_GPU = 1
    num_CPU = 1
if CPU:
    num_CPU = 1
    num_GPU = 0

config = tf.ConfigProto(intra_op_parallelism_threads=num_cores,
                        inter_op_parallelism_threads=num_cores, 
                        allow_soft_placement=True,
                        device_count = {'CPU' : num_CPU,
                                        'GPU' : num_GPU}
                       )

session = tf.Session(config=config)
K.set_session(session)

在此处,通过booleans GPUCPU,我们通过严格定义允许Tensorflow会话访问的GPU和CPU的数量来指示我们是否要使用GPU或CPU运行代码。变量num_GPUnum_CPU定义该值。num_cores然后通过intra_op_parallelism_threads和设置可供使用的CPU内核数inter_op_parallelism_threads

intra_op_parallelism_threads变量指示在计算图中单个节点中并行操作被允许使用(内部)的线程数。虽然inter_ops_parallelism_threads变量定义了跨计算图节点(并行)进行并行操作可访问的线程数。

allow_soft_placement 如果满足以下任一条件,则允许在CPU上运行操作:

  1. 该操作没有GPU实现

  2. 没有已知或注册的GPU设备

  3. 需要与CPU的其他输入一起放置

所有这些都在任何其他操作之前在我的类的构造函数中执行,并且可以与我使用的任何模型或其他代码完全分开。

注意:这要求tensorflow-gpucuda/ cudnn要安装,因为提供了使用GPU的选项。

参考:

A rather separable way of doing this is to use

import tensorflow as tf
from keras import backend as K

num_cores = 4

if GPU:
    num_GPU = 1
    num_CPU = 1
if CPU:
    num_CPU = 1
    num_GPU = 0

config = tf.ConfigProto(intra_op_parallelism_threads=num_cores,
                        inter_op_parallelism_threads=num_cores, 
                        allow_soft_placement=True,
                        device_count = {'CPU' : num_CPU,
                                        'GPU' : num_GPU}
                       )

session = tf.Session(config=config)
K.set_session(session)

Here, with booleans GPU and CPU, we indicate whether we would like to run our code with the GPU or CPU by rigidly defining the number of GPUs and CPUs the Tensorflow session is allowed to access. The variables num_GPU and num_CPU define this value. num_cores then sets the number of CPU cores available for usage via intra_op_parallelism_threads and inter_op_parallelism_threads.

The intra_op_parallelism_threads variable dictates the number of threads a parallel operation in a single node in the computation graph is allowed to use (intra). While the inter_ops_parallelism_threads variable defines the number of threads accessible for parallel operations across the nodes of the computation graph (inter).

allow_soft_placement allows for operations to be run on the CPU if any of the following criterion are met:

  1. there is no GPU implementation for the operation

  2. there are no GPU devices known or registered

  3. there is a need to co-locate with other inputs from the CPU

All of this is executed in the constructor of my class before any other operations, and is completely separable from any model or other code I use.

Note: This requires tensorflow-gpu and cuda/cudnn to be installed because the option is given to use a GPU.

Refs:


回答 2

这对我有用(win10),在导入keras之前放置:

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

This worked for me (win10), place before you import keras:

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

回答 3

只需导入tensortflow并使用keras,就这么简单。

import tensorflow as tf
# your code here
with tf.device('/gpu:0'):
    model.fit(X, y, epochs=20, batch_size=128, callbacks=callbacks_list)

Just import tensortflow and use keras, it’s that easy.

import tensorflow as tf
# your code here
with tf.device('/gpu:0'):
    model.fit(X, y, epochs=20, batch_size=128, callbacks=callbacks_list)

回答 4

按照keras 教程,您可以简单地使用与tf.device常规tensorflow中相同的作用域:

with tf.device('/gpu:0'):
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops in the LSTM layer will live on GPU:0

with tf.device('/cpu:0'):
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops in the LSTM layer will live on CPU:0

As per keras tutorial, you can simply use the same tf.device scope as in regular tensorflow:

with tf.device('/gpu:0'):
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops in the LSTM layer will live on GPU:0

with tf.device('/cpu:0'):
    x = tf.placeholder(tf.float32, shape=(None, 20, 64))
    y = LSTM(32)(x)  # all ops in the LSTM layer will live on CPU:0

回答 5

我只是花了一些时间弄清楚。Thoma的答案不完整。假设您的程序是test.py,您想使用gpu0来运行该程序,并使其他gpus保持空闲。

你应该写 CUDA_VISIBLE_DEVICES=0 python test.py

注意DEVICES不是DEVICE

I just spent some time figure it out. Thoma’s answer is not complete. Say your program is test.py, you want to use gpu0 to run this program, and keep other gpus free.

You should write CUDA_VISIBLE_DEVICES=0 python test.py

Notice it’s DEVICES not DEVICE


回答 6

对于从事PyCharm并强制使用CPU的人员,您可以在“运行/调试”配置的“环境变量”下添加以下行:

<OTHER_ENVIRONMENT_VARIABLES>;CUDA_VISIBLE_DEVICES=-1

For people working on PyCharm, and for forcing CPU, you can add the following line in the Run/Debug configuration, under Environment variables:

<OTHER_ENVIRONMENT_VARIABLES>;CUDA_VISIBLE_DEVICES=-1

加载经过训练的Keras模型并继续训练

问题:加载经过训练的Keras模型并继续训练

我想知道是否有可能保存经过部分训练的Keras模型并在再次加载模型后继续进行训练。

这样做的原因是,将来我将拥有更多的训练数据,并且我不想再次对整个模型进行训练。

我正在使用的功能是:

#Partly train model
model.fit(first_training, first_classes, batch_size=32, nb_epoch=20)

#Save partly trained model
model.save('partly_trained.h5')

#Load partly trained model
from keras.models import load_model
model = load_model('partly_trained.h5')

#Continue training
model.fit(second_training, second_classes, batch_size=32, nb_epoch=20)

编辑1:添加了完全正常的示例

对于第10个时期后的第一个数据集,最后一个时期的损失将为0.0748,精度为0.9863。

保存,删除和重新加载模型后,第二个数据集上训练的模型的损失和准确性分别为0.1711和0.9504。

这是由新的训练数据还是完全重新训练的模型引起的?

"""
Model by: http://machinelearningmastery.com/
"""
# load (downloaded if needed) the MNIST dataset
import numpy
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
from keras.models import load_model
numpy.random.seed(7)

def baseline_model():
    model = Sequential()
    model.add(Dense(num_pixels, input_dim=num_pixels, init='normal', activation='relu'))
    model.add(Dense(num_classes, init='normal', activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

if __name__ == '__main__':
    # load data
    (X_train, y_train), (X_test, y_test) = mnist.load_data()

    # flatten 28*28 images to a 784 vector for each image
    num_pixels = X_train.shape[1] * X_train.shape[2]
    X_train = X_train.reshape(X_train.shape[0], num_pixels).astype('float32')
    X_test = X_test.reshape(X_test.shape[0], num_pixels).astype('float32')
    # normalize inputs from 0-255 to 0-1
    X_train = X_train / 255
    X_test = X_test / 255
    # one hot encode outputs
    y_train = np_utils.to_categorical(y_train)
    y_test = np_utils.to_categorical(y_test)
    num_classes = y_test.shape[1]

    # build the model
    model = baseline_model()

    #Partly train model
    dataset1_x = X_train[:3000]
    dataset1_y = y_train[:3000]
    model.fit(dataset1_x, dataset1_y, nb_epoch=10, batch_size=200, verbose=2)

    # Final evaluation of the model
    scores = model.evaluate(X_test, y_test, verbose=0)
    print("Baseline Error: %.2f%%" % (100-scores[1]*100))

    #Save partly trained model
    model.save('partly_trained.h5')
    del model

    #Reload model
    model = load_model('partly_trained.h5')

    #Continue training
    dataset2_x = X_train[3000:]
    dataset2_y = y_train[3000:]
    model.fit(dataset2_x, dataset2_y, nb_epoch=10, batch_size=200, verbose=2)
    scores = model.evaluate(X_test, y_test, verbose=0)
    print("Baseline Error: %.2f%%" % (100-scores[1]*100))

I was wondering if it was possible to save a partly trained Keras model and continue the training after loading the model again.

The reason for this is that I will have more training data in the future and I do not want to retrain the whole model again.

The functions which I am using are:

#Partly train model
model.fit(first_training, first_classes, batch_size=32, nb_epoch=20)

#Save partly trained model
model.save('partly_trained.h5')

#Load partly trained model
from keras.models import load_model
model = load_model('partly_trained.h5')

#Continue training
model.fit(second_training, second_classes, batch_size=32, nb_epoch=20)

Edit 1: added fully working example

With the first dataset after 10 epochs the loss of the last epoch will be 0.0748 and the accuracy 0.9863.

After saving, deleting and reloading the model the loss and accuracy of the model trained on the second dataset will be 0.1711 and 0.9504 respectively.

Is this caused by the new training data or by a completely re-trained model?

"""
Model by: http://machinelearningmastery.com/
"""
# load (downloaded if needed) the MNIST dataset
import numpy
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
from keras.models import load_model
numpy.random.seed(7)

def baseline_model():
    model = Sequential()
    model.add(Dense(num_pixels, input_dim=num_pixels, init='normal', activation='relu'))
    model.add(Dense(num_classes, init='normal', activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

if __name__ == '__main__':
    # load data
    (X_train, y_train), (X_test, y_test) = mnist.load_data()

    # flatten 28*28 images to a 784 vector for each image
    num_pixels = X_train.shape[1] * X_train.shape[2]
    X_train = X_train.reshape(X_train.shape[0], num_pixels).astype('float32')
    X_test = X_test.reshape(X_test.shape[0], num_pixels).astype('float32')
    # normalize inputs from 0-255 to 0-1
    X_train = X_train / 255
    X_test = X_test / 255
    # one hot encode outputs
    y_train = np_utils.to_categorical(y_train)
    y_test = np_utils.to_categorical(y_test)
    num_classes = y_test.shape[1]

    # build the model
    model = baseline_model()

    #Partly train model
    dataset1_x = X_train[:3000]
    dataset1_y = y_train[:3000]
    model.fit(dataset1_x, dataset1_y, nb_epoch=10, batch_size=200, verbose=2)

    # Final evaluation of the model
    scores = model.evaluate(X_test, y_test, verbose=0)
    print("Baseline Error: %.2f%%" % (100-scores[1]*100))

    #Save partly trained model
    model.save('partly_trained.h5')
    del model

    #Reload model
    model = load_model('partly_trained.h5')

    #Continue training
    dataset2_x = X_train[3000:]
    dataset2_y = y_train[3000:]
    model.fit(dataset2_x, dataset2_y, nb_epoch=10, batch_size=200, verbose=2)
    scores = model.evaluate(X_test, y_test, verbose=0)
    print("Baseline Error: %.2f%%" % (100-scores[1]*100))

回答 0

实际上- model.save根据您的情况保存重新开始培训所需的所有信息。重新加载模型可能会破坏的唯一事情是优化器状态。要进行检查-尝试save重新加载模型并根据训练数据进行训练。

Actually – model.save saves all information need for restarting training in your case. The only thing which could be spoiled by reloading model is your optimizer state. To check that – try to save and reload model and train it on training data.


回答 1

问题可能是您使用了不同的优化器-或优化器使用了不同的参数。我只是在使用自定义预训练模型时遇到了相同的问题

reduce_lr = ReduceLROnPlateau(monitor='loss', factor=lr_reduction_factor,
                              patience=patience, min_lr=min_lr, verbose=1)

对于预训练模型,其中原始学习率从0.0003开始,在预训练过程中,原始学习率降低为min_learning率,即0.000003

我只是将该行复制到使用预训练模型的脚本中,并且准确性很差。直到我注意到预训练模型的最后学习率是最小学习率,即0.000003。如果我以该学习率开始,那么我得到的精确度与预训练模型的输出完全相同-这是有道理的,因为它的学习率是预训练模型中最后一次使用的学习率的100倍该模型将导致GD严重超调,从而导致精度大大降低。

The problem might be that you use a different optimizer – or different arguments to your optimizer. I just had the same issue with a custom pretrained model, using

reduce_lr = ReduceLROnPlateau(monitor='loss', factor=lr_reduction_factor,
                              patience=patience, min_lr=min_lr, verbose=1)

for the pretrained model, whereby the original learning rate starts at 0.0003 and during pre-training it is reduced to the min_learning rate, which is 0.000003

I just copied that line over to the script which uses the pre-trained model and got really bad accuracies. Until I noticed that the last learning rate of the pretrained model was the min learning rate, i.e. 0.000003. And if I start with that learning rate, I get exactly the same accuracies to start with as the output of the pretrained model – which makes sense, as starting with a learning rate that is 100 times bigger than the last learning rate used in the pretrained model will result in a huge overshoot of GD and hence in heavily decreased accuracies.


回答 2

以上大多数答案都涵盖了重点。如果您正在使用最新的Tensorflow(TF2.1或更高版本),则以下示例将为您提供帮助。该代码的模型部分来自Tensorflow网站。

import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

def create_model():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),  
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
    ])

  model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
  return model

# Create a basic model instance
model=create_model()
model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)

请以* .tf格式保存模型。根据我的经验,如果您定义了任何custom_loss,*。h5格式将不会保存优化器状态,​​因此如果您要从我们离开的地方重新训练模型,将无法达到您的目的。

# saving the model in tensorflow format
model.save('./MyModel_tf',save_format='tf')


# loading the saved model
loaded_model = tf.keras.models.load_model('./MyModel_tf')

# retraining the model
loaded_model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)

这种方法将在保存模型之前重新开始训练。正如其他人所提到的,如果你想保存最好的模型的重量或要保存模型的加权每次你需要使用keras回调函数(ModelCheckpoint)的选项,如时代save_weights_only=Truesave_freq='epoch'save_best_only

有关更多详细信息,请在此处检查在此处查看另一个示例。

Most of the above answers covered important points. If you are using recent Tensorflow (TF2.1 or above), Then the following example will help you. The model part of the code is from Tensorflow website.

import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

def create_model():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),  
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
    ])

  model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
  return model

# Create a basic model instance
model=create_model()
model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)

Please save the model in *.tf format. From my experience, if you have any custom_loss defined, *.h5 format will not save optimizer status and hence will not serve your purpose if you want to retrain the model from where we left.

# saving the model in tensorflow format
model.save('./MyModel_tf',save_format='tf')


# loading the saved model
loaded_model = tf.keras.models.load_model('./MyModel_tf')

# retraining the model
loaded_model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)

This approach will restart the training where we left before saving the model. As mentioned by others, if you want to save weights of best model or you want to save weights of model every epoch you need to use keras callbacks function (ModelCheckpoint) with options such as save_weights_only=True, save_freq='epoch', and save_best_only.

For more details, please check here and another example here.


回答 3

注意,Keras有时在加载的模型上有问题,如此处所示。这可能会解释一些情况,其中您并非从相同的训练准确性开始。

Notice that Keras sometimes has issues with loaded models, as in here. This might explain cases in which you don’t start from the same trained accuracy.


回答 4

所有上述帮助,保存模型和权重后,您必须从与LR相同的学习rate()中恢复。直接在优化器上进行设置。

请注意,由于模型可能已达到局部最小值(可能是全局最小值),因此无法保证从那里得到改善。除非您打算以受控方式提高学习率并将模型推向不远处的可能更好的最小值,否则没有必要恢复模型以搜索另一个局部最小值。

All above helps, you must resume from same learning rate() as the LR when the model and weights were saved. Set it directly on the optimizer.

Note that improvement from there is not guaranteed, because the model may have reached the local minimum, which may be global. There is no point to resume a model in order to search for another local minimum, unless you intent to increase the learning rate in a controlled fashion and nudge the model into a possibly better minimum not far away.


回答 5

您可能还遇到了概念漂移问题,请参阅在有新观测值时是否应重新训练模型。还有一些学术论文讨论的灾难性遗忘的概念。这是与MNIST一起进行的灾难性遗忘的实证研究

You might also be hitting Concept Drift, see Should you retrain a model when new observations are available. There’s also the concept of catastrophic forgetting which a bunch of academic papers discuss. Here’s one with MNIST Empirical investigation of catastrophic forgetting


我在哪里可以在Keras中调用BatchNormalization函数?

问题:我在哪里可以在Keras中调用BatchNormalization函数?

如果我想在Keras中使用BatchNormalization函数,那么是否仅需要在开始时调用一次?

我为此阅读了该文档:http : //keras.io/layers/normalization/

我看不到该怎么称呼它。下面是我尝试使用它的代码:

model = Sequential()
keras.layers.normalization.BatchNormalization(epsilon=1e-06, mode=0, momentum=0.9, weights=None)
model.add(Dense(64, input_dim=14, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(2, init='uniform'))
model.add(Activation('softmax'))

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=20, batch_size=16, show_accuracy=True, validation_split=0.2, verbose = 2)

我问,因为如果我用第二行(包括批处理规范化)运行代码,而如果我不使用第二行运行代码,则会得到类似的输出。因此,要么我没有在正确的位置调用该函数,要么我猜它并没有太大的区别。

If I want to use the BatchNormalization function in Keras, then do I need to call it once only at the beginning?

I read this documentation for it: http://keras.io/layers/normalization/

I don’t see where I’m supposed to call it. Below is my code attempting to use it:

model = Sequential()
keras.layers.normalization.BatchNormalization(epsilon=1e-06, mode=0, momentum=0.9, weights=None)
model.add(Dense(64, input_dim=14, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(2, init='uniform'))
model.add(Activation('softmax'))

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=20, batch_size=16, show_accuracy=True, validation_split=0.2, verbose = 2)

I ask because if I run the code with the second line including the batch normalization and if I run the code without the second line I get similar outputs. So either I’m not calling the function in the right place, or I guess it doesn’t make that much of a difference.


回答 0

只是为了更详细地回答这个问题,正如Pavel所说的,批处理规范化只是另一层,因此您可以使用它来创建所需的网络体系结构。

一般用例是在网络的线性层和非线性层之间使用BN,因为它可以将激活函数的输入标准化,从而使您位于激活函数(例如Sigmoid)的线性部分的中心。有一小的讨论在这里

在上述情况下,它可能类似于:


# import BatchNormalization
from keras.layers.normalization import BatchNormalization

# instantiate model
model = Sequential()

# we can think of this chunk as the input layer
model.add(Dense(64, input_dim=14, init='uniform'))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.5))

# we can think of this chunk as the hidden layer    
model.add(Dense(64, init='uniform'))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.5))

# we can think of this chunk as the output layer
model.add(Dense(2, init='uniform'))
model.add(BatchNormalization())
model.add(Activation('softmax'))

# setting up the optimization of our weights 
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=sgd)

# running the fitting
model.fit(X_train, y_train, nb_epoch=20, batch_size=16, show_accuracy=True, validation_split=0.2, verbose = 2)

希望这能使事情更清楚。

Just to answer this question in a little more detail, and as Pavel said, Batch Normalization is just another layer, so you can use it as such to create your desired network architecture.

The general use case is to use BN between the linear and non-linear layers in your network, because it normalizes the input to your activation function, so that you’re centered in the linear section of the activation function (such as Sigmoid). There’s a small discussion of it here

In your case above, this might look like:


# import BatchNormalization
from keras.layers.normalization import BatchNormalization

# instantiate model
model = Sequential()

# we can think of this chunk as the input layer
model.add(Dense(64, input_dim=14, init='uniform'))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.5))

# we can think of this chunk as the hidden layer    
model.add(Dense(64, init='uniform'))
model.add(BatchNormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.5))

# we can think of this chunk as the output layer
model.add(Dense(2, init='uniform'))
model.add(BatchNormalization())
model.add(Activation('softmax'))

# setting up the optimization of our weights 
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=sgd)

# running the fitting
model.fit(X_train, y_train, nb_epoch=20, batch_size=16, show_accuracy=True, validation_split=0.2, verbose = 2)

Hope this clarifies things a bit more.


回答 1

该线程具有误导性。试图评论卢卡斯·斋月的答案,但我没有适当的特权,所以我只想把它放在这里。

在激活功能之后,在此处此处,批标准化最有效就是的原因:它是为防止内部协变量移位而开发的。当激活分布时发生内部协变量移位在整个训练过程中,一层的变化很大。使用批处理规范化,以便由于每个批处理中的参数更新(或至少允许其更改),输入(具体来说,这些输入是激活函数的结果)到特定层的分布不会随时间变化以有利的方式)。它使用批处理统计信息进行归一化,然后使用批处理归一化参数(原始论文中的gamma和beta)“以确保插入到网络中的转换可以表示身份转换”(引自原始论文)。但是关键是我们试图将输入归一化,因此它应该总是紧接在网络的下一层之前。是否

This thread is misleading. Tried commenting on Lucas Ramadan’s answer, but I don’t have the right privileges yet, so I’ll just put this here.

Batch normalization works best after the activation function, and here or here is why: it was developed to prevent internal covariate shift. Internal covariate shift occurs when the distribution of the activations of a layer shifts significantly throughout training. Batch normalization is used so that the distribution of the inputs (and these inputs are literally the result of an activation function) to a specific layer doesn’t change over time due to parameter updates from each batch (or at least, allows it to change in an advantageous way). It uses batch statistics to do the normalizing, and then uses the batch normalization parameters (gamma and beta in the original paper) “to make sure that the transformation inserted in the network can represent the identity transform” (quote from original paper). But the point is that we’re trying to normalize the inputs to a layer, so it should always go immediately before the next layer in the network. Whether or not that’s after an activation function is dependent on the architecture in question.


回答 2

该线程对于是否应在当前层的非线性之前应用BN或对前一层的激活进行广泛的辩论。

尽管没有正确的答案,但批处理规范化的作者说, 应在当前层的非线性之前立即应用它。原因(引自原文)-

“我们通过归一化x = Wu + b来在非线性之前添加BN变换。我们也可以归一化层输入u,但是由于u可能是另一个非线性的输出,因此其分布的形状可能会在训练,并限制其第一和第二时刻并不能消除协变量偏移,相反,Wu + b更有可能具有对称的,非稀疏的分布,即“更呈高斯分布”(Hyvèarinen&Oja,2000) ;规范化它可能会产生具有稳定分布的激活。”

This thread has some considerable debate about whether BN should be applied before non-linearity of current layer or to the activations of the previous layer.

Although there is no correct answer, the authors of Batch Normalization say that It should be applied immediately before the non-linearity of the current layer. The reason ( quoted from original paper) –

“We add the BN transform immediately before the nonlinearity, by normalizing x = Wu+b. We could have also normalized the layer inputs u, but since u is likely the output of another nonlinearity, the shape of its distribution is likely to change during training, and constraining its first and second moments would not eliminate the covariate shift. In contrast, Wu + b is more likely to have a symmetric, non-sparse distribution, that is “more Gaussian” (Hyv¨arinen & Oja, 2000); normalizing it is likely to produce activations with a stable distribution.”


回答 3

Keras现在支持该use_bias=False选项,因此我们可以通过编写如下代码来节省一些计算

model.add(Dense(64, use_bias=False))
model.add(BatchNormalization(axis=bn_axis))
model.add(Activation('tanh'))

要么

model.add(Convolution2D(64, 3, 3, use_bias=False))
model.add(BatchNormalization(axis=bn_axis))
model.add(Activation('relu'))

Keras now supports the use_bias=False option, so we can save some computation by writing like

model.add(Dense(64, use_bias=False))
model.add(BatchNormalization(axis=bn_axis))
model.add(Activation('tanh'))

or

model.add(Convolution2D(64, 3, 3, use_bias=False))
model.add(BatchNormalization(axis=bn_axis))
model.add(Activation('relu'))

回答 4

Conv2D紧随ReLu其后的是BatchNormalization一层,这几乎已成为一种趋势。因此,我组成了一个小函数来一次调用所有这些函数。使模型定义看起来更整洁,更易于阅读。

def Conv2DReluBatchNorm(n_filter, w_filter, h_filter, inputs):
    return BatchNormalization()(Activation(activation='relu')(Convolution2D(n_filter, w_filter, h_filter, border_mode='same')(inputs)))

It’s almost become a trend now to have a Conv2D followed by a ReLu followed by a BatchNormalization layer. So I made up a small function to call all of them at once. Makes the model definition look a whole lot cleaner and easier to read.

def Conv2DReluBatchNorm(n_filter, w_filter, h_filter, inputs):
    return BatchNormalization()(Activation(activation='relu')(Convolution2D(n_filter, w_filter, h_filter, border_mode='same')(inputs)))

回答 5

这是另一种类型的图层,因此您应该将其作为图层添加到模型的适当位置

model.add(keras.layers.normalization.BatchNormalization())

在此处查看示例:https : //github.com/fchollet/keras/blob/master/examples/kaggle_otto_nn.py

It is another type of layer, so you should add it as a layer in an appropriate place of your model

model.add(keras.layers.normalization.BatchNormalization())

See an example here: https://github.com/fchollet/keras/blob/master/examples/kaggle_otto_nn.py


回答 6

批归一化用于通过调整激活的均值和缩放来归一化输入层和隐藏层。由于深度神经网络中具有附加层的这种归一化效果,该网络可以使用更高的学习速率而不会消失或爆炸梯度。此外,批量归一化对网络进行规范化,使其更易于泛化,因此无需使用压差来减轻过度拟合的情况。

在使用Keras中的Dense()或Conv2D()计算线性函数之后,我们立即使用BatchNormalization()来计算图层中的线性函数,然后使用Activation()将非线性添加到图层中。

from keras.layers.normalization import BatchNormalization
model = Sequential()
model.add(Dense(64, input_dim=14, init='uniform'))
model.add(BatchNormalization(epsilon=1e-06, mode=0, momentum=0.9, weights=None))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(BatchNormalization(epsilon=1e-06, mode=0, momentum=0.9, weights=None))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(2, init='uniform'))
model.add(BatchNormalization(epsilon=1e-06, mode=0, momentum=0.9, weights=None))
model.add(Activation('softmax'))

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=20, batch_size=16, show_accuracy=True, 
validation_split=0.2, verbose = 2)

批量标准化如何应用?

假设我们已将a [l-1]输入到层l。同样,我们对层l具有权重W [l]和偏置单元b [l]。令a [l]是为层l计算的激活向量(即,在添加了非线性之后),而z [l]是在添加非线性之前的向量

  1. 使用a [l-1]和W [l]我们可以计算层l的z [l]
  2. 通常,在前馈传播中,我们会在此阶段像z [l] + b [l]一样向z [l]添加偏置单元,但是在批归一化中,不需要添加b [l]的步骤,并且不需要使用b [l]参数。
  3. 计算z [l]均值并从每个元素中减去
  4. 使用标准偏差除以(z [l]-平均值)。称为Z_temp [l]
  5. 现在定义新参数γ和β,它们将改变隐藏层的比例,如下所示:

    z_norm [l] =γ.Z_temp[l] +β

在此代码摘录中,Dense()取a [l-1],使用W [l]并计算z [l]。然后立即的BatchNormalization()将执行上述步骤以给出z_norm [l]。然后立即Activation()将计算tanh(z_norm [l])得出a [l],即

a[l] = tanh(z_norm[l])

Batch Normalization is used to normalize the input layer as well as hidden layers by adjusting mean and scaling of the activations. Because of this normalizing effect with additional layer in deep neural networks, the network can use higher learning rate without vanishing or exploding gradients. Furthermore, batch normalization regularizes the network such that it is easier to generalize, and it is thus unnecessary to use dropout to mitigate overfitting.

Right after calculating the linear function using say, the Dense() or Conv2D() in Keras, we use BatchNormalization() which calculates the linear function in a layer and then we add the non-linearity to the layer using Activation().

from keras.layers.normalization import BatchNormalization
model = Sequential()
model.add(Dense(64, input_dim=14, init='uniform'))
model.add(BatchNormalization(epsilon=1e-06, mode=0, momentum=0.9, weights=None))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(BatchNormalization(epsilon=1e-06, mode=0, momentum=0.9, weights=None))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(2, init='uniform'))
model.add(BatchNormalization(epsilon=1e-06, mode=0, momentum=0.9, weights=None))
model.add(Activation('softmax'))

sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=20, batch_size=16, show_accuracy=True, 
validation_split=0.2, verbose = 2)

How is Batch Normalization applied?

Suppose we have input a[l-1] to a layer l. Also we have weights W[l] and bias unit b[l] for the layer l. Let a[l] be the activation vector calculated(i.e. after adding the non-linearity) for the layer l and z[l] be the vector before adding non-linearity

  1. Using a[l-1] and W[l] we can calculate z[l] for the layer l
  2. Usually in feed-forward propagation we will add bias unit to the z[l] at this stage like this z[l]+b[l], but in Batch Normalization this step of addition of b[l] is not required and no b[l] parameter is used.
  3. Calculate z[l] means and subtract it from each element
  4. Divide (z[l] – mean) using standard deviation. Call it Z_temp[l]
  5. Now define new parameters γ and β that will change the scale of the hidden layer as follows:

    z_norm[l] = γ.Z_temp[l] + β

In this code excerpt, the Dense() takes the a[l-1], uses W[l] and calculates z[l]. Then the immediate BatchNormalization() will perform the above steps to give z_norm[l]. And then the immediate Activation() will calculate tanh(z_norm[l]) to give a[l] i.e.

a[l] = tanh(z_norm[l])

Keras,如何获得每一层的输出?

问题:Keras,如何获得每一层的输出?

我已经使用CNN训练了二进制分类模型,这是我的代码

model = Sequential()
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
                        border_mode='valid',
                        input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
# (16, 16, 32)
model.add(Convolution2D(nb_filters*2, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters*2, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
# (8, 8, 64) = (2048)
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(2))  # define a binary classification problem
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='adadelta',
              metrics=['accuracy'])
model.fit(x_train, y_train,
          batch_size=batch_size,
          nb_epoch=nb_epoch,
          verbose=1,
          validation_data=(x_test, y_test))

在这里,我想像TensorFlow一样获得每一层的输出,我该怎么做?

I have trained a binary classification model with CNN, and here is my code

model = Sequential()
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
                        border_mode='valid',
                        input_shape=input_shape))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
# (16, 16, 32)
model.add(Convolution2D(nb_filters*2, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters*2, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
# (8, 8, 64) = (2048)
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(2))  # define a binary classification problem
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='adadelta',
              metrics=['accuracy'])
model.fit(x_train, y_train,
          batch_size=batch_size,
          nb_epoch=nb_epoch,
          verbose=1,
          validation_data=(x_test, y_test))

And here, I wanna get the output of each layer just like TensorFlow, how can I do that?


回答 0

您可以使用以下命令轻松获取任何图层的输出: model.layers[index].output

对于所有图层,请使用以下命令:

from keras import backend as K

inp = model.input                                           # input placeholder
outputs = [layer.output for layer in model.layers]          # all layer outputs
functors = [K.function([inp, K.learning_phase()], [out]) for out in outputs]    # evaluation functions

# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = [func([test, 1.]) for func in functors]
print layer_outs

注:为了模拟差使用learning_phase1.layer_outs以其它方式使用0.

编辑:(基于评论)

K.function 创建theano / tensorflow张量函数,该函数随后用于从给定输入的符号图中获取输出。

现在K.learning_phase()需要作为输入,因为很多Keras层(如Dropout / Batchnomalization)都依赖于它,以在训练和测试期间更改行为。

因此,如果您删除代码中的辍学层,则可以简单地使用:

from keras import backend as K

inp = model.input                                           # input placeholder
outputs = [layer.output for layer in model.layers]          # all layer outputs
functors = [K.function([inp], [out]) for out in outputs]    # evaluation functions

# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = [func([test]) for func in functors]
print layer_outs

编辑2:更优化

我只是意识到,先前的答案并不是针对每个函数评估进行了优化,因为数据将被传输到CPU-> GPU内存中,并且还需要对低层进行n-n-over的张量计算。

相反,这是一种更好的方法,因为您不需要多个函数,而只需一个函数即可为您提供所有输出的列表:

from keras import backend as K

inp = model.input                                           # input placeholder
outputs = [layer.output for layer in model.layers]          # all layer outputs
functor = K.function([inp, K.learning_phase()], outputs )   # evaluation function

# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = functor([test, 1.])
print layer_outs

You can easily get the outputs of any layer by using: model.layers[index].output

For all layers use this:

from keras import backend as K

inp = model.input                                           # input placeholder
outputs = [layer.output for layer in model.layers]          # all layer outputs
functors = [K.function([inp, K.learning_phase()], [out]) for out in outputs]    # evaluation functions

# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = [func([test, 1.]) for func in functors]
print layer_outs

Note: To simulate Dropout use learning_phase as 1. in layer_outs otherwise use 0.

Edit: (based on comments)

K.function creates theano/tensorflow tensor functions which is later used to get the output from the symbolic graph given the input.

Now K.learning_phase() is required as an input as many Keras layers like Dropout/Batchnomalization depend on it to change behavior during training and test time.

So if you remove the dropout layer in your code you can simply use:

from keras import backend as K

inp = model.input                                           # input placeholder
outputs = [layer.output for layer in model.layers]          # all layer outputs
functors = [K.function([inp], [out]) for out in outputs]    # evaluation functions

# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = [func([test]) for func in functors]
print layer_outs

Edit 2: More optimized

I just realized that the previous answer is not that optimized as for each function evaluation the data will be transferred CPU->GPU memory and also the tensor calculations needs to be done for the lower layers over-n-over.

Instead this is a much better way as you don’t need multiple functions but a single function giving you the list of all outputs:

from keras import backend as K

inp = model.input                                           # input placeholder
outputs = [layer.output for layer in model.layers]          # all layer outputs
functor = K.function([inp, K.learning_phase()], outputs )   # evaluation function

# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = functor([test, 1.])
print layer_outs

回答 1

https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer

一种简单的方法是创建一个新模型,该模型将输出您感兴趣的图层:

from keras.models import Model

model = ...  # include here your original model

layer_name = 'my_layer'
intermediate_layer_model = Model(inputs=model.input,
                                 outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model.predict(data)

或者,您可以构建Keras函数,该函数将在给定特定输入的情况下返回特定图层的输出,例如:

from keras import backend as K

# with a Sequential model
get_3rd_layer_output = K.function([model.layers[0].input],
                                  [model.layers[3].output])
layer_output = get_3rd_layer_output([x])[0]

From https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer

One simple way is to create a new Model that will output the layers that you are interested in:

from keras.models import Model

model = ...  # include here your original model

layer_name = 'my_layer'
intermediate_layer_model = Model(inputs=model.input,
                                 outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model.predict(data)

Alternatively, you can build a Keras function that will return the output of a certain layer given a certain input, for example:

from keras import backend as K

# with a Sequential model
get_3rd_layer_output = K.function([model.layers[0].input],
                                  [model.layers[3].output])
layer_output = get_3rd_layer_output([x])[0]

回答 2

基于此线程的所有良好答案,我编写了一个库来获取每一层的输出。它抽象了所有复杂性,并被设计为尽可能易于使用:

https://github.com/philipperemy/keract

它处理几乎所有边缘情况

希望能帮助到你!

Based on all the good answers of this thread, I wrote a library to fetch the output of each layer. It abstracts all the complexity and has been designed to be as user-friendly as possible:

https://github.com/philipperemy/keract

It handles almost all the edge cases

Hope it helps!


回答 3

以下对我来说看起来很简单:

model.layers[idx].output

上面是张量对象,因此您可以使用可应用于张量对象的操作对其进行修改。

例如,获得形状 model.layers[idx].output.get_shape()

idx 是图层的索引,您可以从中找到它 model.summary()

Following looks very simple to me:

model.layers[idx].output

Above is a tensor object, so you can modify it using operations that can be applied to a tensor object.

For example, to get the shape model.layers[idx].output.get_shape()

idx is the index of the layer and you can find it from model.summary()


回答 4

我为自己编写了此函数(在Jupyter中),它的灵感来自indraforyou的回答。它将自动绘制所有图层输出。您的图像必须具有(x,y,1)形状,其中1代表1个通道。您只需调用plot_layer_outputs(…)即可进行绘制。

%matplotlib inline
import matplotlib.pyplot as plt
from keras import backend as K

def get_layer_outputs():
    test_image = YOUR IMAGE GOES HERE!!!
    outputs    = [layer.output for layer in model.layers]          # all layer outputs
    comp_graph = [K.function([model.input]+ [K.learning_phase()], [output]) for output in outputs]  # evaluation functions

    # Testing
    layer_outputs_list = [op([test_image, 1.]) for op in comp_graph]
    layer_outputs = []

    for layer_output in layer_outputs_list:
        print(layer_output[0][0].shape, end='\n-------------------\n')
        layer_outputs.append(layer_output[0][0])

    return layer_outputs

def plot_layer_outputs(layer_number):    
    layer_outputs = get_layer_outputs()

    x_max = layer_outputs[layer_number].shape[0]
    y_max = layer_outputs[layer_number].shape[1]
    n     = layer_outputs[layer_number].shape[2]

    L = []
    for i in range(n):
        L.append(np.zeros((x_max, y_max)))

    for i in range(n):
        for x in range(x_max):
            for y in range(y_max):
                L[i][x][y] = layer_outputs[layer_number][x][y][i]


    for img in L:
        plt.figure()
        plt.imshow(img, interpolation='nearest')

I wrote this function for myself (in Jupyter) and it was inspired by indraforyou‘s answer. It will plot all the layer outputs automatically. Your images must have a (x, y, 1) shape where 1 stands for 1 channel. You just call plot_layer_outputs(…) to plot.

%matplotlib inline
import matplotlib.pyplot as plt
from keras import backend as K

def get_layer_outputs():
    test_image = YOUR IMAGE GOES HERE!!!
    outputs    = [layer.output for layer in model.layers]          # all layer outputs
    comp_graph = [K.function([model.input]+ [K.learning_phase()], [output]) for output in outputs]  # evaluation functions

    # Testing
    layer_outputs_list = [op([test_image, 1.]) for op in comp_graph]
    layer_outputs = []

    for layer_output in layer_outputs_list:
        print(layer_output[0][0].shape, end='\n-------------------\n')
        layer_outputs.append(layer_output[0][0])

    return layer_outputs

def plot_layer_outputs(layer_number):    
    layer_outputs = get_layer_outputs()

    x_max = layer_outputs[layer_number].shape[0]
    y_max = layer_outputs[layer_number].shape[1]
    n     = layer_outputs[layer_number].shape[2]

    L = []
    for i in range(n):
        L.append(np.zeros((x_max, y_max)))

    for i in range(n):
        for x in range(x_max):
            for y in range(y_max):
                L[i][x][y] = layer_outputs[layer_number][x][y][i]


    for img in L:
        plt.figure()
        plt.imshow(img, interpolation='nearest')

回答 5

来自:https : //github.com/philipperemy/keras-visualize-activations/blob/master/read_activations.py

import keras.backend as K

def get_activations(model, model_inputs, print_shape_only=False, layer_name=None):
    print('----- activations -----')
    activations = []
    inp = model.input

    model_multi_inputs_cond = True
    if not isinstance(inp, list):
        # only one input! let's wrap it in a list.
        inp = [inp]
        model_multi_inputs_cond = False

    outputs = [layer.output for layer in model.layers if
               layer.name == layer_name or layer_name is None]  # all layer outputs

    funcs = [K.function(inp + [K.learning_phase()], [out]) for out in outputs]  # evaluation functions

    if model_multi_inputs_cond:
        list_inputs = []
        list_inputs.extend(model_inputs)
        list_inputs.append(0.)
    else:
        list_inputs = [model_inputs, 0.]

    # Learning phase. 0 = Test mode (no dropout or batch normalization)
    # layer_outputs = [func([model_inputs, 0.])[0] for func in funcs]
    layer_outputs = [func(list_inputs)[0] for func in funcs]
    for layer_activations in layer_outputs:
        activations.append(layer_activations)
        if print_shape_only:
            print(layer_activations.shape)
        else:
            print(layer_activations)
    return activations

From: https://github.com/philipperemy/keras-visualize-activations/blob/master/read_activations.py

import keras.backend as K

def get_activations(model, model_inputs, print_shape_only=False, layer_name=None):
    print('----- activations -----')
    activations = []
    inp = model.input

    model_multi_inputs_cond = True
    if not isinstance(inp, list):
        # only one input! let's wrap it in a list.
        inp = [inp]
        model_multi_inputs_cond = False

    outputs = [layer.output for layer in model.layers if
               layer.name == layer_name or layer_name is None]  # all layer outputs

    funcs = [K.function(inp + [K.learning_phase()], [out]) for out in outputs]  # evaluation functions

    if model_multi_inputs_cond:
        list_inputs = []
        list_inputs.extend(model_inputs)
        list_inputs.append(0.)
    else:
        list_inputs = [model_inputs, 0.]

    # Learning phase. 0 = Test mode (no dropout or batch normalization)
    # layer_outputs = [func([model_inputs, 0.])[0] for func in funcs]
    layer_outputs = [func(list_inputs)[0] for func in funcs]
    for layer_activations in layer_outputs:
        activations.append(layer_activations)
        if print_shape_only:
            print(layer_activations.shape)
        else:
            print(layer_activations)
    return activations

回答 6

想要将其添加为@indraforyou的答案作为注释(但没有足够高的声望)以纠正@mathtick的注释中提到的问题。为了避免InvalidArgumentError: input_X:Y is both fed and fetched.异常,只需更换行outputs = [layer.output for layer in model.layers]outputs = [layer.output for layer in model.layers][1:],即

调整indraforyou的最小工作示例:

from keras import backend as K 
inp = model.input                                           # input placeholder
outputs = [layer.output for layer in model.layers][1:]        # all layer outputs except first (input) layer
functor = K.function([inp, K.learning_phase()], outputs )   # evaluation function

# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = functor([test, 1.])
print layer_outs

PS我尝试的东西,如尝试outputs = [layer.output for layer in model.layers[1:]]不起作用。

Wanted to add this as a comment (but don’t have high enough rep.) to @indraforyou’s answer to correct for the issue mentioned in @mathtick’s comment. To avoid the InvalidArgumentError: input_X:Y is both fed and fetched. exception, simply replace the line outputs = [layer.output for layer in model.layers] with outputs = [layer.output for layer in model.layers][1:], i.e.

adapting indraforyou’s minimal working example:

from keras import backend as K 
inp = model.input                                           # input placeholder
outputs = [layer.output for layer in model.layers][1:]        # all layer outputs except first (input) layer
functor = K.function([inp, K.learning_phase()], outputs )   # evaluation function

# Testing
test = np.random.random(input_shape)[np.newaxis,...]
layer_outs = functor([test, 1.])
print layer_outs

p.s. my attempts trying things such as outputs = [layer.output for layer in model.layers[1:]] did not work.


回答 7

假设您有:

1- Keras训练有素model

2-输入x为图像或图像集。图像的分辨率应与输入层的尺寸兼容。例如对于3通道(RGB)图像为80 * 80 * 3

3- layer要激活的输出的名称。例如,“ flatten_2”层。这应该包含在layer_names变量中,代表给定层的名称model

4- batch_size是可选参数。

然后,您可以轻松地使用get_activation函数来获得layer给定输入x和预训练的输出激活model

import six
import numpy as np
import keras.backend as k
from numpy import float32
def get_activations(x, model, layer, batch_size=128):
"""
Return the output of the specified layer for input `x`. `layer` is specified by layer index (between 0 and
`nb_layers - 1`) or by name. The number of layers can be determined by counting the results returned by
calling `layer_names`.
:param x: Input for computing the activations.
:type x: `np.ndarray`. Example: x.shape = (80, 80, 3)
:param model: pre-trained Keras model. Including weights.
:type model: keras.engine.sequential.Sequential. Example: model.input_shape = (None, 80, 80, 3)
:param layer: Layer for computing the activations
:type layer: `int` or `str`. Example: layer = 'flatten_2'
:param batch_size: Size of batches.
:type batch_size: `int`
:return: The output of `layer`, where the first dimension is the batch size corresponding to `x`.
:rtype: `np.ndarray`. Example: activations.shape = (1, 2000)
"""

    layer_names = [layer.name for layer in model.layers]
    if isinstance(layer, six.string_types):
        if layer not in layer_names:
            raise ValueError('Layer name %s is not part of the graph.' % layer)
        layer_name = layer
    elif isinstance(layer, int):
        if layer < 0 or layer >= len(layer_names):
            raise ValueError('Layer index %d is outside of range (0 to %d included).'
                             % (layer, len(layer_names) - 1))
        layer_name = layer_names[layer]
    else:
        raise TypeError('Layer must be of type `str` or `int`.')

    layer_output = model.get_layer(layer_name).output
    layer_input = model.input
    output_func = k.function([layer_input], [layer_output])

    # Apply preprocessing
    if x.shape == k.int_shape(model.input)[1:]:
        x_preproc = np.expand_dims(x, 0)
    else:
        x_preproc = x
    assert len(x_preproc.shape) == 4

    # Determine shape of expected output and prepare array
    output_shape = output_func([x_preproc[0][None, ...]])[0].shape
    activations = np.zeros((x_preproc.shape[0],) + output_shape[1:], dtype=float32)

    # Get activations with batching
    for batch_index in range(int(np.ceil(x_preproc.shape[0] / float(batch_size)))):
        begin, end = batch_index * batch_size, min((batch_index + 1) * batch_size, x_preproc.shape[0])
        activations[begin:end] = output_func([x_preproc[begin:end]])[0]

    return activations

Assuming you have:

1- Keras pre-trained model.

2- Input x as image or set of images. The resolution of image should be compatible with dimension of the input layer. For example 80*80*3 for 3-channels (RGB) image.

3- The name of the output layer to get the activation. For example, “flatten_2” layer. This should be include in the layer_names variable, represents name of layers of the given model.

4- batch_size is an optional argument.

Then you can easily use get_activation function to get the activation of the output layer for a given input x and pre-trained model:

import six
import numpy as np
import keras.backend as k
from numpy import float32
def get_activations(x, model, layer, batch_size=128):
"""
Return the output of the specified layer for input `x`. `layer` is specified by layer index (between 0 and
`nb_layers - 1`) or by name. The number of layers can be determined by counting the results returned by
calling `layer_names`.
:param x: Input for computing the activations.
:type x: `np.ndarray`. Example: x.shape = (80, 80, 3)
:param model: pre-trained Keras model. Including weights.
:type model: keras.engine.sequential.Sequential. Example: model.input_shape = (None, 80, 80, 3)
:param layer: Layer for computing the activations
:type layer: `int` or `str`. Example: layer = 'flatten_2'
:param batch_size: Size of batches.
:type batch_size: `int`
:return: The output of `layer`, where the first dimension is the batch size corresponding to `x`.
:rtype: `np.ndarray`. Example: activations.shape = (1, 2000)
"""

    layer_names = [layer.name for layer in model.layers]
    if isinstance(layer, six.string_types):
        if layer not in layer_names:
            raise ValueError('Layer name %s is not part of the graph.' % layer)
        layer_name = layer
    elif isinstance(layer, int):
        if layer < 0 or layer >= len(layer_names):
            raise ValueError('Layer index %d is outside of range (0 to %d included).'
                             % (layer, len(layer_names) - 1))
        layer_name = layer_names[layer]
    else:
        raise TypeError('Layer must be of type `str` or `int`.')

    layer_output = model.get_layer(layer_name).output
    layer_input = model.input
    output_func = k.function([layer_input], [layer_output])

    # Apply preprocessing
    if x.shape == k.int_shape(model.input)[1:]:
        x_preproc = np.expand_dims(x, 0)
    else:
        x_preproc = x
    assert len(x_preproc.shape) == 4

    # Determine shape of expected output and prepare array
    output_shape = output_func([x_preproc[0][None, ...]])[0].shape
    activations = np.zeros((x_preproc.shape[0],) + output_shape[1:], dtype=float32)

    # Get activations with batching
    for batch_index in range(int(np.ceil(x_preproc.shape[0] / float(batch_size)))):
        begin, end = batch_index * batch_size, min((batch_index + 1) * batch_size, x_preproc.shape[0])
        activations[begin:end] = output_func([x_preproc[begin:end]])[0]

    return activations

回答 8

如果您具有以下情况之一:

  • 错误: InvalidArgumentError: input_X:Y is both fed and fetched
  • 多输入的情况

您需要进行以下更改:

  • outputs变量中的输入层添加过滤器
  • 最小functors循环变化

最小示例:

from keras.engine.input_layer import InputLayer
inp = model.input
outputs = [layer.output for layer in model.layers if not isinstance(layer, InputLayer)]
functors = [K.function(inp + [K.learning_phase()], [x]) for x in outputs]
layer_outputs = [fun([x1, x2, xn, 1]) for fun in functors]

In case you have one of the following cases:

  • error: InvalidArgumentError: input_X:Y is both fed and fetched
  • case of multiple inputs

You need to do the following changes:

  • add filter out for input layers in outputs variable
  • minnor change on functors loop

Minimum example:

from keras.engine.input_layer import InputLayer
inp = model.input
outputs = [layer.output for layer in model.layers if not isinstance(layer, InputLayer)]
functors = [K.function(inp + [K.learning_phase()], [x]) for x in outputs]
layer_outputs = [fun([x1, x2, xn, 1]) for fun in functors]

回答 9

好吧,其他答案也很完整,但是有一种非常基本的方法可以“看到”形状,而不是“获得”形状。

只是做一个model.summary()。它将打印所有图层及其输出形状。“无”值将指示可变尺寸,而第一维将是批量大小。

Well, other answers are very complete, but there is a very basic way to “see”, not to “get” the shapes.

Just do a model.summary(). It will print all layers and their output shapes. “None” values will indicate variable dimensions, and the first dimension will be the batch size.


Autokeras-面向深度学习的AutoML库

官网:autokeras.com

AutoKera:一个基于KERS的AutoML系统。它是由DATA Lab在德克萨斯农工大学。AutoKera的目标是让每个人都可以使用机器学习

学习资源

  • 一个简短的例子
import autokeras as ak

clf = ak.ImageClassifier()
clf.fit(x_train, y_train)
results = clf.predict(x_test)

安装

要安装该软件包,请使用pip安装步骤如下:

pip3 install autokeras

请按照installation guide有关更多详细信息,请参阅

注:目前,AutoKera仅与Python>=3.5TensorFlow>=2.3.0

社区

随时了解最新信息

推特:你也可以在推特上关注我们@autokeras了解最新消息

电子邮件:订阅我们的email list接收通知的步骤

问题和讨论

GitHub讨论:请在我们的GitHub Discussions这是一个在GitHub上托管的论坛。我们将在那里监控并回答问题

即时通信

松弛Request an invitation使用#autokeras通信通道

QQ群:加入我们的QQ群1150366085。密码:akqqgroup

在线会议:加入online meeting Google group日历事件将出现在您的Google日历上

贡献代码

我们致力于让AutoKera的一切向公众开放。每个人都可以很容易地以开发人员的身份加入。以下是我们如何管理我们的项目

  • 对问题进行分类例如,我们从中挑选要解决的关键问题GitHub issues它们将被添加到此Project其中一些问题随后将添加到milestones,用于计划发布
  • 分配任务:我们在网上会议期间将任务分配给人们
  • 讨论:我们可以在多个地方进行讨论。代码审查在GitHub上。问题可以在Slake或在会议期间提问

请加入我们的Slack给金海峰发个口信。或顺道拜访我们的online meetings然后跟我们谈谈。我们将帮助您入门!

请参阅我们的Contributing Guide学习最佳实践

感谢所有的贡献者!

捐赠

我们接受财政上的支持Open Collective感谢每一位赞助商对我们的支持!


引用这部作品

金海峰、宋清泉、夏虎。“Auto-keras:一种高效的神经结构搜索系统。”第25届ACM SIGKDD知识发现与数据挖掘国际会议论文集。ACM,2019年。(Download)

Biblatex条目:

@inproceedings{jin2019auto,
  title={Auto-Keras: An Efficient Neural Architecture Search System},
  author={Jin, Haifeng and Song, Qingquan and Hu, Xia},
  booktitle={Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  pages={1946--1956},
  year={2019},
  organization={ACM}
}

确认

作者感谢国防高级研究计划局(DARPA)通过AFRL合同FA8750-17-2-0116、德克萨斯农工学院和德克萨斯农工大学管理的D3M计划