哪些参数应用于提前停止?

问题:哪些参数应用于提前停止?

我正在使用Keras为我的项目训练神经网络。Keras提供了提前停止的功能。我是否知道应该观察哪些参数,以避免由于使用早期停止而使神经网络过度拟合?

I’m training a neural network for my project using Keras. Keras has provided a function for early stopping. May I know what parameters should be observed to avoid my neural network from overfitting by using early stopping?


回答 0

一旦损失开始增加(或换句话说,验证准确性开始降低),早期停止基本上就是停止训练。根据文件,其用法如下;

keras.callbacks.EarlyStopping(monitor='val_loss',
                              min_delta=0,
                              patience=0,
                              verbose=0, mode='auto')

值取决于您的实现(问题,批处理大小等),但通常是为了防止我使用过度拟合;

  1. 通过将monitor 参数设置为,监控验证损失(需要使用交叉验证或至少训练/测试集)'val_loss'
  2. min_delta是在某个时期是否将损失量化为改善的阈值。如果损失差异小于min_delta,则将其量化为无改善。最好将其保留为0,因为我们对损失越来越严重感兴趣。
  3. patience参数代表损失开始增加(停止改善)后停止之前的时期数。这取决于您的实现,如果您使用的批次非常小学习率较高,则损失呈锯齿状(准确性会更加嘈杂),因此最好设置一个较大的patience参数。如果您使用大批量学习率较低,则损失会更平稳,因此可以使用较小的patience参数。无论哪种方式,我都将其保留为2,以便为模型提供更多机会。
  4. verbose 确定要打印的内容,将其保留为默认值(0)。
  5. mode参数取决于您监视的数量的方向(应该是减少还是增加),因为我们监视损失,所以可以使用min。但是让我们留给喀拉拉邦为我们处理,并将其设置为auto

因此,我将使用类似的方法并通过绘制有无早期停止的错误损失来进行实验。

keras.callbacks.EarlyStopping(monitor='val_loss',
                              min_delta=0,
                              patience=2,
                              verbose=0, mode='auto')

为了避免对回调的工作方式产生歧义,我将尝试解释更多信息。调用fit(... callbacks=[es])模型后,Keras会调用给定的回调对象预定的函数。这些功能可以称为on_train_beginon_train_endon_epoch_beginon_epoch_endon_batch_beginon_batch_end。在每个时期结束时调用提前停止回调,将最佳监视值与当前监视值进行比较,并在条件满足时停止(自观察最佳监视值以来已经过去了多少个时期,这不仅仅是耐心参数,两者之间的差最后一个值大于min_delta等。)。

正如@BrentFaust在评论中指出的那样,模型的训练将继续进行,直到满足Early Stopping条件或满足epochsin(默认值= 10)为止fit()。设置“提早停止”回调不会使模型超出其epochs参数进行训练。因此,调用fit()较大epochs值的函数将受益于Early Stopping回调。

Early stopping is basically stopping the training once your loss starts to increase (or in other words validation accuracy starts to decrease). According to documents it is used as follows;

keras.callbacks.EarlyStopping(monitor='val_loss',
                              min_delta=0,
                              patience=0,
                              verbose=0, mode='auto')

Values depends on your implementation (problem, batch size etc…) but generally to prevent overfitting I would use;

  1. Monitor the validation loss (need to use cross validation or at least train/test sets) by setting the monitor argument to 'val_loss'.
  2. min_delta is a threshold to whether quantify a loss at some epoch as improvement or not. If the difference of loss is below min_delta, it is quantified as no improvement. Better to leave it as 0 since we’re interested in when loss becomes worse.
  3. patience argument represents the number of epochs before stopping once your loss starts to increase (stops improving). This depends on your implementation, if you use very small batches or a large learning rate your loss zig-zag (accuracy will be more noisy) so better set a large patience argument. If you use large batches and a small learning rate your loss will be smoother so you can use a smaller patience argument. Either way I’ll leave it as 2 so I would give the model more chance.
  4. verbose decides what to print, leave it at default (0).
  5. mode argument depends on what direction your monitored quantity has (is it supposed to be decreasing or increasing), since we monitor the loss, we can use min. But let’s leave keras handle that for us and set that to auto

So I would use something like this and experiment by plotting the error loss with and without early stopping.

keras.callbacks.EarlyStopping(monitor='val_loss',
                              min_delta=0,
                              patience=2,
                              verbose=0, mode='auto')

For possible ambiguity on how callbacks work, I’ll try to explain more. Once you call fit(... callbacks=[es]) on your model, Keras calls given callback objects predetermined functions. These functions can be called on_train_begin, on_train_end, on_epoch_begin, on_epoch_end and on_batch_begin, on_batch_end. Early stopping callback is called on every epoch end, compares the best monitored value with the current one and stops if conditions are met (how many epochs have past since the observation of the best monitored value and is it more than patience argument, the difference between last value is bigger than min_delta etc..).

As pointed by @BrentFaust in comments, model’s training will continue until either Early Stopping conditions are met or epochs parameter (default=10) in fit() is satisfied. Setting an Early Stopping callback will not make the model to train beyond its epochs parameter. So calling fit() function with a larger epochs value would benefit more from Early Stopping callback.