Ex2 - Early Stopping Midway Through Epoch #107

BananaLee · 2022-06-05T23:11:25Z

BananaLee
Jun 5, 2022

This is probably more of a general conceptual discussion about early stopping. From what I've read and learned so far, the theory is to check against the validation set every x epochs (being a full run across all training data) to make sure overfitting doesn't occur.

However, my reading of the implementation provided is actually to check partially through the epoch and to stop early if performance is reducing. Could we clarify what the effects are of stopping partially through and starting a new epoch again? Wouldn't that mean that some of the training data isn't even seen by the model - or does the batch process give a random batch when for batch in Tqdm.tqdm(loader): from the second epoch?

If the former, then what are the implications of a model not looking at some of the training data at all?

Answered by sophiaalthammer

Jun 7, 2022

Hello,

you are right, with early stopping one evaluates against the validation set. However one can do that not only after every x epochs, but after a certain number of training steps.
One does not stop partially through the epoch and then start a new epoch, the whole training is finished when early stopping terminates. The reasoning behind that is that one not trains until the whole end of the specified epoch number, but that the training stops, if the performance does not increase after n additional training steps.
Maybe this article also helps a bit to understand how it works.

Best,
Sophia

View full answer

sophiaalthammer · 2022-06-07T07:09:09Z

sophiaalthammer
Jun 7, 2022
Collaborator

Hello,

you are right, with early stopping one evaluates against the validation set. However one can do that not only after every x epochs, but after a certain number of training steps.
One does not stop partially through the epoch and then start a new epoch, the whole training is finished when early stopping terminates. The reasoning behind that is that one not trains until the whole end of the specified epoch number, but that the training stops, if the performance does not increase after n additional training steps.
Maybe this article also helps a bit to understand how it works.

Best,
Sophia

2 replies

BananaLee Jun 7, 2022
Author

Hi Sophia,
Thanks for that. Yes, I understand how early stopping works but traditionally, the validation checks are done every x epochs.

My concern with stopping halfway was that it creates the possibility of stopping training midway during the first epoch which means the model doesn't even get to see some of the training data.

Am I correct in understanding that the 'stop-halfway-through-an-epoch' implementation only stops the training loop after at least one epoch has been completed?

sophiaalthammer Jun 7, 2022
Collaborator

I am not really sure what you mean by the stop-halfway-through-an-epoch implementation, as no such implementation is given in the assignment. It is your task to implement early stopping yourself, you can do that after every epoch if you are concerned that the model otherwise does not see the whole training data, you can also implement that between epochs after a certain number of training steps. I would recommend doing the second option, but you are free to explore and implement it the way you think is the best, we dont expect it to be done a certain way.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ex2 - Early Stopping Midway Through Epoch #107

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Ex2 - Early Stopping Midway Through Epoch #107

BananaLee Jun 5, 2022

Replies: 1 comment · 2 replies

sophiaalthammer Jun 7, 2022 Collaborator

BananaLee Jun 7, 2022 Author

sophiaalthammer Jun 7, 2022 Collaborator

BananaLee
Jun 5, 2022

Replies: 1 comment 2 replies

sophiaalthammer
Jun 7, 2022
Collaborator

BananaLee Jun 7, 2022
Author

sophiaalthammer Jun 7, 2022
Collaborator