SGDW/SGDWR and AdamW/AdamWR optimizers for Keras

Keras implementations of SGDW and AdamW (SGD and Adam with decoupled weight decay), which can be used with warm restarts to obtain SGDWR and AdamWR.

Usage

from keras_optimizers import SGDW

optimizer = SGDW(lr=0.01, weight_decay=0.01)

model.compile(loss='categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])
              
model.fit(x_train, y_train)

For SGDWR/AdamWR, use the callback WRScheduler with SGDW/AdamW

from keras_optimizers import AdamW
from keras_callbacks import WRScheduler

optimizer = AdamW(lr=0.001, weight_decay=0.01)

model.compile(loss='categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])
              
cb_wr = WRScheduler(steps_per_epoch=len(x_train)/batch_size)

model.fit(x_train, y_train, callbacks=[cb_wr])

Tested on this system

Python 3.6.8
TensorFlow 1.12.0
Keras 2.2.4

Reference

SGDR: Stochastic Gradient Descent with Warm Restarts, Ilya Loshchilov, Frank Hutter

Decoupled Weight Decay Regularization, Ilya Loshchilov, Frank Hutter

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
keras_callbacks.py		keras_callbacks.py
keras_optimizers.py		keras_optimizers.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SGDW/SGDWR and AdamW/AdamWR optimizers for Keras

Usage

Tested on this system

Reference

About

Releases

Packages

Languages

chaohuang/SGDWR-AdamWR-Keras

Folders and files

Latest commit

History

Repository files navigation

SGDW/SGDWR and AdamW/AdamWR optimizers for Keras

Usage

Tested on this system

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages