How should Exponential Moving Average be used in custom TF2.4 training loop
I have a custom training loop that can be simplified as follow
inputs = tf.keras.Input(dtype=tf.float32, shape=(None, None, 3))
model = tf.keras.Model({"inputs": inputs}, {"loss": f(inputs)})
optimizer = tf.keras.optimizers.SGD(learning_rate=0.1, momentum=0.9, nesterov=True)
for inputs in batches:
with tf.GradientTape() as tape:
results = model(inputs, training=True)
grads = tape.gradient(results["loss"], model.trainable_weights)
optimizer.apply_gradients(zip(grads, model.trainable_weights))
The TensorFlow documentation of ExponentialMovingAverage is not clear on how it should be used in from-scratch training loop. As anyone worked with this?
Additionally, how should the shadow variable be restored into the model if both are still in memory, and how can I check that that training variables were correctly updated?
from Recent Questions - Stack Overflow https://ift.tt/3aIDffn
https://ift.tt/eA8V8J
Comments
Post a Comment