supervised_training_step_amp
- ignite.engine.supervised_training_step_amp(model, optimizer, loss_fn, device=None, non_blocking=False, prepare_batch=<function _prepare_batch>, model_transform=<function <lambda>>, output_transform=<function <lambda>>, scaler=None, gradient_accumulation_steps=1)[source]
Factory function for supervised training using
torch.cuda.amp
.- Parameters
model (Module) – the model to train.
optimizer (Optimizer) – the optimizer to use.
loss_fn (Union[Callable, Module]) – the loss function to use.
device (Optional[Union[str, device]]) – device type specification (default: None). Applies to batches after starting the engine. Model will not be moved. Device can be CPU, GPU.
non_blocking (bool) – if True and this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.
prepare_batch (Callable) – function that receives batch, device, non_blocking and outputs tuple of tensors (batch_x, batch_y).
model_transform (Callable[[Any], Any]) – function that receives the output from the model and convert it into the form as required by the loss function
output_transform (Callable[[Any, Any, Any, Tensor], Any]) – function that receives ‘x’, ‘y’, ‘y_pred’, ‘loss’ and returns value to be assigned to engine’s state.output after each iteration. Default is returning loss.item().
scaler (Optional[GradScaler]) – GradScaler instance for gradient scaling. (default: None)
gradient_accumulation_steps (int) – Number of steps the gradients should be accumulated across. (default: 1 (means no gradient accumulation))
- Returns
update function
- Return type
Callable
Examples
from ignite.engine import Engine, supervised_training_step_amp model = ... optimizer = ... loss_fn = ... scaler = torch.cuda.amp.GradScaler(2**10) update_fn = supervised_training_step_amp(model, optimizer, loss_fn, 'cuda', scaler=scaler) trainer = Engine(update_fn)
New in version 0.4.5.
Changed in version 0.4.7: Added Gradient Accumulation.
Changed in version 0.4.11: Added model_transform to transform model’s output