ignite.engine.create_supervised_trainer(model, optimizer, loss_fn, device=None, non_blocking=False, prepare_batch=<function _prepare_batch>, output_transform=<function <lambda>>, deterministic=False, amp_mode=None, scaler=False, gradient_accumulation_steps=1)[source]#

Factory function for creating a trainer for supervised models.

  • model (torch.nn.modules.module.Module) – the model to train.

  • optimizer (torch.optim.optimizer.Optimizer) – the optimizer to use.

  • loss_fn (Union[Callable, torch.nn.modules.module.Module]) – the loss function to use.

  • device (Optional[Union[str, torch.device]]) – device type specification (default: None). Applies to batches after starting the engine. Model will not be moved. Device can be CPU, GPU or TPU.

  • non_blocking (bool) – if True and this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.

  • prepare_batch (Callable) – function that receives batch, device, non_blocking and outputs tuple of tensors (batch_x, batch_y).

  • output_transform (Callable) – function that receives ‘x’, ‘y’, ‘y_pred’, ‘loss’ and returns value to be assigned to engine’s state.output after each iteration. Default is returning loss.item().

  • deterministic (bool) – if True, returns deterministic engine of type DeterministicEngine, otherwise Engine (default: False).

  • amp_mode (Optional[str]) – can be amp or apex, model and optimizer will be casted to float16 using torch.cuda.amp for amp and using apex for apex. (default: None)

  • scaler (Union[bool, torch.cuda.amp.grad_scaler.GradScaler]) – GradScaler instance for gradient scaling if torch>=1.6.0 and amp_mode is amp. If amp_mode is apex, this argument will be ignored. If True, will create default GradScaler. If GradScaler instance is passed, it will be used instead. (default: False)

  • gradient_accumulation_steps (int) – Number of steps the gradients should be accumulated across. (default: 1 (means no gradient accumulation))


a trainer engine with supervised update function.

Return type



If scaler is True, GradScaler instance will be created internally and trainer state has attribute named scaler for that instance and can be used for saving and loading.


engine.state.output for this engine is defined by output_transform parameter and is the loss of the processed batch by default.


The internal use of device has changed. device will now only be used to move the input data to the correct device. The model should be moved by the user before creating an optimizer. For more information see:


If amp_mode='apex' , the model(s) and optimizer(s) must be initialized beforehand since amp.initialize should be called after you have finished constructing your model(s) and optimizer(s), but before you send your model through any DistributedDataParallel wrapper.

See more:

Changed in version 0.4.5:

  • Added amp_mode argument for automatic mixed precision.

  • Added scaler argument for gradient scaling.

Changed in version 0.4.7: Added Gradient Accumulation argument for all supervised training methods.