create_supervised_trainer#
- ignite.engine.create_supervised_trainer(model, optimizer, loss_fn, device=None, non_blocking=False, prepare_batch=<function _prepare_batch>, model_transform=<function <lambda>>, output_transform=<function <lambda>>, deterministic=False, amp_mode=None, scaler=False, gradient_accumulation_steps=1)[source]#
Factory function for creating a trainer for supervised models.
- Parameters
model (Module) – the model to train.
optimizer (Optimizer) – the optimizer to use.
loss_fn (Union[Callable, Module]) – the loss function to use.
device (Optional[Union[str, device]]) – device type specification (default: None). Applies to batches after starting the engine. Model will not be moved. Device can be CPU, GPU or TPU.
non_blocking (bool) – if True and this copy is between CPU and GPU, the copy may occur asynchronously with respect to the host. For other cases, this argument has no effect.
prepare_batch (Callable) – function that receives batch, device, non_blocking and outputs tuple of tensors (batch_x, batch_y).
model_transform (Callable[[Any], Any]) – function that receives the output from the model and convert it into the form as required by the loss function
output_transform (Callable[[Any, Any, Any, Tensor], Any]) – function that receives ‘x’, ‘y’, ‘y_pred’, ‘loss’ and returns value to be assigned to engine’s state.output after each iteration. Default is returning loss.item().
deterministic (bool) – if True, returns deterministic engine of type
DeterministicEngine
, otherwiseEngine
(default: False).amp_mode (Optional[str]) – can be
amp
orapex
, model and optimizer will be casted to float16 using torch.cuda.amp foramp
and using apex forapex
. (default: None)scaler (Union[bool, GradScaler]) – GradScaler instance for gradient scaling if torch>=1.6.0 and
amp_mode
isamp
. Ifamp_mode
isapex
, this argument will be ignored. If True, will create default GradScaler. If GradScaler instance is passed, it will be used instead. (default: False)gradient_accumulation_steps (int) – Number of steps the gradients should be accumulated across. (default: 1 (means no gradient accumulation))
- Returns
a trainer engine with supervised update function.
- Return type
Examples
Create a trainer
from ignite.engine import create_supervised_trainer from ignite.utils import convert_tensor from ignite.contrib.handlers.tqdm_logger import ProgressBar model = ... loss = ... optimizer = ... dataloader = ... def prepare_batch_fn(batch, device, non_blocking): x = ... # get x from batch y = ... # get y from batch # return a tuple of (x, y) that can be directly runned as # `loss_fn(model(x), y)` return ( convert_tensor(x, device, non_blocking), convert_tensor(y, device, non_blocking) ) def output_transform_fn(x, y, y_pred, loss): # return only the loss is actually the default behavior for # trainer engine, but you can return anything you want return loss.item() trainer = create_supervised_trainer( model, optimizer, loss, prepare_batch=prepare_batch_fn, output_transform=output_transform_fn ) pbar = ProgressBar() pbar.attach(trainer, output_transform=lambda x: {"loss": x}) trainer.run(dataloader, max_epochs=5)
Note
If
scaler
is True, GradScaler instance will be created internally and trainer state has attribute namedscaler
for that instance and can be used for saving and loading.Note
engine.state.output for this engine is defined by output_transform parameter and is the loss of the processed batch by default.
Warning
The internal use of device has changed. device will now only be used to move the input data to the correct device. The model should be moved by the user before creating an optimizer. For more information see:
Warning
If
amp_mode='apex'
, the model(s) and optimizer(s) must be initialized beforehand sinceamp.initialize
should be called after you have finished constructing your model(s) and optimizer(s), but before you send your model through any DistributedDataParallel wrapper.See more: https://nvidia.github.io/apex/amp.html#module-apex.amp
Changed in version 0.4.5:
Added
amp_mode
argument for automatic mixed precision.Added
scaler
argument for gradient scaling.
Changed in version 0.4.7: Added Gradient Accumulation argument for all supervised training methods.
Changed in version 0.4.11: Added model_transform to transform model’s output