Shortcuts

ignite.handlers#

Complete list of handlers#

class ignite.handlers.Checkpoint(to_save, save_handler, filename_prefix='', score_function=None, score_name=None, n_saved=1, global_step_transform=None, archived=False)[source]#

Checkpoint handler can be used to periodically save and load objects which have attribute state_dict/load_state_dict. This class can use specific save handlers to store on the disk or a cloud storage, etc.

Parameters
  • to_save (dict) – Dictionary with the objects to save. Objects should have implemented state_dict and ` load_state_dict` methods.

  • save_handler (callable) – Method to use to save engine and other provided objects. Function receives a checkpoint as a dictionary to save. In case if user needs to save engine’s checkpoint on a disk, save_handler can be defined with DiskSaver.

  • filename_prefix (str, optional) – Prefix for the filename to which objects will be saved. See Note for details.

  • score_function (callable, optional) – If not None, it should be a function taking a single argument, Engine object, and returning a score (float). Objects with highest scores will be retained.

  • score_name (str, optional) – If score_function not None, it is possible to store its absolute value using score_name. See Notes for more details.

  • n_saved (int, optional) – Number of objects that should be kept on disk. Older files will be removed. If set to None, all objects are kept.

  • global_step_transform (callable, optional) – global step transform function to output a desired global step. Input of the function is (engine, event_name). Output of function should be an integer. Default is None, global_step based on attached engine. If provided, uses function output as global_step. To setup global step from another engine, please use global_step_from_engine().

  • archived (bool, optional) – It True, saved checkpoint extension will be .pth.tar, Default value is False.

Note

This class stores a single file as a dictionary of provided objects to save. The filename has the following structure: {filename_prefix}_{name}_{suffix}.{ext} where

  • filename_prefix is the argument passed to the constructor,

  • name is the key in to_save if a single object is to store, otherwise name is “checkpoint”.

  • ext is .pth.tar if archived=True otherwise .pth.

  • suffix is composed as following {global_step}_{score_name}={score}.

Above global_step defined by the output of global_step_transform and score defined by the output of score_function.

By default, none of score_function, score_name, global_step_transform is defined, then suffix is setup by attached engine’s current iteration. The filename will be {filename_prefix}_{name}_{engine.state.iteration}.{ext}.

If defined a score_function, but without score_name, then suffix is defined by provided score. The filename will be {filename_prefix}_{name}_{global_step}_{score}.pth.

If defined score_function and score_name, then the filename will be {filename_prefix}_{name}_{score_name}={abs(score)}.{ext}. If global_step_transform is provided, then the filename will be {filename_prefix}_{name}_{global_step}_{score_name}={abs(score)}.{ext}

For example, score_name=”val_loss” and score_function that returns -loss (as objects with highest scores will be retained), then saved filename will be {filename_prefix}_{name}_val_loss=0.1234.pth.

To get the last stored filename, handler exposes attribute last_checkpoint:

handler = Checkpoint(...)
...
print(handler.last_checkpoint)
> checkpoint_12345.pth

Examples

Attach the handler to make checkpoints during training:

from ignite.engine import Engine, Events
from ignite.handlers import Checkpoint, DiskSaver

trainer = ...
model = ...
optimizer = ...
lr_scheduler = ...

to_save = {'model': model, 'optimizer': optimizer, 'lr_scheduler': lr_scheduler, 'trainer': trainer}
handler = Checkpoint(to_save, DiskSaver('/tmp/models', create_dir=True), n_saved=2)
trainer.add_event_handler(Events.ITERATION_COMPLETED(every=1000), handler)
trainer.run(data_loader, max_epochs=6)
> ["checkpoint_7000.pth", "checkpoint_8000.pth", ]

Attach the handler to an evaluator to save best model during the training according to computed validation metric:

from ignite.engine import Engine, Events
from ignite.handlers import Checkpoint, DiskSaver, global_step_from_engine

trainer = ...
evaluator = ...

def score_function(engine):
    engine.state.metrics['accuracy']

to_save = {'model': model}
handler = Checkpoint(to_save, DiskSaver('/tmp/models', create_dir=True), n_saved=2,
                     filename_prefix='best', score_function=score_function, score_name="val_acc",
                     global_step_transform=global_step_from_engine(trainer))

evaluator.add_event_handler(Events.COMPLETED, handler)
trainer.run(data_loader, max_epochs=10)
> ["best_model_9_val_acc=0.77.pth", "best_model_10_val_acc=0.78.pth", ]
class ignite.handlers.DiskSaver(dirname, atomic=True, create_dir=True, require_empty=True)[source]#

Handler that saves input checkpoint on a disk.

Parameters
  • dirname (str) – Directory path where the checkpoint will be saved

  • atomic (bool, optional) – if True, checkpoint is serialized to a temporary file, and then moved to final destination, so that files are guaranteed to not be damaged (for example if exception occures during saving).

  • create_dir (bool, optional) – if True, will create directory ‘dirname’ if it doesnt exist.

  • require_empty (bool, optional) – If True, will raise exception if there are any files in the directory ‘dirname’.

class ignite.handlers.ModelCheckpoint(dirname, filename_prefix, save_interval=None, score_function=None, score_name=None, n_saved=1, atomic=True, require_empty=True, create_dir=True, save_as_state_dict=True, global_step_transform=None, archived=False)[source]#

ModelCheckpoint handler can be used to periodically save objects to disk only. If needed to store checkpoints to another storage type, please consider Checkpoint.

This handler expects two arguments:

  • an Engine object

  • a dict mapping names (str) to objects that should be saved to disk.

See Examples for further details.

Warning

Behaviour of this class has been changed since v0.3.0.

Argument save_as_state_dict is deprecated and should not be used. It is considered as True.

Argument save_interval is deprecated and should not be used. Please, use events filtering instead, e.g. ITERATION_STARTED(every=1000)

There is no more internal counter that has been used to indicate the number of save actions. User could see its value step_number in the filename, e.g. {filename_prefix}_{name}_{step_number}.pth. Actually, step_number is replaced by current engine’s epoch if score_function is specified and current iteration otherwise.

A single pth file is created instead of multiple files.

Parameters
  • dirname (str) – Directory path where objects will be saved.

  • filename_prefix (str) – Prefix for the filenames to which objects will be saved. See Notes of Checkpoint for more details.

  • score_function (callable, optional) – if not None, it should be a function taking a single argument, an Engine object, and return a score (float). Objects with highest scores will be retained.

  • score_name (str, optional) – if score_function not None, it is possible to store its absolute value using score_name. See Notes for more details.

  • n_saved (int, optional) – Number of objects that should be kept on disk. Older files will be removed. If set to None, all objects are kept.

  • atomic (bool, optional) – If True, objects are serialized to a temporary file, and then moved to final destination, so that files are guaranteed to not be damaged (for example if exception occurs during saving).

  • require_empty (bool, optional) – If True, will raise exception if there are any files starting with filename_prefix in the directory ‘dirname’.

  • create_dir (bool, optional) – If True, will create directory ‘dirname’ if it doesnt exist.

  • global_step_transform (callable, optional) – global step transform function to output a desired global step. Input of the function is (engine, event_name). Output of function should be an integer. Default is None, global_step based on attached engine. If provided, uses function output as global_step. To setup global step from another engine, please use global_step_from_engine().

  • archived (bool, optional) – It True, saved checkpoint extension will be .pth.tar, Default value is False.

Examples

>>> import os
>>> from ignite.engine import Engine, Events
>>> from ignite.handlers import ModelCheckpoint
>>> from torch import nn
>>> trainer = Engine(lambda batch: None)
>>> handler = ModelCheckpoint('/tmp/models', 'myprefix', n_saved=2, create_dir=True)
>>> model = nn.Linear(3, 3)
>>> trainer.add_event_handler(Events.EPOCH_COMPLETED(every=2), handler, {'mymodel': model})
>>> trainer.run([0], max_epochs=6)
>>> os.listdir('/tmp/models')
['myprefix_mymodel_4.pth', 'myprefix_mymodel_6.pth']
>>> handler.last_checkpoint
['/tmp/models/myprefix_mymodel_6.pth']
class ignite.handlers.EarlyStopping(patience, score_function, trainer, min_delta=0.0, cumulative_delta=False)[source]#

EarlyStopping handler can be used to stop the training if no improvement after a given number of events.

Parameters
  • patience (int) – Number of events to wait if no improvement and then stop the training.

  • score_function (callable) – It should be a function taking a single argument, an Engine object, and return a score float. An improvement is considered if the score is higher.

  • trainer (Engine) – trainer engine to stop the run if no improvement.

  • min_delta (float, optional) – A minimum increase in the score to qualify as an improvement, i.e. an increase of less than or equal to min_delta, will count as no improvement.

  • cumulative_delta (bool, optional) – It True, min_delta defines an increase since the last patience reset, otherwise, it defines an increase after the last event. Default value is False.

Examples:

from ignite.engine import Engine, Events
from ignite.handlers import EarlyStopping

def score_function(engine):
    val_loss = engine.state.metrics['nll']
    return -val_loss

handler = EarlyStopping(patience=10, score_function=score_function, trainer=trainer)
# Note: the handler is attached to an *Evaluator* (runs one epoch on validation dataset).
evaluator.add_event_handler(Events.COMPLETED, handler)
class ignite.handlers.Timer(average=False)[source]#

Timer object can be used to measure (average) time between events.

Parameters

average (bool, optional) – if True, then when .value() method is called, the returned value will be equal to total time measured, divided by the value of internal counter.

total#

total time elapsed when the Timer was running (in seconds).

Type

float

step_count#

internal counter, usefull to measure average time, e.g. of processing a single batch. Incremented with the .step() method.

Type

int

running#

flag indicating if timer is measuring time.

Type

bool

Note

When using Timer(average=True) do not forget to call timer.step() every time an event occurs. See the examples below.

Examples

Measuring total time of the epoch:

>>> from ignite.handlers import Timer
>>> import time
>>> work = lambda : time.sleep(0.1)
>>> idle = lambda : time.sleep(0.1)
>>> t = Timer(average=False)
>>> for _ in range(10):
...    work()
...    idle()
...
>>> t.value()
2.003073937026784

Measuring average time of the epoch:

>>> t = Timer(average=True)
>>> for _ in range(10):
...    work()
...    idle()
...    t.step()
...
>>> t.value()
0.2003182829997968

Measuring average time it takes to execute a single work() call:

>>> t = Timer(average=True)
>>> for _ in range(10):
...    t.resume()
...    work()
...    t.pause()
...    idle()
...    t.step()
...
>>> t.value()
0.10016545779653825

Using the Timer to measure average time it takes to process a single batch of examples:

>>> from ignite.engine import Engine, Events
>>> from ignite.handlers import Timer
>>> trainer = Engine(training_update_function)
>>> timer = Timer(average=True)
>>> timer.attach(trainer,
...              start=Events.EPOCH_STARTED,
...              resume=Events.ITERATION_STARTED,
...              pause=Events.ITERATION_COMPLETED,
...              step=Events.ITERATION_COMPLETED)
attach(engine, start=Events.STARTED, pause=Events.COMPLETED, resume=None, step=None)[source]#

Register callbacks to control the timer.

Parameters
  • engine (Engine) – Engine that this timer will be attached to.

  • start (Events) – Event which should start (reset) the timer.

  • pause (Events) – Event which should pause the timer.

  • resume (Events, optional) – Event which should resume the timer.

  • step (Events, optional) – Event which should call the step method of the counter.

Returns

self (Timer)

class ignite.handlers.TerminateOnNan(output_transform=<function TerminateOnNan.<lambda>>)[source]#

TerminateOnNan handler can be used to stop the training if the process_function’s output contains a NaN or infinite number or torch.tensor. The output can be of type: number, tensor or collection of them. The training is stopped if there is at least a single number/tensor have NaN or Infinite value. For example, if the output is [1.23, torch.tensor(…), torch.tensor(float(‘nan’))] the handler will stop the training.

Parameters

output_transform (callable, optional) – a callable that is used to transform the Engine’s process_function’s output into a number or torch.tensor or collection of them. This can be useful if, for example, you have a multi-output model and you want to check one or multiple values of the output.

Examples:

trainer.add_event_handler(Events.ITERATION_COMPLETED, TerminateOnNan())