ignite.handlers#
Complete list of handlers#
- class ignite.handlers.Checkpoint(to_save, save_handler, filename_prefix='', score_function=None, score_name=None, n_saved=1, global_step_transform=None, archived=False)[source]#
Checkpoint handler can be used to periodically save and load objects which have attribute state_dict/load_state_dict. This class can use specific save handlers to store on the disk or a cloud storage, etc.
- Parameters
to_save (dict) – Dictionary with the objects to save. Objects should have implemented state_dict and ` load_state_dict` methods.
save_handler (callable) – Method to use to save engine and other provided objects. Function receives a checkpoint as a dictionary to save. In case if user needs to save engine’s checkpoint on a disk, save_handler can be defined with
DiskSaver
.filename_prefix (str, optional) – Prefix for the filename to which objects will be saved. See Note for details.
score_function (callable, optional) – If not None, it should be a function taking a single argument,
Engine
object, and returning a score (float). Objects with highest scores will be retained.score_name (str, optional) – If score_function not None, it is possible to store its absolute value using score_name. See Notes for more details.
n_saved (int, optional) – Number of objects that should be kept on disk. Older files will be removed. If set to None, all objects are kept.
global_step_transform (callable, optional) – global step transform function to output a desired global step. Input of the function is (engine, event_name). Output of function should be an integer. Default is None, global_step based on attached engine. If provided, uses function output as global_step. To setup global step from another engine, please use
global_step_from_engine()
.archived (bool, optional) – It True, saved checkpoint extension will be .pth.tar, Default value is False.
Note
This class stores a single file as a dictionary of provided objects to save. The filename has the following structure: {filename_prefix}_{name}_{suffix}.{ext} where
filename_prefix is the argument passed to the constructor,
name is the key in to_save if a single object is to store, otherwise name is “checkpoint”.
ext is .pth.tar if archived=True otherwise .pth.
suffix is composed as following {global_step}_{score_name}={score}.
Above global_step defined by the output of global_step_transform and score defined by the output of score_function.
By default, none of score_function, score_name, global_step_transform is defined, then suffix is setup by attached engine’s current iteration. The filename will be {filename_prefix}_{name}_{engine.state.iteration}.{ext}.
If defined a score_function, but without score_name, then suffix is defined by provided score. The filename will be {filename_prefix}_{name}_{global_step}_{score}.pth.
If defined score_function and score_name, then the filename will be {filename_prefix}_{name}_{score_name}={abs(score)}.{ext}. If global_step_transform is provided, then the filename will be {filename_prefix}_{name}_{global_step}_{score_name}={abs(score)}.{ext}
For example, score_name=”val_loss” and score_function that returns -loss (as objects with highest scores will be retained), then saved filename will be {filename_prefix}_{name}_val_loss=0.1234.pth.
To get the last stored filename, handler exposes attribute last_checkpoint:
handler = Checkpoint(...) ... print(handler.last_checkpoint) > checkpoint_12345.pth
Examples
Attach the handler to make checkpoints during training:
from ignite.engine import Engine, Events from ignite.handlers import Checkpoint, DiskSaver trainer = ... model = ... optimizer = ... lr_scheduler = ... to_save = {'model': model, 'optimizer': optimizer, 'lr_scheduler': lr_scheduler, 'trainer': trainer} handler = Checkpoint(to_save, DiskSaver('/tmp/models', create_dir=True), n_saved=2) trainer.add_event_handler(Events.ITERATION_COMPLETED(every=1000), handler) trainer.run(data_loader, max_epochs=6) > ["checkpoint_7000.pth", "checkpoint_8000.pth", ]
Attach the handler to an evaluator to save best model during the training according to computed validation metric:
from ignite.engine import Engine, Events from ignite.handlers import Checkpoint, DiskSaver, global_step_from_engine trainer = ... evaluator = ... def score_function(engine): engine.state.metrics['accuracy'] to_save = {'model': model} handler = Checkpoint(to_save, DiskSaver('/tmp/models', create_dir=True), n_saved=2, filename_prefix='best', score_function=score_function, score_name="val_acc", global_step_transform=global_step_from_engine(trainer)) evaluator.add_event_handler(Events.COMPLETED, handler) trainer.run(data_loader, max_epochs=10) > ["best_model_9_val_acc=0.77.pth", "best_model_10_val_acc=0.78.pth", ]
- class ignite.handlers.DiskSaver(dirname, atomic=True, create_dir=True, require_empty=True)[source]#
Handler that saves input checkpoint on a disk.
- Parameters
dirname (str) – Directory path where the checkpoint will be saved
atomic (bool, optional) – if True, checkpoint is serialized to a temporary file, and then moved to final destination, so that files are guaranteed to not be damaged (for example if exception occures during saving).
create_dir (bool, optional) – if True, will create directory ‘dirname’ if it doesnt exist.
require_empty (bool, optional) – If True, will raise exception if there are any files in the directory ‘dirname’.
- class ignite.handlers.ModelCheckpoint(dirname, filename_prefix, save_interval=None, score_function=None, score_name=None, n_saved=1, atomic=True, require_empty=True, create_dir=True, save_as_state_dict=True, global_step_transform=None, archived=False)[source]#
ModelCheckpoint handler can be used to periodically save objects to disk only. If needed to store checkpoints to another storage type, please consider
Checkpoint
.This handler expects two arguments:
an
Engine
objecta dict mapping names (str) to objects that should be saved to disk.
See Examples for further details.
Warning
Behaviour of this class has been changed since v0.3.0.
Argument save_as_state_dict is deprecated and should not be used. It is considered as True.
Argument save_interval is deprecated and should not be used. Please, use events filtering instead, e.g.
ITERATION_STARTED(every=1000)
There is no more internal counter that has been used to indicate the number of save actions. User could see its value step_number in the filename, e.g. {filename_prefix}_{name}_{step_number}.pth. Actually, step_number is replaced by current engine’s epoch if score_function is specified and current iteration otherwise.
A single pth file is created instead of multiple files.
- Parameters
dirname (str) – Directory path where objects will be saved.
filename_prefix (str) – Prefix for the filenames to which objects will be saved. See Notes of
Checkpoint
for more details.score_function (callable, optional) – if not None, it should be a function taking a single argument, an
Engine
object, and return a score (float). Objects with highest scores will be retained.score_name (str, optional) – if score_function not None, it is possible to store its absolute value using score_name. See Notes for more details.
n_saved (int, optional) – Number of objects that should be kept on disk. Older files will be removed. If set to None, all objects are kept.
atomic (bool, optional) – If True, objects are serialized to a temporary file, and then moved to final destination, so that files are guaranteed to not be damaged (for example if exception occurs during saving).
require_empty (bool, optional) – If True, will raise exception if there are any files starting with filename_prefix in the directory ‘dirname’.
create_dir (bool, optional) – If True, will create directory ‘dirname’ if it doesnt exist.
global_step_transform (callable, optional) – global step transform function to output a desired global step. Input of the function is (engine, event_name). Output of function should be an integer. Default is None, global_step based on attached engine. If provided, uses function output as global_step. To setup global step from another engine, please use
global_step_from_engine()
.archived (bool, optional) – It True, saved checkpoint extension will be .pth.tar, Default value is False.
Examples
>>> import os >>> from ignite.engine import Engine, Events >>> from ignite.handlers import ModelCheckpoint >>> from torch import nn >>> trainer = Engine(lambda batch: None) >>> handler = ModelCheckpoint('/tmp/models', 'myprefix', n_saved=2, create_dir=True) >>> model = nn.Linear(3, 3) >>> trainer.add_event_handler(Events.EPOCH_COMPLETED(every=2), handler, {'mymodel': model}) >>> trainer.run([0], max_epochs=6) >>> os.listdir('/tmp/models') ['myprefix_mymodel_4.pth', 'myprefix_mymodel_6.pth'] >>> handler.last_checkpoint ['/tmp/models/myprefix_mymodel_6.pth']
- class ignite.handlers.EarlyStopping(patience, score_function, trainer, min_delta=0.0, cumulative_delta=False)[source]#
EarlyStopping handler can be used to stop the training if no improvement after a given number of events.
- Parameters
patience (int) – Number of events to wait if no improvement and then stop the training.
score_function (callable) – It should be a function taking a single argument, an
Engine
object, and return a score float. An improvement is considered if the score is higher.trainer (Engine) – trainer engine to stop the run if no improvement.
min_delta (float, optional) – A minimum increase in the score to qualify as an improvement, i.e. an increase of less than or equal to min_delta, will count as no improvement.
cumulative_delta (bool, optional) – It True, min_delta defines an increase since the last patience reset, otherwise, it defines an increase after the last event. Default value is False.
Examples:
from ignite.engine import Engine, Events from ignite.handlers import EarlyStopping def score_function(engine): val_loss = engine.state.metrics['nll'] return -val_loss handler = EarlyStopping(patience=10, score_function=score_function, trainer=trainer) # Note: the handler is attached to an *Evaluator* (runs one epoch on validation dataset). evaluator.add_event_handler(Events.COMPLETED, handler)
- class ignite.handlers.Timer(average=False)[source]#
Timer object can be used to measure (average) time between events.
- Parameters
average (bool, optional) – if True, then when
.value()
method is called, the returned value will be equal to total time measured, divided by the value of internal counter.
- step_count#
internal counter, usefull to measure average time, e.g. of processing a single batch. Incremented with the
.step()
method.- Type
Note
When using
Timer(average=True)
do not forget to calltimer.step()
every time an event occurs. See the examples below.Examples
Measuring total time of the epoch:
>>> from ignite.handlers import Timer >>> import time >>> work = lambda : time.sleep(0.1) >>> idle = lambda : time.sleep(0.1) >>> t = Timer(average=False) >>> for _ in range(10): ... work() ... idle() ... >>> t.value() 2.003073937026784
Measuring average time of the epoch:
>>> t = Timer(average=True) >>> for _ in range(10): ... work() ... idle() ... t.step() ... >>> t.value() 0.2003182829997968
Measuring average time it takes to execute a single
work()
call:>>> t = Timer(average=True) >>> for _ in range(10): ... t.resume() ... work() ... t.pause() ... idle() ... t.step() ... >>> t.value() 0.10016545779653825
Using the Timer to measure average time it takes to process a single batch of examples:
>>> from ignite.engine import Engine, Events >>> from ignite.handlers import Timer >>> trainer = Engine(training_update_function) >>> timer = Timer(average=True) >>> timer.attach(trainer, ... start=Events.EPOCH_STARTED, ... resume=Events.ITERATION_STARTED, ... pause=Events.ITERATION_COMPLETED, ... step=Events.ITERATION_COMPLETED)
- attach(engine, start=Events.STARTED, pause=Events.COMPLETED, resume=None, step=None)[source]#
Register callbacks to control the timer.
- Parameters
engine (Engine) – Engine that this timer will be attached to.
start (Events) – Event which should start (reset) the timer.
pause (Events) – Event which should pause the timer.
resume (Events, optional) – Event which should resume the timer.
step (Events, optional) – Event which should call the step method of the counter.
- Returns
self (Timer)
- class ignite.handlers.TerminateOnNan(output_transform=<function TerminateOnNan.<lambda>>)[source]#
TerminateOnNan handler can be used to stop the training if the process_function’s output contains a NaN or infinite number or torch.tensor. The output can be of type: number, tensor or collection of them. The training is stopped if there is at least a single number/tensor have NaN or Infinite value. For example, if the output is [1.23, torch.tensor(…), torch.tensor(float(‘nan’))] the handler will stop the training.
- Parameters
output_transform (callable, optional) – a callable that is used to transform the
Engine
’s process_function’s output into a number or torch.tensor or collection of them. This can be useful if, for example, you have a multi-output model and you want to check one or multiple values of the output.
Examples:
trainer.add_event_handler(Events.ITERATION_COMPLETED, TerminateOnNan())