Shortcuts

SystemResourcesMonitor

class torchtnt.framework.callbacks.SystemResourcesMonitor(loggers: Union[MetricLogger, List[MetricLogger]], *, logging_interval: Literal['epoch', 'step'] = 'epoch')

A callback which logs system stats, including: - CPU usage - resident set size - GPU usage - cuda memory stats

Parameters:
  • loggers – Either a torchtnt.loggers.logger.MetricLogger or list of torchtnt.loggers.logger.MetricLogger
  • logging_interval – whether to print system state every step or every epoch. Defaults to every epoch.
on_eval_epoch_start(state: State, unit: EvalUnit[TEvalData]) None

Hook called before a new eval epoch starts.

on_eval_step_start(state: State, unit: EvalUnit[TEvalData]) None

Hook called before a new eval step starts.

on_predict_epoch_start(state: State, unit: PredictUnit[TPredictData]) None

Hook called before a new predict epoch starts.

on_predict_step_start(state: State, unit: PredictUnit[TPredictData]) None

Hook called before a new predict step starts.

on_train_epoch_start(state: State, unit: TrainUnit[TTrainData]) None

Hook called before a new train epoch starts.

on_train_step_start(state: State, unit: TrainUnit[TTrainData]) None

Hook called before a new train step starts.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources