Shortcuts

Metric Toolkit

torcheval.metrics.toolkit.classwise_converter(input: Tensor, name: str, labels: List[str] | None = None) Dict[str, Tensor]

Converts an unaveraged metric result tensor into a dictionary with each key being ‘metricname_classlabel’ and value being the data associated with that class.

Parameters:
  • input (torch.Tensor) – The tensor to be split along its first dimension.

  • name (str) – Name of the metric.

  • labels (List[str], Optional) – Optional list of strings indicating the different classes.

Raises:

ValueError – When the length of labels is not equal to the number of classes.

torcheval.metrics.toolkit.clone_metric(metric: Metric) Metric

Return a new metric instance which is cloned from the input metric.

Parameters:

metric – The metric object to clone

Returns:

A new metric instance from cloning

torcheval.metrics.toolkit.clone_metrics(metrics: _TMetrics) List[Metric]

Return a list of new metric instances which are cloned from the input metrics.

Parameters:

metrics – The metric objects to clone

Returns:

A list of metric instances from cloning

torcheval.metrics.toolkit.get_synced_metric(metric: Metric, process_group: ProcessGroup | None = None, recipient_rank: int | Literal['all'] = 0) Metric | None

Returns a metric object on recipient_rank whose internal state variables are synced across processes in the process_group. Returns None on non-recipient rank.

If all is passed as recipient_rank, all ranks in the process_group are considered as recipient ranks.

Parameters:
  • metric – The metric object to sync.

  • process_group – The process group on which the metric states are gathered. default: None (the entire world)

  • recipient_rank – The destination rank. If string “all” is passed in, then all ranks are the destination ranks.

Raises:

ValueError – when recipient_rank is not an integer or string “all”.

Examples:

>>> # Assumes world_size of 3.
>>> # Process group initialization omitted on each rank.
>>> import torch
>>> import torch.distributed as dist
>>> from torcheval import Max
>>> max = Max()
>>> max.update(torch.tensor(dist.get_rank())).compute()
tensor(0.) # Rank 0
tensor(1.) # Rank 1
tensor(2.) # Rank 2
>>> synced_metric = get_synced_metric(max)  # by default sync metric states to Rank 0
>>> synced_metric.compute() if synced_metric else None
tensor(2.)     # Rank 0
None # Rank 1 -- synced_metric is None
None # Rank 2 -- synced_metric is None
>>> synced_metric = get_synced_metric(max, recipient_rank=1)
>>> synced_metric.compute() if synced_metric else None
None # Rank 0 -- synced_metric is None
tensor(2.)     # Rank 1
None # Rank 2 -- synced_metric is None
>>>  get_synced_metric(max, recipient_rank="all").compute()
tensor(2.) # Rank 0
tensor(2.) # Rank 1
tensor(2.) # Rank 2
torcheval.metrics.toolkit.get_synced_metric_collection(metric_collection: MutableMapping[str, Metric], process_group: ProcessGroup | None = None, recipient_rank: int | Literal['all'] = 0) Dict[str, Metric] | None | MutableMapping[str, Metric]

Returns a dict of metric objects to the recipient_rank whose internal state variables are synced across processes in the process_group. Returns None on non-recipient rank.

The data transfer is batched to maximize efficiency.

If all is passed as recipient_rank, all ranks in the process_group are considered as recipient ranks.

Parameters:
  • metric_collection (Dict[str, Metric]) – The dict of metric objects to sync.

  • process_group (int) – The process group on which the metric states are gathered. default: None (the entire world)

  • recipient_rank – The destination rank. If string “all” is passed in, then all ranks are the destination ranks.

Raises:

ValueError – when recipient_rank is not an integer or string “all”.

Examples:

>>> # Assumes world_size of 3.
>>> # Process group initialization omitted on each rank.
>>> import torch
>>> import torch.distributed as dist
>>> from torcheval.metrics import Max, Min
>>> metrics = {"max" : Max(), "min": Min()}
>>> metrics["max"].update(torch.tensor(dist.get_rank()))
>>> metrics["min"].update(torch.tensor(dist.get_rank()))
>>> synced_metrics = get_synced_metric_collection(metrics)

by default metrics sync to Rank 0
>>> synced_metrics["max"].compute() if synced_metrics else None
tensor(2.) # Rank 0
None       # Rank 1 -- synced_metrics is None
None       # Rank 2 -- synced_metrics is None
>>> synced_metrics["min"].compute() if synced_metrics else None
tensor(0.) # Rank 0
None       # Rank 1 -- synced_metrics is None
None       # Rank 2 -- synced_metrics is None

you can also sync to all ranks or choose a specific rank
>>> synced_metrics = get_synced_metric_collection(metrics, recipient_rank="all")
>>> synced_metrics["max"].compute()
tensor(2.) # Rank 0
tensor(2.) # Rank 1
tensor(2.) # Rank 2
>>> synced_metrics["min"].compute()
tensor(0.) # Rank 0
tensor(0.) # Rank 1
tensor(0.) # Rank 2
torcheval.metrics.toolkit.get_synced_state_dict(metric: Metric, process_group: ProcessGroup | None = None, recipient_rank: int | Literal['all'] = 0) Dict[str, Any]

Return the state dict of a metric after syncing on recipient_rank. Return an empty dict on other ranks.

Parameters:
  • metric – The metric object to sync and get state_dict()

  • process_group – The process group on which the metric states are gathered. default: None (the entire world)

  • recipient_rank – The destination rank. If string “all” is passed in, then all ranks are the destination ranks.

Returns:

state dict of synced metric

Examples:

>>> # Assumes world_size of 3.
>>> # Process group initialization omitted on each rank.
>>> import torch
>>> import torch.distributed as dist
>>> from torcheval import Max
>>> max = Max()
>>> max.update(torch.tensor(dist.get_rank()))
>>> get_synced_state_dict(max)
{"max", tensor(2.)} # Rank 0
{} # Rank 1
{} # Rank 2
>>> get_synced_state_dict(max, recipient_rank="all")
{"max", tensor(2.)} # Rank 0
{"max", tensor(2.)} # Rank 1
{"max", tensor(2.)} # Rank 2
torcheval.metrics.toolkit.get_synced_state_dict_collection(metric_collection: MutableMapping[str, Metric], process_group: ProcessGroup | None = None, recipient_rank: int | Literal['all'] = 0) Dict[str, Dict[str, Any]] | None

Return the state dict of a collection of metrics after syncing on recipient_rank. Return an None on other ranks.

Parameters:
  • metric_collection (Dict[str, Metric]) – The metric objects to sync and get state_dict()

  • process_group – The process group on which the metric states are gathered. default: None (the entire world)

  • recipient_rank – The destination rank. If string “all” is passed in, then all ranks are the destination ranks.

Returns:

Bundle of state dicts of for the synced metrics

Examples:

>>> # Assumes world_size of 3.
>>> # Process group initialization omitted on each rank.
>>> import torch
>>> import torch.distributed as dist
>>> from torcheval import Max, Min
>>> maximum = Max()
>>> maximum.update(torch.tensor(dist.get_rank()))
>>> minimum = Min()
>>> minimum.update(torch.tensor(dist.get_rank()))
>>> get_synced_state_dict({"max rank": maximum, "min rank": minimum})
{"max rank": {"max", tensor(2.)}, "min rank": {"min", tensor(0.)}} # Rank 0
None # Rank 1
None # Rank 2
>>> get_synced_state_dict({"max rank": maximum, "min rank": minimum}, recipient_rank="all")
{"max rank": {"max", tensor(2.)}, "min rank": {"min", tensor(0.)}} # Rank 0
{"max rank": {"max", tensor(2.)}, "min rank": {"min", tensor(0.)}} # Rank 1
{"max rank": {"max", tensor(2.)}, "min rank": {"min", tensor(0.)}} # Rank 2
torcheval.metrics.toolkit.reset_metrics(metrics: _TMetrics) _TMetrics

Reset input metrics and returns the reset collection back to users.

Parameters:

metrics – The metrics to be reset

Examples:

>>> from torcheval.metrics import Max, Min
>>> max = Max()
>>> min = Min()
>>> max.update(torch.tensor(1)).compute()
>>> min.update(torch.tensor(2)).compute()
>>> max, min = reset_metrics((max, min))
>>> max.compute()
tensor(0.)
>>> min.compute()
tensor(0.)
torcheval.metrics.toolkit.sync_and_compute(metric: Metric[TComputeReturn], process_group: ProcessGroup | None = None, recipient_rank: int | Literal['all'] = 0) TComputeReturn | None

Sync metric states and returns the metric.compute() result of synced metric on recipient rank. Return None on other ranks.

Parameters:
  • metric – The metric object to be synced and computed.

  • process_group – The process group on which the metric states are gathered. default: None (the entire world)

  • recipient_rank – The destination rank. If string “all” is passed in, then all ranks are the destination ranks.

Examples:

>>> # Assumes world_size of 3.
>>> # Process group initialization omitted on each rank.
>>> import torch
>>> import torch.distributed as dist
>>> from torcheval.metrics import Max
>>> max = Max()
>>> max.update(torch.tensor(dist.get_rank())).compute()
tensor(0.) # Rank 0
tensor(1.) # Rank 1
tensor(2.) # Rank 2
>>> sync_and_compute(max)
tensor(2.) # Rank 0
None # Rank 1
None # Rank 2
>>> sync_and_compute(max, recipient_rank="all")
tensor(2.) # Rank 0
tensor(2.) # Rank 1
tensor(2.) # Rank 2
torcheval.metrics.toolkit.sync_and_compute_collection(metrics: MutableMapping[str, Metric], process_group: ProcessGroup | None = None, recipient_rank: int | Literal['all'] = 0) Dict[str, Any] | None

Sync metric states across a dict of metrics and returns the metric.compute() result of synced metrics on recipient rank. Returns None on other ranks.

Parameters:
  • metrics – The dict of metric objects to be synced and computed.

  • process_group – The process group on which the metric states are gathered. default: None (the entire world)

  • recipient_rank – The destination rank. If string “all” is passed in, then all ranks are the destination ranks.

Examples:

>>> # Assumes world_size of 3.
>>> # Process group initialization omitted on each rank.
>>> import torch
>>> import torch.distributed as dist
>>> from torcheval.metrics import Max, Min
>>> metrics = {"max" : Max(), "min": Min()}
>>> metrics["max"].update(torch.tensor(dist.get_rank())).compute()
tensor(0.) # Rank 0
tensor(1.) # Rank 1
tensor(2.) # Rank 2
>>> metrics["min"].update(torch.tensor(dist.get_rank())).compute()
tensor(0.) # Rank 0
tensor(1.) # Rank 1
tensor(2.) # Rank 2
>>> sync_and_compute_collection(metrics)
{"max" : tensor(2.), "min": tensor(0.)} # Rank 0
None # Rank 1
None # Rank 2
>>> sync_and_compute_collection(metrics, recipient_rank="all")
{"max" : tensor(2.), "min": tensor(0.)} # Rank 0
{"max" : tensor(2.), "min": tensor(0.)} # Rank 1
{"max" : tensor(2.), "min": tensor(0.)} # Rank 2
torcheval.metrics.toolkit.to_device(metrics: _TMetrics, device: device, *args: Any, **kwargs: Any) _TMetrics

Moves input metrics to the target device and returns the moved metrics back to users.

Parameters:
  • metrics – The metrics to be moved to the device

  • device – the device to move te metrics to

  • *args – Variadic arguments forwarded to Metric.to

  • **kwargs – Named arguments forwarded to Metric.to

Examples:

>>> from torcheval.metrics import Max, Min
>>> max = Max()
>>> min = Min()
>>> max, min = to_device((max, min), torch.device("cuda"))
>>> max.device
torch.device("cuda")
>>> min.device
torch.device("cuda")

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources