Metric Toolkit¶
- torcheval.metrics.toolkit.classwise_converter(input: Tensor, name: str, labels: List[str] | None = None) Dict[str, Tensor] ¶
Converts an unaveraged metric result tensor into a dictionary with each key being ‘metricname_classlabel’ and value being the data associated with that class.
- Parameters:
input (torch.Tensor) – The tensor to be split along its first dimension.
name (str) – Name of the metric.
labels (List[str], Optional) – Optional list of strings indicating the different classes.
- Raises:
ValueError – When the length of labels is not equal to the number of classes.
- torcheval.metrics.toolkit.clone_metric(metric: Metric) Metric ¶
Return a new metric instance which is cloned from the input metric.
- Parameters:
metric – The metric object to clone
- Returns:
A new metric instance from cloning
- torcheval.metrics.toolkit.clone_metrics(metrics: _TMetrics) List[Metric] ¶
Return a list of new metric instances which are cloned from the input metrics.
- Parameters:
metrics – The metric objects to clone
- Returns:
A list of metric instances from cloning
- torcheval.metrics.toolkit.get_synced_metric(metric: Metric, process_group: ProcessGroup | None = None, recipient_rank: int | Literal['all'] = 0) Metric | None ¶
Returns a metric object on recipient_rank whose internal state variables are synced across processes in the process_group. Returns
None
on non-recipient rank.If
all
is passed as recipient_rank, all ranks in theprocess_group
are considered as recipient ranks.- Parameters:
metric – The metric object to sync.
process_group – The process group on which the metric states are gathered. default:
None
(the entire world)recipient_rank – The destination rank. If string “all” is passed in, then all ranks are the destination ranks.
- Raises:
ValueError – when
recipient_rank
is not an integer or string “all”.
Examples:
>>> # Assumes world_size of 3. >>> # Process group initialization omitted on each rank. >>> import torch >>> import torch.distributed as dist >>> from torcheval import Max >>> max = Max() >>> max.update(torch.tensor(dist.get_rank())).compute() tensor(0.) # Rank 0 tensor(1.) # Rank 1 tensor(2.) # Rank 2 >>> synced_metric = get_synced_metric(max) # by default sync metric states to Rank 0 >>> synced_metric.compute() if synced_metric else None tensor(2.) # Rank 0 None # Rank 1 -- synced_metric is None None # Rank 2 -- synced_metric is None >>> synced_metric = get_synced_metric(max, recipient_rank=1) >>> synced_metric.compute() if synced_metric else None None # Rank 0 -- synced_metric is None tensor(2.) # Rank 1 None # Rank 2 -- synced_metric is None >>> get_synced_metric(max, recipient_rank="all").compute() tensor(2.) # Rank 0 tensor(2.) # Rank 1 tensor(2.) # Rank 2
- torcheval.metrics.toolkit.get_synced_metric_collection(metric_collection: MutableMapping[str, Metric], process_group: ProcessGroup | None = None, recipient_rank: int | Literal['all'] = 0) Dict[str, Metric] | None | MutableMapping[str, Metric] ¶
Returns a dict of metric objects to the recipient_rank whose internal state variables are synced across processes in the process_group. Returns
None
on non-recipient rank.The data transfer is batched to maximize efficiency.
If
all
is passed as recipient_rank, all ranks in theprocess_group
are considered as recipient ranks.- Parameters:
metric_collection (Dict[str, Metric]) – The dict of metric objects to sync.
process_group (int) – The process group on which the metric states are gathered. default:
None
(the entire world)recipient_rank – The destination rank. If string “all” is passed in, then all ranks are the destination ranks.
- Raises:
ValueError – when
recipient_rank
is not an integer or string “all”.
Examples:
>>> # Assumes world_size of 3. >>> # Process group initialization omitted on each rank. >>> import torch >>> import torch.distributed as dist >>> from torcheval.metrics import Max, Min >>> metrics = {"max" : Max(), "min": Min()} >>> metrics["max"].update(torch.tensor(dist.get_rank())) >>> metrics["min"].update(torch.tensor(dist.get_rank())) >>> synced_metrics = get_synced_metric_collection(metrics) by default metrics sync to Rank 0 >>> synced_metrics["max"].compute() if synced_metrics else None tensor(2.) # Rank 0 None # Rank 1 -- synced_metrics is None None # Rank 2 -- synced_metrics is None >>> synced_metrics["min"].compute() if synced_metrics else None tensor(0.) # Rank 0 None # Rank 1 -- synced_metrics is None None # Rank 2 -- synced_metrics is None you can also sync to all ranks or choose a specific rank >>> synced_metrics = get_synced_metric_collection(metrics, recipient_rank="all") >>> synced_metrics["max"].compute() tensor(2.) # Rank 0 tensor(2.) # Rank 1 tensor(2.) # Rank 2 >>> synced_metrics["min"].compute() tensor(0.) # Rank 0 tensor(0.) # Rank 1 tensor(0.) # Rank 2
- torcheval.metrics.toolkit.get_synced_state_dict(metric: Metric, process_group: ProcessGroup | None = None, recipient_rank: int | Literal['all'] = 0) Dict[str, Any] ¶
Return the state dict of a metric after syncing on recipient_rank. Return an empty dict on other ranks.
- Parameters:
metric – The metric object to sync and get
state_dict()
process_group – The process group on which the metric states are gathered. default:
None
(the entire world)recipient_rank – The destination rank. If string “all” is passed in, then all ranks are the destination ranks.
- Returns:
state dict of synced metric
Examples:
>>> # Assumes world_size of 3. >>> # Process group initialization omitted on each rank. >>> import torch >>> import torch.distributed as dist >>> from torcheval import Max >>> max = Max() >>> max.update(torch.tensor(dist.get_rank())) >>> get_synced_state_dict(max) {"max", tensor(2.)} # Rank 0 {} # Rank 1 {} # Rank 2 >>> get_synced_state_dict(max, recipient_rank="all") {"max", tensor(2.)} # Rank 0 {"max", tensor(2.)} # Rank 1 {"max", tensor(2.)} # Rank 2
- torcheval.metrics.toolkit.get_synced_state_dict_collection(metric_collection: MutableMapping[str, Metric], process_group: ProcessGroup | None = None, recipient_rank: int | Literal['all'] = 0) Dict[str, Dict[str, Any]] | None ¶
Return the state dict of a collection of metrics after syncing on recipient_rank. Return an None on other ranks.
- Parameters:
metric_collection (Dict[str, Metric]) – The metric objects to sync and get
state_dict()
process_group – The process group on which the metric states are gathered. default:
None
(the entire world)recipient_rank – The destination rank. If string “all” is passed in, then all ranks are the destination ranks.
- Returns:
Bundle of state dicts of for the synced metrics
Examples:
>>> # Assumes world_size of 3. >>> # Process group initialization omitted on each rank. >>> import torch >>> import torch.distributed as dist >>> from torcheval import Max, Min >>> maximum = Max() >>> maximum.update(torch.tensor(dist.get_rank())) >>> minimum = Min() >>> minimum.update(torch.tensor(dist.get_rank())) >>> get_synced_state_dict({"max rank": maximum, "min rank": minimum}) {"max rank": {"max", tensor(2.)}, "min rank": {"min", tensor(0.)}} # Rank 0 None # Rank 1 None # Rank 2 >>> get_synced_state_dict({"max rank": maximum, "min rank": minimum}, recipient_rank="all") {"max rank": {"max", tensor(2.)}, "min rank": {"min", tensor(0.)}} # Rank 0 {"max rank": {"max", tensor(2.)}, "min rank": {"min", tensor(0.)}} # Rank 1 {"max rank": {"max", tensor(2.)}, "min rank": {"min", tensor(0.)}} # Rank 2
- torcheval.metrics.toolkit.reset_metrics(metrics: _TMetrics) _TMetrics ¶
Reset input metrics and returns the reset collection back to users.
- Parameters:
metrics – The metrics to be reset
Examples:
>>> from torcheval.metrics import Max, Min >>> max = Max() >>> min = Min() >>> max.update(torch.tensor(1)).compute() >>> min.update(torch.tensor(2)).compute() >>> max, min = reset_metrics((max, min)) >>> max.compute() tensor(0.) >>> min.compute() tensor(0.)
- torcheval.metrics.toolkit.sync_and_compute(metric: Metric[TComputeReturn], process_group: ProcessGroup | None = None, recipient_rank: int | Literal['all'] = 0) TComputeReturn | None ¶
Sync metric states and returns the
metric.compute()
result of synced metric on recipient rank. ReturnNone
on other ranks.- Parameters:
metric – The metric object to be synced and computed.
process_group – The process group on which the metric states are gathered. default:
None
(the entire world)recipient_rank – The destination rank. If string “all” is passed in, then all ranks are the destination ranks.
Examples:
>>> # Assumes world_size of 3. >>> # Process group initialization omitted on each rank. >>> import torch >>> import torch.distributed as dist >>> from torcheval.metrics import Max >>> max = Max() >>> max.update(torch.tensor(dist.get_rank())).compute() tensor(0.) # Rank 0 tensor(1.) # Rank 1 tensor(2.) # Rank 2 >>> sync_and_compute(max) tensor(2.) # Rank 0 None # Rank 1 None # Rank 2 >>> sync_and_compute(max, recipient_rank="all") tensor(2.) # Rank 0 tensor(2.) # Rank 1 tensor(2.) # Rank 2
- torcheval.metrics.toolkit.sync_and_compute_collection(metrics: MutableMapping[str, Metric], process_group: ProcessGroup | None = None, recipient_rank: int | Literal['all'] = 0) Dict[str, Any] | None ¶
Sync metric states across a dict of metrics and returns the
metric.compute()
result of synced metrics on recipient rank. ReturnsNone
on other ranks.- Parameters:
metrics – The dict of metric objects to be synced and computed.
process_group – The process group on which the metric states are gathered. default:
None
(the entire world)recipient_rank – The destination rank. If string “all” is passed in, then all ranks are the destination ranks.
Examples:
>>> # Assumes world_size of 3. >>> # Process group initialization omitted on each rank. >>> import torch >>> import torch.distributed as dist >>> from torcheval.metrics import Max, Min >>> metrics = {"max" : Max(), "min": Min()} >>> metrics["max"].update(torch.tensor(dist.get_rank())).compute() tensor(0.) # Rank 0 tensor(1.) # Rank 1 tensor(2.) # Rank 2 >>> metrics["min"].update(torch.tensor(dist.get_rank())).compute() tensor(0.) # Rank 0 tensor(1.) # Rank 1 tensor(2.) # Rank 2 >>> sync_and_compute_collection(metrics) {"max" : tensor(2.), "min": tensor(0.)} # Rank 0 None # Rank 1 None # Rank 2 >>> sync_and_compute_collection(metrics, recipient_rank="all") {"max" : tensor(2.), "min": tensor(0.)} # Rank 0 {"max" : tensor(2.), "min": tensor(0.)} # Rank 1 {"max" : tensor(2.), "min": tensor(0.)} # Rank 2
- torcheval.metrics.toolkit.to_device(metrics: _TMetrics, device: device, *args: Any, **kwargs: Any) _TMetrics ¶
Moves input metrics to the target device and returns the moved metrics back to users.
- Parameters:
metrics – The metrics to be moved to the device
device – the device to move te metrics to
*args – Variadic arguments forwarded to
Metric.to
**kwargs – Named arguments forwarded to
Metric.to
Examples:
>>> from torcheval.metrics import Max, Min >>> max = Max() >>> min = Min() >>> max, min = to_device((max, min), torch.device("cuda")) >>> max.device torch.device("cuda") >>> min.device torch.device("cuda")