class ignite.metrics.RougeL(multiref='average', alpha=0, output_transform=<function RougeL.<lambda>>, device=device(type='cpu'))[source]#

Calculates the Rouge-L score.

The Rouge-L is based on the length of the longest common subsequence of candidates and references.

More details can be found in Lin 2004.

  • update must receive output of the form (y_pred, y) or {'y_pred': y_pred, 'y': y}.

  • y_pred (list(list(str))) must be a sequence of tokens.

  • y (list(list(list(str))) must be a list of sequence of tokens.

  • multiref (str) – reduces scores for multi references. Valid values are “best” and “average” (default: “average”).

  • alpha (float) – controls the importance between recall and precision (alpha -> 0: recall is more important, alpha -> 1: precision is more important)

  • output_transform (Callable) – a callable that is used to transform the Engine’s process_function’s output into the form expected by the metric. This can be useful if, for example, you have a multi-output model and you want to compute the metric with respect to one of the outputs.

  • device (Union[str, torch.device]) – specifies which device updates are accumulated on. Setting the metric’s device to be the same as your update arguments ensures the update method is non-blocking. By default, CPU.


from ignite.metrics import RougeL

m = RougeL(multiref="best")

candidate = "the cat is not there".split()
references = [
   "the cat is on the mat".split(),
    "there is a cat on the mat".split()

m.update(([candidate], [references]))

{'Rouge-L-P': 0.6, 'Rouge-L-R': 0.5, 'Rouge-L-F': 0.5}

New in version 0.4.5.