Bleu

class ignite.metrics.Bleu(ngram=4, smooth='no_smooth', output_transform=<function Bleu.<lambda>>, device=device(type='cpu'), average='macro')[source]

Calculates the BLEU score.

\text{BLEU} = b_{p} \cdot \exp \left( \sum_{n=1}^{N} w_{n} \: \log p_{n} \right)

where $N$ is the order of n-grams, $b_{p}$ is a sentence brevety penalty, $w_{n}$ are positive weights summing to one and $p_{n}$ are modified n-gram precisions.

More details can be found in Papineni et al. 2002.

In addition, a review of smoothing techniques can be found in Chen et al. 2014

update must receive output of the form (y_pred, y) or {'y_pred': y_pred, 'y': y}.
y_pred (list(list(str))) - a list of hypotheses sentences.
y (list(list(list(str))) - a corpus of lists of reference sentences w.r.t hypotheses.

Remark :

This implementation is inspired by nltk

Parameters

ngram (int) – order of n-grams.
smooth (str) – enable smoothing. Valid are no_smooth, smooth1, nltk_smooth2 or smooth2. Default: no_smooth.
output_transform (Callable) – a callable that is used to transform the Engine’s process_function’s output into the form expected by the metric. This can be useful if, for example, you have a multi-output model and you want to compute the metric with respect to one of the outputs. By default, metrics require the output as (y_pred, y) or {'y_pred': y_pred, 'y': y}.
device (Union[str, device]) – specifies which device updates are accumulated on. Setting the metric’s device to be the same as your update arguments ensures the update method is non-blocking. By default, CPU.
average (str) – specifies which type of averaging to use (macro or micro) for more details refer https://www.nltk.org/_modules/nltk/translate/bleu_score.html Default: “macro”

Examples

For more information on how metric works with Engine, visit Attach Engine API.

from ignite.metrics.nlp import Bleu

m = Bleu(ngram=4, smooth="smooth1")

y_pred = "the the the the the the the"
y = ["the cat is on the mat", "there is a cat on the mat"]

m.update(([y_pred.split()], [[_y.split() for _y in y]]))

print(m.compute())

tensor(0.0393, dtype=torch.float64)

New in version 0.4.5.

Changed in version 0.4.7:

update method has changed and now works on batch of inputs.
added average option to handle micro and macro averaging modes.

Methods

`compute`	Computes the metric based on its accumulated state.
`reset`	Resets the metric to its initial state.
`update`	Updates the metric's state using the passed batch output.

compute()[source]

Computes the metric based on its accumulated state.

By default, this is called at the end of each epoch.

Returns

the actual quantity of interest. However, if a Mapping is returned, it will be (shallow) flattened into engine.state.metrics when completed() is called.

Return type

Any

Raises

NotComputableError – raised when the metric cannot be computed.

reset()[source]

Resets the metric to its initial state.

By default, this is called at the start of each epoch.

Return type: None

update(output)[source]

Updates the metric’s state using the passed batch output.

By default, this is called once for each batch.

Parameters: output (Tuple[Sequence[Sequence[Any]], Sequence[Sequence[Sequence[Any]]]]) – the is the output from the engine’s process function.
Return type: None