Shortcuts

torchaudio.functional.merge_tokens

torchaudio.functional.merge_tokens(tokens: Tensor, scores: Tensor, blank: int = 0) List[TokenSpan][source]

Removes repeated tokens and blank tokens from the given CTC token sequence.

Parameters:
  • tokens (Tensor) – Alignment tokens (unbatched) returned from forced_align(). Shape: (time, ).

  • scores (Tensor) – Alignment scores (unbatched) returned from forced_align(). Shape: (time, ). When computing the token-size score, the given score is averaged across the corresponding time span.

Returns:

list of TokenSpan

Example

>>> aligned_tokens, scores = forced_align(emission, targets, input_lengths, target_lengths)
>>> token_spans = merge_tokens(aligned_tokens[0], scores[0])
Tutorials using merge_tokens:
CTC forced alignment API tutorial

CTC forced alignment API tutorial

CTC forced alignment API tutorial
Forced alignment for multilingual data

Forced alignment for multilingual data

Forced alignment for multilingual data

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources