torchaudio.functional.merge_tokens¶
- torchaudio.functional.merge_tokens(tokens: Tensor, scores: Tensor, blank: int = 0) List[TokenSpan] [source]¶
Removes repeated tokens and blank tokens from the given CTC token sequence.
- Parameters:
tokens (Tensor) – Alignment tokens (unbatched) returned from
forced_align()
. Shape: (time, ).scores (Tensor) – Alignment scores (unbatched) returned from
forced_align()
. Shape: (time, ). When computing the token-size score, the given score is averaged across the corresponding time span.
- Returns:
list of TokenSpan
Example
>>> aligned_tokens, scores = forced_align(emission, targets, input_lengths, target_lengths) >>> token_spans = merge_tokens(aligned_tokens[0], scores[0])
- Tutorials using
merge_tokens
: CTC forced alignment API tutorial
CTC forced alignment API tutorialForced alignment for multilingual data
Forced alignment for multilingual data