cuda_ctc_decoder

torchaudio.models.decoder.cuda_ctc_decoder(tokens: Union[str, List[str]], nbest: int = 1, beam_size: int = 10, blank_skip_threshold: float = 0.95) → CUCTCDecoder[source]

Builds an instance of CUCTCDecoder.

Parameters:

tokens (str or List[str]) – File or list containing valid tokens. If using a file, the expected format is for tokens mapping to the same index to be on the same line
beam_size (int, optional) – The maximum number of hypos to hold after each decode step (Default: 10)
nbest (int) – The number of best decodings to return
blank_id (int) – The token ID corresopnding to the blank symbol.
blank_skip_threshold (float) – skip frames if log_prob(blank) > log(blank_skip_threshold), to speed up decoding (Default: 0.95).

Returns:

decoder

Return type:

CUCTCDecoder

Example

>>> decoder = cuda_ctc_decoder(
>>>     vocab_file="tokens.txt",
>>>     blank_skip_threshold=0.95,
>>> )
>>> results = decoder(log_probs, encoder_out_lens) # List of shape (B, nbest) of Hypotheses

Tutorials using cuda_ctc_decoder:

ASR Inference with CUDA CTC Decoder

cuda_ctc_decoder

Docs

Tutorials

Resources