padded_collate¶
- torchtune.utils.padded_collate(batch: List[Tuple[List[int], List[int]]], padding_idx: int = 0, ignore_idx: int = - 100) Tuple[Tensor, Tensor] [source]¶
Pad a batch of sequences to the longest sequence length in the batch, and convert integer lists to tensors.
- Parameters:
- Returns:
Collated input and label tensors.
Example
>>> token_pairs = [ >>> ([1, 2, 3], [4, 5, 6]), >>> ([7,], [10,],), >>> ] >>> inputs, labels = padded_collate( >>> batch=token_pairs, >>> padding_idx=padding_idx, >>> ignore_idx=ignore_idx, >>> ) >>> inputs >>> tensor([[1, 2, 3], [7, 0, 0]]) >>> labels >>> tensor([[4,5,6], [10,-100,-100]])