torchao.dtypes
Layouts and Tensor Subclasses
NF4Tensor class for converting a weight to the QLoRA NF4 format |
|
Affine quantized tensor subclass. |
|
The Layout class serves as a base class for defining different data layouts for tensors. |
|
PlainLayout is the most basic layout class, inheriting from the Layout base class. |
|
SemiSparseLayout is a layout class for handling semi-structured sparse matrices in affine quantized tensors. |
|
TensorCoreTiledLayout is a layout class for handling tensor core tiled layouts in affine quantized tensors. |
|
Represents the layout configuration for Float8 affine quantized tensors. |
|
MarlinSparseLayout is a layout class for handling sparse tensor formats specifically designed for the Marlin sparse kernel. |
|
BlockSparseLayout is a data class that represents the layout of a block sparse matrix. |
|
A layout class for Uintx tensors, which are tensors with elements packed into smaller bit-widths than the standard 8-bit byte. |
|
MarlinQQQ quantized tensor subclass which inherits AffineQuantizedTensor class. |
|
MarlinQQQLayout is a layout class for Marlin QQQ quantization. |
|
Layout class for int4 CPU layout for affine quantized tensor, used by tinygemm kernels _weight_int4pack_mm_for_cpu. |
|
Layout class for int4 packed layout for affine quantized tensor, for cutlass kernel. |
Quantization techniques
Convert a high precision tensor to an integer affine quantized tensor. |
|
Create an integer AffineQuantizedTensor from a high precision tensor using static parameters. |
|
Create a floatx AffineQuantizedTensor from a high precision tensor. |
|
Convert a high precision tensor to a float8 quantized tensor. |
|
Create a float8 AffineQuantizedTensor from a high precision tensor using static parameters. |
|
Converts a floating point tensor to a Marlin QQQ quantized tensor. |
|
Convert a given tensor to normalized float 4-bit tensor. |