class, out_features, bias_=True, dtype=torch.qint8)[source]

A quantized linear module with quantized tensor as inputs and outputs. We adopt the same interface as torch.nn.Linear, please see for documentation.

Similar to Linear, attributes will be randomly initialized at module creation time and will be overwritten later

  • weight (Tensor) – the non-learnable quantized weights of the module of shape (out_features,in_features)(\text{out\_features}, \text{in\_features}).

  • bias (Tensor) – the non-learnable bias of the module of shape (out_features)(\text{out\_features}). If bias is True, the values are initialized to zero.

  • scalescale parameter of output Quantized Tensor, type: double

  • zero_pointzero_point parameter for output Quantized Tensor, type: long


>>> m = nn.quantized.Linear(20, 30)
>>> input = torch.randn(128, 20)
>>> input = torch.quantize_per_tensor(input, 1.0, 0, torch.quint8)
>>> output = m(input)
>>> print(output.size())
torch.Size([128, 30])
classmethod from_float(mod)[source]

Create a quantized module from an observed float module


mod (Module) – a float module, either produced by utilities or provided by the user

classmethod from_reference(ref_qlinear, output_scale, output_zero_point)[source]

Create a (fbgemm/qnnpack) quantized module from a reference quantized module

  • ref_qlinear (Module) – a reference quantized linear module, either produced by utilities or provided by the user

  • output_scale (float) – scale for output Tensor

  • output_zero_point (int) – zero point for output Tensor


Access comprehensive developer documentation for PyTorch

View Docs


Get in-depth tutorials for beginners and advanced developers

View Tutorials


Find development resources and get your questions answered

View Resources