class, dtype=torch.quint8, qscheme=torch.per_channel_affine, reduce_range=False, quant_min=None, quant_max=None, factory_kwargs=None, memoryless=False)[source]

Observer module for computing the quantization parameters based on the running per channel min and max values.

This observer uses the tensor min/max statistics to compute the per channel quantization parameters. The module records the running minimum and maximum of incoming tensors, and uses this statistic to compute the quantization parameters.

  • ch_axis – Channel axis

  • dtype – Quantized data type

  • qscheme – Quantization scheme to be used

  • reduce_range – Reduces the range of the quantized data type by 1 bit

  • quant_min – Minimum quantization value. If unspecified, it will follow the 8-bit setup.

  • quant_max – Maximum quantization value. If unspecified, it will follow the 8-bit setup.

  • memoryless – Boolean that controls whether observer removes old data when a new input is seen. This is most useful for simulating dynamic quantization, especially during QAT.

The quantization parameters are computed the same way as in MinMaxObserver, with the difference that the running min/max values are stored per channel. Scales and zero points are thus computed per channel as well.


If the running minimum equals to the running maximum, the scales and zero_points are set to 1.0 and 0.


Resets the min/max values.


Access comprehensive developer documentation for PyTorch

View Docs


Get in-depth tutorials for beginners and advanced developers

View Tutorials


Find development resources and get your questions answered

View Resources