BackendPatternConfig¶
- class torch.ao.quantization.backend_config.BackendPatternConfig(pattern=None)[source][source]¶
Config object that specifies quantization behavior for a given operator pattern. For a detailed example usage, see
BackendConfig
.- add_dtype_config(dtype_config)[source][source]¶
Add a set of supported data types passed as arguments to quantize ops in the reference model spec.
- Return type
- classmethod from_dict(backend_pattern_config_dict)[source][source]¶
Create a
BackendPatternConfig
from a dictionary with the following items:“pattern”: the pattern being configured “observation_type”: the
ObservationType
that specifies how observers should be inserted for this pattern “dtype_configs”: a list of dictionaries that representsDTypeConfig
s “root_module”: atorch.nn.Module
that represents the root for this pattern “qat_module”: atorch.nn.Module
that represents the QAT implementation for this pattern “reference_quantized_module”: atorch.nn.Module
that represents the reference quantized implementation for this pattern’s root module. “fused_module”: atorch.nn.Module
that represents the fused implementation for this pattern “fuser_method”: a function that specifies how to fuse the pattern for this pattern “pattern_complex_format”: the pattern specified in the reversed nested tuple format (deprecated)- Return type
- set_dtype_configs(dtype_configs)[source][source]¶
Set the supported data types passed as arguments to quantize ops in the reference model spec, overriding all previously registered data types.
- Return type
- set_fused_module(fused_module)[source][source]¶
Set the module that represents the fused implementation for this pattern.
- Return type
- set_fuser_method(fuser_method)[source][source]¶
Set the function that specifies how to fuse this BackendPatternConfig’s pattern.
The first argument of this function should be is_qat, and the rest of the arguments should be the items in the tuple pattern. The return value of this function should be the resulting fused module.
For example, the fuser method for the pattern (torch.nn.Linear, torch.nn.ReLU) can be:
- def fuse_linear_relu(is_qat, linear, relu):
return torch.ao.nn.intrinsic.LinearReLU(linear, relu)
For a more complicated example, see https://gist.github.com/jerryzh168/8bea7180a8ba3c279f2c9b050f2a69a6.
- Return type
- set_observation_type(observation_type)[source][source]¶
Set how observers should be inserted in the graph for this pattern.
Observation type here refers to how observers (or quant-dequant ops) will be placed in the graph. This is used to produce the desired reference patterns understood by the backend. Weighted ops such as linear and conv require different observers (or quantization parameters passed to quantize ops in the reference model) for the input and the output.
There are two observation types:
OUTPUT_USE_DIFFERENT_OBSERVER_AS_INPUT (default): the output observer instance will be different from the input. This is the most common observation type.
OUTPUT_SHARE_OBSERVER_WITH_INPUT: the output observer instance will be the same as the input. This is useful for operators like cat.
Note: This will be renamed in the near future, since we will soon insert QuantDeQuantStubs with observers (and fake quantizes) attached instead of observers themselves.
- Return type
- set_pattern(pattern)[source][source]¶
Set the pattern to configure.
The pattern can be a float module, functional operator, pytorch operator, or a tuple combination of the above. Tuple patterns are treated as sequential patterns, and currently only tuples of 2 or 3 elements are supported.
- Return type
- set_qat_module(qat_module)[source][source]¶
Set the module that represents the QAT implementation for this pattern.
- Return type
- set_reference_quantized_module(reference_quantized_module)[source][source]¶
Set the module that represents the reference quantized implementation for this pattern’s root module.
For more detail, see
set_root_module()
.- Return type
- set_root_module(root_module)[source][source]¶
Set the module that represents the root for this pattern.
When we construct the reference quantized model during the convert phase, the root modules (e.g. torch.nn.Linear for torch.ao.nn.intrinsic.LinearReLU) will be swapped to the corresponding reference quantized modules (e.g. torch.ao.nn.reference.quantized.Linear). This allows custom backends to specify custom reference quantized module implementations to match the numerics of their lowered operators. Since this is a one-to-one mapping, both the root module and the reference quantized module must be specified in the same BackendPatternConfig in order for the conversion to take place.
- Return type