Template Function torch_tensorrt::ptq::make_int8_cache_calibrator

Defined in File ptq.h

Function Documentation

template<typename Algorithm = nvinfer1::IInt8EntropyCalibrator2> inline Int8CacheCalibrator<Algorithm> torch_tensorrt::ptq::make_int8_cache_calibrator(const std::string &cache_file_path)

A factory to build a post training quantization calibrator from a torch dataloader that only uses the calibration cache.

Creates a calibrator to use for post training quantization which reads from a previously created calibration cache, therefore you can have a calibration cache generating program that requires a dataloader and a dataset, then save the cache to use later in a different program that needs to calibrate from scratch and not have the dataset dependency. However, the network should also be recalibrated if its structure changes, or the input data set changes, and it is the responsibility of the application to ensure this.

By default the returned calibrator uses TensorRT Entropy v2 algorithm to perform calibration. This is recommended for feed forward networks You can override the algorithm selection (such as to use the MinMax Calibrator recomended for NLP tasks) by calling make_int8_calibrator with the calibrator class as a template parameter.

e.g. torch_tensorrt::ptq::make_int8_cache_calibrator<nvinfer1::IInt8MinMaxCalibrator>(calibration_cache_file);

Template Parameters: Algorithm – class nvinfer1::IInt8Calibrator (Default: nvinfer1::IInt8EntropyCalibrator2) - Algorithm to use
Parameters: cache_file_path – const std::string& - Path to read/write calibration cache
Returns: Int8CacheCalibrator<Algorithm>

Template Function torch_tensorrt::ptq::make_int8_cache_calibrator

Function Documentation

Docs

Tutorials

Resources