Struct CompileSpec¶
Defined in File torch_tensorrt.h
Struct Documentation¶
struct CompileSpec¶
Settings data structure for Torch-TensorRT TorchScript compilation
Public Functions
TORCHTRT_API CompileSpec(std::vector<std::vector<int64_t>> fixed_sizes)¶
Construct a new Compile Spec object Convienence constructor to set fixed input size from vectors describing size of input tensors. Each entry in the vector represents a input and should be provided in call order.
This constructor should be use as a convience in the case that all inputs are static sized and you are okay with default input dtype and formats (FP32 for FP32 and INT8 weights, FP16 for FP16 weights, contiguous)
- Parameters
fixed_sizes –
TORCHTRT_API CompileSpec(std::vector<c10::ArrayRef<int64_t>> fixed_sizes)¶
Construct a new Compile Spec object Convienence constructor to set fixed input size from c10::ArrayRef’s (the output of tensor.sizes()) describing size of input tensors. Each entry in the vector represents a input and should be provided in call order.
This constructor should be use as a convience in the case that all inputs are static sized and you are okay with default input dtype and formats (FP32 for FP32 and INT8 weights, FP16 for FP16 weights, contiguous)
- Parameters
fixed_sizes –
TORCHTRT_API CompileSpec(std::vector<Input> inputs)¶
Construct a new Compile Spec object from input ranges. Each entry in the vector represents a input and should be provided in call order.
Use this constructor to define inputs with dynamic shape, specific input types or tensor formats
- Parameters
inputs –
TORCHTRT_API CompileSpec(torch::jit::IValue input_signature)¶
Construct a new Compile Spec object from IValue which represents the nesting of input tensors for a module.
- Parameters
input_signature –
Public Members
GraphInputs graph_inputs¶
Specifications for inputs to the engine, can store a IValue which has stored complex Input or a flatened Input.
std::set<DataType> enabled_precisions = {DataType::kFloat}¶
The set of precisions TensorRT is allowed to use for kernels during compilation.
bool disable_tf32 = false¶
Prevent Float32 layers from using TF32 data format
TF32 computes inner products by rounding the inputs to 10-bit mantissas before multiplying, but accumulates the sum using 23-bit mantissas. This is the behavior of FP32 layers by default.
bool sparse_weights = false¶
Enable sparsity for weights of conv and FC layers
bool refit = false¶
Build a refitable engine
bool debug = false¶
Build a debugable engine
bool truncate_long_and_double = false¶
Truncate long/double type to int/float type
bool allow_shape_tensors = false¶
Allow shape tensors (from IShape layer) in the graph
EngineCapability capability = EngineCapability::kSTANDARD¶
Sets the restrictions for the engine (CUDA Safety)
uint64_t num_avg_timing_iters = 1¶
Number of averaging timing iterations used to select kernels
uint64_t workspace_size = 0¶
Maximum size of workspace given to TensorRT
uint64_t dla_sram_size = 1048576¶
Fast software managed RAM used by DLA to communicate within a layer.
uint64_t dla_local_dram_size = 1073741824¶
Host RAM used by DLA to share intermediate tensor data across operations
uint64_t dla_global_dram_size = 536870912¶
host RAM used by DLA to store weights and metadata for execution
nvinfer1::IInt8Calibrator *ptq_calibrator = nullptr¶
Calibration dataloaders for each input for post training quantizatiom
bool require_full_compilation = false¶
Require the full module be compiled to TensorRT instead of potentially running unsupported operations in PyTorch
uint64_t min_block_size = 3¶
Minimum number of contiguous supported operators to compile a subgraph to TensorRT
std::vector<std::string> torch_executed_ops¶
List of aten operators that must be run in PyTorch. An error will be thrown if this list is not empty but
is True
std::vector<std::string> torch_executed_modules¶
List of modules that must be run in PyTorch. An error will be thrown if this list is not empty but
is True
TORCHTRT_API CompileSpec(std::vector<std::vector<int64_t>> fixed_sizes)¶