TorchScript for Deployment¶
Note
TorchScript is no longer in active development.
In this recipe, you will learn:
What TorchScript is
How to export your trained model in TorchScript format
How to load your TorchScript model in C++ and do inference
Requirements¶
PyTorch 1.5
TorchVision 0.6.0
libtorch 1.5
C++ compiler
The instructions for installing the three PyTorch components are available at pytorch.org. The C++ compiler will depend on your platform.
What is TorchScript?¶
TorchScript is an intermediate representation of a PyTorch model
(subclass of nn.Module
) that can then be run in a high-performance
environment like C++. It’s a high-performance subset of Python that is
meant to be consumed by the PyTorch JIT Compiler, which performs
run-time optimization on your model’s computation. TorchScript is the
recommended model format for doing scaled inference with PyTorch models.
For more information, see the PyTorch Introduction to TorchScript
tutorial, the Loading A TorchScript Model in C++ tutorial, and the
full TorchScript documentation, all of which are available on
pytorch.org.
How to Export Your Model¶
As an example, let’s take a pretrained vision model. All of the pretrained models in TorchVision are compatible with TorchScript.
Run the following Python 3 code, either in a script or from the REPL:
import torch
import torch.nn.functional as F
import torchvision.models as models
r18 = models.resnet18(pretrained=True) # We now have an instance of the pretrained model
r18_scripted = torch.jit.script(r18) # *** This is the TorchScript export
dummy_input = torch.rand(1, 3, 224, 224) # We should run a quick test
Let’s do a sanity check on the equivalence of the two models:
unscripted_output = r18(dummy_input) # Get the unscripted model's prediction...
scripted_output = r18_scripted(dummy_input) # ...and do the same for the scripted version
unscripted_top5 = F.softmax(unscripted_output, dim=1).topk(5).indices
scripted_top5 = F.softmax(scripted_output, dim=1).topk(5).indices
print('Python model top 5 results:\n {}'.format(unscripted_top5))
print('TorchScript model top 5 results:\n {}'.format(scripted_top5))
You should see that both versions of the model give the same results:
Python model top 5 results:
tensor([[463, 600, 731, 899, 898]])
TorchScript model top 5 results:
tensor([[463, 600, 731, 899, 898]])
With that check confirmed, go ahead and save the model:
r18_scripted.save('r18_scripted.pt')
Loading TorchScript Models in C++¶
Create the following C++ file and name it ts-infer.cpp
:
#include <torch/script.h>
#include <torch/nn/functional/activation.h>
int main(int argc, const char* argv[]) {
if (argc != 2) {
std::cerr << "usage: ts-infer <path-to-exported-model>\n";
return -1;
}
std::cout << "Loading model...\n";
// deserialize ScriptModule
torch::jit::script::Module module;
try {
module = torch::jit::load(argv[1]);
} catch (const c10::Error& e) {
std::cerr << "Error loading model\n";
std::cerr << e.msg_without_backtrace();
return -1;
}
std::cout << "Model loaded successfully\n";
torch::NoGradGuard no_grad; // ensures that autograd is off
module.eval(); // turn off dropout and other training-time layers/functions
// create an input "image"
std::vector<torch::jit::IValue> inputs;
inputs.push_back(torch::rand({1, 3, 224, 224}));
// execute model and package output as tensor
at::Tensor output = module.forward(inputs).toTensor();
namespace F = torch::nn::functional;
at::Tensor output_sm = F::softmax(output, F::SoftmaxFuncOptions(1));
std::tuple<at::Tensor, at::Tensor> top5_tensor = output_sm.topk(5);
at::Tensor top5 = std::get<1>(top5_tensor);
std::cout << top5[0] << "\n";
std::cout << "\nDONE\n";
return 0;
}
This program:
Loads the model you specify on the command line
Creates a dummy “image” input tensor
Performs inference on the input
Also, notice that there is no dependency on TorchVision in this code. The saved version of your TorchScript model has your learning weights and your computation graph - nothing else is needed.
Building and Running Your C++ Inference Engine¶
Create the following CMakeLists.txt
file:
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(custom_ops)
find_package(Torch REQUIRED)
add_executable(ts-infer ts-infer.cpp)
target_link_libraries(ts-infer "${TORCH_LIBRARIES}")
set_property(TARGET ts-infer PROPERTY CXX_STANDARD 11)
Make the program:
cmake -DCMAKE_PREFIX_PATH=<path to your libtorch installation>
make
Now, we can run inference in C++, and verify that we get a result:
$ ./ts-infer r18_scripted.pt
Loading model...
Model loaded successfully
418
845
111
892
644
[ CPULongType{5} ]
DONE
Important Resources¶
pytorch.org for installation instructions, and more documentation and tutorials.
Introduction to TorchScript tutorial for a deeper initial exposition of TorchScript
Full TorchScript documentation for complete TorchScript language and API reference