TorchServe gRPC API¶
Note: Current TorchServe gRPC does not support workflow.
TorchServe also supports gRPC APIs for both inference and management calls.
TorchServe provides following gRPCs apis
-
Ping : Gets the health status of the running server
Predictions : Gets predictions from the served model
-
RegisterModel : Serve a model/model-version on TorchServe
UnregisterModel : Free up system resources by unregistering specific version of a model from TorchServe
ScaleWorker : Dynamically adjust the number of workers for any version of a model to better serve different inference request loads.
ListModels : Query default versions of current registered models
DescribeModel : Get detail runtime status of default version of a model
SetDefault : Set any registered version of a model as default version
By default, TorchServe listens on port 7070 for the gRPC Inference API and 7071 for the gRPC Management API. To configure gRPC APIs on different ports refer configuration documentation
Python client example for gRPC APIs¶
Run following commands to Register, run inference and unregister, densenet161 model from TorchServe model zoo using gRPC python client.
Clone serve repo to run this example
git clone https://github.com/pytorch/serve
cd serve
Install gRPC python dependencies
pip install -U grpcio protobuf grpcio-tools
Start torchServe
mkdir models
torchserve --start --model-store models/
Generate python gRPC client stub using the proto files
python -m grpc_tools.protoc --proto_path=frontend/server/src/main/resources/proto/ --python_out=ts_scripts --grpc_python_out=ts_scripts frontend/server/src/main/resources/proto/inference.proto frontend/server/src/main/resources/proto/management.proto
Register densenet161 model
python ts_scripts/torchserve_grpc_client.py register densenet161
Run inference using
python ts_scripts/torchserve_grpc_client.py infer densenet161 examples/image_classifier/kitten.jpg
Unregister densenet161 model
python ts_scripts/torchserve_grpc_client.py unregister densenet161