You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 3, 2025. It is now read-only.
I want to use tf-trt to optimize a tf2 model, and then serve with triton. But fail to serve the optimized tf-trt model. Following is the process:
I use image nvcr.io/nvidia/tensorflow:22.07-tf2-py3 to run the code, and successfully created native model and converted model:
the native model is copied under mnist/1/model.savedmodel, with config.pbtxt like this:
the converted model is copied under mnist_trt/1/model.savedmodel, with config.pbtxt the same as above.
start the triton server within container nvcr.io/nvidia/tritonserver:22.07-py3, the log shows both models are loaded successfully.
try to infer. The client code likes this:
If the model_name is mnist, the infer succeeds, and print the predict result.
However, after changing model_name to mnist_trt, the call fails, with error message:
I guess maybe it's a version issue?
The text was updated successfully, but these errors were encountered: