Launch Modes

Triton Model Analyzer's profile subcommand supports four different launch modes along with Triton Inference Server. In the local and docker modes, Triton Inference Server will be launched by the Model Analyzer. In the c_api mode, the Triton Inference Server is launched locally via a C API. In the remote mode, it is assumed there is an already running instance of Triton Inference Server.

Local

CLI Option	`--triton-launch-mode=local`

In this mode, Model Analyzer will launch Triton Server using the local binary supplied using --triton-server-path, or if none is supplied, the tritonserver binary in $PATH.

Local mode is the recommended method of getting started with Model Analyzer. There are detailed instructions about using this mode in the Quick Start Guide.

Docker

CLI Option	`--triton-launch-mode=docker`

In this mode, Model Analyzer uses the Python Docker API to launch the Triton Inference Server container. If you are running Model Analyzer inside a Docker container, make sure that the container is launched with appropriate flags. The following flags are mandatory for correct behavior:

--gpus <gpus> -v /var/run/docker.sock:/var/run/docker.sock --net host

Additionally, Model Analyzer uses the output_model_repository_path to manipulate and store model config variants. When Model Analyzer launches the Triton container, it does so as a sibling container. The launched Triton container will only have access to the host filesystem. As a result, in the docker launch mode, the output model directory will need to be mounted to the Model Analyzer docker container at the same absolute path it has outside the container. So you must add the following when you launch the model analyzer container as well.

-v <path-to-output-model-repository>:<path-to-output-model-repository>

Finally, when launching model analyzer, the argument --output-model-repository must be provided as a directory inside <path-to-output-model-repository>. This directory need not exist.

--output-model-repository=<path-to-output-model-repository>/output

This mode is useful if you want to use the Model Analyzer installed in the Triton SDK Container. You will need Docker installed, though.

C API

CLI Option	`--triton-launch-mode=c_api`

In this mode, Triton server is launched locally via the C_API by the perf_analyzer instances launched by Model Analyzer.

This mode is useful if you want to run with the Triton Server installed locally and want the increased performance from the C API. Similar to the local mode, Triton Server must be installed in the environment that the Model Analyzer is being used.

Remote

CLI Option	`--triton-launch-mode=remote`

This mode is beneficial when you want to use an already running Triton Inference Server. You may provide the URLs for the Triton instance's HTTP or GRPC endpoint depending on your chosen client protocol using the --triton-grpc-endpoint, and --triton-http-endpoint flags. You should also make sure that same GPUs are available to the Inference Server and Model Analyzer and they are on the same machine. Model Analyzer does not currently support profiling remote GPUs. Triton Server in this mode needs to be launched with --model-control-mode explicit flag to support loading/unloading of the models. The model parameters cannot be changed in remote mode, though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

launch_modes.md

launch_modes.md

Launch Modes

Local

Docker

C API

Remote

Files

launch_modes.md

Latest commit

History

launch_modes.md

File metadata and controls

Launch Modes

Local

Docker

C API

Remote