Skip to content

Latest commit

 

History

History
180 lines (136 loc) · 6.34 KB

install.md

File metadata and controls

180 lines (136 loc) · 6.34 KB

Installation

There are 4 ways to use Triton Model Analyzer:

Building the Dockerfile

The recommended way to use Model Analyzer is by building the Model Analyzer's docker yourself. This installation method uses --triton-launch-mode=local by default, as all the dependencies will be available. First, clone the Model Analyzer's git repository, then build the docker image.

$ git clone https://github.com/triton-inference-server/model_analyzer.git -b <rXX.YY>

$ cd ./model_analyzer

$ docker build --pull -t model-analyzer .

The above command will pull all the containers that Model Analyzer needs to run. The Model Analyzer's Dockerfile bases the container on the latest tritonserver containers from NGC. Now you can run the container with:

$ docker run -it --rm --gpus all \
      -v /var/run/docker.sock:/var/run/docker.sock \
      -v <path-to-triton-model-repository>:<path-to-triton-model-repository> \
      -v <path-to-output-model-repo>:<path-to-output-model-repo> \
      --net=host model-analyzer

root@hostname:/opt/triton-model-analyzer# 

If you want to build Model Analyzer on the main branch, you need to also build the Triton SDK container. To build the SDK container you can refer to the Build SDK Image instructions. After you have built the SDK container, you can build the Model Analyzer's Docker image with:

$ git clone https://github.com/triton-inference-server/model_analyzer.git -b main

$ cd ./model_analyzer

$ docker build -t model-analyzer --build-arg TRITONSDK_BASE_IMAGE=<name of the built SDK image> .

Triton SDK Container

You can also use the Triton SDK docker container available on the NVIDIA GPU Cloud Catalog. You can pull and run the SDK container with the following commands:

$ docker pull nvcr.io/nvidia/tritonserver:22.04-py3-sdk

If you are not planning to run Model Analyzer with --triton-launch-mode=docker, You can run the SDK container with the following command:

$ docker run -it --gpus all --net=host nvcr.io/nvidia/tritonserver:22.04-py3-sdk

You will need to build and install the Triton server binary inside the SDK container if you want to use the local mode. See the Triton Installation docs for more details.

If you intend to use --triton-launch-mode=docker, which is recommended with this method of using Model Analyzer when using the SDK container, you will need to mount the following:

  • -v /var/run/docker.sock:/var/run/docker.sock allows running docker containers as sibling containers from inside the Triton SDK container. Model Analyzer will require this if run with --triton-launch-mode=docker.
  • -v <path-to-output-model-repo>:<path-to-output-model-repo> The absolute path to the directory where the output model repository will be located (i.e. parent directory of the output model repository). This is so that the launched Triton container has access to the model config variants that Model Analyzer creates.
$ docker run -it --gpus all \
      -v /var/run/docker.sock:/var/run/docker.sock \
      -v <path-to-output-model-repo>:<path-to-output-model-repo> \
      --net=host nvcr.io/nvidia/tritonserver:22.04-py3-sdk

Model Analyzer uses pdfkit for report generation. If you are running Model Analyzer inside the Triton SDK container, then you will need to download wkhtmltopdf.

$ sudo apt-get update && sudo apt-get install wkhtmltopdf

Once you do this, Model Analyzer will able to use pdfkit to generate reports.

Using pip3

You can install pip using:

$ sudo apt-get update && sudo apt-get install python3-pip

Model analyzer can be installed with:

$ pip3 install triton-model-analyzer

If you encounter any errors installing dependencies like numba, make sure that you have the latest version of pip using:

$ pip3 install --upgrade pip

You can then try installing model analyzer again.

If you are using this approach you need to install DCGM on your machine.

For installing DCGM on Ubuntu 20.04 you can use the following commands:

$ export DCGM_VERSION=2.0.13
$ wget -q https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/datacenter-gpu-manager_${DCGM_VERSION}_amd64.deb && \
   dpkg -i datacenter-gpu-manager_${DCGM_VERSION}_amd64.deb

Building from source

To build model analyzer form source, you'll need to install the same dependencies (tritonclient and DCGM) mentioned in the "Using pip section". After that, you can use the following commands:

$ git clone https://github.com/triton-inference-server/model_analyzer
$ cd model_analyzer
$ ./build_wheel.sh <path to perf_analyzer> true

In the final command above we are building the triton-model-analyzer wheel. You will need to provide the build_wheel.sh script with two arguments. The first is the path to the perf_analyzer binary that you would like Model Analyzer to use. The second is whether you want this wheel to be linux specific. Currently, this argument must be set to true as perf analyzer is supported only on linux. This will create a wheel file in the wheels directory named triton-model-analyzer-<version>-py3-none-manylinux1_x86_64.whl. We can now install this with:

$ pip3 install wheels/triton-model-analyzer-*.whl

After these steps, model-analyzer executable should be available in $PATH.

Notes:

  • Triton Model Analyzer supports all the GPUs supported by the DCGM library. See DCGM Supported GPUs for more information.