No improvement in GPU memory consumption during inference #328

vedanshthakkar · 2022-10-17T01:45:24Z

I have convertd matterport implementation of Mask RCNN saved model to a 16-bit TRT optimized saved model. I can see 100ms improvement in the inference time, however, I do not see any reduction in GPU memory consumption. Given that the original model is 32-bit model, and the optimized model is 16-bit model, I am expecting some reduction in the GPU memory consumption during inference.

I used:
Tensorflow 2.10.0
Tensorrt 7.2.2.1
Colab pro+

No one talks about the GPU memory consumption after optimization. Is it only the inference time that is improved by TF-TRT?

ncomly-nvidia · 2022-10-17T15:53:29Z

In general TF-TRT focuses on inference performance, and unfortunately memory consumption is rarely improved. TensorRT itself does a much better job at memory reduction than TF-TRT if memory size is critical for your application.

CC: @pjannaty for memory consumption issue.

vedanshthakkar · 2022-10-19T15:33:34Z

@ncomly-nvidia @pjannaty Understood, however, if I am optimizing a 32 bit model and using precision_mode='FP16' as one of the conversion parameters, my understanding is that the weights of the converted/optimized model should be FP16. And if that is the case, the model should now take ~half the memory during inference. Am I missing something?

pjannaty · 2022-10-20T00:09:19Z

It's hard to tell why TRT does not show memory usage reduction here.
We do have an experimental PR that you may want to use at your discretion to see if it helps with the issue: tensorflow/tensorflow#55959

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No improvement in GPU memory consumption during inference #328

No improvement in GPU memory consumption during inference #328

vedanshthakkar commented Oct 17, 2022

ncomly-nvidia commented Oct 17, 2022

vedanshthakkar commented Oct 19, 2022

pjannaty commented Oct 20, 2022

No improvement in GPU memory consumption during inference #328

No improvement in GPU memory consumption during inference #328

Comments

vedanshthakkar commented Oct 17, 2022

ncomly-nvidia commented Oct 17, 2022

vedanshthakkar commented Oct 19, 2022

pjannaty commented Oct 20, 2022