You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to run a transformer model on tensorrt. To be more specific my model is vitb_256_mae based. I already converted it into onnx file format. I am trying to run it on tensorrt runtime. So in order to do that i made a python script that creates an engine for fp32 to fp16 convertion. It works just fine for this situration. After that in order to convert fp32 to int8, i used your snippet. But i got stuck with some cuda driver errors.
My model takes template and search as input, and produce 5 outputs.
Important changes
In order to create optimization profiles for multiple inputs i tweaked your onnx_to_tensorrt.py a little bit:
def create_optimization_profiles(builder, inputs, batch_sizes=[1,8,16,32,64]):
# Check if all inputs are fixed explicit batch to create a single profile and avoid duplicates
if all([inp.shape[0] > -1 for inp in inputs]):
profile = builder.create_optimization_profile()
for inp in inputs:
fbs, shape = inp.shape[0], inp.shape[1:]
profile.set_shape(inp.name, min=(fbs, *shape), opt=(fbs, *shape), max=(fbs, *shape))
#print(profile.get_shape("template"))
#print(profile.get_shape("search"))
return [profile]
Environment
TensorRT Version: 8.6.1 (from pip not builded from source) GPU Type: RTX3090 ti Nvidia Driver Version: nvidia-driver-530 CUDA Version: 12.1 CUDNN Version: dont have Operating System + Version: ubuntu 22.04.2 lts Python Version (if applicable): 3.10.12
More info
I tryed int8 conversion with 1 dynamic input using this link: NVIDIA/TensorRT#289
for alexnet code worked. Thats why i think the error might because of the number of input. Or i might have to make some changes at imagenetcalibrator file.
Thanks for reading good days !
Output errors
2023-08-10 11:27:20 - main - INFO - TRT_LOGGER Verbosity: Severity.ERROR
[08/10/2023-11:27:23] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
/home/pc-3730/fatihcan/OSTRACK_TENSORRT/tensorrt-utils-20.01/classification/imagenet/onnx_to_tensorrt.py:177: DeprecationWarning: Use set_memory_pool_limit instead.
config.max_workspace_size = 4**30 # 1GiB
2023-08-10 11:27:23 - main - INFO - Setting BuilderFlag.FP16
2023-08-10 11:27:23 - main - INFO - Setting BuilderFlag.INT8
2023-08-10 11:27:23 - ImagenetCalibrator - INFO - Collecting calibration files from: /home/pc-3730/fatihcan/OSTRACK_TENSORRT/tensorrt-utils-20.01/classification/imagenet/imagenet/val
2023-08-10 11:27:23 - ImagenetCalibrator - INFO - Number of Calibration Files found: 10869
2023-08-10 11:27:23 - ImagenetCalibrator - WARNING - Capping number of calibration images to max_calibration_size: 512
[08/10/2023-11:27:23] [TRT] [W] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[08/10/2023-11:27:23] [TRT] [W] onnx2trt_utils.cpp:400: One or more weights outside the range of INT32 was clamped
[08/10/2023-11:27:23] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output.
2023-08-10 11:27:23 - main - DEBUG - === Network Description ===
2023-08-10 11:27:23 - main - DEBUG - Input 0 | Name: template | Shape: (1, 3, 128, 128)
2023-08-10 11:27:23 - main - DEBUG - Input 1 | Name: search | Shape: (1, 3, 256, 256)
2023-08-10 11:27:23 - main - DEBUG - Output 0 | Name: pred_boxes | Shape: (1, 1, 4)
2023-08-10 11:27:23 - main - DEBUG - Output 1 | Name: score_map | Shape: (1, 1, 16, 16)
2023-08-10 11:27:23 - main - DEBUG - Output 2 | Name: size_map | Shape: (1, 2, 16, 16)
2023-08-10 11:27:23 - main - DEBUG - Output 3 | Name: offset_map | Shape: (1, 2, 16, 16)
2023-08-10 11:27:23 - main - DEBUG - Output 4 | Name: backbone_feat | Shape: (1, 320, 768)
2023-08-10 11:27:23 - main - DEBUG - === Optimization Profiles ===
2023-08-10 11:27:23 - main - DEBUG - template - OptProfile 0 - Min (1, 3, 128, 128) Opt (1, 3, 128, 128) Max (1, 3, 128, 128)
2023-08-10 11:27:23 - main - DEBUG - search - OptProfile 0 - Min (1, 3, 256, 256) Opt (1, 3, 256, 256) Max (1, 3, 256, 256)
2023-08-10 11:27:23 - main - INFO - Building Engine...
/home/pc-3730/fatihcan/OSTRACK_TENSORRT/tensorrt-utils-20.01/classification/imagenet/onnx_to_tensorrt.py:222: DeprecationWarning: Use build_serialized_network instead.
with builder.build_engine(network, config) as engine, open(args.output, "wb") as f:
[08/10/2023-11:27:26] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
2023-08-10 11:27:27 - ImagenetCalibrator - INFO - Calibration images pre-processed: 32/512
[08/10/2023-11:27:27] [TRT] [E] 2: [calibrator.cu::absTensorMax::141] Error Code 2: Internal Error (Assertion memory != nullptr failed. memory must be valid if nbElem != 0)
[08/10/2023-11:27:27] [TRT] [E] 1: [executionContext.cpp::executeInternal::1177] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 3: [engine.cpp::~Engine::298] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/engine.cpp::~Engine::298, condition: mExecutionContextCounter.use_count() == 1. Destroying an engine object before destroying the IExecutionContext objects it created leads to undefined behavior.
)
[08/10/2023-11:27:27] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [cudaResources.cpp::~ScopedCudaStream::47] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 2: [calibrator.cpp::calibrateEngine::1181] Error Code 2: Internal Error (Assertion context->executeV2(&bindings[0]) failed. )
Traceback (most recent call last):
File "/home/pc-3730/fatihcan/OSTRACK_TENSORRT/tensorrt-utils-20.01/classification/imagenet/onnx_to_tensorrt.py", line 227, in
main()
File "/home/pc-3730/fatihcan/OSTRACK_TENSORRT/tensorrt-utils-20.01/classification/imagenet/onnx_to_tensorrt.py", line 222, in main
with builder.build_engine(network, config) as engine, open(args.output, "wb") as f:
AttributeError: enter
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
The text was updated successfully, but these errors were encountered:
Description
I am trying to run a transformer model on tensorrt. To be more specific my model is vitb_256_mae based. I already converted it into onnx file format. I am trying to run it on tensorrt runtime. So in order to do that i made a python script that creates an engine for fp32 to fp16 convertion. It works just fine for this situration. After that in order to convert fp32 to int8, i used your snippet. But i got stuck with some cuda driver errors.
My model takes template and search as input, and produce 5 outputs.
Important changes
In order to create optimization profiles for multiple inputs i tweaked your onnx_to_tensorrt.py a little bit:
Environment
TensorRT Version: 8.6.1 (from pip not builded from source)
GPU Type: RTX3090 ti
Nvidia Driver Version: nvidia-driver-530
CUDA Version: 12.1
CUDNN Version: dont have
Operating System + Version: ubuntu 22.04.2 lts
Python Version (if applicable): 3.10.12
More info
I tryed int8 conversion with 1 dynamic input using this link:
NVIDIA/TensorRT#289
for alexnet code worked. Thats why i think the error might because of the number of input. Or i might have to make some changes at imagenetcalibrator file.
Thanks for reading good days !
Output errors
2023-08-10 11:27:20 - main - INFO - TRT_LOGGER Verbosity: Severity.ERROR
[08/10/2023-11:27:23] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
/home/pc-3730/fatihcan/OSTRACK_TENSORRT/tensorrt-utils-20.01/classification/imagenet/onnx_to_tensorrt.py:177: DeprecationWarning: Use set_memory_pool_limit instead.
config.max_workspace_size = 4**30 # 1GiB
2023-08-10 11:27:23 - main - INFO - Setting BuilderFlag.FP16
2023-08-10 11:27:23 - main - INFO - Setting BuilderFlag.INT8
2023-08-10 11:27:23 - ImagenetCalibrator - INFO - Collecting calibration files from: /home/pc-3730/fatihcan/OSTRACK_TENSORRT/tensorrt-utils-20.01/classification/imagenet/imagenet/val
2023-08-10 11:27:23 - ImagenetCalibrator - INFO - Number of Calibration Files found: 10869
2023-08-10 11:27:23 - ImagenetCalibrator - WARNING - Capping number of calibration images to max_calibration_size: 512
[08/10/2023-11:27:23] [TRT] [W] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[08/10/2023-11:27:23] [TRT] [W] onnx2trt_utils.cpp:400: One or more weights outside the range of INT32 was clamped
[08/10/2023-11:27:23] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output.
2023-08-10 11:27:23 - main - DEBUG - === Network Description ===
2023-08-10 11:27:23 - main - DEBUG - Input 0 | Name: template | Shape: (1, 3, 128, 128)
2023-08-10 11:27:23 - main - DEBUG - Input 1 | Name: search | Shape: (1, 3, 256, 256)
2023-08-10 11:27:23 - main - DEBUG - Output 0 | Name: pred_boxes | Shape: (1, 1, 4)
2023-08-10 11:27:23 - main - DEBUG - Output 1 | Name: score_map | Shape: (1, 1, 16, 16)
2023-08-10 11:27:23 - main - DEBUG - Output 2 | Name: size_map | Shape: (1, 2, 16, 16)
2023-08-10 11:27:23 - main - DEBUG - Output 3 | Name: offset_map | Shape: (1, 2, 16, 16)
2023-08-10 11:27:23 - main - DEBUG - Output 4 | Name: backbone_feat | Shape: (1, 320, 768)
2023-08-10 11:27:23 - main - DEBUG - === Optimization Profiles ===
2023-08-10 11:27:23 - main - DEBUG - template - OptProfile 0 - Min (1, 3, 128, 128) Opt (1, 3, 128, 128) Max (1, 3, 128, 128)
2023-08-10 11:27:23 - main - DEBUG - search - OptProfile 0 - Min (1, 3, 256, 256) Opt (1, 3, 256, 256) Max (1, 3, 256, 256)
2023-08-10 11:27:23 - main - INFO - Building Engine...
/home/pc-3730/fatihcan/OSTRACK_TENSORRT/tensorrt-utils-20.01/classification/imagenet/onnx_to_tensorrt.py:222: DeprecationWarning: Use build_serialized_network instead.
with builder.build_engine(network, config) as engine, open(args.output, "wb") as f:
[08/10/2023-11:27:26] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
2023-08-10 11:27:27 - ImagenetCalibrator - INFO - Calibration images pre-processed: 32/512
[08/10/2023-11:27:27] [TRT] [E] 2: [calibrator.cu::absTensorMax::141] Error Code 2: Internal Error (Assertion memory != nullptr failed. memory must be valid if nbElem != 0)
[08/10/2023-11:27:27] [TRT] [E] 1: [executionContext.cpp::executeInternal::1177] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [resizingAllocator.cpp::deallocate::105] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 3: [engine.cpp::~Engine::298] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/engine.cpp::~Engine::298, condition: mExecutionContextCounter.use_count() == 1. Destroying an engine object before destroying the IExecutionContext objects it created leads to undefined behavior.
)
[08/10/2023-11:27:27] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [cudaDriverHelpers.cpp::operator()::94] Error Code 1: Cuda Driver (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 1: [cudaResources.cpp::~ScopedCudaStream::47] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
[08/10/2023-11:27:27] [TRT] [E] 2: [calibrator.cpp::calibrateEngine::1181] Error Code 2: Internal Error (Assertion context->executeV2(&bindings[0]) failed. )
Traceback (most recent call last):
File "/home/pc-3730/fatihcan/OSTRACK_TENSORRT/tensorrt-utils-20.01/classification/imagenet/onnx_to_tensorrt.py", line 227, in
main()
File "/home/pc-3730/fatihcan/OSTRACK_TENSORRT/tensorrt-utils-20.01/classification/imagenet/onnx_to_tensorrt.py", line 222, in main
with builder.build_engine(network, config) as engine, open(args.output, "wb") as f:
AttributeError: enter
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuMemFree failed: an illegal memory access was encountered
The text was updated successfully, but these errors were encountered: