Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A core dump occurs when multiple threads call CombinedNonMaxSuppression on GPU #67

Open
cyfwry opened this issue Aug 24, 2022 · 0 comments

Comments

@cyfwry
Copy link

cyfwry commented Aug 24, 2022

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): CentOS Linux release 7.6.1810 (Core)
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): nv20.12
  • Python version: 2.7
  • Bazel version (if compiling from source): 0.24.1
  • GCC/Compiler version (if compiling from source): GCC 8.3.1
  • CUDA/cuDNN version: CUDA 11.4.2, cuDNN 8.2.4.15
  • GPU model and memory: A30

Describe the current behavior

A core dump occurs when multiple threads call CombinedNonMaxSuppression on GPU:

Error detected in GPU stream: Error detected in GPU stream: Error detected in GPU stream: an illegal memory access was encounteredan illegal memory access was encounteredan illegal memory access was encountered

2022-08-24 14:56:40.254500: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_event.cc:29] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
2022-08-24 14:56:40.254538: F external/org_tensorflow/tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:273] Unexpected Event status: 1
2022-08-24 14:56:40.254546: F external/org_tensorflow/tensorflow/core/kernels/batched_non_max_suppression_op.cu.cc:825] Non-OK-status: GpuLaunchKernel(SetZero, config.block_count, config.thread_per_block, 0, device.stream(), config.virtual_thread_count, (*output_indices)->flat().data()) status: Internal: an illegal memory access was encountered
run_docker_bash.sh: line 108: 47384 Aborted (core dumped)

When one thread calls CombinedNonMaxSuppression on GPU or multiple threads call CombinedNonMaxSuppression on CPU, no error occurs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant