Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault #1090

Open
axsk opened this issue Jan 24, 2025 · 2 comments
Open

Segmentation Fault #1090

axsk opened this issue Jan 24, 2025 · 2 comments

Comments

@axsk
Copy link

axsk commented Jan 24, 2025

I keep running into Segmentation fault errors.

This happens most probably during calls to the OpenMM Python API.
Here is my "error message"

[13736] signal 11 (1): Segmentation fault                                                                                                                                                                                                                  
in expression starting at none:0                                                                                                                                                                                                                           
_PyInterpreterState_GET at /usr/local/src/conda/python-3.12.8/Include/internal/pycore_pystate.h:133 [inlined]                                                                                                                                              
get_state at /usr/local/src/conda/python-3.12.8/Objects/obmalloc.c:866 [inlined]                                                                                                                                                                           
_PyObject_Free at /usr/local/src/conda/python-3.12.8/Objects/obmalloc.c:1850 [inlined]
PyObject_Free at /usr/local/src/conda/python-3.12.8/Objects/obmalloc.c:830
_buffer_info_free at /data/numerik/people/bzfsikor/conda/envs/conda_jl/lib/python3.12/site-packages/numpy/core/_multiarray_umath.cpython-312-x86_64-linux-gnu.so (unknown line)
array_dealloc at /data/numerik/people/bzfsikor/conda/envs/conda_jl/lib/python3.12/site-packages/numpy/core/_multiarray_umath.cpython-312-x86_64-linux-gnu.so (unknown line)
pydecref_ at /data/numerik/people/bzfsikor/software/julia_depot/packages/PyCall/1gn3u/src/PyCall.jl:118
pydecref at /data/numerik/people/bzfsikor/software/julia_depot/packages/PyCall/1gn3u/src/PyCall.jl:123
jfptr_pydecref_4550 at /data/numerik/people/bzfsikor/software/julia_depot/compiled/v1.11/PyCall/GkzkC_ddiUX.so (unknown line) 
run_finalizer at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/gc.c:299
jl_gc_run_finalizers_in_list at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/gc.c:389
run_finalizers at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/gc.c:435
enable_finalizers at ./gcutils.jl:161 [inlined]
unlock at ./lock.jl:178 [inlined]
macro expansion at ./lock.jl:275 [inlined]
#282 at /home/htc/bzfsikor/.julia/juliaup/julia-1.11.3+0.x64.linux.gnu/share/julia/stdlib/v1.11/REPL/src/LineEdit.jl:2851
jfptr_YY.282_9263 at /data/numerik/people/bzfsikor/software/julia_depot/compiled/v1.11/REPL/u0gqU_dovaC.so (unknown line)
jl_apply at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
start_task at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/task.c:1202
Allocations: 775171170 (Pool: 775164164; Big: 7006); GC: 5349
fish: Job 1, 'env JULIA_HISTORY=./.history.jl…' terminated by signal SIGSEGV (Address boundary error)

I have no idea how to investigate this further.

@axsk axsk changed the title SegmentationFault Segmentation Fault Jan 24, 2025
@axsk
Copy link
Author

axsk commented Jan 24, 2025

And here is another one, which I run into more often. This puzzles me especially since it somehow involves CUDA as well..

in expression starting at REPL[98]:1
_PyInterpreterState_GET at /usr/local/src/conda/python-3.12.8/Include/internal/pycore_pystate.h:133 [inlined]
get_gc_state at /usr/local/src/conda/python-3.12.8/Modules/gcmodule.c:134 [inlined]
PyObject_GC_Del at /usr/local/src/conda/python-3.12.8/Modules/gcmodule.c:2421
pydecref_ at /data/numerik/people/bzfsikor/software/julia_depot/packages/PyCall/1gn3u/src/PyCall.jl:118
pydecref at /data/numerik/people/bzfsikor/software/julia_depot/packages/PyCall/1gn3u/src/PyCall.jl:123
jfptr_pydecref_4550 at /data/numerik/people/bzfsikor/software/julia_depot/compiled/v1.11/PyCall/GkzkC_ddiUX.so (unknown line)
run_finalizer at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/gc.c:299
jl_gc_run_finalizers_in_list at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/gc.c:389
run_finalizers at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/gc.c:435
enable_finalizers at ./gcutils.jl:161 [inlined]
unlock at ./locks-mt.jl:68 [inlined]
popfirst! at ./task.jl:751
trypoptask at ./task.jl:992
jfptr_trypoptask_66779.1 at /home/htc/bzfsikor/.julia/juliaup/julia-1.11.3+0.x64.linux.gnu/lib/julia/sys.so (unknown line)
get_next_task at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/scheduler.c:377 [inlined]
ijl_task_get_next at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/scheduler.c:438
poptask at ./task.jl:1012
wait at ./task.jl:1021
#wait#731 at ./condition.jl:130
wait at ./condition.jl:125 [inlined]
take! at /data/numerik/people/bzfsikor/software/julia_depot/packages/CUDA/1kIOw/lib/cudadrv/synchronization.jl:53
synchronization_worker at /data/numerik/people/bzfsikor/software/julia_depot/packages/CUDA/1kIOw/lib/cudadrv/synchronization.jl:119
unknown function (ip: 0x7f8cd97c33b5)
jlcapi_synchronization_worker_13623 at /data/numerik/people/bzfsikor/software/julia_depot/compiled/v1.11/CUDA/oWw5k_ddiUX.so (unknown line)
unknown function (ip: 0x7f8f3d3961c3)
unknown function (ip: 0x7f8f3d41685b)
Allocations: 1455439833 (Pool: 1455431030; Big: 8803); GC: 4054
fish: Job 1, 'env JULIA_HISTORY=./.history.jl…' terminated by signal SIGSEGV (Address boundary error)

@axsk
Copy link
Author

axsk commented Jan 24, 2025

I switched to a single threaded instance and have not yet observed this issue.
However, I don't make any (explicit) use of multi-threading nowhere in my code..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant