PyTorch and MPI-enabled AMReX don't get along in load_state_dict
#322
Labels
bug: affects latest release
Bug also exists in latest release version
bug
Something isn't working
component: MPI
Domain decomposition and communication
component: third party
Changes in ImpactX that reflect a change in a third-party library
On my local machine, PyTorch has some internal multithreaded functionality that doesn't get along with AMReX. Unless I set PyTorch.set_num_threads(1 or 2), then the attached script will hang when the neural network tries to set its initial parameters.
This script downloads some neural network parameters from Zenodo archive to then load them, and the
load_state_dict
function is the specific point of failure.pytorch_amrex_hang_reproducer_v2.py.txt
The text was updated successfully, but these errors were encountered: