You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
std::thread::hardware_concurrency() returns, when possible, the underlying hardware capability to run threads, which might not corresponds to the actual number of cores available to the process (through the use of taskset, batch system like slurm, etc...). The consequence is that kmc might run in a non optimal way. For example, I've got a user that has submitted a kmc job on a 96 cores HPC nodes, in a single core slurm allocation: more than 100 threads are now fighting for the usage of this core.
Hello,
std::thread::hardware_concurrency()
returns, when possible, the underlying hardware capability to run threads, which might not corresponds to the actual number of cores available to the process (through the use of taskset, batch system like slurm, etc...). The consequence is that kmc might run in a non optimal way. For example, I've got a user that has submitted a kmc job on a 96 cores HPC nodes, in a single core slurm allocation: more than 100 threads are now fighting for the usage of this core.The immediate workaround is to use the -t option to set the number of threads, but I think a more sane behavior would be to use sched_getaffinity as in opencv/opencv#16268. Gromacs did something similar (https://github.com/gromacs/gromacs/blob/1e6873fadf16d5f5be861e6f9ef5f9923a12e540/src/gromacs/hardware/hardwaretopology.cpp#L1221).
What do you think ?
Thank you.
The text was updated successfully, but these errors were encountered: