Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slurm arrays #23

Open
mikygit opened this issue Mar 4, 2025 · 2 comments
Open

slurm arrays #23

mikygit opened this issue Mar 4, 2025 · 2 comments

Comments

@mikygit
Copy link

mikygit commented Mar 4, 2025

Hello,
I have some problems with some of my slurm jobs which gpu information are not well exported and I wonder if it might be related to slurm arrays...
Have you ever tested jobstats in such condiftions?
I mean with sbatch having such param:
#SBATCH --array=0-4

This creates jobids which are not integers anymore but rather something like this: $JOB_ID_0, $JOB_ID_1, $JOB_ID_2 and $JOB_ID_3

Any ideas?

Thx,
--Mike

@plazonic
Copy link
Collaborator

plazonic commented Mar 4, 2025

Hello,

we use jobstats with GPU arrays regularly. Stats are collected with $SLURM_JOBID which is same as jobidraw in sacct. If you have it setup the way we recommend you can verify this by logging to a node where a GPU array job is running and checking /run/gpustat/X where X=GPU ordinal number (or GPU-UUID).

Does all of this work for non array jobs?

@mikygit
Copy link
Author

mikygit commented Mar 4, 2025

Thank you for your response.
Yes, it does work very well with non array jobs.
Pls let me debug a bit more deeply and come back with new inputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants