[Documentation] Call shell scripts with extra resources inside the submitted Python function #567

jan-janssen · 2025-02-10T19:49:58Z

For SLURM each call of srun creates a new job step, so there is no need to assign any resources to the specific Python function, just assign the resources to the external shell script directly.

For flux there is the option to use nested executors:
https://executorlib.readthedocs.io/en/latest/3-hpc-job.html#nested-executors

Still this option can also be assigned during the submission - just like all the other options:

from executorlib import FluxJobExecutor

def get_available_gpus(lst):
    import socket
    from tensorflow.python.client import device_lib
    local_device_protos = device_lib.list_local_devices()
    return [
        (x.name, x.physical_device_desc, socket.gethostname())
        for x in local_device_protos if x.device_type == "GPU"
    ] + lst

with FluxJobExecutor() as exe:
    fs = []
    for i in range(1, 4):
        fs = exe.submit(
            get_available_gpus,
            lst=fs,
            resource_dict={"cores": 1, "gpus_per_core": 1, "flux_executor_nesting": True},
        )
    print(fs.result())

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Documentation] Call shell scripts with extra resources inside the submitted Python function #567

[Documentation] Call shell scripts with extra resources inside the submitted Python function #567

jan-janssen commented Feb 10, 2025

[Documentation] Call shell scripts with extra resources inside the submitted Python function #567

[Documentation] Call shell scripts with extra resources inside the submitted Python function #567

Comments

jan-janssen commented Feb 10, 2025