-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add FLUX_ENCLOSING_ID
to initial program environment for instances with a jobid
broker attribute
#6558
Conversation
Problem: It is inconvenient to get the jobid of the closest enclosing instance (e.g. the ID of a batch or alloc job in the parent instance) because the broker drops FLUX_JOB_ID in favor of a jobid attribute. However, a case has been made that an environment variable would be more convenient since it can be accessed without use of system(3) or the Flux API. Introduce FLUX_ENCLOSING_ID, which is set by the broker whenever the jobid attribute is set (i.e. when the broker is started as a job in a Flux instance). This will be available in the initial program as well as being inherited by jobs run within the instance. Add the variable to the env_blocklist so that it is unset when the current instance is not a job. Fixes flux-framework#6474
Problem: No tests in the testsuite ensure FLUX_ENCLOSING_ID is set (and unset) where applicable. Add a couple tests to t0014-runlevel.t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a suggestion on expanding the description a bit.
doc/man7/flux-environment.rst
Outdated
.. envvar:: FLUX_ENCLOSING_ID | ||
|
||
The jobid of the closest enclosing Flux instance if it was run as a | ||
job. This is set as an alternative to :envvar:`FLUX_JOB_ID` because | ||
it applies only to the parent instance, not the current instance. | ||
|
||
See :ref:`INITIAL PROGRAM ENVIRONMENT <initial_program_environment>` below | ||
for more details. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be helpful to spell this out a bit more for casual Flux user. I took a stab below but feel free to ignore or edit.
The job ID of the enclosing Flux instance, if it has one. The enclosing Flux instance is the one that ran FLUX_JOB_ID. Depending on how the enclosing Flux instance was started, it may or may not have a job ID. If it was not launched by Flux, FLUX_ENCLOSING_ID is not set.
Example 1: a batch job that runs one MPI job is submitted to a Flux system instance. In the environment of the MPI job, FLUX_ENCLOSING_ID refers to the batch job ID in the system instance.
Example 2: an MPI job is submitted directly to a Flux system instance. Since the Flux system instance was not launched by Flux, FLUX_ENCLOSING_ID is not set in the environment of the MPI job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jim beat me to it, I agree fleshing this area out a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestions. I'll take @garlick's text as is besides fixing a few nits (the rest of the document uses "jobid" not "job ID", and style guide says the word after a colon should be capitalized if it is the beginning of a sentence.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, pushed the update. I also dropped the reference to the INITIAL PROGRAM section as now it seemed superfluous since most if the detail in the JOB ENVIRONMENT section instead.
doc/man7/flux-environment.rst
Outdated
Additionally, :envvar:`FLUX_ENCLOSING_ID` is set by the broker if it was | ||
launched as part of a Flux job, and contains the jobid of the instance | ||
in F58 form. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just "launched as a Flux job", not sure "part of" is necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, a single broker is not "a Flux job" unless the job has only one node.. but either way gets the point across if you feel strongly about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohhh i see your context on it now. Perhaps just "is set", no need to say "set by the broker"? Dunno if the latter is a detail the user needs to know. but no biggie either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, that will probably make it clearer to the intended audience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplified to duplicate @garlick's text from above:
Additionally, :envvar:
FLUX_ENCLOSING_ID
is set to the jobid of the enclosing instance, if it has one.
How's that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup i think that works. I guess you could add back the F58 form
, but I don't think that matters.
Problem: The FLUX_ENCLOSING_ID environment variable is not documented. Document it in flux-environment(7).
Thanks! Will set MWP now. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6558 +/- ##
==========================================
- Coverage 79.45% 79.44% -0.02%
==========================================
Files 531 531
Lines 88240 88247 +7
==========================================
- Hits 70109 70104 -5
- Misses 18131 18143 +12
|
This PR adds a
FLUX_ENCLOSING_ID
environment variable as an alternative to the brokerjobid
attribute. If an instance does not have ajobid
attribute, thenFLUX_ENCLOSING_ID
is unset.This environment variable can be used to refer to the jobid of the enclosing batch or alloc instance from the initial program or a job within that instance (since it will be inherited by jobs).