Debug aid #823
Labels
Deployability
Enable TES is easy to deploy for end users
enhancement
New feature or request
Robustness
Enable users can run tasks w/o bugs or with mitigation of known bugs
TES Priority: P1
Groomed to a Priority 1 issue
Troubleshooting
Enable users to identify and debug errors
Milestone
Problem:
The hardest kinds of problems to troubleshoot are the ones were the only information generated is an exit code that may as well be meaningless.
Solution:
Some means of accessing any information produced on the compute node before the task ends.
Describe alternatives you've considered
Providing a solution for #555 is larger in scope and something is needed earlier.
Code dependencies
Will this require code changes in:
CoA, for new and/or existing deployments?
NoTES standalone, for new and/or existing deployments?
NoTerra, for new and/or existing deployments?
NoBuild pipeline?
NoIntegration tests?
NoAdditional context
As envisioned, this is to cover one specific scenario: a very repeatable failure on the compute nodes where no logs of any kind are generated by the task runner, and the batch task ends with a non-zero exit code (usually 10, at the time of this writing).
The text was updated successfully, but these errors were encountered: