-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ah_var_store] Retry with more memory VCF extract [VS-1351] #8812
[ah_var_store] Retry with more memory VCF extract [VS-1351] #8812
Conversation
b076447
to
d7bc9f2
Compare
@@ -368,7 +368,22 @@ task ExtractTask { | |||
--filter-set-name ~{filter_set_name}' | |||
fi | |||
|
|||
gatk --java-options "-Xmx~{memory_gib - 3}g" \ | |||
# This tool may get invoked with "Retry with more memory" with a different amount of memory than specified in | |||
# the input `memory_gib`, so use the memory-related environment variables rather than the `memory_gib` input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a way to set the memory in the run-time block from the memory-related environment variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I know of, and the way "retry with more memory" works I don't think it would matter. Cromwell basically remembers how much memory the task was last run with, and if it fails with looks like an out of memory error, Cromwell retries the task with an overridden, "multiplied" memory value: https://support.terra.bio/hc/en-us/articles/4403215299355-Out-of-Memory-Retry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. But that would defeat the whole point. So it just allocates to the VM the same amount of memory (here memory_gib) each time it retries??
Line 1.2 Some commands/tools might be able to pick up and use the additional memory provided automatically without customization.
seems to imply that the VM will come up with more memory and tools like python would then recognize that they're on a VM with more memory than before?.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a run where I added in some fake "out of memory" code that shows the task being retried with increasing amounts of memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see that you had it OOM before hitting the actual Java command, and from try 4: MEM_SIZE: 69.12, MEM_UNIT: GB, memory_mb: 66120, memory_gib: 40
.
So, what I'm still confused about is how would cromwell up the memory needed for the VM for each retry? At this point the memory that Java will ask for is more than the VM has. The VM's memory is always set to memory_gib which is 40. How is that even supposed to work? Is it possible that if the runtime memory was hard-coded to 40, then cromwell might be smart enough to up that as it's upping MEM_SIZE?
Basically, I don't understand the use of retrying with more memory if there's no way that that additional memory isn't requested on the VM we are spinning up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically Cromwell doesn't honor the original memory
specification on retry-with-more-memory retries, it uses the scaled values instead.
788b0d1
to
733163a
Compare
Tested and working, taking out of Draft status. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Also wondering if we want to add in documentation on the quota increases that we need for extract so that those can be requested well before extract is kicked off?
I will be more than happy to document how to have quotas increased as soon as I figure out how to actually accomplish that, which so far I have not. 😞 |
Currently in draft status, I still need to test this the way I tested the PGEN version.Tested, works, ready for review!