Rework update submissions to allow re-submissions #256
+304
−126
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This reworks the update_submissions module so that it is used for re-submitting updates to BioSample, etc. Now the
--update_submissions
flag is used to submit an updated metadata file. It requires the submission_report.csv to be present in its expected location (with valid accession IDs) and will upload new submission/xml files with updated metadata and the accession IDs.There is a new fetch_submissions module that runs after initial_submissions. It runs after an initial submission or an update submission, but can also be run independently with fetch_reports_only = true.
The subworkflow was re-worked a bit to allow these three independent submission-related functionalities:
Checklist
Go Through Checklist Below and Place A ✔️ (X Inside the Box) if Completed
General Checks
[] Have you run appropriate tests (unit/integration/end-to-end) to check logic across run environments (Conda/Docker/Singularity on Scicomp/AWS/NF Tower/Local)?
Checked test profile on Scicomp, with Singularity
For each relevant configuration:
Have you conducted proper linting procedures?
[] Have you updated existing documentation (README.md, etc.) or created new ones within docs?
I have not done this yet. Should be completed before merging the PR.
We need to add fetch_reports_only flag, and explain that's just for fetching the report.xml from NCBI server if, for instance, it wasn't fetched after the initial submission ran. And we need to clarify that update_submission is for updating a submission you've already made - it requires the submission_report.csv (with valid accession IDs) and will upload new submission/xml files with updated metadata and the accession IDs. The IDs are required for NCBI to link the record to the original one and update it (otherwise it gets submitted as a new record).
CDC Checks
Are additional approvals needed for this change? If so, please mention them below:
Are there potential vulnerabilities or licensing issues with any new dependencies introduced? If so, please mention them below: