-
Notifications
You must be signed in to change notification settings - Fork 2
Classification Import
A variant is a change on a locus, and thus is tied to a genome build.
A classification is initially created for a specific build, but it needs to be viewed, exported and checked for discordance in all genome builds used by the system.
This is done using an Allele (a change independent of a genome build, see Variants and Alleles), and ensuring that:
- An Allele exists for the initial variant
- A variant for each genome build is linked to that allele
If a classification is created inside VariantGrid, we already know the variant so it can be linked immediately.
When submitted via API, we require a genome build, and either VCF style coordinates or HGVS, which needs to be resolved to a variant (which may need to be created).
There can be multiple classifications in an API POST, and BulkInserter creates a new ClassificationImport for each genome build used in the import, and links it to a classification. After processing and saving the classification records, process_classification_import
will be called for each build so the records can be processed in batches.
The API could be performing an upsert on a record already linked to a variant, so further steps are only for classifications without variants. Classifications have been validated, and errors raised if different EKeys resolved to different variant coordinates.
We call get_variant_coordinates_from_evidence()
on the classification which uses the first of c.HGVS, g.HGVS or VCF coordinates to resolve to a variant.
We query Redis with hashes of variant coordinates, returning a variant_id of existing records or None. Known variants are linked, and unknown are written to a VCF and a VCF_INSERT_VARIANTS_ONLY
UploadPipeline is created and run with a post insert task of ClassificationImportProcessVariantsTask
The VCF_INSERT_VARIANTS_ONLY
UploadPipeline handles normalising, and inserting/updating Redis without dupes or race conditions (See VCF_Import).
ClassificationImportProcessVariantsTask then queries Redis for the newly inserted variant_ids, loads them and calls classification.set_variant()
. It also adds a warning if the variant was normalized by VT
ClassificationImportProcessVariantsTask calls populate_clingen_alleles_for_variants
with variants from the import, and then creates Liftover pipelines for all other builds on the system.
See Variants and Alleles and Liftover