-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NEVER MERGE: Test PR for adding synonyms - qc-related-exact-synonym
#8562
Conversation
- Updated wget URL for files using a mondo-ingest feature branch
- Reinstated the mondo-edit.obo version from the commit before running these tests. Ran synonym pipeline and updated mondo-edit.obo
a62ec96
to
237a38c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Results
Look at: CI run
There are now just 7 failures:
FAIL Rule ../sparql/qc/general/qc-related-exact-synonym.sparql: 7 violation(s)
entity,property,value
http://purl.obolibrary.org/obo/MONDO_0007039,oboInOwl:hasExactSynonym,ACN
http://purl.obolibrary.org/obo/MONDO_0007039,oboInOwl:hasExactSynonym,BANF
http://purl.obolibrary.org/obo/MONDO_0010753,oboInOwl:hasExactSynonym,EDS5
http://purl.obolibrary.org/obo/MONDO_0011811,oboInOwl:hasExactSynonym,SCA24
http://purl.obolibrary.org/obo/MONDO_0013426,oboInOwl:hasExactSynonym,LDS1C
http://purl.obolibrary.org/obo/MONDO_0014167,oboInOwl:hasExactSynonym,FCMTE5
http://purl.obolibrary.org/obo/MONDO_0033005,oboInOwl:hasExactSynonym,SCAR5
Basically, monarch-initiative/mondo-ingest#747 addresses all qc-related-exact-synonym.sparql
failures but these 7.
These 7 are cases in which we have skos:exactMatch
to terms from 2 different sources, and the scope from those 2 sources are different.
Example: MONDO:0007039
"ACN"
mondo-edit.obo
:
synonym: "ACN" EXACT ABBREVIATION [DOID:0111252]
synonym: "ACN" RELATED ABBREVIATION [OMIM:101000]
components/doid.owl
:
<owl:Class rdf:about="http://purl.obolibrary.org/obo/DOID_0111252">
<oboInOwl:hasExactSynonym xml:lang="en">ACN</oboInOwl:hasExactSynonym>
</owl:Class>
components/omim.owl
:
<owl:Class rdf:about="https://omim.org/entry/101000">
<oboInOwl:hasRelatedSynonym>ACN</oboInOwl:hasRelatedSynonym>
</owl:Class>
Proposed solution for these remaining:
a. Some kind of manual curation before running pipeline in mondo
b. Update synonym sync: Filter out any cases where sources disagree
I would recommend (b). And instead of only filtering them out, I think we should generate another output that shows all of these conflicts, so that we have the opportunity to curate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joeflack4
(1) Since these are the results from your latest updates to the Synonym Sync why would another output be needed that shows all of these conflicts, in other words why would this list of these qc-related-exact-synonym.sparql
qc check errors not be complete?
(2) In all of these 7 cases the issue is not between any generic combination of sources, it is between DOID and OMIM and specifically an OMIM abbreviation from an alternative symbol marked as formerly. In all of these cases the scope of the OMIM symbol marked as formerly should take precedence and the DOID synonym abbreviation should be filtered from the synonym sync output file. I will create a new ticket for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1) My recommendation for filtering these into an output ahead instead of waiting for QC failure is for ease. If we filter out problematic cases before they hit the QC, then this allows us to seamlessly run the synonym sync monthly without the stress / hold up of handling the QC failures a time where we may struggle to get everything we want done before a monthly build deadline. Having mondo-ingest
put them in a special curation output instead will allow us to curate at a time that best suits us.
But it looks like (2) will be sufficient to handle this! I'll get started.
qc-related-exact-synonym
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this PR comparing the previous Synonym Sync code to the update in monarch-initiative/mondo-ingest#747 generally looks good. For the "added" case, the EXCLUDE
annotation is now being taken into account properly by the code and not adding new synonyms where the existing synonym has the same string value, but has the EXCLUDE
annotation.
Confirms bug fix for
qc-related-exact-synonym
failures in the following PR is working:Context: