Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NEVER MERGE: Test PR for adding synonyms - qc-related-exact-synonym #8562

Closed
wants to merge 2 commits into from

Conversation

joeflack4
Copy link
Collaborator

@joeflack4 joeflack4 commented Jan 12, 2025

Confirms bug fix for qc-related-exact-synonym failures in the following PR is working:

Context:

- Updated wget URL for files using a mondo-ingest feature branch
- Reinstated the mondo-edit.obo version from the commit before running these tests. Ran synonym pipeline and updated mondo-edit.obo
@joeflack4 joeflack4 force-pushed the test-PR-adding-syns-v2 branch from a62ec96 to 237a38c Compare January 12, 2025 03:36
Copy link
Collaborator Author

@joeflack4 joeflack4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

@joeflack4 joeflack4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment thread.

Copy link
Collaborator Author

@joeflack4 joeflack4 Jan 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Results

Look at: CI run

There are now just 7 failures:

FAIL Rule ../sparql/qc/general/qc-related-exact-synonym.sparql: 7 violation(s)
entity,property,value
http://purl.obolibrary.org/obo/MONDO_0007039,oboInOwl:hasExactSynonym,ACN
http://purl.obolibrary.org/obo/MONDO_0007039,oboInOwl:hasExactSynonym,BANF
http://purl.obolibrary.org/obo/MONDO_0010753,oboInOwl:hasExactSynonym,EDS5
http://purl.obolibrary.org/obo/MONDO_0011811,oboInOwl:hasExactSynonym,SCA24
http://purl.obolibrary.org/obo/MONDO_0013426,oboInOwl:hasExactSynonym,LDS1C
http://purl.obolibrary.org/obo/MONDO_0014167,oboInOwl:hasExactSynonym,FCMTE5
http://purl.obolibrary.org/obo/MONDO_0033005,oboInOwl:hasExactSynonym,SCAR5

Basically, monarch-initiative/mondo-ingest#747 addresses all qc-related-exact-synonym.sparql failures but these 7.

These 7 are cases in which we have skos:exactMatch to terms from 2 different sources, and the scope from those 2 sources are different.

Example: MONDO:0007039 "ACN"
mondo-edit.obo:

synonym: "ACN" EXACT ABBREVIATION [DOID:0111252]
synonym: "ACN" RELATED ABBREVIATION [OMIM:101000]

components/doid.owl:

    <owl:Class rdf:about="http://purl.obolibrary.org/obo/DOID_0111252">
        <oboInOwl:hasExactSynonym xml:lang="en">ACN</oboInOwl:hasExactSynonym>
    </owl:Class>

components/omim.owl:

    <owl:Class rdf:about="https://omim.org/entry/101000">
        <oboInOwl:hasRelatedSynonym>ACN</oboInOwl:hasRelatedSynonym>
    </owl:Class>

Proposed solution for these remaining:
a. Some kind of manual curation before running pipeline in mondo
b. Update synonym sync: Filter out any cases where sources disagree

I would recommend (b). And instead of only filtering them out, I think we should generate another output that shows all of these conflicts, so that we have the opportunity to curate.

Copy link
Collaborator

@twhetzel twhetzel Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joeflack4
(1) Since these are the results from your latest updates to the Synonym Sync why would another output be needed that shows all of these conflicts, in other words why would this list of these qc-related-exact-synonym.sparql qc check errors not be complete?

(2) In all of these 7 cases the issue is not between any generic combination of sources, it is between DOID and OMIM and specifically an OMIM abbreviation from an alternative symbol marked as formerly. In all of these cases the scope of the OMIM symbol marked as formerly should take precedence and the DOID synonym abbreviation should be filtered from the synonym sync output file. I will create a new ticket for this.

Copy link
Collaborator Author

@joeflack4 joeflack4 Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) My recommendation for filtering these into an output ahead instead of waiting for QC failure is for ease. If we filter out problematic cases before they hit the QC, then this allows us to seamlessly run the synonym sync monthly without the stress / hold up of handling the QC failures a time where we may struggle to get everything we want done before a monthly build deadline. Having mondo-ingest put them in a special curation output instead will allow us to curate at a time that best suits us.

But it looks like (2) will be sufficient to handle this! I'll get started.

@joeflack4 joeflack4 changed the title NEVER MERGE: Test PR for adding synonyms - v2 NEVER MERGE: Test PR for adding synonyms - qc-related-exact-synonym Jan 15, 2025
Copy link
Collaborator

@twhetzel twhetzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this PR comparing the previous Synonym Sync code to the update in monarch-initiative/mondo-ingest#747 generally looks good. For the "added" case, the EXCLUDE annotation is now being taken into account properly by the code and not adding new synonyms where the existing synonym has the same string value, but has the EXCLUDE annotation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants