Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Labels #26

Merged
merged 2 commits into from
Jun 14, 2024
Merged

Update Labels #26

merged 2 commits into from
Jun 14, 2024

Conversation

jamesaoverton
Copy link
Collaborator

@jamesaoverton jamesaoverton commented May 21, 2024

This PR loads ChEBI and RxNorm, then uses them to update the DrOn templates with the latest labels from those sources.

  • add scripts to load ChEBI
  • add scripts to load RxNorm
  • update labels from ChEBI terms from the latest ChEBI release
  • update labels for RxNorm terms from the RxNorm_full_12042023 release

Please test a full build with:

cd src/ontology
sh run.sh make all_components -B
sh run.sh make prepare_release -B COMP=false MIR=false IMP=false

There's too much to review completely, so I suggest checking the new scripts, then spot-checking the updated templates.

@jamesaoverton jamesaoverton marked this pull request as draft May 21, 2024 19:28
@jamesaoverton jamesaoverton requested a review from hoganwr May 21, 2024 19:30
@hoganwr
Copy link
Collaborator

hoganwr commented May 22, 2024

Here's an interesting, non-synonymous, yet apparently correct label change in RxNorm:
DRON:00841115 Gibberella fujikuroi allergenic extract 2101340
DRON:00841115 Fusarium moniliforme antigen 2101340

Turns out the Gibberela one now has RxCui 2043450.

I am still reviewing but so far the ingredients are checking out.

@hoganwr
Copy link
Collaborator

hoganwr commented May 25, 2024

Some of the diffs show a large addition to the label. I am wondering if the old MySQL database was limited to 255 character fields, and therefore the diff shows a huge addition of everything that got cut off previously. I also wonder if this was the genesis of many duplicate labels.

@hoganwr hoganwr marked this pull request as ready for review May 25, 2024 19:39
Copy link
Collaborator

@hoganwr hoganwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the diffs and everything looks good.

@hoganwr
Copy link
Collaborator

hoganwr commented May 25, 2024

I was reviewing duplicate labels and I see we have duplicate classes. I wonder if the pattern I noticed for ingredients, where the old Scala code created a new ingredient when the label changed (except for purely a change in case), is also reproduced for clinical drugs, clinical drug forms, etc.

For example, look at http://purl.obolibrary.org/obo/DRON_00733166 and http://purl.obolibrary.org/obo/DRON_00058661. Both classes have the same rdfs:label and same RxCui annotations.

@hoganwr
Copy link
Collaborator

hoganwr commented May 25, 2024

They had different labels before we fixed them based on RxCUI match. Here's what I see in Ontobee:

DRON_00733166 Zolpidem tartrate 6.25 MG Extended Release Oral Tablet
DRON_00058661 Zolpidem tartrate 6.25 MG Extended Release Tablet

Now they both have the label: zolpidem tartrate 6.25 MG Extended Release Oral Tablet

@jamesaoverton jamesaoverton merged commit 655f8ed into main Jun 14, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants