Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Previously, the source version of CTKP was hard coded, but now the pa… #271

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

beasleyjonm
Copy link
Contributor

…rser collects the latest version from the CTKP mainfest on github.

…rser collects the latest version from the CTKP mainfest on github.
@EvanDietzMorris
Copy link
Contributor

I had this hard coded because the version in the manifest was an older version than the one they wanted deployed at the time, but this is really the way to go. There's still a bit of a risk of the properties or structure changing and breaking things for such a rapidly evolving data source, but it would be nice to have it update automatically, so it's a trade off. I'll try to see what's changed since this was first written and build the latest and then we can get this in.

@EvanDietzMorris
Copy link
Contributor

There have indeed been many changes since this parser was first written. Updating to the latest without making those changes doesn't work. You also deleted all the comments I had about how I thought this should be done, they were there for a reason because the way you implemented it means we'll retrieve the manifest multiple times and sometimes unnecessarily. That's not a huge deal though, we need to make the content updates before we can update this.

Here are some things from Slack we need to update or verify still work -
"version 2.6.0, as it introduces an important format change: we now use a pipe | (instead of a comma ,) as delimiter for the trial metadata in the TSV files.

...

And to clarify, by 'trial metadata' I mean the edges supported by more than one trial. So for example if an edge was previously supported by three trials, they were represented in the relevant TSV field as NCT00893555,NCT04752163,NCT06582186 and their corresponding phases as 3,1.5,4. Now we represent this as NCT00893555|NCT04752163|NCT06582186 and 3|1.5|4.

Complete changelog from 2.3.0 to 2.6.0:
Changed: content update to 20241108
Changed: value_type_id of max_research_phase and clinical_trial_phase is now biolink:ResearchPhaseEnum
Changed: supporting_study_metadata back to supporting_study
Changed: dropped ‘biolink:’ prefix from the attribute_type_id of nested attributes
Changed: subject_boxed_warning renamed to intervention_boxed_warning
Removed: non-biolink “mentioned_in_trials_for” predicate
Added: “tested” attribute per trial indicates whether the trial evaluates the intervention (‘yes’) or if it’s not clear (‘unsure’)
Changed: improved computation of max_research_phase to refer to testing confidence level of supporting trials
Changed: dropped obsolete trial identifiers
Added: "brief_title" metadata on supporting clinical trials
Changed: modified internal delimiter (in edges tsv file) from comma to pipe, for joining metadata on multiple supporting trials for an edge"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants