Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify the ingest process and make source of monarch mp-hp-etc mappings more transparent #61

Open
cmungall opened this issue Sep 6, 2024 · 3 comments
Assignees

Comments

@cmungall
Copy link
Member

cmungall commented Sep 6, 2024

Currently we have this Makefile process:

$(TMP_DIR)/upheno/%:
mkdir -p $(TMP_DIR)/upheno/
wget -q https://bbop-ontologies.s3.amazonaws.com/upheno/current/upheno-release/all/$* -O $@
$(MAPPING_DIR)/upheno_custom.sssom.tsv: $(patsubst %, $(TMP_DIR)/upheno/%, upheno_species_lexical.csv upheno_mapping_logical.csv upheno_all_with_relations.owl)
mkdir -p $(MAPPING_DIR) $(TMP_DIR)
$(RUN) phenio-toolkit lexical-mapping --species-lexical $(TMP_DIR)/upheno/upheno_species_lexical.csv -m $(TMP_DIR)/upheno/upheno_mapping_logical.csv -o $(TMP_DIR)
$(RUN) sssom parse $(TMP_DIR)/upheno_custom_mapping.sssom.tsv --metadata $(METADATA_DIR)/upheno_custom_mapping.sssom.yml -C merged -o $@

This is not documented, but it seems to grab a CSV of unknown provenance from https://bbop-ontologies.s3.amazonaws.com/upheno/current/upheno-release/all/upheno_species_lexical.csv and run a bespoke mapping process.

This file doesn't seem to have been updated since 2022:

curl -I  https://bbop-ontologies.s3.amazonaws.com/upheno/current/upheno-release/all/upheno_species_lexical.csv

HTTP/1.1 200 OK
x-amz-id-2: JvpiirNXlumH4JRMltKcm8FrdDzmON8hMwyFsBofTV0QpujZHle2RH9gz31PuHSex/DjRtGq0C6aG4WPEc4qCQ==
x-amz-request-id: DXVS5HMZHFRAK1BA
Date: Fri, 06 Sep 2024 22:24:27 GMT
Last-Modified: Sun, 19 Jun 2022 18:33:20 GMT
ETag: "8433a4c273d87bd4f5d268989dd5d264-4"
Accept-Ranges: bytes
Content-Type: text/csv
Server: AmazonS3
Content-Length: 29130561

Same for
https://bbop-ontologies.s3.amazonaws.com/upheno/current/upheno-release/all/upheno_mapping_logical.csv

After this a bespoke mapping process using https://github.com/souzadevinicius/phenio-toolkit is used to make the sssom file used in monarch (see also obophenotype/upheno-dev#53)

See also obophenotype/upheno#962

@matentzn
Copy link
Member

matentzn commented Nov 1, 2024

Before things become easier I documented the entire complexity here:

obophenotype/upheno#963

Working on simplifying HP MP mappings there.

@matentzn
Copy link
Member

matentzn commented Nov 5, 2024

I have addressed the complexity of the pipeline in #63, but this does not automatically increase HP-MP mapping transparency!

@matentzn
Copy link
Member

matentzn commented Nov 8, 2024

pipeline is simplified, but will leave a bit open until my big refactoting is finished #63

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants