This repository contains code meant to be run on the N3C Enclave for measuring semantic similarity between patients using their phenotype data as represented by HPO terms.
It also contains tools for applying statistical tests (chi squared and Fisher exact test) to determine overrepresentation of HPO terms in clustered data.
See here for more documentation of code:
https://national-covid-cohort-collaborative.github.io/semanticsimilarity/index.html