Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tufts] Containerization of all relevant parts of OHDSI GIS workflow #375

Open
Tracked by #376
kzollove opened this issue Jan 24, 2025 · 0 comments
Open
Tracked by #376
Assignees

Comments

@kzollove
Copy link
Collaborator

kzollove commented Jan 24, 2025

What parts of the workflow can be containerized:

  • gaiaDB (source data information)
    • A containerized Postgres database that contains source data information and accepts function calls from gaiaCore to "instantiate" them in a harmonized format
  • gaiaCore (R Package)
    • Exposes the "loadVariable" operation
    • Potentially could expose more functionality to give insights into gaiaDB (available sources and types, list instantiated sources, visualize coverage of sources, etc)
  • gaiaOHDSI
    • given a list of locations with dates of validity (essentially, a slice of the LOCATION_HISTORY table with coordinates), perform a spatiotemporal join on the harmonized geospatial source data. This generates the external_exposure table and a delta vocabulary (the relevant "slice" of OMOP GIS vocabulary), which is then output as a CSV/SQL INSERT statements that can be applied back to the OMOP CDM.

What we will not containerize (though there may be separate containerized solutions)

  • geocoding

The containerized product can be thought of as a black box:

  • LOCATION_HISTORY slice goes in
  • EXTERNAL_EXPOSURE + Delta Vocabulary comes out

The containerized product does not need direct access to an OMOP CDM, though you are feeding it PHI

Concerns:

  • Does a containerized PG database make sense? It seems easy to me in a toy setting, but does it still make sense in a cloud setting? Will this work in the Tufts cloud environment, where we are typically bound to "container apps" solutions and standalone postges server?
  • How would a catalog integrate with this? Would the catalog "populate" the containerized PG database?
  • gaiaOHDSI is not yet a realized R package. Should not be too difficult to build out the spatiotemporal join for a single "class" of relationships (e.g. point in polygon)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant