Skip to content
Kaiming Tao edited this page Jun 17, 2022 · 15 revisions

We're curating a database containing the drug resistance data of SARS-CoV-2. The data are extracted from publications (most of time are papers) and include treatments of monoclonal antibodies (mAb), plasmas (convalescent and vaccinee plasma), and other antiviral drugs.

This document is about how to contribute to the database, how to enter data, and how we manage the database.

The pipeline of curating data

  1. Create a new issue page, with the author name + DOI of the paper, and read the paper. Any questions or comments should be discussed in this issue page.
  2. Extract key data points, and organize them in several Excel spreadsheets with corresponding table headers.
  3. Fork the covid-drdb-payload repository, convert Excel spreadsheets to CSV files with corresponding name schemas, save all CSV files into corresponding folders.
  4. Create a pull request for checking data consistency automatically and for reviewing.

Because of the complexity of entering data, we can divide them into two main steps. Biomedical steps are about extracting data from papers. Database steps are about formatting the data to database-friendly format, and using software to check the data before finally merging them into the main repository.

Biomedical

Because of the heterogeneity of data, we provide different documents to explain them separately.

Database