Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor paper ranking workflow #1358

Merged
merged 11 commits into from
Jan 15, 2025
Merged

Refactor paper ranking workflow #1358

merged 11 commits into from
Jan 15, 2025

Conversation

cthoyt
Copy link
Member

@cthoyt cthoyt commented Jan 13, 2025

This PR refactors the paper ranking workflow with the following:

  1. Remove implementation from inside CLI definitions. This makes it easier to test (rather than having to test that a click command works, you run the function directly)
  2. Add type annotations
  3. Reduce duplicated code
  4. Reuse itertools when appropriate
  5. Better function names
  6. Hide INDRA imports inside relevant functions so it can be directly mocked
  7. Add several TODOs and FIXMEs, but these are for future work. Right now, I just want to check that this first refactor still works the same way.

It also does:

  1. Define optional requirements for the paper tranking workflow in pyproject.toml instead of in an ad-hoc requirements.txt file
  2. Redefine the paper ranking workflow to install with the optional from pyproject.toml
  3. Add a tox environment for running the paper ranking workflow, so it's reproducible by anyone
  4. Reuse builtin paths in the Bioregistry module during tests

@cthoyt cthoyt requested a review from nagutm January 13, 2025 09:51
@cthoyt cthoyt changed the title Update paper ranking workflow Refactor paper ranking workflow Jan 13, 2025
@bgyori
Copy link
Contributor

bgyori commented Jan 14, 2025

I think restructuring the code is okay, without going into discussions about detailed choices, perhaps @nagutm could check to see if there are any unintended changes to the overall behavior we need to address. I think we should look into the reldate / explicit start and end date discrepancy on a separate PR to not mix refactoring with that (That issue happens to not make a practical difference in the monthly runs but is generally something we should fix.)

Copy link
Collaborator

@nagutm nagutm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the basic functionality with the refactored code and there does not appear to be any unexpected changes to the general workflow. I think we can merge this PR in order to address the issue of the start and end dates.

@cthoyt cthoyt merged commit 2ced69e into main Jan 15, 2025
12 checks passed
@cthoyt cthoyt deleted the update-paper-ranking branch January 15, 2025 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants