Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SHPC New Feature - Upgrade Functionality #673

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Ausbeth
Copy link

@Ausbeth Ausbeth commented Sep 30, 2024

Hello @vsoch,

I am Ausbeth Aguguo, a collaborator with the Wellcome Sanger Institute. I contributed to this amazing project under the guidance of Matthieu Muffato, Guoying Qi and the Tree of Life Informatics Infrastructure Team.

In this pull request, I am introducing a robust functionality for SHPC, which uses shpc upgrade command to implement an intuitive upgrade logic that helps users manage installed and available software versions within SHPC. Here’s a breakdown:

shpc list

quay.io/biocontainers/samtools:1.20--h50ea8bc_0
quay.io/biocontainers/biocode:0.10.0--pyhdfd78af_0

User performs shpc upgrade quay.io/biocontainers/samtools

It checks the registry for the latest version of samtools available. If the user has the latest version installed among all the versions of samtools in the user’s list, shpc will just give the user a message that samtools is up to date. Else, it will perform an upgrade by installing the latest version.

User performs shpc upgrade -- all

It will run a check through the user’s list and perform an upgrade on all the user’s outdated software if any.

User performs shpc upgrade quay.io/biocontainers/samtools -- dry-run

It will inform the user if the latest version of samtools is installed or available to install, without performing an upgrade

User performs shpc upgrade -- all -- dry-run

This will display a list of installed software the user has. From the list, it indicates each outdated software and updated software, without upgrading the outdated ones.

Additional Information:

  • shpc upgrade cannot be performed when a user’s software list is empty. However, it will prompt the user to install the software the user tried to upgrade.

  • shpc upgrade gives the user the option to either uninstall or preserve previous versions of software during the upgrade process.

  • shpc upgrade gives the user the option to either install or not install the latest version of software to the views the previous versions were installed in

  • shpc upgrade does not allow the user to include the version of a software as a recipe, this is because the goal is to check the software itself and install the latest version available. It does not upgrade the specific version of the software. Therefore, this command is invalid: 'shpc upgrade quay.io/biocontainers/samtools:1.20--h50ea8bc_0'

  • shpc upgrade does not allow these argument combinations, to maintain clarity in command execution:

      shpc upgrade quay.io/biocontainers/samtools -- all
      shpc upgrade quay.io/biocontainers/samtools --all --dry-run

Finally, this upgrade functionality has been tested locally, including both manual tests and unit tests.

* Create upgrade.py

* Milestone 1.1

* Fixed a bug

* fixing bug

* fixing bug

* another bug fix

* Milestone 1.1.2

* fixing bug 1.1.2

* fixing bugs 1.1.2

* fixing bugs

* Trying capture stdout

* fixing bugs

* fixing bugs

* milestone 1.1.3

* fix

* fixx

* another fix

* and another fix

* Milestone 1.2

* Milestone 1.3

* minor changes

* Ensuring proper control flow of upgrade

* Made uninstall function to return boolean values inorder to fix a control flow bug in upgrade.py

* Using functions for cleaner code

* Using functions for cleaner code 2

* upgrade all feature

* fixing bug

* fixing bug

* upgrade preview

* enhancing upgrade all

* allowing user to upgrade without uninstalling old versions

* typo

* changing the logic of checking for the latest version since user can now keep older versions

* Adding logic to check if user has modules installed first before upgrade

* Replacing most print statements with logger

* Making sure user does not include a version when doing shpc upgrade

* fixing bug

* Handling argument combinations

* user-friendly message for preview

* changing a preview message

* Modfied warning message in main

* Renaming refactoring the source code

* refactoring invalid argument combination error messages

* Allowing shpc upgrade software --dry-run

* making logger info message clearer

* Allowing shpc upgrade --all --dry-run

* removing wrong comment

* making logger info clearer

* fixing typo in help

* fixing a comment

* rename refactoring

* fixing unknown recipe error

* changing error message for unknown recipe

* fixing error message bug and cleaning code

* Making shpc upgrade software -d output message clearer

* removing redudant code after changing the above output message

* Removing redundant arguements, adding installation of latest version to views, fixing print messages

* fixing error message

* removing shpc upgrade --dry-run since its similar to shpc upgrade --all --dry-run

* making help message clearer

* Creating test files for upgrade

* refactoring inner functions within upgrade to become outer functions. For reusability outside upgrade.py

* Tests for upgrading single software

* fixing bug

* fixing bug

* fixing bug by adding force as a parameter to upgrade function

* debugging

* Testing

* fixing bug

* fixing bug

* testing

* debugging

* debugging

* testing

* testing

* debug statements

* testing

* testing

* testing

* testing

* testing

* testing

* testing

* testing

* testing

* testing

* testing

* testing

* tests for upgrade

* fixing bug

* removing temporary file

* removing commented out code

* Removing Upgrade's --dry-run and adding it to the shpc command loop for --dry-run

* Fixing issue to allow it successfully run the test

* changing dry_run variable to dryrun. To keep it consistent with the rest of the code

* Making incomplete command error message consistent even when user has no software installed

* removing redundant code

* Implementing Matthieu's suggested changes

* fixing minor bug

* fixing parameter and argument name (dryrun)

* removing redundant code from the test

* Making function "return" changes for consistency

* fixing shpc linting issue

* Apply automatic formatting with black and fixing SHPC linting issue

* fixing lintint issue

* adding upgrade tests to main test file and removing temporary test files
Changing back to the default header for new file
@vsoch
Copy link
Member

vsoch commented Sep 30, 2024

Thanks @Ausbeth ! This introduces a confusing point for the user - having "update" and "upgrade" (that have different interactions). What exactly is upgrade doing that update is not, and did you think about a way to consolidate the need?

It might help to start at the beginning and tell me the problem you needed to solve.

@Ausbeth
Copy link
Author

Ausbeth commented Oct 1, 2024

Hi @vsoch, currently in SHPC, a user who wants to install the latest version of one of their installed software must manually perform the entire version-checking and installation process. This is rather time consuming as the user may potentially have to perform three commands to achieve their aim: shpc show quay.io/biocontainers/samtools – to view the latest tag, shpc list quay.io/biocontainers/samtools – to ensure the latest is installed, shpc install quay.io/biocontainers/samtools:latest – to install the latest. Incidentally, the user could attempt to install the latest version by doing shpc install quay.io/biocontainers/samtools, however, this could potentially reinstall the software version because the latest may already be installed.

The new upgrade feature aimed to solve this problem by automating the process. It checks the container.yaml recipe for the latest version of software the user has installed. If there is a new “latest” and the user does not have it installed, shpc upgrade will automatically install that version for the user, else, it just gives a helpful message to the user to reassure them that their software is up to date and no upgrade is required. Additionally, if an actual upgrade occurs, the user is given the option to either uninstall or preserve older versions of the upgraded software. shpc upgrade adds a layer of user convenience and ensures users seamlessly install the latest versions of software with minimal effort.

This is different from shpc update. “Update” is concerned with updating container recipes by fetching new metadata for container images from the registry and updating the configuration files accordingly. It does not interact with the software the user has installed. “Upgrade”, on the other hand, complements “Update” by directly interacting with the installed software on the user’s system, ensuring they have the latest versions, based on their latest tags in the container.yaml, installed.

@vsoch
Copy link
Member

vsoch commented Oct 1, 2024

I don't think I'm convinced by needing this addition, especially if (as you say) the same can be accomplished with an shpc install. This would add a lot of confusion for not a lot of new functionality. If this is a problem:

however, this could potentially reinstall the software version because the latest may already be installed.

That is probably what you want to work on - if install is doing a reinstall when latest is already installed, then it should not.

@muffato
Copy link
Contributor

muffato commented Oct 1, 2024

Hi @vsoch . The use case is still the same as in #501 (comment) . Following up on

I figured most folks would install the latest and call it a day, and pull once in a while for new containers or versions.

For people using the remote registry, the container.yaml files are automatically updated in the background (yeah github actions !). shpc upgrade can replace software with the latest version. Users don't need to think about versions and cleaning up the old installations. It just keeps everything up to date. Also, for convenience, new versions are directly added onto the views of the versions they replace.

In our case, we keep a local copy of the registry with only the software and versions we want installed, but the principle remains the same. At regular intervals we update the "latest" tags in all container.yaml and we'd use shpc upgrade to replace software with the latest versions.

Under the hood, yes, shpc/client/upgrade.py is just wrapping the existing functions shpc list, shpc view list, shpc uninstall, shpc install, shpc view install. I would rather see the functionality properly, safely, implemented and tested within shpc rather than letting users / admins write that up in a shell script.

@vsoch
Copy link
Member

vsoch commented Oct 1, 2024

Thanks for the reference! I think that discussion was still oriented around the (current) update comment. For this bit:

For people using the remote registry, the container.yaml files are automatically updated in the background (yeah github actions !). shpc upgrade can replace software with the latest version. Users don't need to think about versions and cleaning up the old installations. It just keeps everything up to date. Also, for convenience, new versions are directly added onto the views of the versions they replace.

I like the idea, but what I don't like is having update and upgrade. Ubuntu / debian has that with apt and it's generally confusing. I'd be open to other proposals to get that same functionality. This would warrant something more simple like:

shpc install --upgrade

Which scopes it under install (where I think it should be, since it's essentially doing install for new versions?) and explicitly says "install the latest (upgrade) for my modules. I'm open to other ideas too - I just don't like having update and upgrade that makes the library confusing.

@muffato
Copy link
Contributor

muffato commented Oct 2, 2024

Is this the interface you're proposing ?

shpc install software[:version]              # Install the software, either the latest version or the one specified
shpc install software --upgrade [--dry-run]  # if the latest version is not installed, install it and uninstall all previous ones - preserving the views
shpc install --all    --upgrade [--dry-run]  # Like a loop on `shpc install software --upgrade` for all installed software

Since you're suggesting expanding the install function, we should add that @Ausbeth is also finalising a reinstall command that reinstalls all installed software. This is useful when having to regenerate modules/wrapper scripts because the template changed. The current interface is:

shpc reinstall software:version
shpc reinstall software
shpc reinstall --all

@vsoch
Copy link
Member

vsoch commented Oct 2, 2024

Let's start with just the install set - I'm not sure about the latter. For this one:

shpc install software --upgrade [--dry-run] 

Why would we need to have --upgrade if the install (without it) would install the latest? I understand the need to have concise commands to do multi-step actions, but I don't want to add new functionality without thinking it through first.

@muffato
Copy link
Contributor

muffato commented Oct 2, 2024

shpc install software 1) doesn't uninstall older versions, 2) doesn't replicate the views of previously installed versions

@muffato
Copy link
Contributor

muffato commented Oct 2, 2024

The analogy is homebrew, which offers:

  brew list [FORMULA|CASK...]
  brew install FORMULA|CASK...
  brew reinstall FORMULA|CASK...
  brew uninstall FORMULA|CASK...
  brew update
  brew upgrade [FORMULA|CASK...]

shpc already has list [software], install software, uninstall software and update. We're seeking to add upgrade [software] and reinstall [software].

@vsoch
Copy link
Member

vsoch commented Oct 2, 2024

hmm :/ @marcodelapierre what do you think?

@marcodelapierre
Copy link
Contributor

Hi there :)

I have read the thread, and I like this landing point:

shpc install software --upgrade [--dry-run]  # if the latest version is not installed, install it and uninstall all previous ones - preserving the views
shpc install --all    --upgrade [--dry-run]  # Like a loop on `shpc install software --upgrade` for all installed software

I see the value in providing an automated way to check for SHASUM updates of given versions and install them, for one or better many installed images.
Side note: in general, this may also be useful also for non-latest versions, in those (rare) cases where the SHASUM is updated for the same tag, due to critical issues with the original build. Right?

I like how --upgrade it is just a flag to install, rather than a new sub-command, to keep the key interface cleaner.

Questions:

  1. If the old image is uninstalled, then views should always be updated right? So no dedicated flag for this option is needed. (if I understand correctly this detail of the functionality)
  2. Is there a simple way to undo this operation? Suppose the commands is run by accident, or the outcome is undesired. This can be particularly critical for the --all case. Or how can we safeguard this scenario?

Finally, I would consider also having --reinstall (maybe --rerun, or --force/-f?) as a flag for install. It's verbose and repetitive, but I am not sure we need an extra subcommand for this operation.

@marcodelapierre
Copy link
Contributor

One more thought.

If the registry in use is already updated (as it should -- better to have the registry sync separate, right?), how are upgrade and reinstall differentiated?

Ultimately, isn't it the same functionality?

There could be a "module only"-like flag to just re-generate module and wrapper scripts.

@muffato
Copy link
Contributor

muffato commented Oct 8, 2024

If the old image is uninstalled, then views should always be updated right? So no dedicated flag for this option is needed.

The new image will be added to the same views from which the old images were uninstalled. Indeed, this is transparent for the user and shouldn't need any flag.

Is there a simple way to undo this operation? Suppose the commands is run by accident, or the outcome is undesired. This can be particularly critical for the --all case. Or how can we safeguard this scenario?

First, there's the --dry-run option to show what the command would do. Then, unless --force is given, shpc still prompts the user for all the actions, just like uninstall and install do.

If the registry in use is already updated (as it should -- better to have the registry sync separate, right?), how are upgrade and reinstall differentiated?
Ultimately, isn't it the same functionality?

They're different in our current implementations:

  • If the software is already the latest version: reinstall uninstalls+installs it again, upgrade does nothing.
  • If the software is in a previous version: reinstall uninstalls+installs it again (still the old version), upgrade uninstalls it and installs the latest version instead.

@vsoch
Copy link
Member

vsoch commented Oct 8, 2024

Sounds like we have a path forward! Let's start with:

shpc install software --upgrade [--dry-run]

@Ausbeth
Copy link
Author

Ausbeth commented Oct 9, 2024

Thanks for your suggestions, @vsoch, @muffato, and @marcodelapierre. Before going ahead, I have some undying questions I would like to address.

shpc install software --upgrade [--dry-run] # if the latest version is not installed, install it and uninstall all previous ones - preserving the views

  • The uninstallation of previous versions and the replication of the views (adding the latest version to the views of the previous versions) are still actions that prompt the user before taking effect right? Which is necessary for safety reasons, but assuming the user doesn't want the prompt and just wants to force the actions, the user includes the --force tag. However, shpc install software --force already exists, and from its description, it "replaces existing symlinks". But this actually does nothing because force is not implemented in the install function. So, do we ignore the initial intentions for shpc install software --force and make shpc install software --upgrade --force focus solely on ignoring the prompts for uninstalling previous versions and replicating the views? And if so, what should actually happen when --force is used without --upgrade ?

  • shpc install software:version installs that specific version, but what happens if the user does shpc install software:version --upgrade. Should this raise an error? For "upgrade", shpc upgrade software:version raises an error because the goal was to check the software itself and install the latest version available, rather than upgrading a specific version, which the user is made aware of in the help description. My concern is if the user wonders why they are able to use shpc install software:version as a recipe for install but suddenly can't use it when they include the --upgrade flag.

@vsoch
Copy link
Member

vsoch commented Oct 9, 2024

  • If force is not used in install (outside of the view), I would do the implementation without force for now, and then, come back to do work just on force - adding/remove where we think makes sense.
  • Upgrade should not be allowed with a version.

@Ausbeth
Copy link
Author

Ausbeth commented Oct 23, 2024

@vsoch @muffato @marcodelapierre My apologies for the slow progress, I had a hectic schedule these past two weeks. However, I have completed:

  shpc install software --upgrade [--dry-run]

)


def upgrade(name, cli, args, dryrun=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (to me) in how it's designed here looks like it better belongs in an associated helper script. It's not a core part of shpc, it's written as a kind of helper function that uses the client.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vsoch, thank you for the response. Could you please shed more light on this? I assume you're saying upgrade() should be defined in a different script rather than in install.py?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think I'm convinced this belongs in core shpc. The implementation here reflects that - it's a big block of code that uses shpc and would better serve as a supplementary helper script.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm not disagreeing with your suggestion, I just wanted to understand your thought process more in order to know the next steps.
From your suggestion, where should the upgrade function be placed, just so we can all be on the same page @vsoch

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

@Ausbeth Ausbeth Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thanks, will work on that.

Should I also add def get_latest_version(name, config), which I added to install.py for this upgrade feature in the new subdirectory? @vsoch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vsoch . If the functionality moves to a separate script outside of shpc install, then no need to name the option "install --upgrade", right ? We can change the name back to "shpc-upgrade.py".

@Ausbeth : compared to where things were in sanger-tol#3, you would need to:

Probably easier to start from 081dc2d rather than the current state of the branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants