Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customize NCCL for base container #1123

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

DwarKapex
Copy link
Contributor

Add ability to build a base container with a custom NCCL version.
2 main scenarios are considered:

  • Default build (JAX_NCCL_VERSION is not defined): JAX_NCCL_VERSION will be assigned to the value of an existing NCCL_VERSION env var of the base image. If NCCL_VERSION is not defined in the base image, then JAX_NCCL_VERSION will be assigned to the version of libnccl-dev library if it exists, or left empty.
  • Custom build with user defined JAX_NCCL_VERSION: libnccl2 and libnccl-dev will be installed with the provided version of NCCL if NCCL not in the system or updated if NCCL comes with the base image.

Installation of NCCL happens if and only if either JAX_NCCL_VERSION or NCCL_VERSION is defined.

@@ -29,6 +29,18 @@ FROM ${BASE_IMAGE}
ARG GIT_USER_EMAIL
ARG GIT_USER_NAME
ARG CLANG_VERSION
ARG JAX_NCCL_VERSION
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would a default be specified? I think the situation today is that we want the default nightly/CI behaviour to be "12.6.1 base image with NCCL 2.23 installed on top", so it should be possible to express that.

I know that for the base image, there is special treatment of latest to enable this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the "special treatment" is to embed a default in the Dockerfile, and then conditionally override it via an Actions conditional expression:

${{ inputs.BASE_IMAGE != 'latest' && format('BASE_IMAGE={0}', inputs.BASE_IMAGE) || '' }}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants