Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Liger-kernel] Add an option to use _apply_liger_kernel_to_instance() to load model #133

Merged
merged 9 commits into from
Jan 30, 2025

Conversation

hongpeng-guo
Copy link
Contributor

@hongpeng-guo hongpeng-guo commented Jan 26, 2025

Summary

This PR enables to use Liger Kernel's _apply_liger_kernel_to_instance to init a fsdp worker model.

Main Changes

  1. Adding an option of using liger_kernel.transformers.AutoLigerKernelForCausalLM to load a model from pretained, instead of the default transformers.AutoModelForCausalLM
  2. Added a test case using configuration file tests/e2e/run_qwen_gsm8k_model_rm_liger_kernel.sh

Related Issue

#96

TODO

#97 optimize the memory usage when computing entropy & log_probs

output = self.actor_module(input_ids=input_ids_rmpad,
attention_mask=None,
position_ids=position_ids_rmpad,
use_cache=False) # prevent model thinks we are generating
logits_rmpad = output.logits.squeeze(0) # (total_nnz, vocab_size)
logits_rmpad.div_(temperature)
# compute entropy
entropy_rmpad = self.compute_entropy_from_logits(logits_rmpad) # ((total_nnz / sp) + pad)
# if use_sp: ((total_nnz / sp) + pad) ; if not use_sp: (batch, seqlen)
log_probs = logprobs_from_logits(logits=logits_rmpad, labels=input_ids_rmpad_rolled)

@hongpeng-guo hongpeng-guo marked this pull request as draft January 26, 2025 02:14
@hongpeng-guo hongpeng-guo marked this pull request as ready for review January 26, 2025 02:22
Signed-off-by: Hongpeng Guo <[email protected]>
Signed-off-by: Hongpeng Guo <[email protected]>
Signed-off-by: Hongpeng Guo <[email protected]>
Signed-off-by: Hongpeng Guo <[email protected]>
@hongpeng-guo hongpeng-guo changed the title [Liger-kernel] Add an option to use AutoLigerKernelForCausalLM to load model [Liger-kernel] Add an option to use _apply_liger_kernel_to_instance() to load model Jan 27, 2025
_apply_liger_kernel_to_instance(model=actor_module)
except ImportError:
# Fallback to use AutoModelForCausalLM and print warning message
logger.warning("Liger kernel was requested but not installed - falling back to AutoModelForCausalLM")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I would prefer for the job to fail outright if liger is requested but not installed. Just printing a warning is too easy to miss in all the job outputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your proposal. I think we can remove the try ... catch ... logic and add liger_kernel a required dependency of verl. There are still a few extra PRs in Liger kernel targeting for the full integration. Once those are done, I think it makes sense to add liger-kernel to the requirements.txt cc @vermouth1992 @eric-haibin-lin

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's a good idea to make liger-kernel a required dependency and remove try except.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove try except and run the CI again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

@@ -0,0 +1,52 @@
set -x
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add this test to CI and make sure liger-kernel is toggled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, added this test to .github/workflows/e2e_gsm8k.yml

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess liger-kernel has to be added to setup.py and project.toml

actor_rollout_ref.model.path=Qwen/Qwen2.5-0.5B \
actor_rollout_ref.actor.optim.lr=1e-6 \
actor_rollout_ref.model.use_remove_padding=True \
+actor_rollout_ref.model.use_liger=True \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flag for use_liger is here.

Signed-off-by: Hongpeng Guo <[email protected]>
Copy link
Contributor Author

@hongpeng-guo hongpeng-guo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A relevant PR for the fused kernel of ce_loss and entropy as described in TODO can be referenced here: linkedin/Liger-Kernel#551

Signed-off-by: Hongpeng Guo <[email protected]>
@vermouth1992
Copy link
Collaborator

Please run the formatter bash script/format.sh

@hongpeng-guo
Copy link
Contributor Author

Please run the formatter bash script/format.sh

Got it! Will handle the comments soon. One suggestion is to have a Contributor's Guid section on the readme, or in the doc, i.e., https://docs.sglang.ai/references/contribution_guide.html The contributors would know the checks to run before pushing a PR. Setting up some pre-commit hooks will be better :)

@vermouth1992
Copy link
Collaborator

Please run the formatter bash script/format.sh

Got it! Will handle the comments soon. One suggestion is to have a Contributor's Guid section on the readme, or in the doc, i.e., https://docs.sglang.ai/references/contribution_guide.html The contributors would know the checks to run before pushing a PR. Setting up some pre-commit hooks will be better :)

Added a code formatting instruction in the README. Will add more later

@vermouth1992 vermouth1992 merged commit dd41877 into volcengine:main Jan 30, 2025
10 checks passed
@hongpeng-guo hongpeng-guo deleted the hpguo/add_liger_kernel branch January 30, 2025 09:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants