Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validate use_remove_padding when applying sequence parallelism #153

Merged
merged 7 commits into from
Jan 31, 2025
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions verl/utils/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,4 +86,13 @@ def check_mutually_exclusive(mbs, mbs_per_gpu, name: str):
assert config.critic.ppo_mini_batch_size % config.critic.ppo_micro_batch_size == 0
assert config.critic.ppo_micro_batch_size * sp_size >= n_gpus

# Check if use_remove_padding is enabled when using sequence parallelism
if config.actor_rollout_ref.actor.ulysses_sequence_parallel_size > 1:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the correct key is config. actor_rollout_ref.model.use_remove_padding, critic.model.use_remove_padding and reward_model.model.use_remove_padding

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed now!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the key is still not correct :(

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chujiezheng The key should be actor_rollout_ref.model.use_remove_padding. The actor and ref are not necessary as they have the same model type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vermouth1992 @PeterSH6 Fixed now 🥹

assert config.actor_rollout_ref.actor.use_remove_padding, \
"When using sequence parallelism for actor, you must enable `use_remove_padding`."

if config.actor_rollout_ref.ref.ulysses_sequence_parallel_size > 1:
assert config.actor_rollout_ref.ref.use_remove_padding, \
"When using sequence parallelism for ref policy, you must enable `use_remove_padding`."

print("[validate_config] All configuration checks passed successfully!")
Loading