Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG Llama CCL Functional Burndown #16632

Open
10 tasks
SeanNijjar opened this issue Jan 10, 2025 · 0 comments
Open
10 tasks

TG Llama CCL Functional Burndown #16632

SeanNijjar opened this issue Jan 10, 2025 · 0 comments

Comments

@SeanNijjar
Copy link
Contributor

SeanNijjar commented Jan 10, 2025

Prerequisites

Integration: Stage 1

Current plan is to have these mostly handled by @caixunshiren - work breakdown TBD after discussion between @avoraTT @kpaigwar @caixunshiren

  • Integrate ccl async into TG Llama with reduce-scatter and all-gather. Where all-reduce is expected to be called, implement with back-to-back calls to reduce-scatter + all-gather
  • Implement all-reduce-async op interface (reuse implementation from older all-reduce V1 except replace underlying operations with async variants)
    • not a blocker for functional bringup but a blocker from how the model would ideally be implemented - keeps llama codebase unified
    • All-reduce-async from composite ops
      • reduce_scatter + all-gather

Integration: Stage 2

  • Implement all-=reduce
    • TBD
  • Migrate to ttnn.experimental.all_reduce_async from many all-reduce implementation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants