Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2024 Year End TODOS (Dec 20 - Dec 31) #62

Open
12 of 20 tasks
alexzhang13 opened this issue Dec 21, 2024 · 0 comments
Open
12 of 20 tasks

2024 Year End TODOS (Dec 20 - Dec 31) #62

alexzhang13 opened this issue Dec 21, 2024 · 0 comments
Assignees

Comments

@alexzhang13
Copy link
Collaborator

alexzhang13 commented Dec 21, 2024

I'm consolidating TODOs for current members and new contributors to get a sense for what needs to be finished for this project. This is based on our last weekly meeting -- if people have time, please add your name to issues and own them! It'll help things move a lot quicker.

Old discussion about leaderboard schema to catch people up to speed: #48

At a high-level, we have the GitHub runners working for leaderboard submission / creation for PyTorch / Triton kernels. There are definitely some rough edges that need to be patched up and things I've missed, so please feel free to add anything else you think is important. Our goal is to having an entire working leaderboard submission / creation pipeline by the end of the year, which means we can create a softmax leaderboard with specified GPUs and have Popcorn contributors test it out and give feedback.

Urgent TODOs

  • Implement leaderboard submission for Modal runners @msaroufim
  • Update eval code / reference code to support our new system per last meeting. This means the eval code on our side controls the warmups and runtime timer, while the reference code should only contain the data generator (as a list of Tensors) and the reference kernel. @alexzhang13
  • Update leaderboard schema so each leaderboard has scores for every GPU type. @b9r5
  • Update submission and creation logic and slash commands to support the new schema above. @alexzhang13
  • Update the README with examples for leaderboard creation / submission with the slash commands. @alexzhang13
  • Add support for .cu / non-.py submissions @S1ro1 @msaroufim @alexzhang13
  • Make sure we all like the parameters / chosen evaluation scripts for evaluating code (e.g. # warmup steps, # timed runs, etc.)

Would be useful to add in parallel. Not sure when / how they will be used, but they may prove useful in the future. Some of these are worth discussing, others are worth implementing on a branch.

  • Support peak-memory profiling (?) EDIT: Do not implement for now, S1ro1
  • Allow leaderboard schema to support arbitrary "scores", e.g. runtime, peak-memory usage, etc. that the problem creator can eventually define.
  • Support use of ncu or other profilers for users to test / profile their code.
  • Add support for Pytorch memory profiler
  • UI: /leaderboard submit github train.py then a popup comes up with SELECT LEADERBOARD: which queries a SQL request and gives a dropdown (and search feature) of the active leaderboards, then another popup comes up with SELECT GPU TYPE with a similar feature, but for the selected leaderboard. this is similar to the popup we have for leaderboard creation @alexzhang13

Beginner-friendly TODOs / warmups / still helpful

  • Change leaderboard name from "problem" to "leaderboard" @b9r5
  • Try to break the leaderboard creation / submission commands. Look for error messages that make sense. Some examples below that I know of: @RizzwareEngineer
    • In the eval code, add type verifiers to the functions to ensure function signatures in both the reference code and user submitted code are expected (e.g. generate_input should return a List[torch.Tensor]).
    • If the leaderboard creator creates a problematic script (e.g. imports are wrong in the eval code), throw an error and do not create the leaderboard.
    • If the leaderboard submission immediately throws an error (e.g. imports are wrong)
  • Change /leaderboard list to include what GPU types are available for each specific leaderboard.
  • Change /leaderboard show {name} to include what GPU types are available for each specific leaderboard.
  • Provide any feedback on submission (especially) or creation slash command workflows! e.g. more args, less args, too slow, etc. Please play with the bot in the #bot-test channel!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants