-
Notifications
You must be signed in to change notification settings - Fork 811
Pull requests: sgl-project/sglang
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[ROCm] Enable Fused MLA Triton kernel for DeepSeekV3
#3237
opened Jan 31, 2025 by
lcskrishna
•
Draft
docs(references/deepseek): current section links
#3225
opened Jan 31, 2025 by
guspan-tanadi
Loading…
4 tasks
Add support for nvidia modelopt fp8 kv cache
#3223
opened Jan 30, 2025 by
Edwardf0t1
Loading…
1 of 4 tasks
Online serving benchmarks of real datasets for hierarchical KV caching
#3211
opened Jan 30, 2025 by
PanJason
Loading…
5 tasks
Fix min_p sampling crash when using flashinfer backend
flashinfer
#3207
opened Jan 29, 2025 by
zifeitong
Loading…
3 of 4 tasks
Add a Doc about guide on nvidia jetson #3182
documentation
Improvements or additions to documentation
#3205
opened Jan 29, 2025 by
lycanlancelot
Loading…
2 of 4 tasks
[Feature] Define backends and add Triton backend for Lora
#3161
opened Jan 27, 2025 by
Fridge003
Loading…
4 tasks done
[MOE] Try to optimize moe align block size multiblocks cuda kernel
#3137
opened Jan 26, 2025 by
yiakwy-xpu-ml-framework-team
•
Draft
8 tasks
fix: Fix deprecated max_tokens param in openai ChatCompletionRequest
#3122
opened Jan 25, 2025 by
mickqian
Loading…
3 of 4 tasks
Split communication logic from computation logic into orchestrator
#3118
opened Jan 25, 2025 by
fzyzcjy
Loading…
4 tasks
Let DetokenizerManager use TypeBasedDispatcher
#3117
opened Jan 25, 2025 by
fzyzcjy
Loading…
4 tasks
Extract generation_manager from tokenizer_manager
#3115
opened Jan 25, 2025 by
fzyzcjy
Loading…
4 tasks
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.