Skip to content
View ZJY0516's full-sized avatar

Highlights

  • Pro

Block or report ZJY0516

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

How much energy do GenAI models consume?

Python 41 4 Updated Oct 16, 2024

Asterinas is a secure, fast, and general-purpose OS kernel, written in Rust and providing Linux-compatible ABI.

Rust 2,729 160 Updated Mar 3, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,975 231 Updated Mar 4, 2025

Large Language Model (LLM) Systems Paper List

794 30 Updated Feb 27, 2025

Reverse Engineering: Decompiling Binary Code with Large Language Models

Python 5,200 351 Updated Oct 28, 2024

CPU inference for the DeepSeek family of large language models in pure C++

C++ 263 23 Updated Feb 11, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 2,269 236 Updated Mar 4, 2025

Curated collection of papers in MoE model inference

85 4 Updated Feb 19, 2025

JAX bindings for the flash-attention3 kernels

C++ 11 1 Updated Aug 6, 2024

Kernel-Bypass LibOS Architecture

Rust 1,097 125 Updated Mar 4, 2025

Fast and memory-efficient exact attention

Python 16,078 1,522 Updated Mar 4, 2025

Custom Linux scheduler for concurrency fuzzing written in Java with hello-ebpf

Java 24 1 Updated Feb 13, 2025

FlagGems is an operator library for large language models implemented in Triton Language.

Python 434 69 Updated Mar 4, 2025

Fused SwiGLU Triton kernels

Python 4 Updated Jan 25, 2024

My learning notes/codes for ML SYS.

Python 1,243 64 Updated Mar 4, 2025

Perceptual video quality assessment based on multi-method fusion.

Python 4,811 768 Updated Feb 12, 2025

FFMPEG Assembly Language Lessons

2,818 77 Updated Mar 3, 2025

如何成为一名自洽的程序员

Shell 1,964 91 Updated Feb 28, 2025

Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA

C++ 755 47 Updated Mar 4, 2025

Low-bit LLM inference on CPU with lookup table

C++ 690 53 Updated Jan 9, 2025

Course materials for MIT6.5940: TinyML and Efficient Deep Learning Computing

Jupyter Notebook 31 2 Updated Jan 8, 2025

GitHub page for "Large Language Model-Brained GUI Agents: A Survey"

CSS 122 6 Updated Mar 1, 2025
Python 14 2 Updated Jan 13, 2025

Include binary files in C/C++

C 1,028 98 Updated Jul 12, 2024

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 41,370 5,587 Updated Mar 2, 2025

Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O

C++ 265 27 Updated Jan 15, 2025

Port of OpenAI's Whisper model in C/C++

C++ 38,231 3,981 Updated Mar 4, 2025

AlphaFold 3 inference pipeline.

Python 6,159 761 Updated Mar 4, 2025

Building blocks for foundation models.

455 19 Updated Jan 3, 2024

A visualized debugging framework to aid in understanding the Linux kernel.

C 108 7 Updated Mar 4, 2025
Next
Showing results