Skip to content
View rand-fly's full-sized avatar

Highlights

  • Pro

Organizations

@Infinideastudio

Block or report rand-fly

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An alternative Windows context menu.

C++ 1,490 23 Updated Mar 7, 2025

Inference RWKV v5, v6 and v7 with Qualcomm AI Engine Direct SDK

C++ 54 4 Updated Mar 7, 2025

💥 Blazing fast terminal file manager written in Rust, based on async I/O.

Rust 22,850 500 Updated Mar 7, 2025

Fast OS-level support for GPU checkpoint and restore

C++ 165 15 Updated Mar 4, 2025

Open-source Framework for HPCA2024 paper: Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators

C++ 72 12 Updated Feb 1, 2025

how to optimize some algorithm in cuda.

Cuda 1,956 174 Updated Mar 5, 2025

Fast Multimodal LLM on Mobile Devices

C++ 731 87 Updated Mar 3, 2025

Puzzles for learning Triton, play it with minimal environment configuration!

Python 253 24 Updated Dec 3, 2024

Puzzles for learning Triton

Jupyter Notebook 1,466 108 Updated Nov 18, 2024

A Chinese (Simplified) Translation Project for the Create: Astral modpack.

JavaScript 31 11 Updated Dec 21, 2024

General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for…

C++ 2,104 161 Updated Mar 6, 2025

A tool for bandwidth measurements on NVIDIA GPUs.

C++ 383 32 Updated Feb 7, 2025

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Python 27,609 2,125 Updated Mar 7, 2025
JavaScript 1 Updated Feb 16, 2025

纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行

C++ 3,410 349 Updated Mar 7, 2025

Low-bit LLM inference on CPU with lookup table

C++ 691 54 Updated Jan 9, 2025

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

C++ 1,482 101 Updated Feb 20, 2025

LLM inference in C/C++

C++ 76,036 10,996 Updated Mar 7, 2025

A Pascal to C/RISC-V compiler based on YACC

C++ 5 Updated Aug 30, 2024

Code for Diversity-Enhanced Learning for Instruction Adaptation in Large Language Models

Python 8 Updated Aug 31, 2024

Linux-capable out-of-order superscaler multicore LoongArch32 (LA32 / LA32R) processor.

SystemVerilog 18 2 Updated Aug 9, 2024

Dynamic Memory Management for Serving LLMs without PagedAttention

C 301 23 Updated Feb 20, 2025

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 14,237 1,166 Updated May 23, 2024

夏令营截止日期DDL静态网页

Vue 152 4 Updated Mar 7, 2025

Open-source training data and evaluation tools used in Token-Efficient Leverage Learning

Python 9 Updated Apr 12, 2024

My personal vim/neovim configuration files, dotfiles, docs and other scripts.

Vim Script 13 Updated Mar 7, 2025

A tool to decode RISC-V and LoongArch and MIPS instructions in gtkwave

C++ 29 7 Updated Apr 8, 2024

A tool to decode RISC-V and LoongArch instructions in gtkwave

C++ 5 Updated Mar 23, 2024
JavaScript 78 79 Updated Feb 2, 2025
Next
Showing results