Skip to content

A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.

Notifications You must be signed in to change notification settings

rkinas/triton-resources

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Triton

Triton OpenAI

A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.

Official Documentation

My daily challange (Triton day by day)

This project is a step-by-step learning journey where we implement various types of Triton kernels—from the simplest examples to more advanced applications—while exploring GPU programming with Triton. The goal of this repository is to help you (and others) get comfortable with Triton by:

  • Starting simple: begin with basic kernels such as vector addition, and understand the building blocks of writing GPU code with Triton.
  • Incremental learning: each day introduces a new challenge, progressively covering more complex topics, techniques, and optimizations.
  • Hands-on experience: code, test, and benchmark your kernels against standard implementations (e.g., PyTorch) to see performance improvements and better understand GPU behavior.

Daily challenges: every day, a new challenge is posted in this repository. Each challenge focuses on a specific aspect of Triton, such as:

  • Basic operations (e.g., vector addition)
  • Memory management and optimizations
  • Advanced indexing and dynamic shapes
  • Multi-dimensional kernels
  • Reduction operations and more
  • Detailed explanations: each kernel comes with an in-depth explanation of the code, helping you understand the concepts behind the implementation.
  • Benchmarking and stress tests: learn how to measure performance by comparing custom Triton kernels with standard PyTorch implementations. Get hands-on experience with benchmarking on real-world GPU workloads.
Day Kernel Description
#1 Constant add This challenge is the first puzzle in our Daily Triton Challenge series. The goal is to write a Triton kernel that adds a constant value to each element of a vector.
#2 Add two vectors Simple example of how to add two vectors using a custom GPU kernel written in Triton and compares the result to a standard PyTorch implementation.
#3 Add two vectors with speed benchmarking This is almost the same as #2 but we meaesure kernel execution speed and compare it to Pytorch implementation.

Articles

Gain deeper insights into Triton through these detailed articles:

Research Papers

Explore the academic foundation of Triton:

Videos

Learn by watching these informative videos:

Triton community meetup

Watch Triton community meetups to be up to date with Triton recent topics.

Triton-Puzzles

Challenge yourself with these engaging puzzles:

Tools

Enhance your Triton development workflow with these tools:

Conferences

Catch up on the latest advancements from Triton Conferences:

Sample Kernels

Explore practical implementations with these sample kernels:

Triton integrations

Triton backends

Triton communities


Triton Kernel Index

Kernel Description Resource
VectorAdd A simple kernel that performs element-wise addition of two vectors. Useful for understanding the basics of GPU programming in Triton. 1 2
Matmul An optimized kernel for matrix multiplication, achieving high performance by leveraging memory hierarchy and parallelism. 1 2 Grouped GEMM
Softmax A kernel for efficient computation of the softmax function, commonly used in machine learning models like transformers. 1 2 3
Dropout A kernel for implementing low-memory dropout, a regularization technique to prevent overfitting in neural networks. 1 2
Layer Normalization A kernel for layer normalization, which normalizes activations within a layer to improve training stability in deep learning models. 1 2 3
Fused Attention A kernel that efficiently implements attention mechanisms by combining multiple operations, key to transformers and similar architectures. 1 2
Conv1d A kernel for 1D convolution, often used in processing sequential data like time series or audio signals. 1
Conv2d A kernel for 2D convolution, a fundamental operation in computer vision tasks such as image classification or object detection. 1
MultiheadAttention A kernel for multi-head attention, a crucial component in transformer-based models for capturing complex relationships in data. 1
Hardsigmoid A kernel for the Hardsigmoid activation function, an efficient approximation of the sigmoid function used in certain neural network layers. 1
GeLU GeLU 1
GeGLU GeGLU 1
RMSNorm RMSNorm 1

Triton updates, news, new features

Contribution

Feel free to contribute more resources or suggest updates by opening a pull request or issue in this repository.


License

This resource list is open-sourced under the MIT license.

About

A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages