LLaMA 2 from scratch

LLaMA 2 model implemented from the beginning.

This is not a production or a proper implementation. The purpose of this code is to self-educate, understand and explore the principles of the LLaMA 2 model.

Based on the excellent tutorials by Umar Jamil:

LLaMA explained: KV-Cache, Rotary Positional Embedding, RMS Norm, Grouped Query Attention, SwiGLU,
Coding LLaMA 2 from scratch in PyTorch - KV Cache, Grouped Query Attention, Rotary PE, RMSNorm.

Thank you very much for the excellent tutorial and all credit goes to @hkproj.

Getting started

# Create your virtual env
python3 -m venv venv

# Activate the venv
source venv/bin/activate

# Install required packages
pip install -r requirements.txt

# Visit the Meta Llama website: https://www.llama.com/llama-downloads/

# Go through the wizard and get the link.

# Download the weights (tested with the 7B model)
cd ./weights && ./download.sh

#  Run the the LLaMA model
python3 -m llama2 (Note: use -m to switch between interactive (i) and batch (b) mode)

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
llama2		llama2
weights		weights
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLaMA 2 from scratch

Getting started

About

Releases

Packages

Languages

ammarik/llama2_from_scratch

Folders and files

Latest commit

History

Repository files navigation

LLaMA 2 from scratch

Getting started

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages