Skip to content

Entropy Based Sampling and Parallel CoT Decoding

License

Notifications You must be signed in to change notification settings

AdjectiveAllison/entropix

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

entropix

Entropy Based Sampling and Parallel CoT Decoding

The goal is to replicate o1 style CoT with open source models

Current supported models: llama3.1+

Future supported models: DeepSeekV2+ Mistral Large (123B)

Getting Started

install poetry

curl -sSL https://install.python-poetry.org | python3 -

install rust to build tiktoken

curl --proto '=https' --tlsv1.3 https://sh.rustup.rs -sSf | sh

poetry install

poetry install

download weights (Base and Instruct)

poetry run python download_weights.py --model-id meta-llama/Llama-3.2-1B --out-dir weights/1B-Base
poetry run python download_weights.py --model-id meta-llama/Llama-3.2-1B-Instruct --out-dir weights/1B-Instruct

download tokenizer.model from huggingface (or wherever) into the entropix folder

run it

 PYTHONPATH=. poetry run python entropix/main.py

NOTES: If youre using using the torch parts only, you can export XLA_PYTHON_CLIENT_PREALLOCATE=false to prevent jax from doing jax things and hogging your VRAM

About

Entropy Based Sampling and Parallel CoT Decoding

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%