entropix

Entropy Based Sampling and Parallel CoT Decoding

The goal is to replicate o1 style CoT with open source models

Current supported models: llama3.1+

Future supported models: DeepSeekV2+ Mistral Large (123B)

Getting Started

install poetry

curl -sSL https://install.python-poetry.org | python3 -

install rust to build tiktoken

curl --proto '=https' --tlsv1.3 https://sh.rustup.rs -sSf | sh

poetry install

poetry install

download weights (Base and Instruct)

poetry run python download_weights.py --model-id meta-llama/Llama-3.2-1B --out-dir weights/1B-Base
poetry run python download_weights.py --model-id meta-llama/Llama-3.2-1B-Instruct --out-dir weights/1B-Instruct

download tokenizer.model from huggingface (or wherever) into the entropix folder

run it

 PYTHONPATH=. poetry run python entropix/main.py

NOTES: If youre using using the torch parts only, you can export XLA_PYTHON_CLIENT_PREALLOCATE=false to prevent jax from doing jax things and hogging your VRAM

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

entropix

Getting Started

Files

README.md

Latest commit

History

README.md

File metadata and controls

entropix

Getting Started