diff --git a/README.md b/README.md index 0ca2e3812c858..941878bd4b98c 100644 --- a/README.md +++ b/README.md @@ -61,6 +61,11 @@ TODO Torrent: `magnet:?xt=urn:btih:f3cf71b172129d6b5abccab393bc32253fac8159&dn=ggml-alpaca-13b-q4.bin&tr=udp%3A%2F%http://2Ftracker.opentrackr.org%3A1337%2Fannounce&tr=udp%3A%2F%https://t.co/zenhelfwRd%3A6969%2Fannounce&tr=https%3A%2F%https://t.co/zenhelfwRd%3A443%2Fannounce&tr=udp%3A%2F%https://t.co/RRAn1X65wE%3A6969%2Fannounce&tr=udp%3A%2F%https://t.co/uTXBeTLUMa%3A2810%2Fannounce` + +``` +./chat -m ggml-alpaca-13b-q4.bin +``` + ## Credit This combines [Facebook's LLaMA](https://github.com/facebookresearch/llama), [Stanford Alpaca](https://crfm.stanford.edu/2023/03/13/alpaca.html), [alpaca-lora](https://github.com/tloen/alpaca-lora) and [corresponding weights](https://huggingface.co/tloen/alpaca-lora-7b/tree/main) by Eric Wang (which uses [Jason Phang's implementation of LLaMA](https://github.com/huggingface/transformers/pull/21955) on top of Hugging Face Transformers), and [llama.cpp](https://github.com/ggerganov/llama.cpp) by Georgi Gerganov. The chat implementation is based on Matvey Soloviev's [Interactive Mode](https://github.com/ggerganov/llama.cpp/pull/61) for llama.cpp. Inspired by [Simon Willison's](https://til.simonwillison.net/llms/llama-7b-m2) getting started guide for LLaMA. diff --git a/chat.cpp b/chat.cpp index 5b8ead727362f..d142c2e3a533c 100644 --- a/chat.cpp +++ b/chat.cpp @@ -798,6 +798,7 @@ int main(int argc, char ** argv) { params.temp = 0.1f; params.top_p = 0.95f; + params.n_ctx = 2048; params.interactive = true; params.interactive_start = true; #if !defined(_WIN32)