Name		Name	Last commit message	Last commit date
parent directory ..
PicoLLMDemo		PicoLLMDemo
.editorconfig		.editorconfig
.gitignore		.gitignore
PicoLLMDemo.sln		PicoLLMDemo.sln
README.md		README.md

README.md

picoLLM Inference Engine .NET Demos

Made in Vancouver, Canada by Picovoice

picoLLM Inference Engine

picoLLM Inference Engine is a highly accurate and cross-platform SDK optimized for running compressed large language models. picoLLM Inference Engine is:

Accurate; picoLLM Compression improves GPTQ by significant margins
Private; LLM inference runs 100% locally.
Cross-Platform
Runs on CPU and GPU
Free for open-weight models

Requirements

.NET 8.0

Compatibility

Linux (x86_64)
macOS (x86_64, arm64)
Windows (x86_64, arm64)
Raspberry Pi (4, 5)

Models

picoLLM Inference Engine supports the following open-weight models. The models are on Picovoice Console.

Gemma
- gemma-2b
- gemma-2b-it
- gemma-7b
- gemma-7b-it
Llama-2
- llama-2-7b
- llama-2-7b-chat
- llama-2-13b
- llama-2-13b-chat
- llama-2-70b
- llama-2-70b-chat
Llama-3
- llama-3-8b
- llama-3-8b-instruct
- llama-3-70b
- llama-3-70b-instruct
Llama-3.2
- llama3.2-1b-instruct
- llama3.2-3b-instruct
Mistral
- mistral-7b-v0.1
- mistral-7b-instruct-v0.1
- mistral-7b-instruct-v0.2
Mixtral
- mixtral-8x7b-v0.1
- mixtral-8x7b-instruct-v0.1
Phi-2
- phi2
Phi-3
- phi3
Phi-3.5
- phi3.5

AccessKey

AccessKey is your authentication and authorization token for deploying Picovoice SDKs, including picoLLM. Anyone who is using Picovoice needs to have a valid AccessKey. You must keep your AccessKey secret. You would need internet connectivity to validate your AccessKey with Picovoice license servers even though the LLM inference is running 100% offline and completely free for open-weight models. Everyone who signs up for Picovoice Console receives a unique AccessKey.

Usage

There are two demos available: completion and chat. The completion demo accepts a prompt and a set of optional parameters and generates a single completion. It can run all models, whether instruction-tuned or not. The chat demo can run instruction-tuned (chat) models such as llama-3-8b-instruct, phi2, etc. The chat demo enables a back-and-forth conversation with the LLM, similar to ChatGPT.

NOTE: File path arguments must be absolute paths. The working directory for the following dotnet commands is:

picollm/demo/dotnet/PicoLLMDemo

Build with the dotnet CLI:

dotnet build -c ChatDemo.Release
dotnet build -c CompletionDemo.Release

For both demos, you can use --help/-h to see the list of input arguments.

Chat Demo

To run an instruction-tuned model for chat, run the following in the terminal:

dotnet run -c ChatDemo.Release -- --access_key ${ACCESS_KEY} --model_path ${MODEL_PATH}

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console and ${MODEL_PATH} with the path to a model file downloaded from Picovoice Console.

To get information about all the available options in the demo, run the following:

dotnet run -c ChatDemo.Release -- --help

Completion Demo

Run the demo by entering the following in the terminal:

dotnet run -c CompletionDemo.Release -- --access_key ${ACCESS_KEY} --model_path ${MODEL_PATH} --prompt ${PROMPT}

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console, ${MODEL_PATH} with the path to a model file downloaded from Picovoice Console, and ${PROMPT} with a prompt string.

To get information about all the available options in the demo, run the following:

dotnet run -c CompletionDemo.Release -- --help

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dotnet

dotnet

README.md

picoLLM Inference Engine .NET Demos

picoLLM Inference Engine

Requirements

Compatibility

Models

AccessKey

Usage

Chat Demo

Completion Demo

Files

dotnet

Directory actions

More options

Directory actions

More options

Latest commit

History

dotnet

Folders and files

parent directory

README.md

picoLLM Inference Engine .NET Demos

picoLLM Inference Engine

Requirements

Compatibility

Models

AccessKey

Usage

Chat Demo

Completion Demo