Skip to content

Latest commit

 

History

History
160 lines (96 loc) · 4.34 KB

binding.md

File metadata and controls

160 lines (96 loc) · 4.34 KB

Bindings

Precondition

Build target libchatllm:

Windows:

Assume MSVC is used.

  1. Build target libchatllm:

    cmake --build build --config Release --target libchatllm
  2. Copy libchatllm.dll, libchatllm.lib and ggml.dll to bindings;

Linux/MacOS:

  1. Build target libchatllm:

    cmake --build build --target libchatllm

Python

Command line

Run chatllm.py with exactly the same command line options.

For example,

  • Linux: python3 chatllm.py -i -m path/to/model

  • Windows: python chatllm.py -i -m path/to/model

If OSError: exception: access violation reading 0x0000000000000000 occurred, try:

Web demo

There is also a Chatbot powered by Streamlit:

To start it:

streamlit run chatllm_st.py -- -i -m path/to/model

Note: "STOP" function is not implemented yet.

OpenAI Compatible API

Here is a server providing OpenAI Compatible API. Note that most of the parameters are ignored. With this, one can start two servers one for chatting and one for code completion (a base model supporting fill-in-the-middle is required), and setup a fully functional local copilot in Visual Studio Code with the help of tools like twinny.

openai_api.py takes three arguments specifying models for chatting, code completion and text embedding respectively. For example, use DeepSeekCoder instructed for chatting, and its base model for code completion:

python openai_api.py path/to/deepseekcoder-1.3b.bin /path/to/deepseekcoder-1.3b-base.bin

Additional arguments for each model can be specified too. For example:

python openai_api.py path/to/chat/model /path/to/fim/model * --temp 0 --top_k 2 --- --temp 0.8

Where --temp 0 --top_k 2 are passed to the chatting model, while --temp 0.8 are passed to the code completion model.

openai_api.py uses model and API path to select chatting or completion models: when Model name to something either starting with fim or ending with fim, or API path is ending with /generate, code completion model is selected; otherwise, chatting model is selected. Here is a reference configuration in twinny:

Note that, openai_api.py is tested to be compatible with provider litellm.

Some models that can be used for code completion:

JavaScript/TypeScript

Command line

Run chatllm.ts with exactly the same command line options using Bun:

bun run chatllm.ts -i -m path/to/model

WARNING: Bun looks buggy on Linux.

Other Languages

libchatllm can be utilized by all languages that can call into dynamic libraries.

C

  • Linux

    1. Build bindings\main.c:

      export LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH
      gcc main.c libchatllm.so
    2. Test a.out with exactly the same command line options.

  • Windows:

    1. Build bindings\main.c:

      cl main.c libchatllm.lib
    2. Test main.exe with exactly the same command line options.

Pascal (Delphi/FPC)

Pascal binding is also available.

Examples:

Nim

Examples:

  • main.nim, which highlights code snippets.

    Build:

    nim c -d:Release -d:ssl main.nim
    

Others