Skip to content

Commit

Permalink
style(project): format all project
Browse files Browse the repository at this point in the history
  • Loading branch information
tpoisonooo committed Jan 12, 2024
1 parent a60f966 commit ad5b78c
Show file tree
Hide file tree
Showing 26 changed files with 763 additions and 570 deletions.
1 change: 0 additions & 1 deletion .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,3 @@ jobs:
run: |
python .github/scripts/doc_link_checker.py --target README_en.md
python .github/scripts/doc_link_checker.py --target README.md
60 changes: 60 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
repos:
- repo: https://github.com/PyCQA/flake8
rev: 4.0.1
hooks:
- id: flake8
exclude: ^(__init__.py)$
args: ["--max-line-length=79", "--exclude=service/__init__.py", "--exclude=tests/*"]
- repo: https://github.com/PyCQA/isort
rev: 5.11.5
hooks:
- id: isort
- repo: https://github.com/pre-commit/mirrors-yapf
rev: v0.32.0
hooks:
- id: yapf
name: yapf
description: 'Formatter for Python code'
entry: yapf
language: python
args: ['-i', '--style={based_on_style: pep8, column_limit: 79}']

- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.2.0
hooks:
- id: trailing-whitespace
- id: check-yaml
- id: end-of-file-fixer
- id: requirements-txt-fixer
- id: double-quote-string-fixer
- id: check-merge-conflict
- id: fix-encoding-pragma
args: ["--remove"]
- id: mixed-line-ending
args: ["--fix=lf"]
- repo: https://github.com/executablebooks/mdformat
rev: 0.7.9
hooks:
- id: mdformat
args: ["--number"]
additional_dependencies:
- mdformat-openmmlab
- mdformat_frontmatter
- linkify-it-py
- repo: https://github.com/codespell-project/codespell
rev: v2.1.0
hooks:
- id: codespell
args: ["--skip=third_party/*,*.ipynb,*.proto"]

- repo: https://github.com/myint/docformatter
rev: v1.4
hooks:
- id: docformatter
args: ["--in-place", "--wrap-descriptions", "79"]

- repo: https://github.com/open-mmlab/pre-commit-hooks
rev: v0.4.1
hooks:
- id: check-copyright
args: ["service"]
194 changes: 102 additions & 92 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
</div>

"HuixiangDou" is a domain-specific knowledge assistant based on the LLM. Features:

1. Deal with complex scenarios like group chats, answer user questions without causing message flooding.
2. Propose an algorithm pipeline for answering technical questions.
3. Low deployment cost, only need the LLM model to meet 4 traits can answer most of the user's questions, see [technical report](./resource/HuixiangDou.pdf).
Expand All @@ -19,11 +20,11 @@ View [HuixiangDou inside](./huixiangdou-inside.md).

The following are the hardware requirements for running. It is suggested to follow this document, starting with the basic version and gradually experiencing advanced features.

| Version | GPU Memory Requirements | Features | Tested on Linux |
| :-: | :-: | :-: | :-: |
| Basic Version | 20GB | Answer basic domain knowledge questions, zero cost | ![](https://img.shields.io/badge/3090%2024G-passed-blue?style=for-the-badge) |
| Advanced Version | 40GB | Answer source code level questions, zero cost | ![](https://img.shields.io/badge/A100%2080G-passed-blue?style=for-the-badge) |
| Modified Version | 4GB | Using openai API, operation involves cost | ![](https://img.shields.io/badge/1660ti%206G-passed-blue?style=for-the-badge) |
| Version | GPU Memory Requirements | Features | Tested on Linux |
| :--------------: | :---------------------: | :------------------------------------------------: | :---------------------------------------------------------------------------: |
| Basic Version | 20GB | Answer basic domain knowledge questions, zero cost | ![](https://img.shields.io/badge/3090%2024G-passed-blue?style=for-the-badge) |
| Advanced Version | 40GB | Answer source code level questions, zero cost | ![](https://img.shields.io/badge/A100%2080G-passed-blue?style=for-the-badge) |
| Modified Version | 4GB | Using openai API, operation involves cost | ![](https://img.shields.io/badge/1660ti%206G-passed-blue?style=for-the-badge) |

# 🔥 Run

Expand Down Expand Up @@ -76,34 +77,37 @@ Please ensure that the GPU memory is over 20GB (such as 3090 or above). If the m

The first run will automatically download the configuration of internlm2-7B.

* **Non-docker users**. If you **don't** use docker environment, you can start all services at once.
```shell
# standalone
python3 main.py workdir --standalone
..
ErrorCode.SUCCESS, Could you please advise if there is any good optimization method for video stream detection flickering caused by frame skipping?
1. Frame rate control and frame skipping strategy are key to optimizing video stream detection performance, but you need to pay attention to the impact of frame skipping on detection results.
2. Multithreading processing and caching mechanism can improve detection efficiency, but you need to pay attention to the stability of detection results.
3. The use of sliding window method can reduce the impact of frame skipping and caching on detection results.
```
* **Docker users**. If you are using docker, HuixiangDou's Hybrid LLM Service needs to be deployed separately.
```shell
# Start LLM service
python3 service/llm_server_hybride.py
```
Open a new terminal, configure the host IP in `config.ini`, run
```shell
# config.ini
[llm]
..
client_url = "http://10.140.24.142:8888/inference" # example
python3 main.py workdir
```
## STEP3. Integrate into Feishu [Optional]
- **Non-docker users**. If you **don't** use docker environment, you can start all services at once.

```shell
# standalone
python3 main.py --standalone
..
ErrorCode.SUCCESS, Could you please advise if there is any good optimization method for video stream detection flickering caused by frame skipping?
1. Frame rate control and frame skipping strategy are key to optimizing video stream detection performance, but you need to pay attention to the impact of frame skipping on detection results.
2. Multithreading processing and caching mechanism can improve detection efficiency, but you need to pay attention to the stability of detection results.
3. The use of sliding window method can reduce the impact of frame skipping and caching on detection results.
```
- **Docker users**. If you are using docker, HuixiangDou's Hybrid LLM Service needs to be deployed separately.
```shell
# Start LLM service
python3 service/llm_server_hybride.py
```
Open a new terminal, configure the host IP in `config.ini`, run
```shell
# config.ini
[llm]
..
client_url = "http://10.140.24.142:8888/inference" # example
python3 main.py
```
## STEP3. Integrate into Feishu \[Optional\]
Click [Create a Feishu Custom Robot](https://open.feishu.cn/document/client-docs/bot-v3/add-custom-bot) to get the WEBHOOK_URL callback, and fill it in the config.ini.
Expand All @@ -116,114 +120,120 @@ webhook_url = "${YOUR-LARK-WEBHOOK-URL}"
```
Run. After it ends, the technical assistant's reply will be sent to the Feishu group chat.
```shell
python3 main.py workdir
python3 main.py
```
<img src="./resource/figures/lark-example.png" width="400">
If you still need to read Feishu group messages, see [Feishu Developer Square - Add Application Capabilities - Robots](https://open.feishu.cn/app?lang=zh-CN).
## STEP4. Advanced Version [Optional]
## STEP4. Advanced Version \[Optional\]
The basic version may not perform well. You can enable these features to enhance performance. The more features you turn on, the better.
1. Use higher accuracy local LLM
Adjust the `llm.local` model in config.ini to `internlm2-20B`.
This option has a significant effect, but requires more GPU memory.
Adjust the `llm.local` model in config.ini to `internlm2-20B`.
This option has a significant effect, but requires more GPU memory.
2. Hybrid LLM Service
For LLM services that support the openai interface, HuixiangDou can utilize its Long Context ability.
Using [kimi](https://platform.moonshot.cn/) as an example, below is an example of `config.ini` configuration:
```shell
# config.ini
[llm.server]
..
# open https://platform.moonshot.cn/
remote_type = "kimi"
remote_api_key = "YOUR-KIMI-API-KEY"
remote_llm_max_text_length = 128000
remote_llm_model = "moonshot-v1-128k"
```
We also support chatgpt API. Note that this feature will increase response time and operating costs.
For LLM services that support the openai interface, HuixiangDou can utilize its Long Context ability.
Using [kimi](https://platform.moonshot.cn/) as an example, below is an example of `config.ini` configuration:
```shell
# config.ini
[llm.server]
..
# open https://platform.moonshot.cn/
remote_type = "kimi"
remote_api_key = "YOUR-KIMI-API-KEY"
remote_llm_max_text_length = 128000
remote_llm_model = "moonshot-v1-128k"
```
We also support chatgpt API. Note that this feature will increase response time and operating costs.
3. Repo search enhancement
This feature is suitable for handling difficult questions and requires basic development capabilities to adjust the prompt.
This feature is suitable for handling difficult questions and requires basic development capabilities to adjust the prompt.
* Click [sourcegraph-account-access](https://sourcegraph.com/users/tpoisonooo/settings/tokens) to get token
- Click [sourcegraph-account-access](https://sourcegraph.com/users/tpoisonooo/settings/tokens) to get token
```shell
# open https://github.com/sourcegraph/src-cli#installation
curl -L https://sourcegraph.com/.api/src-cli/src_linux_amd64 -o /usr/local/bin/src && chmod +x /usr/local/bin/src
```shell
# open https://github.com/sourcegraph/src-cli#installation
curl -L https://sourcegraph.com/.api/src-cli/src_linux_amd64 -o /usr/local/bin/src && chmod +x /usr/local/bin/src

# Fill the token into config.ini
[sg_search]
..
src_access_token = "${YOUR_ACCESS_TOKEN}"
```
# Fill the token into config.ini
[sg_search]
..
src_access_token = "${YOUR_ACCESS_TOKEN}"
```
* Edit the name and introduction of the repo, we take opencompass as an example
- Edit the name and introduction of the repo, we take opencompass as an example
```shell
# config.ini
# add your repo here, we just take opencompass and lmdeploy as example
[sg_search.opencompass]
github_repo_id = "open-compass/opencompass"
introduction = "Used for evaluating large language models (LLM) .."
```
```shell
# config.ini
# add your repo here, we just take opencompass and lmdeploy as example
[sg_search.opencompass]
github_repo_id = "open-compass/opencompass"
introduction = "Used for evaluating large language models (LLM) .."
```
* Use `python3 -m service.sg_search` for unit test, the returned content should include opencompass source code and documentation
- Use `python3 -m service.sg_search` for unit test, the returned content should include opencompass source code and documentation
```shell
python3 service/sg_search.py
..
"filepath": "opencompass/datasets/longbench/longbench_trivia_qa.py",
"content": "from datasets import Dataset..
```
```shell
python3 service/sg_search.py
..
"filepath": "opencompass/datasets/longbench/longbench_trivia_qa.py",
"content": "from datasets import Dataset..
```
Run `main.py`, HuixiangDou will enable search enhancement when appropriate.
Run `main.py`, HuixiangDou will enable search enhancement when appropriate.
4. Tune Parameters
It is often unavoidable to adjust parameters with respect to business scenarios.
* Refer to [data.json](./tests/data.json) to add real data, run [test_intention_prompt.py](./tests/test_intention_prompt.py) to get suitable prompts and thresholds, and update them into [worker](./service/worker.py).
* Adjust the [number of search results](./service/worker.py) based on the maximum length supported by the model.
It is often unavoidable to adjust parameters with respect to business scenarios.
# 🛠️ FAQ
- Refer to [data.json](./tests/data.json) to add real data, run [test_intention_prompt.py](./tests/test_intention_prompt.py) to get suitable prompts and thresholds, and update them into [worker](./service/worker.py).
- Adjust the [number of search results](./service/worker.py) based on the maximum length supported by the model.
# 🛠️ FAQ
1. How to access other IMs?
* WeChat. For Enterprise WeChat, see [Enterprise WeChat Application Development Guide](https://developer.work.weixin.qq.com/document/path/90594) ; for personal WeChat, we have confirmed with the WeChat team that there is currently no API, you need to search and learn by yourself.
* DingTalk. Refer to [DingTalk Open Platform-Custom Robot Access](https://open.dingtalk.com/document/robots/custom-robot-access)
- WeChat. For Enterprise WeChat, see [Enterprise WeChat Application Development Guide](https://developer.work.weixin.qq.com/document/path/90594) ; for personal WeChat, we have confirmed with the WeChat team that there is currently no API, you need to search and learn by yourself.
- DingTalk. Refer to [DingTalk Open Platform-Custom Robot Access](https://open.dingtalk.com/document/robots/custom-robot-access)
2. What if the robot is too cold/too chatty?
* Fill in the questions that should be answered in the real scenario into `resource/good_questions.json`, and fill the ones that should be rejected into `resource/bad_questions.json`.
* Adjust the theme content in `repodir` to ensure that the markdown documents in the main library do not contain irrelevant content.
- Fill in the questions that should be answered in the real scenario into `resource/good_questions.json`, and fill the ones that should be rejected into `resource/bad_questions.json`.
- Adjust the theme content in `repodir` to ensure that the markdown documents in the main library do not contain irrelevant content.
Re-run `service/feature_store.py` to update thresholds and feature libraries.
Re-run `service/feature_store.py` to update thresholds and feature libraries.
3. Launch is normal, but out of memory during runtime?
LLM long text based on transformers structure requires more memory. At this time, kv cache quantization needs to be done on the model, such as [lmdeploy quantization description](https://github.com/InternLM/lmdeploy/blob/main/docs/en/kv_int8.md). Then use docker to independently deploy Hybrid LLM Service.
LLM long text based on transformers structure requires more memory. At this time, kv cache quantization needs to be done on the model, such as [lmdeploy quantization description](https://github.com/InternLM/lmdeploy/blob/main/docs/en/kv_int8.md). Then use docker to independently deploy Hybrid LLM Service.
4. How to access other local LLM / After access, the effect is not ideal?
* Open [hybrid llm service](./service/llm_server_hybrid.py), add a new LLM inference implementation.
* Refer to [test_intention_prompt and test data](./tests/test_intention_prompt.py), adjust prompt and threshold for the new model, and update them into [worker.py](./service/worker.py).
- Open [hybrid llm service](./service/llm_server_hybrid.py), add a new LLM inference implementation.
- Refer to [test_intention_prompt and test data](./tests/test_intention_prompt.py), adjust prompt and threshold for the new model, and update them into [worker.py](./service/worker.py).
5. What if the response is too slow/request always fails?
* Refer to [hybrid llm service](./service/llm_server_hybrid.py) to add exponential backoff and retransmission.
* Replace local LLM with an inference framework such as [lmdeploy](https://github.com/internlm/lmdeploy), instead of the native huggingface/transformers.
- Refer to [hybrid llm service](./service/llm_server_hybrid.py) to add exponential backoff and retransmission.
- Replace local LLM with an inference framework such as [lmdeploy](https://github.com/internlm/lmdeploy), instead of the native huggingface/transformers.
6. What if the GPU memory is too low?
At this time, it is impossible to run local LLM, and only remote LLM can be used in conjunction with text2vec to execute the pipeline. Please make sure that `config.ini` only uses remote LLM and turn off local LLM.
At this time, it is impossible to run local LLM, and only remote LLM can be used in conjunction with text2vec to execute the pipeline. Please make sure that `config.ini` only uses remote LLM and turn off local LLM.
# 📝 Citation
```shell
@misc{2023HuixiangDou,
title={HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance},
Expand Down
Loading

0 comments on commit ad5b78c

Please sign in to comment.