Skip to content

Commit

Permalink
feat(docs): how to add readthedocs (#377)
Browse files Browse the repository at this point in the history
* feat(docs): how to add readthedocs

* docs(README): update
  • Loading branch information
tpoisonooo authored Aug 28, 2024
1 parent 008f1fa commit 109616c
Show file tree
Hide file tree
Showing 8 changed files with 127 additions and 88 deletions.
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ If this helps you, please give it a star ⭐

Our Web version has been released to [OpenXLab](https://openxlab.org.cn/apps/detail/tpoisonooo/huixiangdou-web), where you can create knowledge base, update positive and negative examples, turn on web search, test chat, and integrate into Feishu/WeChat groups. See [BiliBili](https://www.bilibili.com/video/BV1S2421N7mn) and [YouTube](https://www.youtube.com/watch?v=ylXrT-Tei-Y) !

- \[2024/08\] `chat_with_repo` [pipeline](./huixiangdou/service/parallel_pipeline.py) 👍
- \[2024/08\] [chat_with_readthedocs](https://huixiangdou.readthedocs.io/en/latest/), see [how to integrate](./docs/zh/doc_add_readthedocs.md) 👍
- \[2024/07\] Image and text retrieval & Removal of `langchain` 👍
- \[2024/07\] [Hybrid Knowledge Graph and Dense Retrieval](./docs/en/doc_knowledge_graph.md) improve 1.7% F1 score 🎯
- \[2024/06\] [Evaluation of chunksize, splitter, and text2vec model](./evaluation) 🎯
Expand Down Expand Up @@ -132,8 +132,9 @@ Our Web version has been released to [OpenXLab](https://openxlab.org.cn/apps/det
- WeChat([android](./docs/zh/doc_add_wechat_accessibility.md)/[wkteam](./docs/zh/doc_add_wechat_commercial.md))
- Lark
- [OpenXLab Web](https://openxlab.org.cn/apps/detail/tpoisonooo/huixiangdou-web)
- [Gradio Demo](./huixiangdou/gradio.py)
- [Gradio Demo](./huixiangdou/gradio_ui.py)
- [HTTP Server](./huixiangdou/server.py)
- [Read the Docs](./docs/zh/doc_add_readthedocs.md)

</td>

Expand Down Expand Up @@ -227,7 +228,7 @@ python3 -m huixiangdou.main --standalone
💡 Also run a simple Web UI with `gradio`:

```bash
python3 -m huixiangdou.gradio
python3 -m huixiangdou.gradio_ui
```

<video src="https://github.com/user-attachments/assets/9e5dbb30-1dc1-42ad-a7d4-dc7380676554" ></video>
Expand Down Expand Up @@ -282,7 +283,7 @@ python3 -m huixiangdou.service.feature_store --config_path config-cpu.ini
# Q&A test
python3 -m huixiangdou.main --standalone --config_path config-cpu.ini
# gradio UI
python3 -m huixiangdou.gradio --config_path config-cpu.ini
python3 -m huixiangdou.gradio_ui --config_path config-cpu.ini
```

If you find the installation too slow, a pre-installed image is provided in [Docker Hub](https://hub.docker.com/repository/docker/tpoisonooo/huixiangdou/tags). Simply replace it when starting the docker.
Expand Down
19 changes: 10 additions & 9 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@
<a href="resource/figures/wechat.jpg" target="_blank">
<img alt="Wechat" src="https://img.shields.io/badge/wechat-robot%20inside-brightgreen?logo=wechat&logoColor=white" />
</a>
<!-- <a href="https://huixiangdou.readthedocs.io/zh-cn/latest/" target="_blank">
<img alt="Readthedocs" src="https://img.shields.io/badge/readthedocs-chat%20with%20AI-brightgreen?logo=readthedocs&logoColor=white" />
</a> -->
<a href="https://huixiangdou.readthedocs.io/zh-cn/latest/" target="_blank">
<img alt="Readthedocs" src="https://img.shields.io/badge/readthedocs-black?logo=readthedocs&logoColor=white" />
<img alt="Readthedocs" src="https://img.shields.io/badge/readthedocs-chat%20with%20AI-brightgreen?logo=readthedocs&logoColor=white" />
</a>
<!-- <a href="https://huixiangdou.readthedocs.io/zh-cn/latest/" target="_blank">
<img alt="Readthedocs" src="https://img.shields.io/badge/readthedocs-black?logo=readthedocs&logoColor=white" />
</a> -->
<a href="https://youtu.be/ylXrT-Tei-Y" target="_blank">
<img alt="YouTube" src="https://img.shields.io/badge/YouTube-black?logo=youtube&logoColor=red" />
</a>
Expand Down Expand Up @@ -50,7 +50,7 @@

Web 版视频教程见 [BiliBili](https://www.bilibili.com/video/BV1S2421N7mn)[YouTube](https://www.youtube.com/watch?v=ylXrT-Tei-Y)

- \[2024/08\] `chat_with_repo` [pipeline](./huixiangdou/service/parallel_pipeline.py)
- \[2024/08\] ["chat_with readthedocs"](https://huixiangdou.readthedocs.io/zh-cn/latest/) ,见[集成说明](./docs/zh/doc_add_readthedocs.md)
- \[2024/07\] 图文检索 & 移除 `langchain` 👍
- \[2024/07\] [混合知识图谱和稠密检索,F1 提升 1.7%](./docs/zh/doc_knowledge_graph.md) 🎯
- \[2024/06\] [评估 chunksize,splitter 和 text2vec 模型](./evaluation) 🎯
Expand Down Expand Up @@ -131,8 +131,9 @@ Web 版视频教程见 [BiliBili](https://www.bilibili.com/video/BV1S2421N7mn)
- 微信([android](./docs/zh/doc_add_wechat_accessibility.md)/[wkteam](./docs/zh/doc_add_wechat_commercial.md)
- 飞书
- [OpenXLab Web](https://openxlab.org.cn/apps/detail/tpoisonooo/huixiangdou-web)
- [Gradio Demo](./huixiangdou/gradio.py)
- [Gradio Demo](./huixiangdou/gradio_ui.py)
- [HTTP Server](./huixiangdou/server.py)
- [Read the Docs](./docs/zh/doc_add_readthedocs.md)

</td>

Expand Down Expand Up @@ -225,9 +226,9 @@ python3 -m huixiangdou.main --standalone
💡 也可以启动 `gradio` 搭建一个简易的 Web UI,默认绑定 7860 端口:

```bash
python3 -m huixiangdou.gradio
python3 -m huixiangdou.gradio_ui
# 若已单独运行 `llm_server_hybrid.py`,可以
# python3 -m huixiangdou.gradio --no-standalone
# python3 -m huixiangdou.gradio_ui --no-standalone
```

<video src="https://github.com/user-attachments/assets/9e5dbb30-1dc1-42ad-a7d4-dc7380676554" ></video>
Expand Down Expand Up @@ -281,7 +282,7 @@ python3 -m huixiangdou.service.feature_store --config_path config-cpu.ini
# 问答测试
python3 -m huixiangdou.main --standalone --config_path config-cpu.ini
# gradio UI
python3 -m huixiangdou.gradio --config_path config-cpu.ini
python3 -m huixiangdou.gradio_ui --config_path config-cpu.ini
```

如果装依赖太慢,[dockerhub 里](https://hub.docker.com/repository/docker/tpoisonooo/huixiangdou/tags)提供了安装好依赖的镜像,docker 启动时替换即可。
Expand Down
7 changes: 7 additions & 0 deletions docs/en/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,13 @@ We warmly welcome users' PRs and Issues!
doc_architecture.md
doc_rag_annotate_sft_data.md

.. _readthedocs:
.. toctree::
:maxdepth: 1
:caption: readthedocs Integration

doc_add_readthedocs.md

.. _IMApplicaion:
.. toctree::
:maxdepth: 1
Expand Down
95 changes: 95 additions & 0 deletions docs/zh/doc_add_readthedocs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# 在 readthedocs 实现 `chat_with_repo`

本文介绍如何零成本在 readthedocs 实现 `chat_with_repo`。效果见 [HuixiangDou readthedocs 文档](https://huixiangdou.readthedocs.io)

部署图如下:

<img src="https://github.com/user-attachments/assets/d15935fa-a8fa-49ed-9995-7549ab1f71dc" width="400">

其中:
* [readthedocs](https://readthedocs.io) 托管中英文文档
* [OpenXLab](https://openxlab.org.cn/apps) 提供 https 入口(readthedocs 无法内嵌 http)和 cpu
* [SiliconCloud](https://siliconflow.cn/siliconcloud) 提供 text2vec、reranker 和 LLM 模型 API

我们需要使用 readthedocs 的自定义 theme,在 theme 中添加按钮。

1. 点击按钮时,创建一个 `iframe` 加载 https 版茴香豆
2. https 需要审核域名。可以用 OpenXLab 提供的随机子域名
3. OpenXLab 中 GPU 资源有限,我们使用 SiliconCloud 提供的免费模型 API

以下是操作步骤。

## 一、准备代码和文档

假设用 mmpose 所有文档做知识库,把知识库放入 repodir

```bash
cd HuixiangDou
mkdir repodir
git clone https://github.com/open-mmlab/mmpose --depth=1
# 移除知识库的 .git
rm -rf .git
```

调整 `gradio_ui.py` 的默认配置,使用 `config-cpu.ini`
```bash
# huixiangdou/gradio_ui.py
parser.add_argument(
'--config_path',
default='config-cpu.ini',
type=str,
..
```
连同知识库和 Huixiangou 项目,一起提交到 Gtihub,例如 [huixiangdou-readthedocs](https://github.com/tpoisonooo/huixiangdou-readthedocs/tree/for-openxlab-readthedocs) 的 `for-openxlab-readthedocs` 分支。
## 二、创建 OpenXLab 应用
打开 [OpenXLab](https://openxlab.org.cn/apps),创建 `Gradio` 类型应用。
1. 填入上一步的 Github 地址和分支名称
2. 服务器选择 CPU
确认后,修改应用设置:
* `自定义启动文件` 改为 `huixiangdou/gradio_ui.py`
* 由于代码已开源,需配置环境变量。HuixiangDou 优先使用配置中的 token,找不到时会尝试检查 `SILICONCLOUD_TOKEN``LLM_API_TOKEN`,如图:
<img src="https://github.com/user-attachments/assets/66291c65-1a5e-495a-aad6-e8962bef6bb6" width="400">
启动。首次运行需要 **10min 左右**建立特征库,结束后应该能看到一个 gradio 应用。例如:
```bash
https://openxlab.org.cn/apps/detail/tpoisonooo/HuixiangDou-readthedocs
```
在浏览器中按 F12,检查源码,可获得此服务对应的 https 地址:
```JavaScript
src="https://g-app-center-000704-0786-wrbqzpv.openxlab.space"
```
只要不删除应用数据,这个地址是**固定的**
## 三、使用 readthedocs 自定义主题
假设你已经熟悉 readthedocs 基本用法,可以直接拷贝 HuixiangDou docs 目录
* zh 或 en 目录
* requirements/doc.txt 设置自定义主题
[这里](https://github.com/tpoisonooo/pytorch_sphinx_theme/
) 是我们的自定义主题的实现,主要是:
1. 在 [layout.html](https://github.com/tpoisonooo/pytorch_sphinx_theme/blob/3db120b0f1e064425f37e98368dcea49972702e9/pytorch_sphinx_theme/layout.html#L324) 创建了一个 `chatButton` 和空白 container
2. 为 `chatButton` 绑定事件。按钮点击时,空白 container 加载 https 地址,例如前面的:
```bash
https://g-app-center-000704-0786-wrbqzpv.openxlab.space
```
在 [theme.css](https://github.com/tpoisonooo/pytorch_sphinx_theme/blob/master/pytorch_sphinx_theme/static/css/theme.css) 中,您可修改自己喜欢的样式。
最后,在 readthedocs.io 配置自己的项目,`Build Version` 即可。
7 changes: 7 additions & 0 deletions docs/zh/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,13 @@ HuixiangDou 上手路线
doc_rag_annotate_sft_data.md
doc_architecture.md

.. _接入readthedocs:
.. toctree::
:maxdepth: 1
:caption: 接入readthedocs

doc_add_readthedocs.md

.. _接入即时通讯软件:
.. toctree::
:maxdepth: 1
Expand Down
File renamed without changes.
4 changes: 2 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@ redis
requests
scikit-learn
# See https://github.com/deanmalmgren/textract/issues/461
textract @ git+https://github.com/tpoisonooo/textract@master
# textract
# textract @ git+https://github.com/tpoisonooo/textract@master
textract
texttable
tiktoken
torch>=2.0.0
Expand Down
74 changes: 1 addition & 73 deletions tests/test_query_gradio.py
Original file line number Diff line number Diff line change
@@ -1,76 +1,4 @@
import argparse
import json
import os
import time
from multiprocessing import Process, Value

import cv2
import gradio as gr
import pytoml
from loguru import logger

from huixiangdou.primitive import Query
from huixiangdou.service import ErrorCode, SerialPipeline, ParallelPipeline, llm_serve, start_llm_server

def parse_args():
"""Parse args."""
parser = argparse.ArgumentParser(description='SerialPipeline Gradio WebUI.')
parser.add_argument('--work_dir',
type=str,
default='workdir',
help='Working directory.')
parser.add_argument(
'--config_path',
default='config.ini',
type=str,
help='SerialPipeline configuration path. Default value is config.ini')
parser.add_argument('--standalone',
action='store_true',
default=True,
help='Auto deploy required Hybrid LLM Service.')
args = parser.parse_args()
return args


def get_reply(text, image):
if image is not None:
filename = 'image.png'
image_path = os.path.join(args.work_dir, filename)
cv2.imwrite(image_path, image)
else:
image_path = None

assistant = SerialPipeline(work_dir=args.work_dir, config_path=args.config_path)
query = Query(text, image_path)

code, reply, references = assistant.generate(query=query,
history=[],
groupname='')
ret = dict()
ret['text'] = str(reply)
ret['code'] = int(code)
ret['references'] = references

return json.dumps(ret, indent=2, ensure_ascii=False)


if __name__ == '__main__':
args = parse_args()

# start service
if args.standalone is True:
# hybrid llm serve
start_llm_server(config_path=args.config_path)

with gr.Blocks() as demo:
with gr.Row():
input_question = gr.Textbox(label='Input the question.')
input_image = gr.Image(label='Upload Image.')
with gr.Column():
result = gr.Textbox(label='Generate response.')
run_button = gr.Button()
run_button.click(fn=get_reply,
inputs=[input_question, input_image],
outputs=result)
logger.warning('This file would move to `huixiangdou.gradio`')
demo.launch(share=False, server_name='0.0.0.0', debug=True)
logger.warning('This file moved to `huixiangdou.gradio_ui`')

0 comments on commit 109616c

Please sign in to comment.