Skip to content

Commit

Permalink
Make WebUI and API code cleaner (+ 1.5 fixes) (#703)
Browse files Browse the repository at this point in the history
* rename webui.py to run_webui.py

* remove unused imports

* remove unsued code

* move inference code and fix all warnings

* move web app code

* make code easier to read

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused function

* remove msgpack_api.py

* rename API files

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* finish updating the doc with the new file names

* finish updating the doc with the new file names

* fix CPU use in the API

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactor WebUIinference in a class with submodules

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* re-enable streaming in webui inference code

* generalize inference code in webui

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* make a unique inference engine class

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fix

* cleaning code

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* implement new structure of the API (not working)

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refactor API

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* minor fixes

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* reimplement chat endpoint

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
Picus303 and pre-commit-ci[bot] authored Dec 7, 2024
1 parent 954cae1 commit 62eae26
Show file tree
Hide file tree
Showing 45 changed files with 1,959 additions and 1,697 deletions.
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/bug_report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ body:
description: |
Include detailed steps, screenshots, and logs. Use the correct markdown syntax for code blocks.
placeholder: |
1. Run the command `python -m tools.post_api -t "xxxxx"`
1. Run the command `python -m tools.api_client -t "xxxxx"`
2. Observe the console output error: `ModuleNotFoundError: No module named 'pyaudio'` (with screenshots or logs will be better)
validations:
required: true
Expand Down
2 changes: 1 addition & 1 deletion docs/en/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ pip install -e .[stable]
4. Configure environment variables and access WebUI

In the terminal inside the docker container, enter `export GRADIO_SERVER_NAME="0.0.0.0"` to allow external access to the gradio service inside docker.
Then in the terminal inside the docker container, enter `python tools/webui.py` to start the WebUI service.
Then in the terminal inside the docker container, enter `python tools/run_webui.py` to start the WebUI service.

If you're using WSL or MacOS, visit [http://localhost:7860](http://localhost:7860) to open the WebUI interface.
Expand Down
10 changes: 5 additions & 5 deletions docs/en/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ python tools/vqgan/inference.py \
We provide a HTTP API for inference. You can use the following command to start the server:

```bash
python -m tools.api \
python -m tools.api_server \
--listen 0.0.0.0:8080 \
--llama-checkpoint-path "checkpoints/fish-speech-1.5" \
--decoder-checkpoint-path "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth" \
Expand All @@ -78,10 +78,10 @@ python -m tools.api \
After that, you can view and test the API at http://127.0.0.1:8080/.

Below is an example of sending a request using `tools/post_api.py`.
Below is an example of sending a request using `tools/api_client.py`.

```bash
python -m tools.post_api \
python -m tools.api_client \
--text "Text to be input" \
--reference_audio "Path to reference audio" \
--reference_text "Text content of the reference audio" \
Expand All @@ -93,7 +93,7 @@ The above command indicates synthesizing the desired audio according to the refe
The following example demonstrates that you can use **multiple** reference audio paths and reference audio texts at once. Separate them with spaces in the command.

```bash
python -m tools.post_api \
python -m tools.api_client \
--text "Text to input" \
--reference_audio "reference audio path1" "reference audio path2" \
--reference_text "reference audio text1" "reference audio text2"\
Expand All @@ -109,7 +109,7 @@ The currently supported reference audio has a maximum total duration of 90 secon


!!! info
To learn more about available parameters, you can use the command `python -m tools.post_api -h`
To learn more about available parameters, you can use the command `python -m tools.api_client -h`

## GUI Inference
[Download client](https://github.com/AnyaCoder/fish-speech-gui/releases)
Expand Down
2 changes: 1 addition & 1 deletion docs/en/start_agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ pip install -e .[stable]
To build fish-agent, please use the command below under the main folder:

```bash
python -m tools.api --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
python -m tools.api_server --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
```

The `--compile` args only support Python < 3.12 , which will greatly speed up the token generation.
Expand Down
2 changes: 1 addition & 1 deletion docs/ja/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ pip install -e .[stable]
4. 環境変数の設定と WebUI へのアクセス

Docker コンテナ内のターミナルで、`export GRADIO_SERVER_NAME="0.0.0.0"` と入力して、外部から Docker 内の gradio サービスにアクセスできるようにします。
次に、Docker コンテナ内のターミナルで `python tools/webui.py` と入力して WebUI サービスを起動します。
次に、Docker コンテナ内のターミナルで `python tools/run_webui.py` と入力して WebUI サービスを起動します。

WSL または MacOS の場合は、[http://localhost:7860](http://localhost:7860) にアクセスして WebUI インターフェースを開くことができます。

Expand Down
8 changes: 4 additions & 4 deletions docs/ja/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ python tools/vqgan/inference.py \
推論のための HTTP API を提供しています。次のコマンドを使用してサーバーを起動できます:

```bash
python -m tools.api \
python -m tools.api_server \
--listen 0.0.0.0:8080 \
--llama-checkpoint-path "checkpoints/fish-speech-1.5" \
--decoder-checkpoint-path "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth" \
Expand All @@ -78,10 +78,10 @@ python -m tools.api \
その後、`http://127.0.0.1:8080/`で API を表示およびテストできます。

以下は、`tools/post_api.py` を使用してリクエストを送信する例です。
以下は、`tools/api_client.py` を使用してリクエストを送信する例です。

```bash
python -m tools.post_api \
python -m tools.api_client \
--text "入力するテキスト" \
--reference_audio "参照音声へのパス" \
--reference_text "参照音声テキスト" \
Expand All @@ -91,7 +91,7 @@ python -m tools.post_api \
上記のコマンドは、参照音声の情報に基づいて必要な音声を合成し、ストリーミング方式で返すことを示しています。

!!! info
使用可能なパラメータの詳細については、コマンド` python -m tools.post_api -h `を使用してください
使用可能なパラメータの詳細については、コマンド` python -m tools.api_client -h `を使用してください

## WebUI 推論

Expand Down
2 changes: 1 addition & 1 deletion docs/ja/start_agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ pip install -e .[stable]
fish-agentを構築するには、メインフォルダで以下のコマンドを使用してください:

```bash
python -m tools.api --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
python -m tools.api_server --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
```

`--compile`引数はPython < 3.12でのみサポートされており、トークン生成を大幅に高速化します。
Expand Down
2 changes: 1 addition & 1 deletion docs/ko/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ pip install -e .[stable]
4. 환경 변수 설정 및 WebUI 접근

Docker 컨테이너 내부의 터미널에서 `export GRADIO_SERVER_NAME="0.0.0.0"`를 입력하여 Docker 내부에서 Gradio 서비스에 외부 접근을 허용합니다.
이후, 터미널에서 `python tools/webui.py` 명령어를 입력하여 WebUI 서비스를 시작합니다.
이후, 터미널에서 `python tools/run_webui.py` 명령어를 입력하여 WebUI 서비스를 시작합니다.

WSL 또는 macOS를 사용하는 경우 [http://localhost:7860](http://localhost:7860)에서 WebUI 인터페이스를 열 수 있습니다.

Expand Down
10 changes: 5 additions & 5 deletions docs/ko/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ python tools/vqgan/inference.py \
추론을 위한 HTTP API를 제공하고 있습니다. 아래의 명령어로 서버를 시작할 수 있습니다:

```bash
python -m tools.api \
python -m tools.api_server \
--listen 0.0.0.0:8080 \
--llama-checkpoint-path "checkpoints/fish-speech-1.5" \
--decoder-checkpoint-path "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth" \
Expand All @@ -78,10 +78,10 @@ python -m tools.api \

이후, http://127.0.0.1:8080/ 에서 API를 확인하고 테스트할 수 있습니다.

아래는 `tools/post_api.py`를 사용하여 요청을 보내는 예시입니다.
아래는 `tools/api_client.py`를 사용하여 요청을 보내는 예시입니다.

```bash
python -m tools.post_api \
python -m tools.api_client \
--text "입력할 텍스트" \
--reference_audio "참고 음성 경로" \
--reference_text "참고 음성의 텍스트 내용" \
Expand All @@ -93,7 +93,7 @@ python -m tools.post_api \
다음 예시는 여러 개의 참고 음성 경로와 텍스트를 한꺼번에 사용할 수 있음을 보여줍니다. 명령에서 공백으로 구분하여 입력합니다.

```bash
python -m tools.post_api \
python -m tools.api_client \
--text "입력할 텍스트" \
--reference_audio "참고 음성 경로1" "참고 음성 경로2" \
--reference_text "참고 음성 텍스트1" "참고 음성 텍스트2"\
Expand All @@ -107,7 +107,7 @@ python -m tools.post_api \
`--reference_audio``--reference_text` 대신에 `--reference_id`(하나만 사용 가능)를 사용할 수 있습니다. 프로젝트 루트 디렉토리에 `references/<your reference_id>` 폴더를 만들어 해당 음성과 주석 텍스트를 넣어야 합니다. 참고 음성은 최대 90초까지 지원됩니다.

!!! info
제공되는 파라미터는 `python -m tools.post_api -h`를 사용하여 확인할 수 있습니다.
제공되는 파라미터는 `python -m tools.api_client -h`를 사용하여 확인할 수 있습니다.

## GUI 추론
[클라이언트 다운로드](https://github.com/AnyaCoder/fish-speech-gui/releases)
Expand Down
2 changes: 1 addition & 1 deletion docs/ko/start_agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ pip install -e .[stable]
fish-agent를 구축하려면 메인 폴더에서 아래 명령어를 사용하세요:

```bash
python -m tools.api --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
python -m tools.api_server --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
```

`--compile` 인자는 Python < 3.12에서만 지원되며, 토큰 생성 속도를 크게 향상시킵니다.
Expand Down
2 changes: 1 addition & 1 deletion docs/pt/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ pip install -e .[stable]
4. Configure as variáveis de ambiente e acesse a WebUI

No terminal do contêiner Docker, digite `export GRADIO_SERVER_NAME="0.0.0.0"` para permitir o acesso externo ao serviço gradio dentro do Docker.
Em seguida, no terminal do contêiner Docker, digite `python tools/webui.py` para iniciar o serviço WebUI.
Em seguida, no terminal do contêiner Docker, digite `python tools/run_webui.py` para iniciar o serviço WebUI.

Se estiver usando WSL ou MacOS, acesse [http://localhost:7860](http://localhost:7860) para abrir a interface WebUI.

Expand Down
8 changes: 4 additions & 4 deletions docs/pt/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ python tools/vqgan/inference.py \
Fornecemos uma API HTTP para inferência. O seguinte comando pode ser usado para iniciar o servidor:

```bash
python -m tools.api \
python -m tools.api_server \
--listen 0.0.0.0:8080 \
--llama-checkpoint-path "checkpoints/fish-speech-1.5" \
--decoder-checkpoint-path "checkpoints/fish-speech-1.4/firefly-gan-vq-fsq-8x1024-21hz-generator.pth" \
Expand All @@ -78,10 +78,10 @@ python -m tools.api \
Depois disso, é possível visualizar e testar a API em http://127.0.0.1:8080/.

Abaixo está um exemplo de envio de uma solicitação usando `tools/post_api.py`.
Abaixo está um exemplo de envio de uma solicitação usando `tools/api_client.py`.

```bash
python -m tools.post_api \
python -m tools.api_client \
--text "Texto a ser inserido" \
--reference_audio "Caminho para o áudio de referência" \
--reference_text "Conteúdo de texto do áudio de referência" \
Expand All @@ -91,7 +91,7 @@ python -m tools.post_api \
O comando acima indica a síntese do áudio desejada de acordo com as informações do áudio de referência e a retorna em modo de streaming.

!!! info
Para aprender mais sobre parâmetros disponíveis, você pode usar o comando `python -m tools.post_api -h`
Para aprender mais sobre parâmetros disponíveis, você pode usar o comando `python -m tools.api_client -h`

## Inferência por WebUI

Expand Down
2 changes: 1 addition & 1 deletion docs/pt/start_agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ pip install -e .[stable]
Para construir o fish-agent, use o comando abaixo na pasta principal:

```bash
python -m tools.api --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
python -m tools.api_server --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
```

O argumento `--compile` só suporta Python < 3.12, o que aumentará muito a velocidade de geração de tokens.
Expand Down
2 changes: 1 addition & 1 deletion docs/zh/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@ pip install -e .[stable]
4. 配置环境变量,访问 WebUI

在 docker 容器内的终端,输入 `export GRADIO_SERVER_NAME="0.0.0.0"` ,从而让外部可以访问 docker 内的 gradio 服务。
接着在 docker 容器内的终端,输入 `python tools/webui.py` 即可开启 WebUI 服务。
接着在 docker 容器内的终端,输入 `python tools/run_webui.py` 即可开启 WebUI 服务。

如果是 WSL 或者是 MacOS ,访问 [http://localhost:7860](http://localhost:7860) 即可打开 WebUI 界面。

Expand Down
10 changes: 5 additions & 5 deletions docs/zh/inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ python tools/vqgan/inference.py \
运行以下命令来启动 HTTP 服务:

```bash
python -m tools.api \
python -m tools.api_server \
--listen 0.0.0.0:8080 \
--llama-checkpoint-path "checkpoints/fish-speech-1.5" \
--decoder-checkpoint-path "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth" \
Expand All @@ -88,10 +88,10 @@ HF_ENDPOINT=https://hf-mirror.com python -m ...(同上)

随后, 你可以在 `http://127.0.0.1:8080/` 中查看并测试 API.

下面是使用`tools/post_api.py`发送请求的示例。
下面是使用`tools/api_client.py`发送请求的示例。

```bash
python -m tools.post_api \
python -m tools.api_client \
--text "要输入的文本" \
--reference_audio "参考音频路径" \
--reference_text "参考音频的文本内容" \
Expand All @@ -102,7 +102,7 @@ python -m tools.post_api \

下面的示例展示了, 可以一次使用**多个** `参考音频路径``参考音频的文本内容`。在命令里用空格隔开即可。
```bash
python -m tools.post_api \
python -m tools.api_client \
--text "要输入的文本" \
--reference_audio "参考音频路径1" "参考音频路径2" \
--reference_text "参考音频的文本内容1" "参考音频的文本内容2"\
Expand All @@ -117,7 +117,7 @@ python -m tools.post_api \
里面放上任意对音频与标注文本。 目前支持的参考音频最多加起来总时长90s。

!!! info
要了解有关可用参数的更多信息,可以使用命令`python -m tools.post_api -h`
要了解有关可用参数的更多信息,可以使用命令`python -m tools.api_client -h`

## GUI 推理
[下载客户端](https://github.com/AnyaCoder/fish-speech-gui/releases)
Expand Down
2 changes: 1 addition & 1 deletion docs/zh/start_agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ pip install -e .[stable]
你需要使用以下指令来构建 fish-agent

```bash
python -m tools.api --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
python -m tools.api_server --llama-checkpoint-path checkpoints/fish-agent-v0.1-3b/ --mode agent --compile
```

`--compile`只能在小于 3.12 版本的 Python 使用,这个功能可以极大程度上提高生成速度。
Expand Down
2 changes: 1 addition & 1 deletion entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ if [ "${CUDA_ENABLED}" != "true" ]; then
DEVICE="--device cpu"
fi

exec python tools/webui.py ${DEVICE}
exec python tools/run_webui.py ${DEVICE}
2 changes: 1 addition & 1 deletion fish_speech/webui/manage.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ def change_infer(
p_infer = subprocess.Popen(
[
PYTHON,
"tools/webui.py",
"tools/run_webui.py",
"--decoder-checkpoint-path",
infer_decoder_model,
"--decoder-config-name",
Expand Down
2 changes: 1 addition & 1 deletion inference.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@
},
"outputs": [],
"source": [
"!python tools/webui.py \\\n",
"!python tools/run_webui.py \\\n",
" --llama-checkpoint-path checkpoints/fish-speech-1.4 \\\n",
" --decoder-checkpoint-path checkpoints/fish-speech-1.4/firefly-gan-vq-fsq-8x1024-21hz-generator.pth \\\n",
" # --compile"
Expand Down
2 changes: 1 addition & 1 deletion start.bat
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ if not "!flags!"=="" set "flags=!flags:~1!"
echo Debug: flags = !flags!

if "!mode!"=="api" (
%PYTHON_CMD% -m tools.api !flags!
%PYTHON_CMD% -m tools.api_server !flags!
) else if "!mode!"=="infer" (
%PYTHON_CMD% -m tools.webui !flags!
)
Expand Down
Loading

0 comments on commit 62eae26

Please sign in to comment.