Add Windows Setup Help (#264)

fishaudio · Jun 5, 2024 · 009f32f · 009f32f
1 parent bfbd32a
commit 009f32f
Show file tree

Hide file tree

Showing 2 changed files with 103 additions and 8 deletions.
diff --git a/docs/en/index.md b/docs/en/index.md
@@ -13,7 +13,7 @@
 </div>
 
 !!! warning
-    We assume no responsibility for any illegal use of the codebase. Please refer to the local laws regarding DMCA (Digital Millennium Copyright Act) and other relevant laws in your area.
+We assume no responsibility for any illegal use of the codebase. Please refer to the local laws regarding DMCA (Digital Millennium Copyright Act) and other relevant laws in your area.
 
 This codebase is released under the `BSD-3-Clause` license, and all models are released under the CC-BY-NC-SA-4.0 license.
 
@@ -22,12 +22,61 @@ This codebase is released under the `BSD-3-Clause` license, and all models are r
 </p>
 
 ## Requirements
+
 - GPU Memory: 4GB (for inference), 16GB (for fine-tuning)
 - System: Linux, Windows
 
-We recommend Windows users to use WSL2 or docker to run the codebase, or use the integrated environment developed by the community.
+~~We recommend Windows users to use WSL2 or docker to run the codebase, or use the integrated environment developed by the community.~~
+
+## Windows Setup
+
+Windows professional users may consider WSL2 or Docker to run the codebase.
+
+Non-professional Windows users can consider the following methods to run the codebase without a Linux environment (with model compilation capabilities aka `torch.compile`):
+
+0. Extract the project zip file.
+1. Click `install_env.bat` to install the environment.
+
+   1. You can decide whether to use a mirror site for downloading by editing the `USE_MIRROR` item in `install_env.bat`.
+   2. The default is `preview`, using a mirror site and the latest development version of torch (the only way to activate the compilation method).
+   3. `false` uses the original site to download the environment. `true` uses the mirror site to download the stable version of torch and other environments.
+
+2. (Optional, this step is to activate the model compilation environment)
+
+   1. Use the following links to download the `LLVM` compiler.
+
+      - [LLVM-17.0.6 (original site download)](https://huggingface.co/fishaudio/fish-speech-1/resolve/main/LLVM-17.0.6-win64.exe?download=true)
+      - [LLVM-17.0.6 (mirror site download)](https://hf-mirror.com/fishaudio/fish-speech-1/resolve/main/LLVM-17.0.6-win64.exe?download=true)
+      - After downloading `LLVM-17.0.6-win64.exe`, double-click to install, choose the appropriate installation location, and most importantly, check `Add Path to Current User` to add the environment variable.
+      - Confirm the installation is complete.
+
+   2. Download and install the Microsoft Visual C++ Redistributable Package to resolve potential .dll missing issues.
+      - [MSVC++ 14.40.33810.0 download](https://aka.ms/vs/17/release/vc_redist.x64.exe)
+
+3. Double-click `start.bat` to enter the Fish-Speech training and inference configuration WebUI page.
+
+   - Want to go directly to the inference page? Edit the `API_FLAGS.txt` in the project root directory, and modify the first three lines as follows:
+
+   ```text
+   --infer
+   # --api
+   # --listen ...
+   ...
+   ```
+
+   - Want to start the API server? Edit the API_FLAGS.txt in the project root directory, and modify the first three lines as follows:
+
+   ```text
+   # --infer
+   --api
+   --listen ...
+   ...
+   ```
+
+4. (Optional) Double-click run_cmd.bat to enter the conda/python command line environment of this project.
+
+## Linux Setup
 
-## Setup
 ```bash
 # Create a python 3.10 virtual environment, you can also use virtualenv
 conda create -n fish-speech python=3.10
@@ -55,6 +104,7 @@ apt install libsox-dev
 - 2023/12/13: Beta version released, includes VQGAN model and a language model based on LLAMA (phoneme support only).
 
 ## Acknowledgements
+
 - [VITS2 (daniilrobnikov)](https://github.com/daniilrobnikov/vits2)
 - [Bert-VITS2](https://github.com/fishaudio/Bert-VITS2)
 - [GPT VITS](https://github.com/innnky/gpt-vits)

diff --git a/docs/zh/index.md b/docs/zh/index.md
@@ -13,7 +13,7 @@
 </div>
 
 !!! warning
-    我们不对代码库的任何非法使用承担任何责任. 请参阅您当地关于 DMCA (数字千年法案) 和其他相关法律法规.
+我们不对代码库的任何非法使用承担任何责任. 请参阅您当地关于 DMCA (数字千年法案) 和其他相关法律法规.
 
 此代码库根据 `BSD-3-Clause` 许可证发布, 所有模型根据 CC-BY-NC-SA-4.0 许可证发布.
 
@@ -22,12 +22,57 @@
 </p>
 
 ## 要求
-- GPU内存: 4GB (用于推理), 16GB (用于微调)
+
+- GPU 内存: 4GB (用于推理), 16GB (用于微调)
 - 系统: Linux, Windows
 
-我们建议 Windows 用户使用 WSL2 或 docker 来运行代码库, 或者使用由社区开发的整合环境.
+~~我们建议 Windows 用户使用 WSL2 或 docker 来运行代码库, 或者使用由社区开发的整合环境.~~
+
+## Windows 配置
+
+Windows 专业用户可以考虑 WSL2 或 docker 来运行代码库。
+
+Windows 非专业用户可考虑以下为免 Linux 环境的基础运行方法（附带模型编译功能，即 `torch.compile`）：
+
+0. 解压项目压缩包。
+1. 点击`install_env.bat`安装环境。
+   - 可以通过编辑`install_env.bat`的`USE_MIRROR`项来决定是否使用镜像站下载。
+   - 默认为`preview`, 使用镜像站且使用最新开发版本 torch（唯一激活编译方式）。
+   - `false`使用原始站下载环境。`true`为从镜像站下载稳定版本 torch 和其余环境。
+2. (可跳过，此步为激活编译模型环境)
+
+   1. 使用如下链接下载`LLVM`编译器。
+      - [LLVM-17.0.6 (原始站点下载)](https://huggingface.co/fishaudio/fish-speech-1/resolve/main/LLVM-17.0.6-win64.exe?download=true)
+      - [LLVM-17.0.6 (镜像站点下载)](https://hf-mirror.com/fishaudio/fish-speech-1/resolve/main/LLVM-17.0.6-win64.exe?download=true)
+      - 下载完`LLVM-17.0.6-win64.exe`后，双击进行安装，选择合适的安装位置，最重要的是勾选`Add Path to Current User`添加环境变量。
+      - 确认安装完成。
+   2. 下载安装`Microsoft Visual C++ 可再发行程序包`, 解决潜在`.dll`丢失问题。
+      - [MSVC++ 14.40.33810.0 下载](https://aka.ms/vs/17/release/vc_redist.x64.exe)
+
+3. 双击`start.bat`, 进入 Fish-Speech 训练推理配置 WebUI 页面。
+
+   - 想直接进入推理页面？编辑项目根目录下的`API_FLAGS.txt`, 前三行修改成如下格式:
+
+   ```text
+   --infer
+   # --api
+   # --listen ...
+   ...
+   ```
+
+   - 想启动 API 服务器？编辑项目根目录下的`API_FLAGS.txt`, 前三行修改成如下格式:
+
+   ```text
+   # --infer
+   --api
+   --listen ...
+   ...
+   ```
+
+4. (可选)双击`run_cmd.bat`进入本项目的 conda/python 命令行环境
+
+## Linux 配置
 
-## 设置
 ```bash
 # 创建一个 python 3.10 虚拟环境, 你也可以用 virtualenv
 conda create -n fish-speech python=3.10
@@ -43,7 +88,6 @@ pip3 install -e .
 apt install libsox-dev
 ```
 
-
 ## 更新日志
 
 - 2024/05/10: 更新了 Fish-Speech 到 1.1 版本，引入了 VITS Decoder 来降低口胡和提高音色相似度.
@@ -56,6 +100,7 @@ apt install libsox-dev
 - 2023/12/13: 测试版发布, 包含 VQGAN 模型和一个基于 LLAMA 的语言模型 (只支持音素).
 
 ## 致谢
+
 - [VITS2 (daniilrobnikov)](https://github.com/daniilrobnikov/vits2)
 - [Bert-VITS2](https://github.com/fishaudio/Bert-VITS2)
 - [GPT VITS](https://github.com/innnky/gpt-vits)