-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
11 changed files
with
283 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# --infer | ||
# --api | ||
--listen 0.0.0.0:8000 \ | ||
--listen 0.0.0.0:8080 \ | ||
--llama-checkpoint-path "checkpoints/fish-speech-1.2" \ | ||
--decoder-checkpoint-path "checkpoints/fish-speech-1.2/firefly-gan-vq-fsq-4x1024-42hz-generator.pth" \ | ||
--decoder-config-name firefly_gan_vq |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,211 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## 命令行推理" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### For Windows" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "bat" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"!chcp 65001" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### For Linux" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import locale\n", | ||
"locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## API Client\n", | ||
"\n", | ||
"需要在终端开启API Server\n", | ||
"\n", | ||
"> 音频用本地路径\n", | ||
"\n", | ||
"> 文本可以直接用路径,也可以用内容" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "shellscript" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"!python -m tools.post_api \\\n", | ||
" --text \"Hello everyone, I am an open-source text-to-speech model developed by Fish Audio.\" \\\n", | ||
" --reference_audio \"D:\\PythonProject\\原神语音中文\\胡桃\\vo_hutao_draw_appear.wav\" \\\n", | ||
" --reference_text \"D:\\PythonProject\\原神语音中文\\胡桃\\vo_hutao_draw_appear.lab\" \\\n", | ||
" --streaming True" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## For Test" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### 0. 下载模型" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!set HF_ENDPOINT=https://hf-mirror.com\n", | ||
"# !export HF_ENDPOINT=https://hf-mirror.com\n", | ||
"!huggingface-cli download fishaudio/fish-speech-1.2 --local-dir checkpoints/fish-speech-1.2/" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### 1. 从语音生成 prompt:\n", | ||
"> 如果你打算让模型随机选择音色, 你可以跳过这一步.\n", | ||
"\n", | ||
"你应该能得到一个 `fake.npy` 文件." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "shellscript" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"## 在此输入你的语音路径:\n", | ||
"src_audio = r\"D:\\PythonProject\\原神语音中文\\胡桃\\vo_hutao_draw_appear.wav\"\n", | ||
"\n", | ||
"!python tools/vqgan/inference.py \\\n", | ||
" -i {src_audio} \\\n", | ||
" --checkpoint-path \"checkpoints/fish-speech-1.2/firefly-gan-vq-fsq-4x1024-42hz-generator.pth\"\n", | ||
"\n", | ||
"from IPython.display import Audio, display\n", | ||
"audio = Audio(filename=\"fake.wav\")\n", | ||
"display(audio)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### 2. 从文本生成语义 token:\n", | ||
"> 该命令会在工作目录下创建 codes_N 文件, 其中 N 是从 0 开始的整数.\n", | ||
"\n", | ||
"> 您可以使用 --compile 来融合 cuda 内核以实现更快的推理" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "shellscript" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"!python tools/llama/generate.py \\\n", | ||
" --text \"人间灯火倒映湖中,她的渴望让静水泛起涟漪。若代价只是孤独,那就让这份愿望肆意流淌。流入她所注视的世间,也流入她如湖水般澄澈的目光。\" \\\n", | ||
" --prompt-text \"唷,找本堂主有何贵干呀?嗯?你不知道吗,往生堂第七十七代堂主就是胡桃我啦!嘶,不过瞧你的模样,容光焕发,身体健康,嗯…想必是为了工作以外的事来找我,对吧?\" \\\n", | ||
" --prompt-tokens \"fake.npy\" \\\n", | ||
" --checkpoint-path \"checkpoints/fish-speech-1.2\" \\\n", | ||
" --num-samples 2\n", | ||
" # --compile" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### 3. 从语义 token 生成人声:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"vscode": { | ||
"languageId": "shellscript" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"!python tools/vqgan/inference.py \\\n", | ||
" -i \"codes_0.npy\" \\\n", | ||
" --checkpoint-path \"checkpoints/fish-speech-1.2/firefly-gan-vq-fsq-4x1024-42hz-generator.pth\"\n", | ||
"\n", | ||
"from IPython.display import Audio, display\n", | ||
"audio = Audio(filename=\"fake.wav\")\n", | ||
"display(audio)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.14" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
Oops, something went wrong.