--compile crack #860

ardentillumina · 2025-01-26T23:02:45Z

Self Checks

This template is only for bug reports. For questions, please visit Discussions.
I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文日本語 Portuguese (Brazil)
I have searched for existing issues, including closed ones. Search issues
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Self Hosted (Source)

Environment Details

Windows11 - WSL (ubuntu)

AMD 5090X + 128G RAM
RTX 3090 + 24G vRAM

Followed all the documents.

conda python 3.10
pip install -e .

Steps to Reproduce

python -m tools.api_server \
    --listen 0.0.0.0:8080 \
    --llama-checkpoint-path "checkpoints/fish-speech-1.5" \
    --decoder-checkpoint-path "checkpoints/fish-speech-1.5/firefly-gan-vq-fsq-8x1024-21hz-generator.pth" \
    --decoder-config-name firefly_gan_vq \
    --compile

✔️ Expected Behavior

launch an API server locally.

❌ Actual Behavior

(fish-speech) xxx@win11:/mnt/d/lab/fish-speech$ ./server.sh
INFO:     Started server process [1354]
INFO:     Waiting for application startup.
2025-01-26 22:27:15.968 | INFO     | fish_speech.models.text2semantic.inference:load_model:681 - Restored model from checkpoint
2025-01-26 22:27:15.968 | INFO     | fish_speech.models.text2semantic.inference:load_model:687 - Using DualARTransformer
2025-01-26 22:27:15.968 | INFO     | fish_speech.models.text2semantic.inference:load_model:695 - Compiling function...
2025-01-26 22:27:16.587 | INFO     | tools.server.model_manager:load_llama_model:99 - LLAMA model loaded.
/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:445: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/vector_quantize_pytorch/vector_quantize_pytorch.py:630: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/vector_quantize_pytorch/finite_scalar_quantization.py:147: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/vector_quantize_pytorch/lookup_free_quantization.py:209: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
  @autocast(enabled = False)
Uncaught exception in compile_worker subprocess
Traceback (most recent call last):
  File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/torch/_inductor/compile_worker/__main__.py", line 38, in main
    pre_fork_setup()
  File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/torch/_inductor/async_compile.py", line 62, in pre_fork_setup
    from triton.compiler.compiler import triton_key
  File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/triton/__init__.py", line 8, in <module>
    from .runtime import (
  File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/triton/runtime/__init__.py", line 1, in <module>
    from .autotuner import (Autotuner, Config, Heuristics, OutOfResources, autotune, heuristics)
  File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/triton/runtime/autotuner.py", line 7, in <module>
    from ..testing import do_bench
  File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/triton/testing.py", line 7, in <module>
    from . import language as tl
  File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/triton/language/__init__.py", line 6, in <module>
    from .standard import (
2025-01-26 22:27:22.006 | INFO     | fish_speech.models.vqgan.inference:load_model:46 - Loaded model: <All keys matched successfully>
2025-01-26 22:27:22.006 | INFO     | tools.server.model_manager:load_decoder_model:107 - Decoder model loaded.
2025-01-26 22:27:22.016 | INFO     | fish_speech.models.text2semantic.inference:generate_long:788 - Encoded text: Hello world.
2025-01-26 22:27:22.016 | INFO     | fish_speech.models.text2semantic.inference:generate_long:806 - Generating sentence 1/1 of sample 1/1
  0%|                                                                                                         | 0/1023 [00:00<?, ?it/s]/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/contextlib.py:103: FutureWarning: `torch.backends.cuda.sdp_kernel()` is deprecated. In the future, this context manager will be removed. Please see `torch.nn.attention.sdpa_kernel()` for the new context manager, with updated signature.
  self.gen = func(*args, **kwds)
  0%|                                                                                                         | 0/1023 [00:16<?, ?it/s]
ERROR:    Traceback (most recent call last):
  File "/home/xxx/miniconda3/envs/fish-speech/lib/python3.10/site-packages/kui/asgi/lifespan.py", line 36, in __call__
    await result
  File "/mnt/d/lab/fish-speech/tools/api_server.py", line 78, in initialize_app
    app.state.model_manager = ModelManager(
  File "/mnt/d/lab/fish-speech/tools/server/model_manager.py", line 65, in __init__
    self.warm_up(self.tts_inference_engine)
  File "/mnt/d/lab/fish-speech/tools/server/model_manager.py", line 121, in warm_up
    list(inference(request, tts_inference_engine))

if I removed the --compile

it works, but I think I need to use the --compile to enhance the toks/s...

The text was updated successfully, but these errors were encountered:

ardentillumina added the bug Something isn't working label Jan 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--compile crack #860

--compile crack #860

ardentillumina commented Jan 26, 2025 •

edited

Loading

--compile crack #860

--compile crack #860

Comments

ardentillumina commented Jan 26, 2025 • edited Loading

Self Checks

Cloud or Self Hosted

Environment Details

Windows11 - WSL (ubuntu)

Followed all the documents.

Steps to Reproduce

✔️ Expected Behavior

❌ Actual Behavior

ardentillumina commented Jan 26, 2025 •

edited

Loading