Skip to content

Commit

Permalink
Merge pull request #244 from ahmetoner/upgrade-whisper-v20240930
Browse files Browse the repository at this point in the history
Upgrade OpenAI Whisper to v20240930 (turbo)
  • Loading branch information
ahmetoner authored Oct 6, 2024
2 parents 005c7e9 + a6169fb commit b3daab8
Show file tree
Hide file tree
Showing 10 changed files with 72 additions and 386 deletions.
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,15 @@ Changelog
Unreleased
----------

### Changed

- Upgraded
- [openai/whisper](https://github.com/openai/whisper)@[v20240930](https://github.com/openai/whisper/releases/tag/v20240930)
- fastapi to v0.115.0
- uvicorn to v0.31.0
- tqdm to v4.66.5
- python-multipart to v0.0.12

[1.5.0] (2024-07-04)
--------------------

Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -63,4 +63,4 @@ RUN poetry install

EXPOSE 9000

ENTRYPOINT ["gunicorn", "--bind", "0.0.0.0:9000", "--workers", "1", "--timeout", "0", "app.webservice:app", "-k", "uvicorn.workers.UvicornWorker"]
ENTRYPOINT ["whisper-asr-webservice"]
2 changes: 1 addition & 1 deletion Dockerfile.gpu
Original file line number Diff line number Diff line change
Expand Up @@ -81,4 +81,4 @@ RUN $POETRY_VENV/bin/pip install torch==1.13.1+cu117 -f https://download.pytorch

EXPOSE 9000

CMD gunicorn --bind 0.0.0.0:9000 --workers 1 --timeout 0 app.webservice:app -k uvicorn.workers.UvicornWorker
CMD whisper-asr-webservice
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Whisper is a general-purpose speech recognition model. It is trained on a large

Current release (v1.5.0) supports following whisper models:

- [openai/whisper](https://github.com/openai/whisper)@[v20231117](https://github.com/openai/whisper/releases/tag/v20231117)
- [openai/whisper](https://github.com/openai/whisper)@[v20240930](https://github.com/openai/whisper/releases/tag/v20240930)
- [SYSTRAN/faster-whisper](https://github.com/SYSTRAN/faster-whisper)@[v1.0.3](https://github.com/SYSTRAN/faster-whisper/releases/tag/1.0.3)

## Quick Usage
Expand Down
33 changes: 30 additions & 3 deletions app/webservice.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
import importlib.metadata
import os
from os import path
from typing import Annotated, BinaryIO, Union
from typing import Annotated, BinaryIO, Optional, Union
from urllib.parse import quote

import click
import ffmpeg
import numpy as np
import uvicorn
from fastapi import FastAPI, File, Query, UploadFile, applications
from fastapi.openapi.docs import get_swagger_ui_html
from fastapi.responses import RedirectResponse, StreamingResponse
Expand All @@ -14,9 +16,9 @@

ASR_ENGINE = os.getenv("ASR_ENGINE", "openai_whisper")
if ASR_ENGINE == "faster_whisper":
from .faster_whisper.core import language_detection, transcribe
from app.faster_whisper.core import language_detection, transcribe
else:
from .openai_whisper.core import language_detection, transcribe
from app.openai_whisper.core import language_detection, transcribe

SAMPLE_RATE = 16000
LANGUAGE_CODES = sorted(tokenizer.LANGUAGES.keys())
Expand Down Expand Up @@ -122,3 +124,28 @@ def load_audio(file: BinaryIO, encode=True, sr: int = SAMPLE_RATE):
out = file.read()

return np.frombuffer(out, np.int16).flatten().astype(np.float32) / 32768.0

@click.command()
@click.option(
"-h",
"--host",
metavar="HOST",
default="0.0.0.0",
help="Host for the webservice (default: 0.0.0.0)",
)
@click.option(
"-p",
"--port",
metavar="PORT",
default=9000,
help="Port for the webservice (default: 9000)",
)
@click.version_option(version=projectMetadata["Version"])
def start(
host: str,
port: Optional[int] = None
):
uvicorn.run(app, host=host, port=port)

if __name__ == "__main__":
start()
2 changes: 1 addition & 1 deletion docs/build.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ poetry install
Starting the Webservice:

```sh
poetry run gunicorn --bind 0.0.0.0:9000 --workers 1 --timeout 0 app.webservice:app -k uvicorn.workers.UvicornWorker
poetry run whisper-asr-webservice --host 0.0.0.0 --port 9000
```

### Build
Expand Down
2 changes: 1 addition & 1 deletion docs/environmental-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
export ASR_MODEL=base
```

Available ASR_MODELs are `tiny`, `base`, `small`, `medium`, `large` (only OpenAI Whisper), `large-v1`, `large-v2` and `large-v3`.
Available ASR_MODELs are `tiny`, `base`, `small`, `medium`, `large`, `large-v1`, `large-v2`, `large-v3`, `turbo`(only OpenAI Whisper) and `large-v3-turbo`(only OpenAI Whisper).

For English-only applications, the `.en` models tend to perform better, especially for the `tiny.en` and `base.en` models. We observed that the difference becomes less significant for the `small.en` and `medium.en` models.

Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Whisper is a general-purpose speech recognition model. It is trained on a large

Current release (v1.5.0) supports following whisper models:

- [openai/whisper](https://github.com/openai/whisper)@[v20231117](https://github.com/openai/whisper/releases/tag/v20231117)
- [openai/whisper](https://github.com/openai/whisper)@[v20240930](https://github.com/openai/whisper/releases/tag/v20240930)
- [SYSTRAN/faster-whisper](https://github.com/SYSTRAN/faster-whisper)@[v1.0.3](https://github.com/SYSTRAN/faster-whisper/releases/tag/1.0.3)

## Quick Usage
Expand Down
Loading

2 comments on commit b3daab8

@easonwanger
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"--workers 1" should not be removed, inference usually runs in different processes.

@ahmetoner
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@easonwanger I'll reinstate the Gunicorn worker count. Do you also need an environment variable to configure it?

Please sign in to comment.