Deploying DeepSeek R1 Models Using Modal and OpenWebUI

Introduction

Deploying machine learning models can be complex, especially when dealing with varying model sizes and ensuring scalability. The GitHub repository by mohsiniscoding aims to simplify this process by providing a production-ready setup for deploying DeepSeek R1 models using Modal, Ollama and OpenWebUI.

Tools Overview

Modal: A cloud platform designed for deploying machine learning models, offering flexibility in scaling resources based on model size.
OpenWebUI: Developed by togethercomputer, it is an open-source project focused on creating user-friendly interfaces for interacting with AI models.
Ollama: A lightweight, easy-to-use, and open-source server for running large language models.

Key Features of the Repository

Multi-model Support:
- The repository supports models ranging from 1.5B to 671B parameters, accommodating various deployment needs.
Persistent Storage:
- Utilizes Modal volumes to cache models persistently across restarts, reducing the need for repeated downloads and saving time.
GPU Optimization:
- Offers support for different GPU options (T4 and A100), ensuring that larger models with higher parameter counts can be efficiently deployed.
Enterprise Readiness:
- Includes features like logging, error handling, and a secure and stable production environment.

Deployment Steps

Clone the Repository:

git clone https://github.com/mohsiniscoding/deepseekr1-openwebui.git
cd deepseekr1-openwebui

Install Dependencies (locally, if needed):
- Install modal
```
 pip install modal  
```
Configure the Model:
- Edit handler.py to set your desired model under the MODEL constant.
- Ensure that the model exists in the DEEPSEEK_R1_MODELS mapping.

Deploy Using Modal:

modal token new  # If not already authenticated
modal serve handler.py

Access OpenWebUI:
- Once deployed, Modal provides a public URL accessible (even prior to launching ollama or openwebui)

Troubleshooting

Ollama Issues: If Ollama fails to start or models take too long to pull, consider increasing the time.sleep() delay in handler.py.
GPU Configuration: Ensure your GPU is properly configured and meets the VRAM requirements for your model size.
Secrets Management: Verify that all secrets and environment variables are correctly set up for seamless operation.

Future Enhancements

Determine GPU Configurations:
- Investigate GPU settings for models beyond 70B parameters to optimize hardware usage.
Session Persistence:
- Develop features to save chat sessions and maintain OpenWebUI sessions across system restarts.
Additional Features:
- Explore adding more functionalities as needed, enhancing the deployment experience further.

Contributing

To contribute to this project:

Fork the repository.
Create a new branch from main.
Make your changes.
Commit and push your branch.
Open a pull request for review.

This setup provides a robust framework for deploying DeepSeek R1 models in production, leveraging Modal's scalability and OpenWebUI's interface capabilities. While some specifics like model configuration details may require further exploration, the repository offers a solid foundation for deployment needs.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
handler.py		handler.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deploying DeepSeek R1 Models Using Modal and OpenWebUI

Introduction

Tools Overview

Key Features of the Repository

Deployment Steps

Troubleshooting

Future Enhancements

Contributing

About

Releases

Packages

Languages

License

mohsiniscoding/deepseekr1-openwebui

Folders and files

Latest commit

History

Repository files navigation

Deploying DeepSeek R1 Models Using Modal and OpenWebUI

Introduction

Tools Overview

Key Features of the Repository

Deployment Steps

Troubleshooting

Future Enhancements

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages