Connect your knowledge to any RAG system
Simba is an open source, portable KMS (knowledge management system) designed to integrate seamlessly with any Retrieval-Augmented Generation (RAG) system. With a modern UI and modular architecture, Simba allows developers to focus on building advanced AI solutions without worrying about the complexities of knowledge management.
- 🧩 Modular Architecture: Plug in various vector stores, embedding models, chunkers, and parsers.
- 🖥️ Modern UI: Intuitive user interface to visualize and modify every document chunk.
- 🔗 Seamless Integration: Easily integrates with any RAG-based system.
- 👨💻 Developer Focus: Simplifies knowledge management so you can concentrate on building core AI functionality.
- 📦 Open Source & Extensible: Community-driven, with room for custom features and integrations.
Before you begin, ensure you have met the following requirements:
- Python 3.11+ & poetry
- Redis 7.0+
- Node.js 20+
- Git for version control.
- (Optional) Docker for containerized deployment.
install simba-core:
pip install simba-core
Clone the repository and install dependencies:
git clone https://github.com/GitHamza0206/simba.git
cd simba
poetry config virtualenvs.in-project true
poetry install
source .venv/bin/activate
Create a .env
file in the root directory:
OPENAI_API_KEY=your_openai_api_key
REDIS_HOST=localhost
CELERY_BROKER_URL=redis://localhost:6379/0
CELERY_RESULT_BACKEND=redis://localhost:6379/1
create or update config.yaml file in the root directory:
# config.yaml
project:
name: "Simba"
version: "1.0.0"
api_version: "/api/v1"
paths:
base_dir: null # Will be set programmatically
faiss_index_dir: "vector_stores/faiss_index"
vector_store_dir: "vector_stores"
llm:
provider: "openai"
model_name: "gpt-4o-mini"
temperature: 0.0
max_tokens: null
streaming: true
additional_params: {}
embedding:
provider: "huggingface"
model_name: "BAAI/bge-base-en-v1.5"
device: "mps" # Changed from mps to cpu for container compatibility
additional_params: {}
vector_store:
provider: "faiss"
collection_name: "simba_collection"
additional_params: {}
chunking:
chunk_size: 512
chunk_overlap: 200
retrieval:
k: 5
celery:
broker_url: ${CELERY_BROKER_URL:-redis://redis:6379/0}
result_backend: ${CELERY_RESULT_BACKEND:-redis://redis:6379/1}
Run the server:
simba server
Run the frontend:
simba front
Run the parsers:
simba parsers
For CPU:
DEVICE=cpu make build
DEVICE=cpu make up
For NVIDIA GPU with Ollama:
DEVICE=cuda make build
DEVICE=cuda make up
For Apple Silicon:
# Note: MPS (Metal Performance Shaders) is NOT supported in Docker containers
# For Docker, always use CPU mode even on Apple Silicon:
DEVICE=cpu make build
DEVICE=cpu make up
Run with Ollama service (for CPU):
DEVICE=cpu ENABLE_OLLAMA=true make up
Run in background mode:
# All commands run in detached mode by default
For detailed Docker instructions, see the Docker deployment guide.
- 💻 pip install simba-core
- 🔧 pip install simba-sdk
- 🌐 www.simba-docs.com
- 🔒 Adding Auth & access management
- 🕸️ Adding web scraping
- ☁️ Pulling data from Azure / AWS / GCP
- 📚 More parsers and chunkers available
- 🎨 Better UX/UI
Contributions are welcome! If you'd like to contribute to Simba, please follow these steps:
-
Fork the repository.
-
Create a new branch for your feature or bug fix.
-
Commit your changes with clear messages.
-
Open a pull request describing your changes.
For support or inquiries, please open an issue 📌 on GitHub or contact repo owner at Hamza Zerouali