Enterprise Search

Enterprise Search is a Retrieval-Augmented Generation (RAG) system designed for efficient local information retrieval on document collections. It uses vector search techniques with language models to provide context-aware answers to your queries.

Enterprise Search is designed to be dynamic and adaptable to fit in both development and production workflows. It offers a RestFul API for indexing and querying over document collections, making it ideal for businesses and developers seeking to deploy the question-answering solutions in their infrastructure.

High level overview of user interaction with the Enterprise Search system

🚀 Key Features

Enterprise Search - Key Features

🛠️ Prerequisites

Before setting up Enterprise Search, ensure you have:

Python 3.9 or higher
Docker and Docker Compose
CUDA 11 or higher (for GPU acceleration)

Note: Depending on your Docker Compose version, you may need to use docker-compose (with a hyphen) instead of docker compose.

⚙️ Configuration

Rename .env.example to .env and update the values to match your setup.
Update the configuration in config/config.dev.yaml. Default settings are defined in llamasearch/settings.py:

application: Application settings
vector_store_config: Qdrant settings for vector storage
qdrant_client_config: Qdrant client connection settings
redis_config: Redis settings for document store and cache
embedding: Embedding model configuration (uses model from HuggingFace)
llm: Language model configuration (uses model from ollama/openai)
reranker: Reranker model settings (uses model from HuggingFace)

Setup LLM of your choice.

Open-Source Option: Ollama

Run Ollama docker:: Use the docker/docker-compose-ollama.yml file to run Ollama:

docker-compose -f docker/docker-compose-ollama.yml up -d

The docker container will pull models mentioned in the config/config.dev.yaml on startup. It may take few minutes to download the models. Check ollama docker logs for progress.

External Providers Option: OpenAI

To use OpenAI's proprietary models, set OPENAI_API_KEY in .env file.

Set up OpenAI API key: Export your OpenAI API key:

export OPENAI_API_KEY=your_api_key_here

Configure for OpenAI:: Update the config/config.dev.yaml file to use an OpenAI model, set use_openai flag to True. Check the openai models list.

🚀 Quick Start

Option 1: Jupyter Notebook

Clone the repository and set up the environment:

conda create --name es_env python=3.9
conda activate es_env
pip install -r requirements.txt

Start Jupyter Notebook:

pip install jupyter
jupyter notebook

Open the quick_start.ipynb file in your browser and follow the step-by-step instructions to set up and test the pipeline.

Option 2: Command Line Interface

Set up the environment (same as above)
Configure the application:

Modify config/config.dev.yaml to match your setup.

Start the redis and qdrant services:

docker-compose -f docker/docker-compose.yml up -d redis qdrant

Run the pipeline::

python -m llamasearch.pipeline

The pipeline loads documents from application->data_path defined in config file, processes and indexes them on startup. Enter your query when prompted. Results will be displayed in the terminal.

🌐 API

We provide a RESTful API for document indexing, querying, and management. Follow steps to test the pipeline and backend server (API) using curl locally.

Default API settings are defined in llamasearch/api/core/config.py. These settings can be customized using environment variables defined in .env file in the project root.

Important: When deploying to production, ensure you set appropriate values related to server, authentication.

Build the Docker Image: Run the following command to build the Docker image.

docker build -t es:latest -f docker/Dockerfile .

Authentication: Update FIREBASE_CREDENTIALS_PATH to point to your firebase credentials file in .env file for user authentication. Refer to Firebase README for instructions.

Note: Currently, we only support testing API endpoints with authentication enabled. A firebase account is required to test the API endpoints.

Setup LLM: Setup the LLM of your choice (if you haven't already) as mentioned in the configuration section. Ensure the LLM service is running.
Run the docker image: Adjust docker mount points in the docker/docker-compose.yml file to point to match your local setup. It will run the API server on port 8010 by default.

docker-compose -f docker/docker-compose.yml up -d

Test API: For API usage examples, including request and response formats, curl request examples and more, please refer to our API Documentation.

🧪 Testing

We use pytest for testing. To run the test suite:

Ensure you're in the project root directory.
Start the API server as stated in the API section.
In another terminal, set up the Python path:

export PYTHONPATH=$PYTHONPATH:$(pwd)

Run the tests:

pytest

For more detailed testing instructions, including how to run specific tests, please refer to our Testing Guide.

📊 Evaluation

Enterprise Search includes a robust evaluation module based on DeepEval to assess the RAG pipeline's performance:

🧪 Synthetic dataset generation for simulating your documents
📈 Industry standard metrics including faithfulness, relevancy, and coherence

For step-by-step instructions on running evaluations, see our Evaluation Guide.

🚀 Deployment

Enterprise Search can be deployed using Kubernetes and Helm. Here's a high-level overview of the deployment process:

Build and push the Docker image to your docker registry:

docker build -t es:latest .
docker push es:latest

Configure your Kubernetes cluster and ensure kubectl is set up correctly.
Update the k8s/values.yaml file with your configuration settings (namespace, service names, etc.).
Deploy using Helm:

cd k8s/
helm install enterprise-search . --values values.yaml

Monitor the deployment:

kubectl get pods,svc -n {{YOUR_NAMESPACE}}

For detailed deployment instructions, please refer to our Deployment Guide.

🖥️ UI

We provide an experimental frontend application built with Next.js 14 (app router) to interact with the Enterprise Search API.

For detailed setup instructions and usage guidelines, refer to the README.

📄 License

This project is licensed under the SOFTWARE LICENCE AGREEMENT - see the LICENSE file for details.

🙏 Acknowledgements

Enterprise Search project is built on top of valuable open source projects. We'd like to acknowledge the following projects and their contributors:

LlamaIndex for a stable foundation for RAG capabilities with wide array of integrations
Deepeval for the RAG evaluation framework
Qdrant for the vector database functionality
FastAPI for the high-performance web framework
Ollama for local LLM inference
Redis for caching and document storage
Docker for containerization
Kubernetes for orchestration

Name		Name	Last commit message	Last commit date
Latest commit History 247 Commits
.dvc		.dvc
assets		assets
config		config
data		data
docker		docker
docs		docs
frontend		frontend
k8s		k8s
llamasearch		llamasearch
scripts		scripts
tests		tests
.dvcignore		.dvcignore
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
dvc.yaml		dvc.yaml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
quick_start.ipynb		quick_start.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enterprise Search

📚 Table of Contents

🚀 Key Features

🛠️ Prerequisites

⚙️ Configuration

Open-Source Option: Ollama

External Providers Option: OpenAI

🚀 Quick Start

Option 1: Jupyter Notebook

Option 2: Command Line Interface

🌐 API

🧪 Testing

📊 Evaluation

🚀 Deployment

🖥️ UI

📄 License

🙏 Acknowledgements

About

Releases

Packages

Languages

License

deval-shah/enterprise-search

Folders and files

Latest commit

History

Repository files navigation

Enterprise Search

📚 Table of Contents

🚀 Key Features

🛠️ Prerequisites

⚙️ Configuration

Open-Source Option: Ollama

External Providers Option: OpenAI

🚀 Quick Start

Option 1: Jupyter Notebook

Option 2: Command Line Interface

🌐 API

🧪 Testing

📊 Evaluation

🚀 Deployment

🖥️ UI

📄 License

🙏 Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages