This repository contains a movie recommender application that provides personalized movie recommendations based on user preferences and text-based queries. The system also integrates a chatbot for interactive recommendations.
- User-Based Collaborative Filtering: Recommends movies based on a user's rating history compared to other users.
- Content-Based Filtering: Suggests movies with plots similar to user-provided text input.
- Interactive Chatbot: Engages with users to provide recommendations and contextual movie-related information.
The MovieRecommender class leverages Retrieval-Augmented Generation (RAG) techniques to retrieve movies with relevant plots and descriptions. It utilizes SentenceTransformers embeddings to store and search movie descriptions efficiently within a vector database.
The Chatbot class processes user queries, incorporating chat history to generate contextual responses. It employs the Deepseek-R1 model, optimized and served via Ollama, for inference.
The backend is powered by FastAPI, exposing REST endpoints to facilitate interaction between the frontend and recommendation models.
The user interface is built with Streamlit, enabling an interactive and user-friendly experience for exploring recommendations and interacting with the chatbot.
- Build a vector database and integrate RAG using a small dataset.
- Integrate an LLM-powered chatbot for user interaction.
- Develop FastAPI endpoints to handle API requests.
- Create a functional Streamlit UI.
- Integrate the MovieLens dataset to enrich user preference data.
- Scrape additional movie elements (actors, plots, genres, etc.) to enhance recommendations.
- Transition from CSV-based storage to SQL and vector databases for better scalability.
- Implement an NER (Named Entity Recognition) model to extract keywords from user queries, enhancing RAG-based retrieval.
- Develop a query/response validation or self-reflection module to avoid chatbot hallucinations.
- Implement collaborative filtering for personalized recommendations.
- Apply clustering techniques to group similar users and movies.
- Improve plot chunking strategies to enhance retrieval accuracy.
- Experiment with incorporating additional movie metadata into descriptions.
- Further refinements (To Be Determined).
Ensure you have the following installed:
python == 3.11
fastapi
streamlit
langchain_community
sentence-transformers
ollama (for Deepseek-R1 inference)
chromadb as vector database
🚧 Coming Soon 🚧
For any questions, feedback or ideas, feel free to reach out by email!