Skip to content

Latest commit

 

History

History
341 lines (228 loc) · 16 KB

README.md

File metadata and controls

341 lines (228 loc) · 16 KB

Movie Recommendation System:

— Web App with Advanced Graph and Computing Method

Introduction

The primary goal is to develop a robust Movie Recommendation System that provides users with personalized movie recommendation based on their previous movie rating and movie reference. The system will consist of a user-friendly frontend for interaction and a powerful backend for processing and generating recommendations.

Dataset:

Netflix Prize Dataset:
The Netflix Prize dataset was a famous dataset released by Netflix for a competition to improve the accuracy of their movie recommendation system.

TMDB 5000 Dataset:
The Movie Database (TMDB), a popular, user-editable database for movies and TV shows.It includes a wide range of data such as titles, genres, release dates, budgets, revenues, production companies, countries, vote counts, and average vote scores.

Modules and Contributors

We divided the project into four modules:

Note: Considering this is a comprehensive project including frontend, backend and recommendation module, we created four different branches, each of which contains the related code. Thus, there is no code in main branch and please click on the links above to go to the corresponding branch and review the codes.

Team members: Samuel Wang, Shengyi Liu, Rachel Huang, Zoey Zhang, Zitong Li, Guodong Sun

Language and tools Used:

  • Frontend: HTML/CSS/Javascript, React, Tailwind CSS
  • Backend: Python,Node.js with Express.js
  • Database:MongoDB, Neo4j Graph Database, JSON Format
  • Rcomendation System: Neo4j Advanced Knowledge Graph, Collaborative Filtering, Cosine Similarity, Fuzzy Matching, Scikit-learn(Old version)

👉🏽 For this project, we will be using:

How to run our codes

Installation

Python, Node.js, mongoDB, Neo4j Graph Database

Database preparation

- Create a new DBMS and modify recommender_graph.py in backend-module with your own credentials
- Copy all the .csv files in backend-module and paste them into the 'import' folder of the DBMS 
- Execute recommender_graph.py to load data to graph database

Backend

git clone https://github.com/samuelusc/CSCI596-Project.git
git checkout backend-module
cd user
npm install
node app.js

Frontend

git clone https://github.com/samuelusc/CSCI596-Project.git
git checkout frontend-module
npm install
npm start

Note: There could be some exceptions for certain movies since the API we used might not provide all the movie info in our database.

Product develpment feature map


Product Development Feature Map

*Evaluation Metrics: Used in the previous version, the evaluation matrix will be integrated into the latest version in the future.

Table of Four modules

Recommender

Design Objects

Our recommendation module primarily aims to solve two main problems:

  1. How to enable new users to quickly discover movies they'll love.
  2. How to effectively increase the engagement of our existing users.

Four Core Strategies

Relevance: Offer movie recommendations as closely aligned as possible with user preferences and needs.

Novelty: Suggest films that users might not have encountered but are likely to find intriguing.

Serendipity: Ensure that our recommendations exceed user expectations, creating a sense of surprise and delight.

Diversity: Provide a diverse range of recommended genres to cater to the varied tastes and requirements of our users.

Four strategies

Measures and Features

Personalized Recommendations: Utilizing machine learning algorithms, we provide individualized suggestions based on a user's search history, viewing history, and rating data.

New User Questionnaire: New users are asked to complete a brief interest survey or rate movies during registration, which will allow us to quickly understand their preferences.

Interactive Interface: An intuitive and user-friendly interface is designed to make it easier for users to discover and explore new movies.

Intelligent Sorting: Movies are sorted to prominently display those that are likely to align with a user's tastes.

Editor's Picks: We showcase a list of movies recommended by editors or based on popular trends.

Tagging System: Movies are categorized using tags such as genre, mood, director, or actors, enabling users to swiftly filter according to their interests.

User Reviews: Displaying other users' ratings and reviews helps new users discover popular movies.

Possible Tech Stacks

Latest Version:

Pandas: For data handling and analysis.

Neo4j Database: Using Neo4j, a advanced graph database, to store and manage data.

Dynamic Query Building: Constructs Cypher queries based on user input, such as filtering movies by genre or calculating similarity.

Cold Start Problem Handling: The user interacts with the system through the command line, inputting data and receiving recommendations.

Fuzzy Matching for Movie Titles: To handle partial or imprecise movie title inputs, the script employs a fuzzy matching technique.

Cosine Similarity for User Similarity: Using Pearson Correlation Coefficient. The Pearson correlation coefficient is used to calculate the similarity between different movies. The movies are represented as vectors of pre-collected user review ratings. For each movie the correlation coefficients of the rating vector with vectors of other movies are collected and sorted. The recommneded movies are selected per largest correlation coefficients.

Collaborative Filtering for Recommendations: a user behavior-based collaborative filtering recommendation system, specifically for movie recommendations. This system identifies movies to recommend by analyzing user ratings, finding users with similar movie rating habits, and basing suggestions on the preferences of these similar users.


Old Version:

Scikit-surprise or scikit-learn: A python scikit we used to build and analyze recommender systems. It provides some efficient collaborative filtering algorithms, including user-based collaborative filtering, item-based collaborative filtering, and matrix factorization algorithms.

SVD (Singular Value Decomposition / matrix factorization ): It’s a powerful matrix factorization technique used for collaborative filtering. This algorithm identifies latent features by decomposing the user-item rating matrix.

Movie Recommendation System Flowchart:


flowchart-recommender module traning

The matrix factorization illustration:

(Matrix image sourced from Buomsoo-kim) matrix picture

Method: A movie matrix is assembled based on collected data. Each column of the matrix represents the review pattern of all reviewers of a certain movie. For each column, the correlation coefficients are calculated with all other columns and the columns with highest coefficients are recorded and the movies represented by these columns shall be taken as recommended movie.

matrix formular

  • User Matrix: X = (x1, x2, x3…, xn)
  • Item matrix: Y = (y1, y2, y3…, ym)

Evaluation Metrics :

  • Personalized Picks: Suggesting 5 movies tailored to individual user preferences.
  • Related Discoveries: Presenting 4 related movies based on user input, using advanced filtering methods.
  • Trending Now: Showcasing the top 3 trending movies to keep users engaged with popular content.

Part of Output Test The interactive interfaces are used for user to input any movies for recommendation. The fuzzywuzzy module is used to map user input to one of the movies in MovieMat, and then the interactive interface shows the recommended movies. A sample input/output result is shown as below.

Relevance Recommendation

Evaluation metrics:

Old Version

  • MSE: The average squared difference between the predicted and actual values.
  • RMSE: Taking the square root of the mean squared error (MSE).
  • Precision : True Positive / (True Positive + False Positive)
  • Recall: True Positive / (True Positive + False Negative)

Latest Demo

Create New User 999111 Graph_recommender_demo1

Cold Star For New User Graph_recommender_demo2

Graph Representation For User 999111 Graph_recommender_demo2

Frontend

Description: The frontend mainly includes the following pages:

  • User Sign In/Sign Up/Forget Password
  • Home page displaying top rated movies, recommended movies and providing searching functionality
  • Single movie page displaying the basic information of the movie, review(0-5 stars) and related movies

Tech Stacks:

  • React.js
  • Tailwind CSS

Preview: image9 image2 image6

Backend

Features:

  • User sign up, user sign in, email verification, reset password
  • Send movie information to the frontend (user ID, review, recommendation list, popular movie)
  • Send search engine data to the frontend

Tech Stacks:

  • Node.js
  • Express.js

Testing with Postman

Create User:

Screenshot 2023-11-23 at 5 18 41 PM

Mailtrap—Email Verification:

Screenshot 2023-11-23 at 5 19 16 PM

Get User ID from MongoDB:

Screenshot 2023-11-23 at 5 20 15 PM

Email Verification:

Screenshot 2023-11-23 at 5 20 26 PM

Get List of Top Rated Movies:

Screenshot 2023-12-12 at 7 44 20 PM

Get List of Related Movies based on a Movie Title:

Screenshot 2023-12-12 at 7 44 42 PM

Get Movie Rating:

Screenshot 2023-12-12 at 7 44 53 PM

Get Search Engine Results:

Screenshot 2023-12-12 at 7 45 09 PM

Get List of Movie Recommendation for a User:

Screenshot 2023-12-12 at 7 45 21 PM

Database used

  • MongoDB
  • Neo4j Graph Database

Things Stored in Database

  • User information
  • Movie reviews
  • Pre-trained result for movie recommendation

Movie Detail API

Request movie details (movie title, movie overview, movie poster, etc.) from TMDB.

Example Response

{
  adult: false,
  backdrop_path: '/bckxSN9ueOgm0gJpVJmPQrecWul.jpg',
  genre_ids: [ 28, 12, 14 ],
  id: 572802,
  original_language: 'en',
  original_title: 'Aquaman and the Lost Kingdom',
  overview: "Black Manta, still driven by the need to avenge his father's death and wielding the power of the mythic Black Trident, will stop at nothing to take Aquaman down once and for all. To defeat him, Aquaman must turn to his imprisoned brother Orm, the former King of Atlantis, to forge an unlikely alliance in order to save the world from irreversible destruction.",
  popularity: 253.712,
  poster_path: '/8xV47NDrjdZDpkVcCFqkdHa3T0C.jpg',
  release_date: '2023-12-20',
  title: 'Aquaman and the Lost Kingdom',
  video: false,
  vote_average: 0,
  vote_count: 0
}

Database presentation

Display the graph database interface graph_database_presentation

Present the relationship network by movie keywords graph_database_keyword_presentation

Show the relationship network by movie productors graph_database_productor

Final Demo

Home page

Home_wo_login

Sign up

Sign_up

Search result

The result after we input "iron man". Search_result

Single movie page

We can see the related movies provided by recommendation system. Single_movie

Rate movie

Rate_movie

Recommended movies

We rated Iron Man 3, Iron Man 2 and The Avengers 5 stars. The recommendation system gave us other related sci-fi movies. Recommendation