This project is a standalone backend solution for sentiment analysis of video and audio files. It processes both complete files and segmented audio from one-sided or multi-speaker videos. The system provides sentiment analysis for each segment of the audio or video, including a transcription and its corresponding sentiment for each timestamp.
Developed as one of the main projects for RuxAiLab for GSoC '24, this tool integrates with the RuxAiLab project through RuxAiLab PRs. The pipeline involves:
- Input: Video or audio files are uploaded to the system.
- Pipeline Modules:
- Audio Extraction: Extract audio from video files using MoviePy.
- Transcription: Transcribe the audio using Whisper.
- Sentiment Analysis: Analyze the transcription for sentiment using RoBERTa. Sentiment labels are POS (Positive), NEU (Neutral), and NEG (Negative).
- Output: Generate a detailed report that includes the transcription along with sentiment analysis for each timestamp.
This project aims to enhance RuxAiLab's capabilities by providing detailed sentiment insights along with transcriptions for better understanding and analysis for users of RuxAiLab.
Here is the folder structure of the Sentiment Analysis API project:
sentiment-analysis-api/
├── app/
│ ├── __init__.py # Initializes the app and its components
│ ├── config.py # Configuration for app settings
│ ├── data/ # Data handling module
│ ├── models/ # Contains the models for sentiment analysis
│ ├── routes/ # Defines the routes for the API
│ ├── services/ # Contains the business logic services
│ └── ... # Additional app files
├── docs/ # Documentation files
├── docker-compose.yml # Defines Docker Compose configuration
├── Dockerfile # Defines Docker container setup
├── .env # Environment variables
├── pytest.ini # Pytest configuration
├── README.md # Project documentation
├── CONTRIBUTING.md.md # Project documentation
├── requirements.txt # Lists required Python dependencies
├── run.py # Entry point to run the app
├── samples/ # Sample input files for testing
├── static/ # Static files (e.g., images)
├── tests/ # Contains unit and integration tests
│ ├── coverage/ # Coverage reports
│ ├── integration/ # Integration tests
│ ├── unit/ # Unit tests
│ └── ... # Other test files
└── ... # Other files
This structure helps separate the application logic, configuration files, test files, and Docker-related configurations.
- Clone the Repository
~$ git clone https://github.com/ruxailab/sentiment-analysis-api.git ~$ cd sentiment-analysis-api
- Create the Virtual Environment
- Linux
~/sentiment-analysis-api$ python3 -m venv env
- Windows
### python -m venv env
- Linux
- Activate the virtual environment
- Linux
~/sentiment-analysis-api$ source env/bin/activate
- Windows
#### env\Scripts\activate
- Linux
- Install pip Dependencies
(env) ~/sentiment-analysis-api$ pip install -r requirements.txt
- Install FFmpeg
- Linux
~$ apt-get -y update && apt-get -y upgrade && apt-get install -y --no-install-recommends ffmpeg
- Windows (Win10)
- Follow Tutorial here
- Add FFmpeg to the system path.
- Linux
- Run Flask App
- In Debug Mode [port 8001]
~/sentiment-analysis-api$ python3 -m run
- In Debug Mode [port 8001]
- Run API Documentation
- Access the API documentation at: http://localhost:8001/apidocs
- Install Docker
~$ sudo apt install docker.io ~$ docker --version ## Docker version 26.1.3, build 26.1.3-0ubuntu1~22.04.1
- Clone the Repository
~$ git clone https://github.com/ruxailab/sentiment-analysis-api.git ~$ cd sentiment-analysis-api
- Build Image
~/sentiment-analysis-api$ docker build -t sentiment_analysis_api .
- Start Docker Conatiner (Port 8001)
~/sentiment-analysis-api$ docker run --name sentiment_analysis_api_app -p 8001:8001 -v ./:/sentiment_analysis_app sentiment_analysis_api
- Run API Documentation
- Access the API documentation at: http://localhost:8001/apidocs
- Install Docker Compose
- Clone the Repository
~$ sudo apt install docker-compose-v2 ~$ docker-compose version ## Docker Compose version 2.27.1+ds1-0ubuntu1~22.04.1
- Build Image Using Docker Compose
~/sentiment-analysis-api$ docker compose build
- Start Docker Container
~/sentiment-analysis-api$ docker compose up
- Run API Documentation
- Access the API documentation at: http://localhost:8001/apidocs
You can access the API documentation at http://localhost:8001/apidocs after running the Flask App.
For testing the API endpoints, you can use the following Postman collection:
- Unit Tests
- Run Data Layer unit tests using the following command:
~/sentiment-analysis-api$ coverage run -m pytest ./tests/unit/test_data/
- Run Service Layer unit tests using the following command:
~/sentiment-analysis-api$ coverage run -m pytest ./tests/unit/test_service/
- Run API Layer unit tests using the following command:
~/sentiment-analysis-api$ coverage run -m pytest ./tests/unit/test_routes/
- Run all the unit tests using the following command:
~/sentiment-analysis-api$ coverage run -m pytest ./tests/unit/
- Run Data Layer unit tests using the following command:
- Integration Tests
- Run the integration tests using the following command:
~/sentiment-analysis-api$ coverage run -m pytest ./tests/integration/
- Run the integration tests using the following command:
- Run all the tests
- Run all the tests using the following command:
~/sentiment-analysis-api$ coverage run -m pytest ./tests/
- Run all the tests using the following command:
- View the Coverage Report
- View the coverage report using the following command:
~/sentiment-analysis-api$ coverage report
- View the coverage report in HTML format using the following command:
~/sentiment-analysis-api$ coverage html
- Open the HTML file in the browser:
~/sentiment-analysis-api$ firefox htmlcov/index.html
- Open the HTML file in the browser:
- View the coverage report using the following command:
This repository is part of the Google Summer of Code (GSoC) 2024 program.
- Contributor: Basma Elhoseny
- Mentors: Karine - Marc - Vinícius - Murilo
- GSoC'24 Project Page: Sentiment Analysis API Project GSoC 24 Program
- Progress Tracking Docs: GSOC'24 Project Progress and Follow Up Sheet
- Meetings Presentations: Slides
- Main Reposotory for the Project: sentiment-analysis-api Repo
- Integration to RUXAILAB PR Requests:
- Wikkis:
This software is licensed under the MIT License. See the LICENSE file for more information.