Sentiment-Analysis-Project

40.220 The Analytics Edge Twitter Sentiment Analysis Kaggle Competition

Details

Project Title: Twitter Sentiment Analysis Kaggle Competition

Description: The task is to develop an algorithm that determines what sort of weather the tweets reference. Specifically, the challenge is to determine whether a tweet has a negative, neutral, or positive sentiment. The following datasets are provided:

• train.csv: 22,500 tweets with the corresponding classification / sentiment. The integers 1, 2, and 3 indicate negative, neutral, and positive sentiment, respectively.

• test.csv: 7,500 tweets. Naturally, this dataset has no labels. It will be used to quantify the performance of the algorithms.

The performance of the algorithms will be then evaluated based on their capability of classifying correctly the sentiment of each tweet in the test dataset. In particular, the evaluation will be based on the accuracy metric, defined as the ratio between the number of correctly-classified samples and the total number of samples. Kaggle will calculate the value of the accuracy on two subsets of the test dataset, named public and private. The results on the public dataset will be available during the competition (public leaderboard), while the results on the private one will be available at the end of the competition (private leaderboard).

Team 7 Members

Lee Min Shuen (1004244)
Sim Wei Xuan, Samuel (1004657)
Muhammad Hazwan Bin Mohamed Hafiz (1004122)

Dependencies

R version 4.0.4
Python 3.8 (Using reticulate for R-Python interoperability for the Keras library)
Keras 2.7.0
Tensorflow 2.4.0

Note when loading reticulate in R, rminiconda will be installed by default to load a python environment. Of which follow the instructions to install keras and tensorflow in the r-reticulate conda environment.

In conda prompt, type:

$ conda activate r-reticulate
$ pip install tensorflow==2.4.0
$ pip install keras

If python is already installed in the system, you can force reticulate to use python by setting the path.

In R console, type:

reticulate::use_python("path to python.exe",required=T)

Then to install tensorflow and keras, in command prompt type:

$ pip install tensorflow==2.4.0
$ pip install keras

How to Run

We have already created our own word embeddings specific to our dataset from pre-trained word embeddings using Word Embedding Weights/create_word_embeddings.R.
For model training, open Final_Submission.Rmd, which has all our steps inside.
For viewing, open Final_Submission.html for a readable html webpage.

Directories

├───Data
└───Word Embedding Weights
    └───Pre-Trained Weights

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Data		Data
Word Embedding Weights		Word Embedding Weights
.gitignore		.gitignore
Data Competition Deliverables.pdf		Data Competition Deliverables.pdf
Final_Submission.Rmd		Final_Submission.Rmd
Final_Submission.html		Final_Submission.html
LICENSE		LICENSE
README.md		README.md
Team 7 Final Report.pdf		Team 7 Final Report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment-Analysis-Project

Details

Team 7 Members

Dependencies

How to Run

Directories

About

Releases

Packages

Languages

License

Samthesimpsons/Project-Twitter-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Sentiment-Analysis-Project

Details

Team 7 Members

Dependencies

How to Run

Directories

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages