Project 1 - Analyse Twitter reaction vs CDC surveillance report on influenza activity in US States. Repo..
The project fetches the tweets on influenza and plots the heat map to compare how twitter reacted on influenza affected States. The complete project is implemented in R with the help of twitteR and geocode API for collecting tweets.
Project 2 - Sentiment Analysis on Gun Violence using Hadoop. Repo..
Performed Sentiment analysis of People on gun violence on Twitter data and compared that with NYTimes articles. Hadoop is used to perform the word count and co-occurance of top words in two sets of data. I have used d3 for word-could and python to implement mapper and reducer of Hadoop framework.
Project 3 - Document Classification using Spark Infrastructure. Repo..
News articles can be from different categories like sports, business, etc. This project uses Spark infrastructure with machine learning to predict the category of articles. The first step is to train our model using the training set, test it, and finally predict the unknow set of articles and evaluate the performance of trained model.
DocumentClassification Python Code
Prediction Result: