The author implemented logistic regression for topic labelling and applied two feature extractions, Bag-of-Words (CountVectorizer) and TF-IDF (TfidfVectorizer), after which the results for both methods were analysed. The accuracy obtained for both methods were 96%.
The author improved the result of the previous approach by implementing a different machine learning classifer (Support vector machine) using the two previous extractions methods Bag-of-Words (CountVectorizer) and TF-IDF (TfidfVectorizer). The result will be analysed and discussed.
The author will further analyse and critically appraise the performance of logistic regression and support vector machine methods for topic labelling using the same feature extractions, discuss the suitability, advantages and drawbacks of these methods for text analysis