Project information
- Title: Movie Reviews: sentiment analysis
- Category: Machine Learning
- Author: Andrea Alberti
- Project date: 2 May, 2023
- Project URL: Github-Sentiment_Analysis
Movie Review Sentiment Analysis
Developing a Multinomial Naive Bayesian classifier for sentiment prediction
Exploring vocabulary sizes and alternative approaches
Overview • Dataset • Approach • Results • Conclusion • License
Overview
This project aims to develop a Multinomial Naive Bayesian classifier that can accurately predict the sentiment of movie reviews based on their textual content. The available data comprises a collection of labeled movie reviews classified as positive or negative. The study explores various versions of the Multinomial Naive Bayesian model, including different vocabulary sizes, to evaluate the classifier's performance. Additionally, the project investigates the use of logistic regression as an alternative approach for predicting the reviews' sentiment.
Dataset
The dataset used in this project consists of movie reviews that are labeled as positive or negative, representing the sentiment associated with each review. The textual content of the reviews serves as the input data for training and evaluating the classifiers.
Approach
- Multinomial Naive Bayesian Classifier: The project implements a Multinomial Naive Bayesian classifier, a probabilistic model commonly used for text classification tasks. Different vocabulary sizes are explored to understand their impact on the classifier's performance.
- Logistic Regression: As an alternative approach, logistic regression is investigated for sentiment prediction. The model's performance is evaluated in comparison to the Multinomial Naive Bayesian classifier.
Results
The analysis of the results demonstrates that, despite its simplicity, the Multinomial Naive Bayesian classifier is a robust model for sentiment analysis tasks. The classifier achieves promising accuracy in predicting the sentiment of movie reviews. Furthermore, the logistic regression model also performs well with a relatively small training effort.
Conclusion
The project successfully develops a Multinomial Naive Bayesian classifier and explores its variations with different vocabulary sizes. The study also investigates logistic regression as an alternative approach for sentiment prediction. Both models show promising results in accurately predicting the sentiment of movie reviews.
The findings of this project are valuable for sentiment analysis tasks in the context of movie reviews and demonstrate the effectiveness of the Multinomial Naive Bayesian classifier as a simple yet powerful approach for such tasks.
License
This project is licensed under the MIT License. Feel free to use and modify the code as per the terms of the license.