Study of sampling methods in sentiment analysis of imbalanced data

06/12/2021
by   Zeeshan Ali Sayyed, et al.
0

This work investigates the application of sampling methods for sentiment analysis on two different highly imbalanced datasets. One dataset contains online user reviews from the cooking platform Epicurious and the other contains comments given to the Planned Parenthood organization. In both these datasets, the classes of interest are rare. Word n-grams were used as features from these datasets. A feature selection technique based on information gain is first applied to reduce the number of features to a manageable space. A number of different sampling methods were then applied to mitigate the class imbalance problem which are then analyzed.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset