According to Wikipedia Sentiment Analysis (also known as opinion mining or emotion AI) refers to the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information.
How does Sentiment Analysis work?
Sentiment analyzer apply a probabilistic value between 0 and 1 to the input text. There are several approaches to sentiment analysis. There are two main approaches to sentiment analysis: knowledge based techniques and statistical methods.
knowledge based techniques more intuitive. They are built upon unequivocal vocabulary, and vocabulary that has high affinity with those. For example, we can consider the word happy to be unequivocally positive, and therefore give it a value of 1. In this fashion we can compute a first sentiment analysis based on a predefined vocabulary list.
Statistical methods are more advanced and leverage different machine learning techniques.
One of the most known algorithms for bag of words based sentiment analysis is the Naive Bayes algorithm. In the last 10 years, the use of bag of words and word embeddings such as google’s word2vec to statistically compute the sentiment of a sentence have achieved over 90% accuracy in binary sentiment analysis tasks.
Currently, the state-of-the-art sentiment analysis are built upon the transformers architecture. A complex encoder-decoder layered model that makes use of attention mechanisms to produce state-of-the-art results in many NLP tasks such as sentiment analysis or question answering.
In this context, an encoder is a set of identical neural networks (usually long-short-term memory networks) that take a single element from the data as input and collects the necessary information, to propagate forward to the next layer.
A decoder is also a set of identical neural networks, but their task is to accept a hidden state from the previous unit, create a new hidden state, and return the output.
To this, we add a self-attention mechanism or function, that creates key-value pair for all elements in the input sent to the model so that we can access them by a query word. This allows the model to access several states of a word during the process, which in turn allows the model to inspect all the related words in the input and calculate the attention that must be given to to each of them.
As of today, the state-of-the-art models are based on google’s BERT(Bidirectional Transformers for Language Understanding).
Binary sentiment analysisA binary sentiment analysis establishes two categories: negative and positive , where 0 means absolutely negative, and 1 absolutely positive. Sometimes a neutral category is added and applied to values with a probability between 0.4 and 0.6. Currently, it is possible to perform very accurate binary sentiment analysis. BERT-based models achieve over a 95% success rate on the IMDb and Stanford Sentiment Treebank binary classification datasets.
Multi-class sentiment analysis
Multi-class sentiment analysis is as of today still far from the levels of performance achieved on binary sentiment analysis. This type of analysis is also known as fine-grained Sentiment Analysis. As the number of labels increases, so does the difficulty. Currently, state of the arts models fall short of 60% success rate on the Stanford Fine-grained classification dataset (SST-5).You can visit NLP-progress for more details on state-of-the-art sentiment analysis.
bag of words
Harvard’s annotated paper for the attention is all you need paper