Sentiment analysis with Hugging face

Hugging Face is an NLP library based on deep learning models called Transformers. We will be using the library to do the sentiment analysis with just a few lines of code.

In this blog post, we will use the pre-trained model or the shelf model.

You can install the library using Pip, Which is similar to Nuget, NPM, and cargo.

How it Works

Similarly, you can create for

  • Text generation (in English): provide a prompt, and the model will generate what follows.
  • Name entity recognition (NER): in an input sentence, label each word with the entity it represents (person, place, etc.)
  • Question answering: provide the model with some context and a question and extract the context's answer.
  • Filling masked text: given a text with masked words (e.g., replaced by [MASK]), fill the blanks.
  • Summarization: generate a summary of a long text.
  • Translation: translate a text into another language.
  • Feature extraction: return a tensor representation of the text.

Courtesy: Hugging face

We are going to run our data across the model. Let us with the data. You can find out how I got the data from this Blog Post.

To get a glimpse of the data

So we will be using loading the data using pandas. This contains two main fields ratings which are review text which we pulled from amazon about a product and ratings are human-annotated.

Transformer Model-Based Sentiment analysis

Once we got the Label and score we can compare it with the human-annotated

Results

To solve this impedance mismatch, let us find common ground by normalizing the data(predicated vs. human) to Negative(0), Neutral(1), and Positive(2)

To normalize the predicated value, we will be using the following function. This function simply takes the Label and score and determines if the review is negative, neutral, and Positive.

To Normalize target(human) data, we will do similar to the previous method.

Visualize the Results

The output of this function looks like this.

In a much better visual form.

In the previous blog post, we have seen using a Dictionary-based sentiment analyzer we got 41% accuracy. Now with Transformer based we got 63 %.

In the upcoming blog post, we will fine-tune the model so we can improve the accuracy of the model.

I build intelligent Web Apps