Back to Blog

How to Build a Twitter Sentiment Analysis Tool using

Hundreds of millions of people willingly spew their opinions in under 280 characters per post and 6,000 times per second. Sentiment analysis on social media platforms such as Twitter is a very effective way for analysts to gauge consumer reactions to products and services. The use of machine learning is necessary to properly gather consumer reactions. Individuals sifting through tweets ranging from “Where the Buffalo Wild Things Are” to firey, researched political statements can easily get in the noise and twitter void, especially if a business just wants to gauge the attitude towards a particular product or service.

Tweets are 280-character posts that users can share with their followers either publicly or privately. Due to this character limit, tweets are useful for text mining. Sentiment analysis is the process of reading and understanding multiple tweets about a subject and extracting the general reaction to it by the target audience. Organizations that can benefit from creating twitter sentiment analysis projects range from political groups to consumer brands to celebrity influencers to public relations firms. This includes news and media sites that can harness sentiment analysis technology to see the evolving opinions on certain laws, movements, and emerging technology.

Social Media Sentiment Analysis

In the past, many companies have used traditional business intelligence tools to monitor social media. However, this is not efficient because traditional BI tools cannot handle true sentiment analysis, capture sarcasm, or process and learn new slang. So, you would need an ML model or project, that uses natural language processing technology, to identify keywords and phrases while comprehending the negative, positive, or neutral tone of each tweet.

However, an ML project is expensive and time-consuming to create. If a project manager or executive starts the project from scratch, it takes a large dedicated team of data scientists and analysts to make the infrastructure, outsource the data collection process, and then clean the data using an algorithm or by hand. This process requires a large amount of time and resources. From there, project leaders need to remake and recompile the model a few times due to bugs and other inconsistencies during the debugging phase which increases the expenses and time a few folds.

Sentiment Analysis using, a revolutionary end-to-end ML platform, is the best tool for sentiment analysis projects. makes the data collection process easier by allowing users to collect data from multiple different sources, such as using an API, CSV, forms, or a collaborative mobile application. Both the form and mobile application data input can be completed as collaborative “jobs”. This means a manager can delegate the data input and labeling process, decreasing the likelihood of inaccurate data being input. Data labeling “jobs” can be assigned to non-specialized workers as they come with customizable instructions and an easy to use interface as seen below. It also has a single tab to get an overview of all your data, with graphs and statistics letting you track the data entered into the dataset. It also lets you view all the data in a table format, letting you see all the data in real-time, letting users review and even edit it. In this model, we have used this set of 100 tweets to make the model. These tweets were taken from Heroku, which is a cloud platform that provides multiple support services.

Data Labeling

Next up is the data labeling part of the project. You can label it in two different ways, either form-based or on the mobile app. Similarly to collection data, form-based labeling and mobile application labeling allow users to view each tweet and label it as either positive, negative, or neutral. The data labeling section of features an overview page with the categories, the number of records labeled, outliers in the data, and how far along different collaborators are in the labeling process.

Data Visualization also features a final data visualization page. Here, users can see the different parts of the dataset. This page is intended to finalize the dataset and clean the data as this is the last step before creating subsets or “feature sets” of the data before training it. This page highlights information such as whether or not data is missing labels and the distribution of data and their corresponding labels. Here, you can see if your dataset is a useful representation of the population dataset.  Maybe consider changing it if there are more positive tweets than negative tweets.

Feature Set Engineering

Using, you can then create feature sets, or subsets of the dataset to use for model training. Users can either use the standard 70:30 random split for the feature set or customize many aspects, such either extracting the data or handing the reins to the platform for manipulating ratios of training and testing.  Using feature sets is important because the whole dataset will not be useful for creating accurate models.  Using a variety of tweets and random subsets allows users to fine-tune their models and make them more accurate by changing which pieces of data the algorithm learns from more efficiently.

The variables to tweak in this particular example may be the distribution of positive, negative, and neutral tweets, the topic of the themes, or how many times Twitter users retweeted and responded to a tweet.


Creating a feature set using

Model Training

Once the user creates feature sets, they can move onto training the model. This is where the model is created and trained. Here you can choose your feature set and name and make notes in the model description. It takes some time, however, the time elapsed for this model was only 10 minutes even with 100 data sets of an average of 20 words. uses a Convolutional Neural Network (CNN), which is a class of deep neural networks. This is used because CNNs learn their own information and tweak their own filters, using statistics that otherwise have to be engineered by hand. This independence from the traditional need for human interference has been a great advantage because this process reduces bias. Artificial neurons respond to stimuli over certain areas and with overlapping areas. This is how they “learn” to recognize patterns.

Only click model training using's deep learning networks

Then the model is ready to deploy. displays the training summary which includes graphs of how the model was trained to show how the accuracy, loss, precision, and recall changed over time. The description of the feature sets and it also lets you see and choose some aspects of the algorithm used to train the model. Then it gives you the code snippets you can use to call the model into a variety of languages including java, python, and ruby.

Once deployed, you can monitor the model on the monitoring page. Here you can see real-time data of how the model is interacting with the code. It includes a variety of information including the inference count, requests made per minute, the accuracy and time it takes to make a decision. You can analyze how well the model works for your particular project. If needed, you can repeat this cycle to make a better and more accurate model.

Visit to learn more about how to create other natural language processing and computer vision projects!

New Call-to-action