Data and Insights

Taking A Data Science Approach to Social Media Monitoring

Social media comments offer a rich source of insight into people’s interests, opinions and desires. While surveys and focus groups show what people say they believe when in an artificial research environment, social conversations show how they communicate when unguarded. In combination with other information sources, anonymised and aggregated conversation data can help brands build powerful, accurate audience profiles and uncover previously hidden insights. These trends can even help power the predictive models that help brands make better business decisions.

It has never been easier to implement high quality social listening. Here are a few things to consider before you get started:

One size rarely fits all

Each analysis has different goals.

One brand may wish to understand which owned content gets the most engagement, while another may want to track negative reviews and understand which experiences drive low star ratings. The right analysis approach and tools for each job will also differ. In one case, the volume of comments and interactions may be the most important metric, and an off-the-shelf monitoring tool may be the best for the job. In another, detailed understanding of the topics discussed and sentiment contained in comments may be needed, and a well-trained Machine Learning classifier will perform best.

Consider what “social” really means

Many social analyses focus on what “Twitter” thinks about an issue. But Twitter users aren’t a representative sample of your audience, unless that audience happens to be urban, millennial and disproportionately populated by journalists.

People use different sites for different purposes. If you want feedback on your services, focus on reviews. If you’re interested in people’s thoughts and concerns, forum comments can be a much richer data source than traditional social channels. Popular news stories garner hundreds of comments on news websites and aggregators like Reddit, but these sources are rarely included in analyses. Where easily accessible APIs don’t exist, third party monitoring tools (or simple web scrapers where sites permit scraping) can be used to gather data.

Conversation analysis is moving beyond just text, too. Speech APIs are improving, making it feasible to convert audio from vlogs, for example, into text for inclusion in analysis.

Use good data…

“Bad data” is information that is incomplete, inaccurate or irrelevant. Good analysts validate and contextualise their data, and this is as important in social monitoring as anywhere else.

Third party tools are a great resource. Why gather data from several channels manually when you can easily access everything in one dashboard? However be sure to check the quality of the data they provide. For example, check the blog comments and you may find the majority of that feed is filled with irrelevant spam. The accuracy of off-the-shelf sentiment identification is also known to be patchy, and even with more sophisticated methods like Machine Learning, it’s important to check a sample of the raw data your algorithm has analysed and make adjustments where necessary. 

Remember to provide the context that will make your reports meaningful. “75% of comments were positive” doesn’t mean much but answer questions like:

  • positive towards what?
  • How positive?
  • How has that number changed over time? 

and your analysis starts to become insightful.

…and good tools

One way of getting more accurate insights is to curate your own corpus and apply Machine Learning techniques. When we train a corpus of comments, we either manually tag them with particular sentiments and topics or let an unsupervised algorithm search for patterns. Data Scientists have many Open Source tools to choose from or paid-for tools can help guide you through the classification process. Our Data Science team possess extensive knowledge of ML along with all the tools to implement this type of analysis and drive brand strategies. 

A few of the advantages of this approach are:

  • Better classification. Movie reviewers use very different language to car enthusiasts who use very different language to contact lens customers. This is why generic sentiment classification can be so inaccurate - it wasn’t trained on the particular topic you are interested in. Train a corpus of comments that only includes movie reviews, or car discussions, and you may be able to more accurately identify trends in your data.

  • Shades of grey. Rather than just “positive” or “negative”, ML tools can show you a sentiment score on scale, allowing for more detailed analysis.

  • Phrases, not words. ML algorithms can even identify comments with mixed sentiment and understand which elements of a comment contributed to the overall sentiment core.

  • Sentiment towards what? By identifying topics specific to your area of interest, you can accurately identify comments discussing that particular topic, and compare the sentiment of comments across various topics. You can see how often topics are discussed together or how often your brand is mentioned alongside a particular topic. You don’t even have to create your own topics. Google’s NLP suite, for example, automatically identifies references to people, objects or locations.

  • Sarcasm, jokes, regionalisms and context. Phrases can take on different meanings depending on their context. We know the phrase “killed it” is often used positively and a simple rules-based sentiment model is unlikely to agree. Because a ML algorithm assesses many different patterns in data to come up with classifications, it is more likely to correctly identify where an ambiguous comment should fall than simpler tools.

  • Everything beyond the text. ARE ALL CAPS COMMENTERS ANGRY? Do multiple punctuation marks imply enthusiasm?!! Do emojis convey emotions that text might not? All of these indicators can be included in a ML analysis to indicate how a commenter feels, beyond the words they used.

To learn more about ML-aided social monitoring, and how it could help you uncover meaningful audience insights, get in touch. 

comments powered by Disqus