Novacene AI co-publishes peer reviewed article on clustering

Image of lego blocks sorted by colour

Clustering is an essential technology for public opinion research that is revolutionizing the way data analysts across all sectors do their job

In the weeks leading up to the 2021 federal election, Canadians across the country took to social media to post about what made them tick: what issues they cared about, and who they wanted to see in power.

For our client Advanced Symbolics Inc. (ASI), a research organization that provides election predictions using Artificial Intelligence, these social media posts are extremely valuable in helping them forecast election winners. But this time around, ASI didn’t just look at individual tweets to help them predict the results. When they called upon Novacene AI for our expertise, we provided them with a clustering algorithm – allowing ASI to “cluster” individual tweets into conversations which ultimately gave them a deeper understanding of not just who Canadians were more likely to vote for, but also why.

What is clustering?

Novacene AI and ASI recently co-published the findings in a peer reviewed article entitled Forecasting and Understanding the 2021 Canadian Federal Election Using Twitter Conversations, which was featured at the Canadian AI 2022 Conference that took place in Toronto from May 30 to June 5. The article discusses how the research agency used our clustering algorithm to correctly predict the outcome of the 2021 Canadian federal election.

Clustering is at the core of everything Novacene AI does: we provide our clustering algorithm to our clients, and also, offer clients the technology platform that has clustering functionality as a core feature.

The clustering algorithm can be compared to a piece of Lego, while the technology platform is like a bucket of Lego. Our clustering algorithm works with other algorithms – just as a piece of Lego works with other pieces – to solve a problem. When clients choose to purchase the technology platform as a product, it is like buying a bucket of Lego. Clients have access to Novacene AI’s clustering algorithm (or “Lego piece”) along with other algorithms (the “other Lego pieces”) to assemble solutions to solve their problems.

In ASI’s case, they were able to correctly predict the winner of the Canadian federal election based on Novacene AI’s clustering algorithm – or “one Lego piece.”

Applications beyond election predictions

Our founder and CEO, Marcelo Bursztein, says that clustering ultimately allows the researcher to group things together and start to understand the different topics that are driving the conversation. For example, if there are 10,000 tweets about COVID-19, clustering would allow the researcher to see that 20 per cent are about lockdowns, 30 per cent are about vaccines, and 50 per cent are about epidemiology.

He refers to clustering as a fundamental approach to understanding public opinion that has applications beyond predicting election results.

Think tanks, for instance, could also find many clustering applications to help them understand how Canadians feel about present-day issues, such as the country’s level of economic inclusion.

While clustering helps analysts across a variety of sectors harvest insight from raw data, that data doesn’t always have to come from social media. For example, Novacene AI recently worked with a client who surveyed 3,000 people with open-ended questions and needed to analyze what respondents said. Our clustering technology was able to take this data and categorize the responses within 30 seconds – a job that would have been extremely time consuming for a person to analyze.

Keeping humans in the loop

However, Marcelo stresses that clustering doesn’t replace the analyst. Instead, he says it is a tool analysts can use so they don’t have to start from scratch. It is still critical that humans be involved in the process of understanding public opinion, and are needed to review the clustering’s outcome for accuracy.

But most importantly, humans provide empathy and can take action when understanding public opinion. For example, in the months leading up to the February protests in Ottawa, clustering would have given government officials the opportunity to understand why some Canadians felt frustrated.

Marcelo says that clustering isn’t about collecting intelligence and monitoring people. Rather, as the peer reviewed journal outlines, it allows institutions to gain a deeper understanding of society and build empathy and resilience through these tools.

To read the paper in full, visit: https://caiac.pubpub.org/pub/ewuzz3aj/release/1