AI and the importance of ‘why?’

In a time where everyone seems to be grappling with AI and how it’s going to impact their industry, I wanted to write an article about AI and how it is at odds with Sapien’s mindset of ‘speak human’. But as I explored the subject, I came to realise that AI and humans are not all that different, and in fact, share many of the same vulnerabilities. We’re biased, we invent, and 98% of the time, we take mental shortcuts. AI does these things too; echoing those System 1 behaviours our human brains are so adept at.

“In many ways, these models are like very powerful pattern-matching machines. And pattern matching is what System 1 essentially is.” — Alexander Reshytko, Towards Data Science [1]

Just like us, AI also struggles to engage in deeper, more analytical thinking, i.e. the equivalent of our System 2 brain (which we only use about 2% of the time).

These limitations mean it’s doubly important for us to be aware of our own flaws as we navigate this brave new world of AI. Whether an ‘insight’ comes from AI or a human, if it is marred by bias or incorrect information it could have disastrous consequences, especially if it’s used to guide business decisions.

Adhoc-AI-importance-of-why

Humans and AI both satisfice

Research has found that, more often than not, consumers make decisions by satisficing: a combination of sufficing and satisfying. In other words, we don’t go for the best-in-class, most optimal choice. In light of other pressures such as cost and availability, we’ll settle for near enough is good enough.[2]

Why? Well, it’s because we’re kind of lazy. It takes a lot of time and a lot of mental energy (and calories to feed that mental energy) to weigh up all the information required to make the very best decision every time we have to make one. So, we don’t. We just like to think we’ve made a good choice, and if you ask us about it, we can tell ourselves a pretty convincing story as to why it was the right choice.

Perhaps it should come as no surprise then, that AI is similar. After all, a large language model like ChatGTP is trained on vast amounts of very human data and it’s very good at delivering a good enough recommendation.

“The immediate end game of AI isn’t to beat humans at their own game, but to tackle the tasks where “good enough” is more than adequate." —Jonathan Bailey, AI and the Danger of Good Enough, Plagiarism Today.

While good enough might be acceptable when you’re choosing which peanut butter to buy this week, it’s not enough if you’re using it to identify opportunities for your brand strategy.

The problem with ‘good enough’

It reinforces ingrained biases

AI hallucination aside, large language model (LLM) AIs, such as ChatGTP, were trained on data from the internet—which, as we know from experience, is riddled with biases and inaccuracies—and these models continue to be trained on user input—so, let’s not pretend we’re not without our blind spots either. This, combined with the fact that these AI interfaces are designed to feel comfortable for the user, means the answers LLM AIs provide often feel ‘right’ precisely because they reinforce stereotypes and reward our ingrained biases. Even ChatGTP’s creators, OpenAI, admit that their product has some problems around this:

“GPT-4 still has many known limitations that we are working to address, such as social biases, hallucinations, and adversarial prompts.” — GPT-4 – OpenAI, 13 March 2023

While the answers or solutions a LLM AI provides may feel familiar, safe, and easy to accept, they don’t necessarily challenge our perspectives—or if they do, it is easy to dismiss. After all, it’s not a ‘human’ opinion that we’re choosing to ignore.

It doesn’t explore the edges

Time and time again I’ve seen one key insight lead to powerful business transformation. The problem with AI is it can’t determine how valuable a particular insight is to a business—especially your business. By generalising, it runs the risk of missing the best insight; that one, business-transforming nugget that unlocks a new creative platform or a new product niche.

AIs are fantastic at simplifying a large dataset and highlighting the common themes. The problem is innovation and transformative ideas rarely come from the mainstream. Instead, innovation comes from the edge, something AI tends to ignore.

So what can we use AI for? And what can’t we?

LLM AIs have huge potential to change the insights industry, but only as a tool to augment and accelerate our work. AI’s ability to speed up typically time-consuming tasks, such as cleaning data by isolating poor or dubious quality responses and synthesising diverse data sources to summarise emerging themes, creates more time for us to focus on adding value to real insights.

Closer to home, I’m hoping AI will help the Sapien team speed up analysis of the thousands of interactions a typical shopper generates in-store when wearing eye-tracking glasses. Currently, we do this manually and it’s a task that can take days, if not weeks. Having AI quickly process this data and identify common patterns of interaction would give us the ability to upscale these programmes to include more shoppers and more stores, allowing us to quickly uncover big picture insights around shelf engagement and product or POS positioning. The potential here is hugely exciting for us.

Meanwhile, on the LLM front, we’re already seeing generative AI helping market researchers gain more detailed open-text answers in qualitative surveys. These specially trained AI are being used to prompt respondents into providing more detailed open-text answers. This all leads to more engaging experiences for respondents, richer responses, and deeper, more powerful insights.

What AI can’t do

We’ve spent a few months looking at how LLM AI can be incorporated into a project to complement or augment our current ways of working. In doing so, we’ve found a few additional ‘can’t dos’:

Comparing data and drawing out differences

In my experience so far, while LLM AI can detect common themes from a spreadsheet of open-text responses, it is less effective at identifying differences between data sets, and when it does, it doesn’t like to commit to comparative conclusions.

Getting the AI to identify and draw out specific differences requires a lot of prompting and repetition. You can become locked in a cycle of asking the AI more and more ‘pointy’ questions to try and get it to reveal these differences. By this stage, you might as well have looked at the data yourself—at least then you can be sure there’s no nugget hiding in it.

Of course, pointed prompting only works if you already know the kind of answers you’re looking for. That’s fine in some situations, but a core principle of market research is to uncover what you don’t know; to identify the things you wouldn’t think to look for—things that aren’t on your radar. More often than not, it’s the uncovering of the unexpected that has the most value to a business.

All this said, we’re still in the early stages of AI augmentation, so the ability of AI to compare data may change. If it does, this could significantly cut down the time it takes us to manually analyse and compare data.

Numbers are a struggle

While this might appear obvious, given the descriptor for LLM AI is ‘large language model’, we nonetheless thought we’d give ChatGTP-3 a whirl to see if it had any numerical capability at all. Suffice to say, it’s not great.

While it can tell you a key theme, it’s reluctant to quantify its results. For example, ChatGTP might detect a common response in a sheet of open responses and report something along the lines of ‘this issue is a concern among Kiwis’. However, it can’t provide specifics into how much of an issue something is or the number of people affected by it. In other words, it can’t tell us ‘this issue is a major concern for 37% of Kiwis’. It lacks specificity. And because of that, it is difficult to validate and assess how serious an issue truly is. In short, it's easier just to crunch the numbers.

As a side note, even when we asked ChatGTP to provide a schedule of days and dates for a particular survey deployment, it kept getting the days and dates wrong.

Some things can’t replace human

From drawing connections between behaviour and underlying needs and motivations to then overlaying those insights with behavioural data, AI presents a huge opportunity as a tool to augment how the industry operates on the day to day. While LLM and generative AI have huge potential to help us work smarter, better, faster, nimbler, it’s not at a stage where we can take our hands off the steering wheel. It is not a set and forget tool, but a machine that requires constant steering to get the most out of it. Similar to a self-driving car, it might be able to get you from A to B, but it still needs a human to make the decisions on where to go.

Meanwhile, if you’re hoping LLM AI will replace human-powered thinking and in-depth analysis, you’re going to be disappointed. The reality is that LLM AI outputs are a sophisticated version of predictive text combining your request parameters, your data, learnings scraped from the internet, and its algorithm. It is not drawing conclusions so much as putting one word after another to form an output that makes sense and meets that ‘good enough’ baseline. Moreover, in the event you do manage to get an LLM AI to uncover important points, they are often reported in a way that fails to convey their impact and value to a business. Instead, everything becomes a ‘good enough’ vanilla sameness.

So, before you go replacing your human market research experts with an AI counterpart, I’ll leave you with this thought: if you’re deciding what's next for your brand, would you want ‘good enough’?

Footnotes

[1] Reshytko, A. 2022. 'Large Language Models and Two Modes of Human Thinking'. Towards Data Science, 13 September 2022. Available at: https://towardsdatascience.com/large-language-models-and-two-modes-of-human-thinking-1322160755e8

[2] 2023. Behavioural Science Solutions Ltd. 'Choice architecture'. Available at: www.behaviouraleconomics.com