ChatGPT and UX audits: the current limits of generative AI

By Merlyn Meredith | Senior UI/UX Designer

With brands and retailers weathering through an uncertain economic climate, they’re reshuffling their priorities to help navigate turbulent waters.

In calmer conditions, businesses zeroed in on offering the highest quality products for the most affordable price. Today, with higher competition and lower expectations, businesses are confronted with a crowded marketplace and constrained budgets.

This has led to the emergence of a new battleground for customer acquisition and growth: the customer experience (CX). A survey of almost 2,000 business leaders found that their number one priority for the next five years is CX, above both product and pricing.

This shift underscores the critical role of getting CX right to sustain growth through difficult conditions.

The best way to measure CX performance is to conduct UX audits, which helps to ensure that a business delivers on their promises to meet the ever-evolving expectations of their customers.

A UX audit is a process of analysing how customers interact with a brand in order to discover areas for improvement. It assesses all key touchpoints along the customer journey (from awareness to post-purchase) and identifies strengths and weaknesses of a brand’s customer experience strategy.

With insight from your customers’ preferences, behaviours, needs, and pain paints, a UX audit is essential for increasing a host of key metrics, from customer satisfaction and loyalty to average order value (AOV) and customer lifetime value (CLV).

Its emphasis on analysis, assessment requires a deep understanding of customer behaviour and interaction. In an effort to short-circuit this process, some have turned towards the promising power of generative artificial intelligence (AI) to accelerate the time it takes to carry out audits.

But is AI capable of carrying out UX audits?

A fascinating study by the Baymard Institute, an independent UX research centre that carries out large-scale research studies on all aspects of the online user experience. They tested ChatGPT-4’s ability to conduct a UX audit and compared it to a human UX professional.

Human v AI UX audits

When OpenAI enabled image uploads to ChatGPT-4, it meant the ability to analyse webpages. That’s when Baymard decided to test the platform to see how accurate it was when it came to discovering UX issues and to measure the gap between AI-powered and human-driven UX audits.

In the study, researchers uploaded screenshots of 12 different ecommerce webpages to ChatGPT-4 and asked for UX improvement suggestions. They then compared the AI’s responses to the recommendations of 6 highly trained UX professionals, who relied on extensive UX testing with over 130,000 hours of data from real end users.

The human experts spent 2-10 hours on each webpage, and the researchers further invested 50 hours in a detailed comparison of the 257 suggestions made by the humans and the 178 suggestions provided by ChatGPT-4. The webpages tested included a variety of product, listing, and checkout pages from brands such as Lego, Cabelas, and Argos.

The headline findings: 

ChatGPT has a UX discovery rate of 14.1% 

The platform highlighted a small percent of the UX issues actually present on the live webpage. That’s because it could only analyse screenshots of the webpage; the UX professionals used the live interactive website.

ChatGPT has a 19.9% accuracy rate in UX suggestions

With such a low accuracy rate, the experts commented that it wouldn’t make generative AI a useful supplemental tool for UX audits, give the amount of time it would take professionals to sypher through suggestions.

ChatGPT generates an 80.1% error rate

Among the high false-positive error rate, one-in-eight suggestions were deemed detrimental to the user experience. For example, suggesting that LEGO simplify its footer by essentially removing it. Seven-in-eight of the suggestions were simply considered a waste of time, including the same generic, unhelpful comments. For example, adding elements that the site already had, or making the site mobile responsive despite the screenshot clearly being desktop.

Among the 12 webpages that were analysed, ChatGPT correctly identified 2.9 UX issues on average. However, it overlooked 18.5 issues; made 10.6 suggestions that were a waste of time; and made 1.3 suggestions that would be harmful to act on.

So what are the broader implications?

These results clarify the current gap between AI and human performance when it comes to carrying out a UX audit. It should serve as a warning to brands and retailers who are – or are considering – relying on AI for evaluating and assessing their user experience.

With its current limitations, businesses are better off leveraging generative AI for other areas of operations where it’s having a higher, more positive, impact. For example, here are 5 areas to embed generative AI into your digital commerce strategy.

For UX audits, there remains a need for human expertise when it comes to assessing, interpreting, and addressing nuanced issues – which are currently overlooked by the capabilities of AI.

This is all true today. The evolution of AI may gain greater capabilities to carry out UX audits tomorrow. The explosive rise of generative AI makes it feel inevitable that it will learn the overcome the current limitations. As Baymard called out, interaction-related UX issues can’t be identified from an image.

What about when it’s able to analyse live interactive websites? Or when its capable of digesting more comprehensive prompt engineering? These are questions worth thinking over. But tomorrow has the answers. For now, it is the human value of empathy, understanding, and creativity that are the pillars of conducting effective UX audits.

Tryzens approach to UX audits 

As an international digital commerce agency, Tryzens has a dedicated team of CX specialists, who are well-versed in navigating the complexities of customer behaviour and elevating the customer journey of a diverse group of global brands.

Our approach involves conversion rate optimisation (CRO) initiatives, which include UX audits, to identify strategic areas to enhance and facilitate growth. 

We understand that seamless, connected experiences are fundamental in today’s crowded marketplace. That’s why we emphasise the significance of UX audits in digital strategy. By conducting thorough assessments, we identify pain points and uncover opportunities for optimisation. When brands can meet their customers’ expectations, they’re in position to increase AOV, CLV, and brand loyalty.

In an era where the customer experience is the differentiator, our CX expertise reflects our dedication to helping brands understand their customer better and delivering insight-driven, customer-first solutions.

Is it time for a UX audit? Then connect with Tryzens.

Share on social

Learn more about who we work with