AI EDUCATION: Yes, AI Hallucinates. Is That A Bad Thing?

920

Each week we find a new topic for our readers to learn about in our AI Education column. 

Science fiction author Philip K. Dick titled one of his novels (and the inspiration for the movies Bladerunner (1982) and Bladerunner 2049 (2017)) Do Androids Dream of Electric Sheep, and with the rise of generative artificial intelligence, we can now answer that titular question with a yes—or at least, sort of. 

Maybe they don’t dream, but they certainly hallucinate. 

As more people and businesses are using generative artificial intelligence, awareness is rising around the issue of AI hallucinations. 

Not only is awareness rising, but solutions are coming to bear as well.

Does an AI hallucination sound more like a psychedelic flashback than a technological problem? Read on to learn more!  And check out our prior post on AI hallucinations while you’re at it!


What Are AI Hallucinations 

Newfangled generative AI engines like ChatGPT are pretty handy for quickly conducting research and drafting readable, digestible content based on that research. In fact, they’re superior to many student writers and cub reporters in that they actually do the work of looking for information and translating it to something readable. 

However, they can’t be left to their own devices to publish the content they create, and one huge reason is that, rather than conduct research, generative AI tends to occasionally make up its own facts to suit its content. Which, if you think about it, is really not so different from certain student writers and cub reporters. 

One problem with AI hallucinations is that generative AI engine can be awfully persuasive when presenting fictional material as fact—to the point where most human readers would be unable to distinguish between fact from fiction in the output presented by generative AI. 

How We Got Here 

In recent months technological solutions to the AI hallucination problem are starting to proliferate. One such solution, Lynx, from Patronus AI, is designed to help automatically detect AI hallucinations without the intervention of an editor by repeatedly testing the large language models that power generative artificial intelligence. 

OpenAI, the creator of ChatGPT, has done something similar by turning its own GPT-4 engine on ChatGPT to try to detect—and correct—hallucinations. 

Another similar method, tested by researchers at Oxford University, uses a statistical model based on “semantic uncertainty” to determine when a generative AI is most likely to be hallucinating. The method works, they say, because there are subtle markers when GenAI makes up an incorrect or inaccurate answer that is not based on actual research. 

And this week, a company named Aporia, boasted that it’s hallucination detection technology, which uses multiple small-language models to identify hallucinations versus one large language model, ensures a 98% detection rate in identifying hallucinations. Aporia argues that by distributing work across multiple small language models, its detection system is more reliable. 

Why Does Gen AI Hallucinate 

Generative AI doesn’t actually know anything about anything. When you query ChatGPT or any other GenAI engine, it isn’t digging into its own trove of wisdom to create an answer, it’s stringing together the best combination of characters, words, sentences and paragraphs to generate your desired output with no consideration for what’s true or accurate and what’s not. 

AI is only as good as the information it’s trained on, and there’s a lot of questionable information that is published on the internet. That’s one of the reasons that early attempts to create social media AI bots often went awry—when you train a large language model on the interactions across a network like Twitter (now known as X), it picks up every comment and remark, the good, the bad and the ugly. 

AI won’t necessarily be able to tell the difference between satire and reality, opinion and news, an academic journal or a religious text without some form of human intervention. 

Outside the hands of a creator who can guide and effectively query generative AI to create the desired output and then verify the results of their query, generative AI is like a brush without a painter or a cello without a cellist.  

Why It’s a Problem 

For one thing, hallucinations are disturbingly common in GenAI deployment. Chatbots can hallucinate up to one-in-four answers to use queries, according to tracking from GitHub. Open AI’s GPT-4 hallucinated 3% of its answers in that research. 

AI hallucinations have already caused some headline-grabbing havoc, from GenAI getting lawyers into hot water due to legal briefs that cited hallucinated case law, to chatbots exposing companies to litigatory risk after explaining non-existent policies to users. 

As long as hallucinations go unchecked, GenAI will suffer a trust deficit with consumers and businesses and delayed implementation into trust-based businesses like finance and law. 

The Silver Lining 

Artificial intelligence hallucinations may also reveal vulnerabilities in companies’ existing deployments of GenAI technology and illustrate their readiness, or lack thereof, for using or expanding their use of AI.  

In a recent piece for Forbes, DataRobot CEO Debanjan Saha also illuminates the concept of synthetic creativity: sometimes, the unexpected result of an AI hallucination may be a new idea or combination of ideas. Like human creatives find inspiration by manipulating information in their brains, artificial intelligence manipulates large amounts of data into new and different forms across neural networks.  

Saha likens AI hallucinations to creative leaps and argues that, in some cases, hallucination should be embraced. 

The Bottom Line 

At this stage, few technologists, let alone business leaders and content creators, are ready to look at AI hallucinations as more of a creative tool than a nuisance or liability. 

Generative AI is a great timesaver, but there must still be a human intermediary to make sure that clients and consumers aren’t harmed by AI hallucinations. Hopefully, new technologies will help solve this problem and close the trust gap. 

In the months and years to come, AI hallucinations may be a creative tool to synthesize new ideas and paths of inquiry and perhaps create art and technologies yet unimagined by human minds.


DWN Staff & AI