AI Hallucinates and Invents Facts

AI hallucinerer og opfinder egne fakta
Stoler du på denne AI?

The latest AI language models, such as ChatGPT, can generate high-quality content faster than any human, and for that reason, we utilize them.

However, you should be aware that not everything an AI says can be trusted, even if it responds in a particularly convincing manner.

AI models have a tendency to hallucinate: They can invent their own facts, arguments, and citations – without basis in the dataset they were trained on. Therefore, you should always validate the content an AI generates before using it, as it can harm others, or the company you work for – and thus yourself.

OpenAI has, for that reason, written a discreet warning below the input field in the desktop version of their ChatGPT, but unfortunately not in the mobile app version.

ChatGPT may produce inaccurate information about people, places, or facts.

The warning is written in a small and faint gray font, which one can only hope all ChatGPT users will notice and understand the implications of:

The issue with hallucination is far from unique to ChatGPT, but rather a fundamental problem with AI.

When your lawyer uses AI, and your AI loves you

On May 27, 2023, The New York Times published the frightening story of the lawsuit Roberto Mata v. Avianca Inc., where Mata’s lawyer uncritically used ChatGPT to prepare the lawsuit against the airline. The problem was that ChatGPT had hallucinated and completely made up a series of previous lawsuits that served as precedents in the current case. The lawyer defended himself by saying that he had never used ChatGPT before and did not know that its answers could be incorrect.

Unfortunately, I fear he is far from alone, and that all of us, in a moment of inattention, idleness or haste, might not adequately validate the answers given by an AI, or that we simply do not have the necessary expertise to assess the answers nor the capabilities of analyzing a task that is too great for any one single individual. 

If you are not already familiar with this rather amusing, but also deeply disturbing story, I recommend that you read The New York Times journalist Kevin Roose’s account of how he got Microsoft’s Bing chatbot to hallucinate wildly, and how it ended up declaring its love for him and encouraged him to leave his wife in favor of it: A Conversation With Bing’s Chatbot Left Me Deeply Unsettled (February 16, 2023).

It seems unbelievable, but if you can so easily get an AI to do this, what else can you get it to do?

Would you read an entire book to check if your AI is mistaken?

The blame for this problem of hallucinations cannot simply be laid on us users. Of course, we have the responsibility of using the necessary methods of source criticism, but hallucinations are first and foremost an inherent problem with AI:

  • At the heart of an AI is a gigantic statistical model. It does not actually understand the words you write to it. Words and fragments of words are represented with numbers, so it can calculate the likelihood of what the next word or fragment of word should be. And it usually does this frighteningly well. The New York Times has a nice article with animations that show how a language model like ChatGPT guesses the next word in its response based on your conversation with it.
  • Even though an AI is trained on vast amounts of text, including things from the internet, there will therefore be gaps in its knowledge. Just as the texts are of varying quality or subjectivity. This also increases the risks of hallucinations.

  • There is, furthermore, a problem in the very way an AI is trained. This process includes so-called supervised learning, where humans validate the answers an AI comes up with and provide feedback to improve the quality of its responses. But what is the objectively right answer? Is there even such a thing? And are the humans who need to assess this capable of answering that themselves? There might be right and wrong answers to many mathematical problems, but much knowledge is subjective and open to interpretation.

In the Ted Talk, The Inside Story of ChatGPT’s Astonishing Potential, OpenAI co-founder, Greg Brockman, humorously and honestly illustrates the problem of ensuring that an AI provides a comprehensive answer during supervised learning:

But even summarizing a book, like, that’s a hard thing to supervise. Like, how do you know if this book summary is any good? You have to read the whole book. No one wants to do that.

At which point the audience bursts out laughing.

The exact same problem applies to us users. Marcel Proust’s novel “In Search of Lost Time” contains approximately 1.2 million words. Who would read the book to find out if the AI’s summary is accurate? One might argue that this is not fundamentally different from when two literary scholars write a summary; they are also different because they interpret the work differently. 

But there’s a significant difference here: There is a named author behind these summaries. With an AI, we do not know where it got its information, and whether it is something it made up itself. The story about the lawyer shows us how wrong it can go when one does not know how to use an AI.

Will the issue with AI hallucination be solved?

According to an article in Fortune (April 17, 2023), Google CEO Sundar Pichai stated in the 60 Minutes program that AI hallucination is expected:

No one in the field has yet solved the hallucination problems. All models do have this as an issue.

And when asked if the problem will be solved in the future, he was quoted as saying that it is a subject of intense debate, but that his team will “eventually” “make progress.”

When the debate is so intense, it’s because there’s great disagreement about whether the problem can be solved at all. And when Sundar Pichai does not give a simple yes or no answer to whether the problem will be solved in the future, we must probably prepare to live with this problem for a long time.

AI hallucination is a risk for companies

Can an AI be held accountable for lying, or will it always fall back on us users, as in the story of the lawyer? In other words, we, as users of AI, are often faced with the difficult, or even impossible, task of fact-checking the answers we receive from our AI. 

Otherwise, we can easily damage both the reputation and earnings of the business we work for, like the lawyer in the story above, if irresponsible use of ChatGPT went global.

Does that mean we should refrain from using AI? No, absolutely not, we must reap all the benefits AI can offer us. But we all need to learn to use this technology appropriately and responsibly. 

And if you do not already have a strategy for the application of AI at your workplace, you should create one as soon as possible, and preferably before your competitors.

Share this post

Picture of Jakob Styrup Brodersen

Jakob Styrup Brodersen

I have worked with data-driven online optimization for 20 years in 5 different industries. Now, I am a freelance CRO and AI consultant: I teach and advice how to utilize the benifits of AI and I do prompt engineering.