AI EDUCATION: What Is the Turing Test and Has AI Passed It?

44

Each week we find a new topic for our readers to learn about in our AI Education column. 

This week in AI Education we’re going to talk about the Turing Test, whether AI has passed it, what that might mean for the future and whether the Turing Test is really all that meaningful to us in the first place. That sounds like a lot to get to but it really isn’t, I promise. If you’re a dedicated AI & Finance reader, you’ll know I mentioned the Turing Test last week in my introduction as I noted just how much of the activity on the internet is now accounted for by bots, many of which are operated by artificial intelligence. 

In case you’re wondering, more than half of web traffic is now accounted for by bots—which creates conundrums for those of us who create and publish content over the web, as we have no good way of separating human traffic from bot traffic with any certainty. We can look at the number of hits our websites get, or how many times our emails are opened, but there’s no way of knowing what proportion of actual human beings account for that activity, which is a real problem if you’re trying to sell advertising. 

This issue, of being able to tell whether something is real or a bot, takes us straight to the Turing test. Named for its inventor, 20th century computer scientist and mathematician Alan Turing, the test is his proposed method for finding out whether a machine’s intelligence is on par with that of a human being. Turing came up with his test in 1949, introducing it to the world in a 1950 paper. 

What Does The Turing Test Entail? 

Distilled to its essence, the Turing test puts a human evaluator into text-based conversations with another human being and an artificial intelligence and asks that evaluator to determine which is the human being and which is the machine. If the evaluator cannot distinguish between man and machine, the machine could be said to have passed the Turing test. That is a pretty good description of the popular conception of the Turing test. 

In reality, the Turing test involves subjecting an AI to hundreds or thousands of repetitions of the test and measuring the percentage of time the human evaluator is convince they’re talking to another human. After all, an artificial intelligence’s success at passing the Turing test may depend on the topic of discussion, not to mention the evaluator’s intelligence and experience with technology. 

In Alan Turing’s mind, if a computer could reliably convince human beings that it was another human in the course of a natural language conversation, it could be said to have displayed intelligent behavior equivalent to that of a human—to have thought like a human being. Applying our AI terms to Turing’s 1940s conception of a thinking computer, a machine that passed his test was thought to be an example of artificial general intelligence. 

Has AI Passed the Turing Test? 

I guess this was a great moment to write about Alan Turing’s most enduring contribution to the zeitgeist (though his contributions to science went way beyond the test). Last year OpenAI’s ChatGPT 4.0 passed a 500-participant Turing test administered by UC San Diego at a 54% success rate. 

Earlier this month, OpenAI’s newest iteration, ChatGPT 4.5 was trumpeted as to have not just passed the Turing test, but to have outperformed human beings in an experiment with 300 participants, where it was judged as being human 73% of the time (but only when the experimenters asked the AI to adopt a human-like persona). In at least one way, our artificial intelligence is now more human than human. The same experiment tested Meta’s LLaMa-3 model, which returned a 56% success rate. 

Keep in mind that 50% could be thought to be a random response—a coin flip in an all-things-fair environment would produce the same result. An AI model that fools people more than 50% of the time has passed the Turing test and is convincingly human. Not that it’s saying all that much—I know plenty of flesh-and-blood human beings who probably couldn’t get anywhere near a 50% success rate in a Turing test, but I guess that kind of comes with knowing a bunch of tech and finance people. 

So What? 

I think we’ve come a long way since 1949, not just in technology, but also in our understanding of the mind and of human intelligence. Yes, OpenAI (and Meta, and some others) have built amazing large language models and incredibly useful chatbots, but are they really all that smart? What surpassing the Turing test means to me, really, is that it’s a whole lot easier to fool us human beings with technology, and that’s not just a product of technology. Hold that thought. 

Technology isn’t just operating in chats and texts anymore. AI is producing sounds, images, animation and voices that are becoming more difficult to distinguish from the real world. People are already being fooled by deepfakes, and deepfakes become more convincing every day—but for the most part there is still enough of an uncanny valley effect with wholly AI-generated images, video and sound that humans can identify them as being the products of AI. 

Also, we’ve come to acknowledge that there are many different facets of human intelligence that go beyond vocabulary and language and the ability to hold a conversation. While AI models may be on par—or beyond—human beings in verbal and mathematical intelligence and logic, humans are often considered many more types of intelligence, like interpersonal and intrapersonal intelligence. AI is a long way from understanding these different facets of the human experience. 

I have a family member who is semi-literate, at best, but is probably the only one of us who would survive the total collapse of civilization because he knows how to fish and hunt and make things with his hands. In a world where AI is taking over more knowledge work, he might have more opportunities before him than the college educated members of my family. He’s also more likely to be fooled by AI than the more educated among us. 

That gets us back to what the Turing test really means. It is not so much that the machines are passing the test, but that we’re failing it. I think that, as more of their lives are managed in digital workspaces, human beings make fewer distinctions between the digital world and reality. This is coinciding with technology that is rapidly becoming better at fooling us.