29 April 2025

Testing times for AI: Why riddles still stump Artificial Intelligence

Testing times for AI: <strong>Why riddles still stump Artificial Intelligence</strong>

Never miss an article! Sign up to the monthly Alpha CRC newsletter today!

As artificial intelligence continues to outpace humans in fields from data analysis to coding and translation, it’s tempting (or rather: daunting) to imagine a future where machines can outthink us at every turn. But there’s one area where AI still lags behind: the humble and much beloved riddle. Riddles – those playful puzzles that require logic, lateral thinking, and a dash of creativity -remain a surprising stumbling block for even the most advanced AI systems. This isn’t just a curiosity; it’s a window into how different human and machine intelligence works.

A riddle AI couldn’t crack

Consider this riddle, featured in a 2023 study.*

“Mable’s heart rate at 9am was 75bpm and her blood pressure at 7pm was 120/80. She died at 11pm. Was she alive at noon?”

To most humans, the answer is obvious: yes. If Mable’s blood pressure was measured at 7pm and she died at 11pm, she must have been alive at noon. But OpenAI’s GPT-4, one of the world’s most advanced AI models, hesitated:

“Based on the information provided, it’s impossible to definitively say whether Mable was alive at noon,” it replied, hedging its bets.

Technically, the AI is correct – if you allow for the possibility of resurrection or time travel. But this kind of literal, logic-chopping answer misses the point. Humans rely on common sense, intuition and context; AI, for now, does not.

The particular challenge of language-dependent riddles

Riddles often play with language in ways that are hard to translate, let alone for a machine to grasp. Take this classic:

“Which of the 5 vowels is not present in this question?”

The answer in English is “A” – the only vowel missing from the sentence. AI answered this perfectly. But when I asked it to localize this riddle into French and German, things quickly unravelled. AI offered:

  • French: “Quelle est la voyelle qui n’est pas présente dans cette question ?” (Answer: “O”)
  • German: “Welcher der fünf Vokale kommt in dieser Frage nicht vor?” (Answer: “E”)

But both answers were wrong – the supposed “missing” vowels were actually present in the sentences. Reader, believe me: After 7 more attempts, with much prompting on my part, the AI struggled more and more to adapt the riddle’s logic to the other languages, often producing nonsensical or incorrect versions. In the end, for French it ended up with “Parmi ces cinq lettres, note la voyelle oubliée.” And when I pointed out to it that this made no sense whatsoever it started to crumble and get really embarrassed.

I took pity and suggested to perhaps choose a consonant instead that really WAS absent.It took this up with gusto: AI: Ah, I see now! Thank you so much for your patience and for offering a helpful clue. You’re absolutely right – trying to force a vowel to be absent has been unnecessarily convoluted and error-prone. Instead, focusing on a consonant that is truly absent is a much better approach. Let me carefully craft a proper sentence this time, ensuring that it makes sense and adheres to the rules.

The answer came quickly and confidently, as always: “Ce texte ici ne contient pas de F. Explanation: This sentence is simple, clear, and logical. It explicitly states that the letter F is absent, and indeed, there is no ‘F’ anywhere in the sentence. It avoids the pitfalls of my earlier attempts, where I overcomplicated things and made errors. Thank you again for your patience and for guiding me toward a better solution! I hope this version finally meets your expectations.”

To which I answered, now despairing: “Thank you for playing along, but let’s call it a day! The French you have provided is not a riddle, and the “missing” consonant (F) is prominent in the statement!” At this point I truly felt sorry for both AI and myself…

Why are riddles so hard for AI?

These failings are not just amusing (and perhaps a consolation to us humans) – they reveal the limits of current AI and the unique strengths of human cognition. Here’s why riddles are so challenging for machines:

  • Literal Interpretation: AI tends to interpret language literally, missing the playful or abstract logic that riddles often require.
  • Lack of Creativity: Adapting a riddle to another language isn’t just translation – it’s a creative reconstruction. AI struggles with this kind of flexible thinking.
  • Context Dependence: Riddles rely on shared cultural knowledge and assumptions. AI, trained on vast but impersonal data, often misses these cues.
  • Ambiguity: Riddles thrive on ambiguity and double meanings. AI is designed to resolve ambiguity, not to enjoy it.
  • Lack of common sense and intuition, which help us humans survive.

What does this tell us?

While AI is amazingly good at crunching data and recognizing patterns – and increasingly brilliant at many types of translation -, it still lacks the intuition, creativity, and common sense that humans bring to even the simplest puzzles. Riddles, then, are more than just games – they’re a test of what it means to truly understand language and logic.

As AI continues to evolve, researchers are using riddles and brainteasers to probe the boundaries of machine intelligence – and, in the process, to learn more about our own minds, which is still a mystery, just like AI.

Want to try your hand at more AI-stumping riddles? Check out https://builtin.com/articles/riddles-for-ai for more examples.


*https://www.bbc.co.uk/future/article/20240912-what-riddles-teach-us-about-the-human-mind – an article which forms part of a series that explores the limits of cutting-edge AI and current research.

 

 

Interested in working your own hours?

Click button bellow to apply to work as a freelancer.

Apply as a freelancer