CTO AI Corner: Sometimes GenAI sucks

With all the success stories and hype surrounding AI, we often forget that it doesn't magically solve everything. In fact, it’s quite bad at a few things. Those of us who’ve spent enough time wrestling with AI have learned to spot the traps or work around them. But for new users, there are some common scenarios where AI fails spectacularly. Hitting one of these early on can leave you questioning why anyone thought this stuff was useful in the first place.

Where the AI fails

First up: handling lists in ChatGPT or almost any other chat service. Say your job is to make a small change to a list with a few hundred rows. Sounds like a perfect AI task, right? Wrong. This is exactly where GenAI flops. It will return some of the rows. It will make mistakes. It will forget things. And this isn’t even surprising. The output token limit means it can’t always include everything, and the probability of small errors adds up fast when dealing with lots of items. Some models equipped with tools or different thinking methods are a bit better, but I haven’t found a model I would blindly trust to handle list manipulation without serious babysitting.

Another problem: integrating new information into existing knowledge. Models can summarize webpages or search the internet, but if you give them new information and ask them to use it intelligently? Good luck. Often the results are... creatively incorrect. Again, this makes sense. The model's internal thought process is built from massive pre-existing data. The new info you’re feeding it is just treated like additional background noise, not a rewrite of its "understanding."

Then there's estimation. Sure, AI models are better at math now. They can solve surprisingly tough problems and even use external tools. But when it comes to moving from real-world problem definition to the math itself, they often trip over their own virtual feet. They have little concept of scale, relevance, or sanity-checking numbers unless it’s a textbook case they’ve seen a million times.

And of course, image models have their own party tricks. They don't understand relative sizes. They don’t get that you can’t walk through glass, even if it's transparent. Gravity? Never heard of it. But again, what do you expect? They're guessing what pixels should go next to each other based on your words, not thinking about how objects actually behave. Newer omni models can analyze and modify parts of images more intelligently, but that doesn’t mean the generation process involves any real object awareness.

May 5, 2025

ai-corner

Authors

Tomi Leppälahti

CAIO & CTO

CTO AI Corner: Sometimes GenAI sucks

Where the AI fails

Pohdituttaako AI-asiat? Jätä viesti ja kartoitetaan yhdessä, miten ja missä hyödyntää tekoälyä.