Friday Squid Blogging: Live Colossal Squid Filmed - Source: www.schneier.com

Clive Robinson • April 18, 2025 8:55 PM

@ ALL,

It tries to reason more but just hallucinates more at OpenAI.

The link says nearly all you need to know,

https://techcrunch.com/2025/04/18/openais-new-reasoning-ai-models-hallucinate-more/

And gets amplified in the first paragraph with,

“OpenAI’s recently launched o3 and o4-mini AI models are state-of-the-art in many respects. However, the new models still hallucinate, or make things up — in fact, they hallucinate more than several of OpenAI’s older models.”

So not just new “soft bullshit” more of it by atleast the bucket load.

Potential users and investors take note of,

“Perhaps more concerning, the ChatGPT maker doesn’t really know why it’s happening.”

Funny that perhaps as it’s “soft bullshit” it’s the diet it’s been fed[1].

But more seriously, “to reason” current AI LLM and ML systems need to not just understand “context” but need curated input.

As I’ve indicated before, the current AI LLM and ML systems

1, Do not have context or the agency to learn it.
2, The input is generally volume over quality, thus the potential signal to noise may well be negative.

Put simply if you look at a lot of apparently random pictures then asked “what do you see?” You first have to understand the context of the whole image in some detail before you start listing features.

Humans usually have implicit context and thus recognise “key indicators” to base their reasoning on. As such humans will ignore sky clouds and other “background” and thus lift the effective signal of the foreground features to build context from.

An AI mostly can not tell foreground from background features so the signal of the foreground context identifying features is near or below the noise floor. Thus it has less of an ability to identify context and be able to move forward to “valid” reasoning.

If the AI had agency to move and stereo cameras then in a real world setting it might acquire a sense of image depth and go on from there to identifying some kind of context.

It’s one of the reasons human labour has to be used when the AI reasons about whales by the clouds or sea state in a picture… Something that humans just “blank out” without thought.

Hence you see,

“Our hypothesis is that the kind of reinforcement learning used for o-series models may amplify issues that are usually mitigated (but not fully erased) by standard post-training pipelines,”

Maybe they should consider a change of strategy away from,

1, Move fast and break things.
2, Load in more uncurated input data.

As a starting point. But they don’t want to think about that as shown by,

“One promising approach to boosting the accuracy of models is giving them web search capabilities.”

I suspect all readers of this blog, know just how “uncurated” and mostly “inaccurate” the web is… And how AI it’s self is polluting it further thus not just decreasing the signal but distorting it as well…

I could go on, but I’ll let folks read the article and report for themselves.

But as a thought exercise,

“How long did it take primates to go from using fire to cook to having functioning chemistry labs?”

[1] For those that have never raised “livestock” especially cattle, they are not great in the brains dept. So if you just turn them into pasture there is a good chance they will eat something that was windborn seeded that will make them sick, dead or both in quick succession. Thus a stockman has to walk the pasture with an open eye to see that there is nothing poisonous that can be potentially consumed.

Sidebar photo of Bruce Schneier by Joe MacInnis.

Original Post URL: https://www.schneier.com/blog/archives/2025/04/friday-squid-blogging-live-colossal-squid-filmed.html

Category & Tags: Uncategorized,squid – Uncategorized,squid