Intelligent...ish #2
Large language models are often judged on complex benchmarks, but some of their most interesting failures show up on questions that seem trivial at first glance. In this article, we test a range of OpenAI models using a small set of deliberately easy questions, the kind that have a track
Claude Opus 4.5 has been drawing attention for its coding skills, so I decided to put it to the test on a problem I’ve struggled with before: getting an AI to write shaders that actually work. I was curious to see whether it could handle the challenge better
In a recent development out of Samsung’s AI lab in Montreal, researchers have introduced a new “Tiny Recursive Model” (TRM) that challenges the prevailing notion that more parameters = more intelligence.) Key ideas & claims * Tiny footprint: TRM has only ~7 million parameters—orders of magnitude smaller than typical large