2 votes

o3 - wow

2 comments

  1. Amarok
    (edited )
    Link
    Author's summary: All the talk about AI models 'hitting a wall' and slowing down has been vaporized - in fact we don't appear to be approaching any walls at all. That was just wishful thinking on...

    Author's summary:

    o3 isn’t one of the biggest developments in AI for 2+ years because it beats a particular benchmark. It is so because it demonstrates a reusable technique through which almost any benchmark could fall, and at short notice. I’ll cover all the highlights, benchmarks broken, and what comes next. Plus, the costs OpenAI didn’t want us to know, Genesis, ARC-AGI 2, Gemini-Thinking, and much more.

    All the talk about AI models 'hitting a wall' and slowing down has been vaporized - in fact we don't appear to be approaching any walls at all. That was just wishful thinking on the part of some researchers and it has now been demonstrated as false. This model, in a nutshell, can beat subject matter expert humans at anything you can create a benchmark to measure, with only a few holdouts for spatial reasoning and other niche areas that are just too 'soft' in the data (for now) to succumb to rigorous reasoning steps. The o3 model does this at an absurd electrical processing cost, but in minutes it still solves what would take any human expert days to calculate. Results are well into the 90th percentile now on even the most grueling math and physics thrown at it.

    It's not quite an AGI but it is blurring the line and closing in on that goal faster than any experts or futurists (including the crazy ones) ever predicted it could. There are still no brakes on this train, and safety is becoming a very real concern with increasing urgency. I look forward to learning how to install a nuclear reactor in my data centers just to power these things.

    2 votes
  2. Wes
    Link
    AI Explained is always a good source for keeping up with the latest in this field. His videos are well-researched and clear. It's incredible how these models keep leapfrogging each other. Every...

    AI Explained is always a good source for keeping up with the latest in this field. His videos are well-researched and clear.

    It's incredible how these models keep leapfrogging each other. Every time it seems like progress is slowing, we see new techniques emerge like mixture of experts, multimodality training, and now chain of thought. The capabilities continue to improve, and benchmarks are being quickly obsoleted.

    This is the first model to show serious adaption of training data to new material, which is closer to rational thinking than we've seen before. It seems we really are getting to the point of needing to consider what AGI really means.

    I look forward to seeing new quantization and optimization techniques explored for chain of thought reasoning. It should be possible to reduce inference costs to more reasonable levels, even on weaker hardware. Like MoE, it would just need to run serially, and not in parallel.