34 votes

OpenAI, Google and Anthropic are struggling to build more advanced AI

34 comments

  1. [29]
    creesch
    (edited )
    Link
    Ah so the sky isn't the limit? Or maybe they have sucked up all the atmosphere (training data) so the sky is a hard limit? This could be a temporary bump in the road. But, given how iterative most...

    Ah so the sky isn't the limit? Or maybe they have sucked up all the atmosphere (training data) so the sky is a hard limit?

    This could be a temporary bump in the road. But, given how iterative most LLM development seems to have been I personally think we might have reached a plateau with what can be achieved with these types of models. At the very least it is becoming clear they are not improving at the rate the hype seems to suggest. Although that shouldn't come as a surprise for anyone who has paid any attention to the actual improvements between gpt-4 versions.

    Edit:

    Another thought I had. If all the training data in the world is not enough to create the next step forward, then maybe these models are indeed not the way forward for anything remotely resembling actual intelligence and more importantly ability to improvise. This should also not be a surprise for most people who have paid attention.
    To be clear, because I know people will misread what I wrote, that doesn't make these models useless. I quite enjoy my fancy rubber ducky who can actually talk back and pitching two models against each other. Like the juniors our team needs but never gets,while also not having to be gentle with their emotions when the code sucks ;). But, at the end of the day these models are just tools and like all tools they have very clear limitations.

    36 votes
    1. [11]
      Habituallytired
      Link Parent
      Thank you so much for spelling out what I've struggled to express to people. As someone who will forever hate GenAI, I have no problem with AI in tech or to help make my administrative time...

      Thank you so much for spelling out what I've struggled to express to people. As someone who will forever hate GenAI, I have no problem with AI in tech or to help make my administrative time easier, but I can always tell when something is AI, and I genuinely hope that continues to be the case. For one, because it doesn't really feel like "intelligence" so much as it does a really good compilation of everything we've learned thus far. For another, I think people need to understand that there will always need to be human intervention, for the sake of community and society.

      15 votes
      1. [2]
        teaearlgraycold
        Link Parent
        Sorry to use a gross analogy, but I think you’re falling for the selection bias that transphobes use when they say they can “always tell”. They can tell until they realize that cis people cover a...

        Sorry to use a gross analogy, but I think you’re falling for the selection bias that transphobes use when they say they can “always tell”. They can tell until they realize that cis people cover a wide range and trans people can pass really well.

        16 votes
        1. WeAreWaves
          Link Parent
          This is known as the toupee fallacy if you want an example that doesn’t involve transphobia.

          This is known as the toupee fallacy if you want an example that doesn’t involve transphobia.

          9 votes
      2. creesch
        Link Parent
        I have come to the mindset that it is possible that you can't always tell. However, that also means that someone has done all the work to double-check the output, rewrite/rework bits of it, etc....

        but I can always tell when something is AI

        I have come to the mindset that it is possible that you can't always tell. However, that also means that someone has done all the work to double-check the output, rewrite/rework bits of it, etc. Effectively done the work of human intervention. Which also brings them back to the realm of simply being tooling and not standalone miracle workers.

        15 votes
      3. [4]
        raze2012
        Link Parent
        It's a shame because I don't actually hate AI itself, I hate product teams that keep insisting it needs to replace labor, instead of making existing talent more efficient. I feel like even 10...

        It's a shame because I don't actually hate AI itself, I hate product teams that keep insisting it needs to replace labor, instead of making existing talent more efficient. I feel like even 10 years ago such a tech would have at least 30% of the industry using it as the next step in automation.

        How things shift so quickly...

        7 votes
        1. [2]
          skybrian
          Link Parent
          I think it's only partially boosters. It's also critics who talk about what they fear. Sometimes it seems like the critics are boosting the "AI will take our jobs" meme more than the boosters....

          I think it's only partially boosters. It's also critics who talk about what they fear. Sometimes it seems like the critics are boosting the "AI will take our jobs" meme more than the boosters.

          Hype is kind of like that - everyone makes a lot of noise, boosters, skeptics, and everyone else who wants to talk about it.

          2 votes
          1. raze2012
            Link Parent
            Some of it is fluff, yes. But I'm already seeing the effects here and there. Some friends here and there were laid off under this initaive "shift towards AI" (insert the prettified way to say how...

            Some of it is fluff, yes. But I'm already seeing the effects here and there. Some friends here and there were laid off under this initaive "shift towards AI" (insert the prettified way to say how this benefits workers when it doesn't), I'm needing to talk more to bots for customer service, animation studios are cutting out people in some hope of making a mix of outsourcing and LLM tech to achieve cheaper movies. AIGen Slop has completely poisoned most asset stores for professional development (even if it looks good, it's a copyright nightmare to use it).

            It may all revert in a few years, but this hype is definitely going to hurt a lot of people who otherwise were doing fine jobs.

            3 votes
        2. Fiachra
          Link Parent
          I suspect it isn't the boosters exactly - when the supporters have too high a ratio of salesman influencer types you intuitively sense a scam.

          I suspect it isn't the boosters exactly - when the supporters have too high a ratio of salesman influencer types you intuitively sense a scam.

          1 vote
      4. [3]
        trim
        Link Parent
        Some of the AI voice impersonations are incredibly accurate. I've put myself through some blind tests and I have no idea. I have no doubt I could be completely fooled by AI voice in the correct...

        I can always tell when something is AI

        Some of the AI voice impersonations are incredibly accurate. I've put myself through some blind tests and I have no idea. I have no doubt I could be completely fooled by AI voice in the correct circumstance.

        3 votes
        1. [2]
          Habituallytired
          Link Parent
          Sorry, I should have said something written or an AI image.

          Sorry, I should have said something written or an AI image.

          1. lintful
            (edited )
            Link Parent
            LLMs can respond with text of almost any length and style and word choice given the right prompt though. You can even ask them to mess up the grammar and spelling, something they tend to be nearly...

            LLMs can respond with text of almost any length and style and word choice given the right prompt though. You can even ask them to mess up the grammar and spelling, something they tend to be nearly flawless at.

            Also most people only have experience with "instruct" models trained for conversation, but base models just continue a text, making it even lower effort to hide in plain sight.

            A consequence to this is any "AI detection" software is snake oil. They only catch the lowest effort stuff and false positives and negatives are impossible.

            7 votes
    2. [5]
      Wolf_359
      (edited )
      Link Parent
      I think we are discovering that consciousness is exactly as complicated as we thought it was (before LLMs maybe gave some people false hope). The thing is, I don't think they're off base on...

      I think we are discovering that consciousness is exactly as complicated as we thought it was (before LLMs maybe gave some people false hope).

      The thing is, I don't think they're off base on thinking this is a step toward GAI or a fully sapient machine.

      Instead, an LLM is probably just one of the many components that will be required to create this artificial mind, much like our own brains have multiple different parts with distinct functions which only coalesce into consciousness when they are able to communicate and interact with one another.

      Audio/visual sensors, access to a calculator, access to memory, some form of LLM, and probably a hundred other components we haven't figured out yet may very well create consciousness someday.

      Edit: Just realized I forgot to add that I actually think our brains have LLM-like components in them already. If you really examine a lot of the autopilot conversations you have, you might come to the same conclusion. Now, our personal observations with regard to consciousness could be absolutely meaningless. It's possible that the best case scenario is that we can't escape our own biases and poor interpretations. It's hard to be objective about subjective experiences and our memories are notoriously unreliable on top of that. But worst case? Consciousness could literally just be an illusion and we may not be able to reflect on it at all.

      Anyway, the fact that I have ADHD might be relevant here. I process information incredibly quickly and often respond on autopilot - usually to my own detriment, lol.

      7 votes
      1. creesch
        Link Parent
        Yeah it could be that LLMs or some future derivative are part of what makes an actual conscious. For now though LLMs as stand alone technology seem to have hit a bit of a ceiling. If it is just a...

        Yeah it could be that LLMs or some future derivative are part of what makes an actual conscious. For now though LLMs as stand alone technology seem to have hit a bit of a ceiling. If it is just a speed bump or a plateau remains to be seen although I personally am leaning towards the latter.

        And again, that doesn't mean they are useless, I greatly enjoy using LLMs as part of the toolset available to me.

      2. [3]
        R3qn65
        Link Parent
        I genuinely haven't heard this take before and it's very interesting.

        Instead, an LLM is probably just one of the many components that will be required to create this artificial mind, much like our own brains have multiple different parts with distinct functions which only coalesce into consciousness when they are able to communicate and interact with one another.

        I genuinely haven't heard this take before and it's very interesting.

        1. [2]
          Wolf_359
          Link Parent
          Oh man... if that intrigues you, you should really take a dive into this: The Split Brain Experiments That video is a very short and simple introduction to one of the most insane rabbit holes I've...

          Oh man... if that intrigues you, you should really take a dive into this:

          The Split Brain Experiments

          That video is a very short and simple introduction to one of the most insane rabbit holes I've ever explored.

          After reading/watching about 25 different articles/videos on that topic, I ended up reading a book called The Origin of Consciousness in the Breakdown of the Bicameral Mind by Julian Jaynes. It's purely speculative but goddamn if it didn't blow my mind and open up a million questions about consciousness that I had never even considered.

          2 votes
          1. R3qn65
            Link Parent
            Thanks for the book rec - will check it out.

            Thanks for the book rec - will check it out.

    3. [7]
      winther
      (edited )
      Link Parent
      The public hype has been treating the current AI models as something of a first step that will only improve over time. And at the same rate! When in fact, what we have now is more like the peak of...

      The public hype has been treating the current AI models as something of a first step that will only improve over time. And at the same rate! When in fact, what we have now is more like the peak of 20 years of research and development. Of course there will still be improvements to make, but not at the same rate. Like some of the fastests cars can go from 0-100 in 1.8s, but from 100-200 takes 2.5s on top of that and even longer to go to 300. But the hype have just been mindlessly extrapolating the jump from GPT 2 to GPT 3 and concluded it would continue at that level every year.

      6 votes
      1. [2]
        papasquat
        Link Parent
        The hype isn't necessarily about an expectation for LLMs to get better; they're already very good at what they do, which is generating convincing looking text. The hype has more to do with the...
        • Exemplary

        The hype isn't necessarily about an expectation for LLMs to get better; they're already very good at what they do, which is generating convincing looking text. The hype has more to do with the potential applications of that technology, which we've barely scratched the surface of at this point. Right now, we're still using the technology mostly in its most raw form, that is, passing text to it via a field and receiving its text output directly. There are way more applications for it to be integrated into applications and application pipelines, most of which will be pure gimmicky garbage, but likely much of which will be genuinely useful.

        It sort of reminds me of the development of lasers. At first, a super powerful beam of coherent light was merely super cool. Shortly after, they started being used in their most raw and obvious applications; dumping tons of power into a small spot to cut or destroy things.

        Now, the modern world wouldn't be remotely achievable without them. They enable high bandwidth long distance communication, optical disc storage, lithography, audio, microscopy, and about a million other things. The average person likely uses hundreds of lasers every day without even knowing it, and taking them for granted. It just took a while to figure those use cases out.

        15 votes
        1. mordae
          Link Parent
          Yesterday I've asked Claude to draw me an illustration of memory layout with various segments and use it to illustrate a pointer from stack to a vector on heap to a student. It did a very good job...

          Yesterday I've asked Claude to draw me an illustration of memory layout with various segments and use it to illustrate a pointer from stack to a vector on heap to a student.

          It did a very good job producing correct ~1K SVG, except for small issues like text overlapping some border lines.

          1 vote
      2. aetherious
        Link Parent
        I agree, there will be advancement but it's unreasonable to expect it at the same rate just because of the hype. The companies are feeling the pressure to show the improvement in performance of...

        I agree, there will be advancement but it's unreasonable to expect it at the same rate just because of the hype. The companies are feeling the pressure to show the improvement in performance of models because of their ridiculous valuations. What we have now with GPT-4 and comparable models is still a great piece of tech in itself. I hope the hype dies down and the trend of shoehorning AI into everything just to sell it more.

        2 votes
      3. [3]
        carsonc
        Link Parent
        I don't think it's quite so clear. How would you discuss Moore's Law in relation to automotive design? Though easily grasped, automotive analogies might be inapt. If the public believes that AI...

        I don't think it's quite so clear. How would you discuss Moore's Law in relation to automotive design? Though easily grasped, automotive analogies might be inapt. If the public believes that AI progress will resemble exponential growth, then the expectation of eye-popping advances every year is warranted. The hype might be wrong, but such a belief would not be unreasonable, given the track record of computing advancement over the past 60 years.

        To me, the "growth" of AI is both unambiguous and highly subjective. Perhaps it's more like the adoption of the automobile than it is the mechanics of an automobile. When cars started to take over roads, we made the roads better for cars, then cars got better, and more people drove them, and the roads got better, and so on. At any point, you could say, "We couldn't possibly become more car-friendly!" But here we are, expanding lanes and spending more money on cars.

        1 vote
        1. [2]
          sparksbet
          Link Parent
          Moore's law is also not really directly relevant here -- certainly no more so than the automotive metaphor. While hardware is necessary to train models like this, it generally has not been the...

          Moore's law is also not really directly relevant here -- certainly no more so than the automotive metaphor. While hardware is necessary to train models like this, it generally has not been the thing holding further growth back. The issue for models prior to GPT was always that, after a certain number of parameters was reached, returns diminished so significantly that the model more or less couldn't be improved further. The big thing with GPT-3 was that it went way past where this should have happened, but returns hadn't diminished and it kept improving. This was surprising and allowed for a lot of really rapid progress on LLMs, but it seems that we may now be hitting that same "ceiling" for these types of models.

          3 votes
          1. carsonc
            Link Parent
            I didn't really think about it that way. Good point!

            I didn't really think about it that way. Good point!

    4. Raspcoffee
      Link Parent
      I wouldn't be surprised myself. A lot of hype around LLMs is with the idea that they are intelligent and make decisions that require intelligence. And with that I don't mean making associations...

      I wouldn't be surprised myself. A lot of hype around LLMs is with the idea that they are intelligent and make decisions that require intelligence.

      And with that I don't mean making associations and guess working on that(which I would personally argue is more or less what LLMs do), but actually making predictions based on a certain understanding of what the world is like.

      And making something like that is going to take a whole lot more understanding. Given how memory might be more intertwined with the rest of the body than we previously thought, there's a good chance that copying or reproducing that part of 'intelligence'(yes I know I am vague here, unfortunately unavoidable with discussions around this topic) is more difficult than we thought.

      I could be wrong of course. But I imagine that a lot of the infrastructure currently built for LLMs will be used for something else by the end of this decade.

      4 votes
    5. [3]
      skybrian
      Link Parent
      I think they've reached limits for a particular approach where you throw more hardware at the problem, but it's still quite hard to tell what machine learning researchers will come up next, or how...

      I think they've reached limits for a particular approach where you throw more hardware at the problem, but it's still quite hard to tell what machine learning researchers will come up next, or how long it will take to hit on something better.

      Moore's law aside, technological innovation usually doesn't happen on a schedule.

      3 votes
      1. creesch
        Link Parent
        I mean sure, the article is about specific machine learning technologies though. Those seem to have reached either a speed bump or a plateau at the moment. I am sure there is research being done...

        I mean sure, the article is about specific machine learning technologies though. Those seem to have reached either a speed bump or a plateau at the moment.

        I am sure there is research being done in other areas that might or might not lead to other possibilities in the future.

        2 votes
      2. sparksbet
        Link Parent
        The field tends to go through alternating phases of "just throw more resources at it and add more data" vs. "innovate ways for the models to do more with less data". We've been in the former for a...

        The field tends to go through alternating phases of "just throw more resources at it and add more data" vs. "innovate ways for the models to do more with less data". We've been in the former for a while now, so it seems natural that a switch to the latter is coming soon.

        2 votes
    6. glesica
      Link Parent
      It occurs to me that a reasonable next step (in terms of making the things more useful, not in terms of getting the hype train revved up) is to go back to the 80s and start building things like...

      It occurs to me that a reasonable next step (in terms of making the things more useful, not in terms of getting the hype train revved up) is to go back to the 80s and start building things like expert systems with LLM building blocks. I mean, in a way, this is basically what the whole "tool use" thing is, but I bet there's a whole lot more that could be done there to do really magical things, though with a lot more effort than just having some "intelligent" agent and telling it to do the thing.

  2. JesusShuttlesworth
    (edited )
    Link
    If one were to look at a graph of A.I. progress on the Y axis and time on the X axis, they would see a graph filled with flat lines and massive jumps. Are we in the middle of a jump or at the...

    If one were to look at a graph of A.I. progress on the Y axis and time on the X axis, they would see a graph filled with flat lines and massive jumps.

    Are we in the middle of a jump or at the start of a flat line? Tough to say. However, if I may speculate...

    First of all, it is important to recognize that many of the rapid increases in A.I. progress have come not from improvements to the techniques, but rather from improvements to compute power. For example, it is well known that Least squares linear regression was performed by Legendre and Gauss in the 1800s but it wasn't until the computer was developed that Linear Regression became such a powerful technique. The work done by Gauss and Legendre would eventually lead to the creation of the neural network in the 1950s. Despite this revolutionary discovery, by 1974 we were at a flat line, an "AI Winter". Only 20ish years after this AI winter, the world would witness Deep Blue become the first computer chess-playing system to beat a reigning world chess champion. As stated on Wikipedia:

    These successes were not due to some revolutionary new paradigm, but mostly on the tedious application of engineering skill and on the tremendous increase in the speed and capacity of computers by the 90s. In fact, Deep Blue's computer was 10 million times faster than the Ferranti Mark 1 that Christopher Strachey taught to play chess in 1951. This dramatic increase is measured by Moore's law, which predicts that the speed and memory capacity of computers doubles every two years. The fundamental problem of "raw computer power" was slowly being overcome.

    Even the current models have been enabled by the power of the modern GPU. Of course nowadays, some speculate that Moore's law is slowing down and if this is true, it will mean that A.I. improvements cannot come from compute power increases alone.

    Additionally, it seems obvious to me that the biggest problem with modern LLMs is simply that the creators have poisoned the well. Now that AI content is everywhere, it becomes much more complicated to find good quality data to train on. The current approach seems to have been "just throw more data at it". An approach like this is guaranteed to have massive returns at first and diminishing returns later, especially so as the quality of the data decreases.

    In my opinion, improvements in the near future are unlikely to come from model improvements. Improvements to the models require a lot of research and effort. Short term improvements are most likely to come in the form of techniques that utilize the current models more effectively such as agents or prompting techniques like tree of thought reasoning.

    That being said, a lot of these articles remind me of Flying Machines Which Do Not Fly. Many people are rooting against A.I. and would like to see it fail. Therefore, people are willing to latch onto any ounce of information that paints the future of A.I. as bleak. In my opinion, the current models are more than enough to cause issue in many parts of society. There needs to be less of a focus on the future and more of a focus on the now. These models already exist and they are actively changing the world.

    While the derivative of the A.I. knowledge graph may change over time, the long term trend is monotonic. If the past is any indication of the future, A.I. will continue to improve, it just may take time. Or maybe it will rapidly accelerate again.

    Regardless, run with it or run from it, my friends.

    12 votes
  3. skybrian
    Link
    From the article (archive): … … …

    From the article (archive):

    [OpenAI’s] model, known internally as Orion, did not hit the company’s desired performance, according to two people familiar with the matter, who spoke on condition of anonymity to discuss company matters. As of late summer, for example, Orion fell short when trying to answer coding questions that it hadn’t been trained on, the people said. Overall, Orion is so far not considered to be as big a step up from OpenAI’s existing models as GPT-4 was from GPT-3.5, the system that originally powered the company’s flagship chatbot, the people said.

    At Alphabet Inc.’s Google, an upcoming iteration of its Gemini software is not living up to internal expectations, according to three people with knowledge of the matter. Anthropic, meanwhile, has seen the timetable slip for the release of its long-awaited Claude model called 3.5 Opus.

    Similar to its competitors, Anthropic has been facing challenges behind the scenes to develop 3.5 Opus, according to two people familiar with the matter. After training it, Anthropic found 3.5 Opus performed better on evaluations than the older version but not by as much as it should, given the size of the model and how costly it was to build and run, one of the people said.

    Tech companies are also beginning to wrestle with whether to keep offering their older AI models, perhaps with some additional improvements, or to shoulder the costs of supporting hugely expensive new versions that may not perform much better.

    5 votes
  4. [3]
    RNG
    Link
    So advanced LLMs have been around for what, two years now? It feels premature to think we are hitting some fundamental limit with these models. And there have been recent, meaningful upgrades....

    So advanced LLMs have been around for what, two years now? It feels premature to think we are hitting some fundamental limit with these models. And there have been recent, meaningful upgrades.

    Llama 3 is far better than Llama 2. GPT-4.o is leaps and bounds above the performance of GPT-4. We probably won't see the level of improvement we saw in late 2022, but this technology is still progressing very quickly in the grand scheme of things.

    4 votes
    1. [2]
      skybrian
      Link Parent
      I don't think the article implies any fundamental limit. It's a report that empirically, brute-force scaling seems to be getting harder for multiple companies, which is worth noticing.

      I don't think the article implies any fundamental limit. It's a report that empirically, brute-force scaling seems to be getting harder for multiple companies, which is worth noticing.

      7 votes
      1. RNG
        Link Parent
        Yeah, totally agree, however I'm more responding to how folks here and HN have interpreted these findings; namely as an indicator that we are approaching some fundamental technical limit with LLMs.

        Yeah, totally agree, however I'm more responding to how folks here and HN have interpreted these findings; namely as an indicator that we are approaching some fundamental technical limit with LLMs.

        4 votes