Does generative AI have a natural limit without a major innovation? - ~comp

[2]

arqalite

June 14

Link

I'm far from an LLM researcher, but I work with AI on the daily and have deployed multiple applications using models from all major frontier labs. I don't think LLMs will get us to AGI, it might...

I'm far from an LLM researcher, but I work with AI on the daily and have deployed multiple applications using models from all major frontier labs.

I don't think LLMs will get us to AGI, it might get very close but I do think that always a sufficiently trained human will outperform the very best LLM there is, in a given task.

Where LLMs will outperform us (and maybe already do for some narrow cases) though, is in cross-functional tasks across multiple domains, since they can distill knowledge from so many fields at once while people generally specialize in one or two fields.

I do think we need a breakthrough in model architecture to get past this barrier. It will probably happen eventually, but not right now.

It's interesting to see the attempts at recursive self-improvement and how they develop in the future, though.

17 votes

kaffo (OP)
June 14
Link Parent
Sensible take. It does seem that gen AI is getting better at having more, and more broad models. Way back when the hype started in like 2023/2024 I wondered if we'd see extremely good, but...

Sensible take.

It does seem that gen AI is getting better at having more, and more broad models. Way back when the hype started in like 2023/2024 I wondered if we'd see extremely good, but specialised models. Maybe they could talk to each other, or they could work together in some environment. But seems they've gone the general route and it's actually working out reasonably well.

Re self improvement, I'm sure it's being attempted. It's got to be in similar veins to reinforcement learning where they give the models a reward metric. But it must be slow as hell right now with how expensive training is.

3 votes

[11]

NonoAdomo

June 14

Link

No. That's not an an end state for LLM/Generative AI. The AI companies want you to think that it is, but they're honestly just using marketing smoke and mirrors to talk about how cool their...

Is gen AI going to take us to general intelligence?

No. That's not an an end state for LLM/Generative AI. The AI companies want you to think that it is, but they're honestly just using marketing smoke and mirrors to talk about how cool their products are.

The first hurdle is for everyone to agree on what "intelligence" is. What does it mean? Clearly, us awesome humans have it, but are there other species that have it? I could go and write a doctoral thesis level post on this, but the problem I wish to highlight is that we don't have an agreed upon definition on what intelligence is.

The second hurdle is that LLMs are a prediction model. It can only spit out responses based on what what it was trained to do. Most LLMs are trained with whatever they could get their hands on (acquired legal or illegal) but they don't come up with new ideas. Every response is a prediction based on what they expect the answer to be. There is no "logic" or "reasoning" with this on the part of the software (again, things we have no clear philosophical definitions). To these AI companies credit, they tuned the training process so well that they give impressive answers that appear like intelligence to the layperson.

The third and final hurdle understanding what the achievement state of a "general intelligence" is. Lets take an easy route, and simply say: General Intelligence will be when we can replicate the human brain and how intelligence exists in humans. Well, the challenging part there is that we honestly don't know what that means from a medical position either. We know the high levels: Brains have neurons and neurons talk to one another. (Hence the term you might hear of "Neural net" for various simple models), but we don't understand the how on a human consciousness exists within our brains. It works, we can even see it happening in other animals but it's such a challenging web of chemicals and electrical signals that we don't understand it. Someday we will, but right now we don't and LLMs/Generative AI is not the path there.

Will gen AI get to a place where it's "intelligence" and reasoning is actually better than the sum of Humanity?

To put it simply, no. The best we can hope for is equal to, but that seems unlikely as well. Don't get me wrong, it's likely going to get to like, 95%-99% range but it can never get better than the comparison of where it started. The only reason why I can't confidently give it 100% is that AI as we know it now currently works by guessing what it should respond next. It's gotten DAMN good at guessing, but stuff that we can reason as true (like 2+2=4) is not what LLMs do. It responds that 2+2=4 because it read enough places that this is true. It does not see two rocks in one hand and two rocks in another count to four total rocks when brought together like humans do. When you ask how it got the answer, it will give you a pretty good bit of stuff that looks like reasoning, but it's again just taking the most likely answer. This is why for the longest time LLMs struggled with stuff that's simple for us like "How many instances of the letter r are in the word strawberry?" There are tons of these edge cases and the AI companies try their best to shut these down every time the public loudly discovers one to show that the models are learning and growing.

Now none of this is to disparage LLMs or GenAI. This is a remarkable achievement in computational history. When I started to learn about computers and software as a kid, I wanted to learn about how to do those cool things in science fiction like AI and robots. Turns out, as I got into actually learning about AI, I also learned that the ethics are really REALLY complicated and it didn't take long to see how these models would be abused by everyone to do morally ambiguous things, which is why I ultimately didn't feel comfortable pursuing a career in it.

I hope these answers help!

17 votes

[7]
R3qn65
June 14 (edited June 14)
Link Parent
I get where you're coming from and this isn't 100% wrong, but it's a simplification to the point that it's starting to be a little bit wrong. Models have, at this point, proved multiple novel...

The second hurdle is that LLMs are a prediction model. It can only spit out responses based on what what it was trained to do. Most LLMs are trained with whatever they could get their hands on (acquired legal or illegal) but they don't come up with new ideas.

I get where you're coming from and this isn't 100% wrong, but it's a simplification to the point that it's starting to be a little bit wrong. Models have, at this point, proved multiple novel mathematical proofs. Yes, that's built on other work and other ideas, but so is the entirety of human innovation. "They can only combine existing ideas" isn't really an indictment, since that's how human creativity basically functions also.

The third and final hurdle understanding what the achievement state of a "general intelligence" is. Lets take an easy route, and simply say: General Intelligence will be when we can replicate the human brain and how intelligence exists in humans. Well, the challenging part there is that we honestly don't know what that means from a medical position either. We know the high levels: Brains have neurons and neurons talk to one another. (Hence the term you might hear of "Neural net" for various simple models), but we don't understand the how on a human consciousness exists within our brains. It works, we can even see it happening in other animals but it's such a challenging web of chemicals and electrical signals that we don't understand it. Someday we will, but right now we don't and LLMs/Generative AI is not the path there.

For what it's worth, most researchers do not consider fully understanding and replicating the human brain to be necessary for general intelligence. Most have a functionalist definition of general intelligence, not structuralist. If we met intelligent alien life, presumably it would not function exactly the same way as the human brain, but that would not preclude it being generally intelligent.

20 votes
1. [6]
  Blakdragon
  June 14
  Link Parent
  Honestly hadn't heard of this before - can you share more?
  
  Models have, at this point, proved multiple novel mathematical proofs
  
  Honestly hadn't heard of this before - can you share more?
  
  2 votes
  1. [5]
    R3qn65
    June 14
    Link Parent
    You bet! A few months ago OpenAI's model disproved one of the Erdos conjectures. It's probably the most significant result thus far. But there have been others as well. Those two results were...
    
    You bet!
    
    A few months ago OpenAI's model disproved one of the Erdos conjectures. It's probably the most significant result thus far. But there have been others as well.
    
    Those two results were comprehensible to humans. (Mathematicians, that is. Not me.) But there's a very interesting open question right now about what it will mean when models are proving things mathematically that nobody really understands. That last link is probably the most worth reading - it's a fascinating look at what math really means, not just what it is, if that makes sense.
    
    10 votes
    
    [3]
    Blakdragon
    June 16
    Link Parent
    Took me a couple days to get around to reading the David Bessis article, but I'm back! I didn't understand a darn thing about the AI proof (disproof?) (I was a strong math student right up until...
    
    Exemplary
    
    Took me a couple days to get around to reading the David Bessis article, but I'm back!
    
    I didn't understand a darn thing about the AI proof (disproof?) (I was a strong math student right up until university, where it all fell apart on me lol), but still interesting to read about.
    
    My degree is in computer science though, so I'm a bit more familiar with how AI was, at least 15 years ago, and was aware of the direction it was heading. I find myself being a bit of a luddite in the AI space, and I think this article kind of put words to some of the things I've been feeling. When people were saying "AI will replace software developers!" my reaction was, well then you really don't understand my job all that well. Because simply writing code isn't the important part of what I do. Understanding requirements and building a maintainable and changeable codebase is the important part of what I do. (Writing code is the fun part though, and I'm loath to give that up to some algorithm that I have to clean up after.)
    
    I think within a decade, a lot of things that mathematicians currently do—what we spend the bulk of our time doing and a lot of stuff we put in our papers today—can be done by AI. But we will find that that actually wasn’t the most important part of what we do.
    
    I liked where the article concluded. I like understanding things, and I feel like AI is being used in a lot of places to skip understanding, and maybe that's why I'm having such a negative reaction to it. I know that LLM AIs aren't inherently bad; they're just tools.
    
    But also I'm tired of the world completely changing around me every 15 years. Anyways. Thanks for sharing! This was definitely enlightening stuff to read.
    
    10 votes
    
    post_below
    June 16
    Link Parent
    This is entirely unsolicited, feel free to ignore it, and I don't intend to be condescending in any way. Unless you're close to retirement (and maybe even if you are), be wary of this tendency!...
    
    This is entirely unsolicited, feel free to ignore it, and I don't intend to be condescending in any way.
    
    But also I'm tired of the world completely changing around me every 15 years.
    
    Unless you're close to retirement (and maybe even if you are), be wary of this tendency! There isn't much that ages you faster than the disconnection that comes from feeling like the world has left you behind and doesn't make sense anymore.
    
    And the speed with which "everything changes" is likely to keep increasing provided some catastrophe doesn't send us back to the dark ages.
    
    3 votes
    
    R3qn65
    June 16
    Link Parent
    You’re very welcome, and I loved reading your thoughts here.
    
    You’re very welcome, and I loved reading your thoughts here.
    
    2 votes
    
    tauon
    June 15 (edited June 15)
    Link Parent
    To pile onto that, ErdosBench(mark) is a thing now, comparing multiple models’ performance on (previously not published) adaptations of Erdős problems.
    
    To pile onto that, ErdosBench(mark) is a thing now, comparing multiple models’ performance on (previously not published) adaptations of Erdős problems.
    
    4 votes
delphi
June 14 (edited June 14)
Link Parent
I get the mechanisms you're referring to, I know how back propagation and reinforced learning works, I understand pretaining as a concept, and you're spot on on the analysis in any other part, but...

I get the mechanisms you're referring to, I know how back propagation and reinforced learning works, I understand pretaining as a concept, and you're spot on on the analysis in any other part, but I have to ask - "they just regurgitate text and can't have any original insight" - is that really true?

Like, get this. I've absolutely seen the model do "original" things before, even if they were just deterministic flukes. I can absolutely get the model to string together words and sentences in a way that - aside from the library of Babel, grumble grumble - no human being has ever said. Now granted, I can also do that with a random number generator, but therein I think lies the point.

If we can get it to do that, we can get it to do that in code, or maths, or poetry, or whatever. Will the output be coherent, good, any of those? Doubtful, doubtful, but it will absolutely be "original". This is pedantic, sure, but I kind of reject the notion that "LLMs can't be meaningfully creative" when the mechanisms in place are in concept so close to a human synthesising knowledge from their learned experiences. And god knows humans are capable of writing nonsense as well.

6 votes
F13
June 14
Link Parent
I would say that LLMs are getting closeish, if you really squint, to something that could arguably be called reasoning. Yes, they know 2+2=4 because they have internalized enough data that...

I would say that LLMs are getting closeish, if you really squint, to something that could arguably be called reasoning.

Yes, they know 2+2=4 because they have internalized enough data that basically says that. But also, good models can "reason", even if they were never told 2+2=4, if they were told things like 1+1=2, 1+2=3, 1+1+1+1=4, etc etc. There really truly is a sense of reasoning going on, using their multi-dimensional relationship model of one token to another.

You could argue that a dimension of an LLM might be an object's "blueness", even if it's not codified as such, based on the relative weights of other blue and non-blue things represented in that dimension, and similarly it might encode an emotional concept like "sadness". As such, it could tell you with an amount of logic whether a blue thing is likely to be associated with sad things, even if it was never trained on that directly, based on how often blue things tend to be also sad things.

4 votes
kaffo (OP)
June 14
Link Parent
Thanks for the detailed reply, I agree it's very interesting and very in-depth. On your point of "what is intelligence". I, like probably many others, have been thinking about it and there's a...

Thanks for the detailed reply, I agree it's very interesting and very in-depth.

On your point of "what is intelligence". I, like probably many others, have been thinking about it and there's a good reason we've not got agreement.
I suspect, in my opinion, if we knew the "source code" of how something like an individual sheep worked then we would look at it in a very different way. When Humans don't understand something, it's put on a pedestal (sometimes worshipped!) and I think we do that with consciousness and intelligence to some degree.
That's not to say it's not extremely impressive, especially as a biological evolution, but I think that because we fully understand how something works, we have a natural tendency to demote it.

That said. I'm convinced that gen AI is not intelligent, but it is able to mimic intelligence. Which is confusing to a lot of users who don't understand what they are talking to.
How do we decide what is "intelligent" like you said? Who knows! I thought about it somewhat and I haven't come to a conclusion. But I think that at least a "thing" has to be able to make it's own decisions and those decisions must have some kind of reasoning behind them based on both external input and also their own internal memories and thoughts.
I don't think models (or agents) today meet this criteria. They mimic it well, especially well sometimes, but it's essential the same as a broken clock being right twice a day.

I agree with much you've said in the rest of your comment, I can see us getting a long way in the right direction with gen AI. But it's not taking over the world quite yet.

[8]

pete_the_paper_boat

June 14

Link

Have the diminishing returns since GPT 3 not been clearly visible?

As I see it, surely the bottleneck will soon become the data they are trained on?

Have the diminishing returns since GPT 3 not been clearly visible?

9 votes

[2]
R3qn65
June 14
Link Parent
Most labs and think-tanks believe capabilities since GPT-3 have either continued to advance linearly or have even accelerated. Even skeptics generally insist that progress is "only" linear. I'm...

Have the diminishing returns since GPT 3 not been clearly visible?

Most labs and think-tanks believe capabilities since GPT-3 have either continued to advance linearly or have even accelerated. Even skeptics generally insist that progress is "only" linear. I'm sure there are some who hold that it's slowing, but that's far from a mainstream opinion.

Not saying it can't still be your opinion, of course. But that's the consensus for context.

21 votes
1. EpicAglet
  June 18
  Link Parent
  I understood it as that progress is logarithmic as function of resource cost. So that's why progress seems limited despite massive investments
  
  I understood it as that progress is logarithmic as function of resource cost. So that's why progress seems limited despite massive investments
  
  1 vote
[3]
V17
June 14
Link Parent
As a regular user since about ChatGPT release I don't think so. ChatGPT 4 was a huge step forward, and so was o1, the change to "reasoning" models that are now standard. Since then the gains have...

As a regular user since about ChatGPT release I don't think so. ChatGPT 4 was a huge step forward, and so was o1, the change to "reasoning" models that are now standard. Since then the gains have been seemingly small, but also I haven't tested any of the frontier models that are hidden behind the higher tier subscriptions or in the case of Anthropic currently paused, and in the grand scheme of things the time since "reasoning" models proliferated has been incredibly short, we're just used to really fast development.

13 votes
1. [2]
  vord
  June 14
  Link Parent
  And the important question: How much is improvement to the actual model, how much is just the non-AI framework around it, and how much is just pumping nitrous in the fuel line?
  
  And the important question:
  
  How much is improvement to the actual model, how much is just the non-AI framework around it, and how much is just pumping nitrous in the fuel line?
  
  6 votes
  1. kru
    June 15
    Link Parent
    Most of the gains over the past cycle have come from better tooling. I think this is widely recognized. But it's a tit-for-tat thing. Tools get made. Models get better at generic tool use. Tools...
    
    Most of the gains over the past cycle have come from better tooling. I think this is widely recognized. But it's a tit-for-tat thing. Tools get made. Models get better at generic tool use. Tools get better/standardized. Models get better at using those tools. Rinse. Repeat.
    
    I liken it to the development of software for creatives. Back in the days of yore, if you wanted to edit a digital photograph (not that those were easy to come by in the 80s/90s, heh), you were doing direct pixel manipulation. Then photoshop (and similar) came out and there was suite of nifty editing tools - but you had to learn how best to use them. Then those tools got better/more advanced and you learned the new usages. Then the tools got even better, and you learned more to keep up. Then the tools started being able to use themselves and here we are.
    
    5 votes
updawg
June 14
Link Parent
I regularly think about how incredible some of the things are that Claude does for me and how I could never have expected to be doing this when I was using GPT-4. I know I've been seeing a lot of...

I regularly think about how incredible some of the things are that Claude does for me and how I could never have expected to be doing this when I was using GPT-4. I know I've been seeing a lot of pessimism around Claude the last few weeks, but my projects shifted and it is even better for me than when I was singing its praises in the past.

8 votes
tauon
June 15 (edited June 15)
Link Parent
Without delivering concrete proof here, I am fairly certain most major American labs, and for sure (like, confirmed by them) some of the Chinese labs known for “distillation” work (Moonshot,...

Without delivering concrete proof here, I am fairly certain most major American labs, and for sure (like, confirmed by them) some of the Chinese labs known for “distillation” work (Moonshot, Zhipu/Z.ai, DeepSeek, Alibaba’s Qwen), are already using synthetic training data, which is to mean data originally produced by an LLM (or a specific/deterministic code-driven process), and then (eventually) fact-checked and/or refined by a human.

It’s worked pretty well so far in the cases that were published, for example Moonshot’s Kimi K-model series:

[Step] 4. Simulate Usage of the Synthetic Agents: The team simulated multi-turn tool-use scenarios in order to generate “trajectories” – a fancy way of saying the detailed set of steps documenting the inputs and steps models take to accomplish their goals. Some of these scenarios simulated “users” – fake people with diverse communication styles – interacting with these agents, while others simulated autonomous usage.

Edit: This is not to say I believe synthetic training data, for LLMs specifically, will get us to “AGI”/further-than-human intelligence. I’m sure there’s an inherent quality ceiling we’ll encounter somewhere.

1 vote

[2]

post_below

June 14

Link

To clarify the vocab: Gen AI = LLM powered agents = LLM fine tuned for reasoning and tool use running in a harness that provides tools and other functionality. Boiling it down there are two steps:...

To clarify the vocab: Gen AI = LLM powered agents = LLM fine tuned for reasoning and tool use running in a harness that provides tools and other functionality.

Boiling it down there are two steps:

Pre training. The giant dataset, tokenizing it (converting it into numbers) and generating embeddings (mathematical relationships between the tokens). This step is constrained by the available data like you said.
Post training (or fine tuning). This step turns the LLM, which can't really do anything except output plausible text in response to input, into a tool that can do useful work. It's where it learns to be an assistant, to use tools, do multi-step reasoning, write code that mostly works, develop an em-dash kink, etc..

The above compresses a bunch of important sub steps for brevity.

Innovation can happen in various parts of both steps, so there's still a lot of room for improvement. There are undoubtedly better ways to do everything involved, much of it has been replaced with better methods multiple times already.

Model size is likely to become a limiting factor, both because of the limit of what exists in terms of training data and because bigger models are more computationally expensive to train and to run. But that's assuming better ways of getting, vetting and tagging pre-training data aren't discovered. I'd assume that, yes, eventually there will be a ceiling. In terms of compute, the tech is going to keep getting more efficient and the hardware will keep getting better so likely any limits imposed by compute will be temporary.

Will recursive self improvement hit an event horizon where LLMs will start improving themselves so fast they start rocketing towards AGI? Probably not with the current state of the art. When models generate their own training data they end up entrenching and exaggerating their flaws, and there are a lot of flaws. Some amount of artifical training data is fine (especially if it comes from a better model), but 100% artifical training isn't viable at this point.

Even if LLMs were to achieve the ability to recursively self improve without ensloppifying themselves, there's no room in the math for the kind of awareness or understanding we'd associate with AGI. The models don't have a conceptual understanding of reality, they only appear to. They would need to invent new technology to get there, not just iterate on existing LLM tech.

However, will LLM tools contribute to whatever sort of AGI is someday created? It's hard to imagine they won't.

I can imagine a future world model with pre-training on a much wider dataset that strives to tokenize reality, as opposed to just language and other creative outout, having a more realistic path to AGI. Especially if it was fine tuned with some sort of feedback mechanism that could approximate real world cause and effect. Maybe you'd need sensory feedback. But that's speculating on technology that doesn't exist yet. Right now world models are mostly focused on improving robotics. As far as I know, no one has tried to make a super-sized general world model. It would take the resources of one of the frontier labs to attempt it.

My perspective is that AGI is still roughly comparable to stable fusion power. There's no reason to believe it can't be done, but it will most likely be "just around the corner" for years and years.

9 votes

kaffo (OP)
June 14
Link Parent
Thanks for the detailed reply. Very interesting take on the "world model" idea, that makes a lot of sense in terms of giving the model some context of the real world as opposed to just our...

Thanks for the detailed reply.

Very interesting take on the "world model" idea, that makes a lot of sense in terms of giving the model some context of the real world as opposed to just our language.

I do agree with the take that gen AI won't lead to general AI but will help pave the way. Though I suspect there will be a lot of media coverage along the way (not that we don't get plenty of it already!) about how gen AI is actually already general AI and has thoughts and feelings.

2 votes

TurtleCracker

June 15 (edited June 15)

Link

Lots of great replies here, so I’ll keep mine short. My issue with framing LLMs as a path to “real” AI/AGI is that they can’t effectively learn in real time. If you ask it something, it does it...

Lots of great replies here, so I’ll keep mine short.

My issue with framing LLMs as a path to “real” AI/AGI is that they can’t effectively learn in real time. If you ask it something, it does it wrong, and you correct it then it won’t answer that question correctly for the next person. This is a key thing that human intelligence is capable of.

8 votes

[3]

Greg

June 14

Link

Why not? Genuine question, I’m interested to hear it in your words, because this is a very big assumption that seems to have slipped in kind of unexamined. I’m being a little annoyingly Socratic...

Gen AI should (in theory) never be able to out preform or push the boundary of the sum of humanity at time of training.

Why not? Genuine question, I’m interested to hear it in your words, because this is a very big assumption that seems to have slipped in kind of unexamined. I’m being a little annoyingly Socratic here, but I do think it’s fascinating to trace how people think about these things.

To be clear, this isn’t my way of obliquely saying I think models are going to go full AGI on current or near future tech or anything, it’s more just that I think it’s an interesting axiom to have built the question on when it doesn’t match up with what I see of even 2025-era models.

6 votes

[2]
kaffo (OP)
June 15
Link Parent
I concerned a rant in the post but decided against it, I didn't want to put too much opinion in the top level. Maybe I should have put it in a collapsed section or just my own comment. Anyway yes,...

I concerned a rant in the post but decided against it, I didn't want to put too much opinion in the top level. Maybe I should have put it in a collapsed section or just my own comment.

Anyway yes, thanks for asking. My opinion is that models and/or agents, given the right training and data, then have the capability to produce content which exceeds that boundary. But I would put it down to a combination of randomness and any software (the agent) to capture the high quality output and drop the low quality stuff.
Also I don't believe the step change would ever be large. We've proven that turning up the randomness on these models just produces more noise. I think there's a sweet spot where it'll start to produce content +/- 5 or 10% the threshold. You could capture that content above the line using some metric (would be difficult with the kind of boundary pushing content we are talking about) then feed that back into the next training set.

So yeah, I think it's possible, but I don't think it's reliably pushing the boundary nor is it a large jump.

What I would like to see is models getting more specific and less general. Iterate training on a model that only does math, or law, or software engineer, etc. Give it focus, cut out the context it doesn't need and see if it can seriously push the boundary.

1 vote
1. Greg
  June 23
  Link Parent
  Sorry for the very late reply, I appreciate the discussion and I missed your response on this one! I guess what I’d say is that your line of thinking reminds me strongly of biological evolution on...
  
  Sorry for the very late reply, I appreciate the discussion and I missed your response on this one!
  
  I guess what I’d say is that your line of thinking reminds me strongly of biological evolution on long timescales, and of human societal and technological development on shorter ones. The vast majority of random genetic mutations go nowhere, and almost none will ever be a step change for a species on their own. The vast majority of people can live long, productive, fulfilling lives without ever really hitting on an idea so new that humanity as a whole hasn’t already considered it somewhere along the way - and again, of those tiny number of truly new ideas that do happen and are valuable, the number that make a large impact on their own rather than being incremental improvements are vanishingly small.
  
  So if that’s the case for us as people, I see the occasional model making the occasional leap beyond what was known at the time it was trained to be pretty absurdly impressive, really. I see at least conceptual scope for those models being able to stack those incremental improvements autonomously if we set them on a path to doing so, especially since they can be copied and fine tuned at a scale and speed vastly higher than biology allows.
  
  It’s all far too abstract to say with any confidence that letting loose a few million self-modifying models interacting with each other in a “society” would or wouldn’t lead to true development of the “individuals” and the system as a whole - and like I said in another comment a week or two ago, I certainly don’t have access to the billions of dollars in hardware that it’d take to test that at meaningful scale with no real pitch beyond “it’s gonna be so interesting”. But seeing that they’re capable of creating new knowledge at all is enough to prevent me from confidently betting against those capabilities…

[3]

Staross

June 14

Link

I think the lack of online learning/catastrophic forgetting is a major limitation currently, to be truly smart a model should be able to learn new information in a reliable way, without a finicky...

I think the lack of online learning/catastrophic forgetting is a major limitation currently, to be truly smart a model should be able to learn new information in a reliable way, without a finicky and costly separate training procedure. Probably the training procedure where all the data is learned as once is an issue in itself (that's not how we learn).

https://en.wikipedia.org/wiki/Catastrophic_interference

5 votes

[2]
kaffo (OP)
June 14
Link Parent
Ah, I didn't know it had a proper term, but that's been on my mind. It's very valid. The context is a weird format for memory, especially for gen AI. Since it essentially drives the output and it...

Ah, I didn't know it had a proper term, but that's been on my mind. It's very valid.
The context is a weird format for memory, especially for gen AI. Since it essentially drives the output and it also has perfect "recall" of everything in the context, the output would always be something "silicon based" and unnatural to us in my opinion.

I suspect there's a format for memory that we haven't thought of yet. The current implementations of "memory" all suck, and they have no real signs of getting better. Especially since at the end of the day, all they do is modify the context.

2 votes
1. Greg
  June 14 (edited June 14)
  Link Parent
  I strongly suspect that allowing models to recursively fine tune their own weights is a better allegory for experiential memory (as much as such a term can apply to a system that probably isn’t...
  
  I strongly suspect that allowing models to recursively fine tune their own weights is a better allegory for experiential memory (as much as such a term can apply to a system that probably isn’t yet capable of experiencing), possibly with something like DeepSeek’s engram lookups as short term/factual memory. Although perhaps not having those lookups would be more organic… just relying on an imperfect fine tuning process that modifies the “brain” as a whole does seem a better analogy for organic remembering.
  
  A lot of the really cool stuff just isn’t being done right now because it’s not particularly likely to be commercially beneficial. A model that modifies itself and remembers imperfectly but holistically is fascinating, but likely less useful than the way we do it now, and/or the way DeepSeek are looking at doing it.
  
  Which is a shame, because I much prefer fascinating, but research grants are limited in a way that AI company budgets don’t seem to be.
  
  [Edit] Having now thoroughly nerd sniped myself with this, I'm seeing something like a system that runs a couple of lightweight LoRA training steps using the entire context window as the training data after every input or output is completed, merging those back to the underlying model weights each time to get a new overall model state. Probably literally less than 20 training steps, I'd imagine, on a tensor bootstrapped directly from the existing state of the weights at that point in time (although a smallish LoRA layer would converge quickly anyway), because most of the same context window is going to be passed back in to the next brief training session, and the next, and the next, with things being repeated and "thought about" until they fall past the limits of the window. The context window remains in place as well, to serve as working memory of the actual conversation, but it's capped at a shorter length than modern systems are capable of, on the basis that we're trying to push the model into making stronger use of its "experiential" memory baked into the weights. I imagine each fine tuning pass would also need to apply some kind of exponential decay function to the context window, maybe breaking it into shorter chunks of conversation (couple of sentences each) and skewing the training sampler heavily towards selecting more recent ones - things that are "fresh in the model's mind" are more likely to be "dwelled upon" and encoded into long term memory, altering the "brain" and "mind" as a whole (although altering them far more strongly in the small targeted areas related to that memory), but things from earlier in the conversation might also be sampled with lower probability, "popping back into the model's mind" and similarly reinforcing as memories. This seems far closer to a true continuity of experience for the model, again at least in as much as that term can make sense at all here.
  
  Of course this is absurdly expensive, compute intensive, slow, potentially buggy (garbage in, garbage out, apart from anything else), risky (it would invalidate basically all guardrails and alignment training), storage intensive (especially because you likely couldn't safely share a model tuned like this between users without horrible data leaks and just straight up confusion between threads of conversation), and really just not viable as a user facing idea in any way. But damn it would be fun as an experiment to run with a single-session model on dedicated hardware and a small cohort of "friends and colleagues" of the model to converse with it.
  
  3 votes

[2]

V17

June 14

Link

This is literally a billion dollar question. Nobody really knows. Imo the answers are Yes, though no idea how different the technology is going to be from the one we have now - it could be just...

This is literally a billion dollar question. Nobody really knows. Imo the answers are

Is gen AI going to take us to general intelligence?

Yes, though no idea how different the technology is going to be from the one we have now - it could be just incremental development from LLMs gradually taking us someplace else, not necessarily a paradigm change. I think it can be as little as a decade away, depending on whether/when we manage to get to recursive improvements, using AI to improve itself.

(to be clear this worries me a lot, and I wish it didn't happen, but I think it will)

Will gen AI get to a place where it's "intelligence" and reasoning is actually better than the sum of Humanity?

"sum of Humanity" is very strong, I wouldn't bet on that specifically, though I guess it depends on definition. I think there's a big difference between a theoretical best possible sum of humanity, our potential, and a realistic sum of humanity, humanity that is uncooperative, tribalistic, irrational and full of conflicts. The latter, of course, seems more likely to be beaten.

4 votes

kaffo (OP)
June 15
Link Parent
Interesting, one of the few people who thinks we will get to general AI "soon"! In your opinion how far away do you think current models are from "general AI" in terms of capability?

Interesting, one of the few people who thinks we will get to general AI "soon"!
In your opinion how far away do you think current models are from "general AI" in terms of capability?

1 vote

ThrowdoBaggins

June 17

Link

This has been a vomit of words and thoughts, barely structured, so I urge the potential reader to exercise agency when choosing whether to read further. On my lunch break, I decided this has been...

This has been a vomit of words and thoughts, barely structured, so I urge the potential reader to exercise agency when choosing whether to read further. On my lunch break, I decided this has been worth my time to quickly write but wasn’t worth my time to edit further, so you, dear reader, should consider whether it’s worth your time to read.

My personal view is that current LLMs are incapable of becoming AGI because of their fundamental structure, in the same way that a propeller aeroplane won’t work in space. Sure, the plane can move fast through air and generate lift, and you could make the assumption that if you just give it more and more powerful engines and more and more efficient lift surfaces, eventually it can go infinitely up, but eventually you run up against the fact that propellers can’t work in a vacuum, and no matter how powerful your engines or how astonishingly efficient your lift, that doesn’t change the fact that it structurally requires air to function.

I view LLMs as having similar limitations. If you look at the very early models, you can see how they’re a very sophisticated word prediction machine, and that as it gets more and more powerful and absorbs more and more information from the wider internet, it will get more and more advanced. But by definition of being a word prediction machine, it will never be particularly great at novel ideas unless those ideas can be described by existing words.

You could argue the same applies to humans, but I believe humans are capable of much more complex thoughts than a language machine because we can think in concepts rather than exclusively in words, and therefore we can push new boundaries where words do not yet exist.

Additionally, where LLMs have struggled with non-language concepts, they have been given tools and harnesses to guide them, but I think these solutions hide the underlying problems. For example, if your entire universe is words, even words about mathematics, you will still struggle with maths. You can be given mathematics tools and when someone asks you a maths question, you can remember to use this tool to get you the answer, but fundamentally you still struggle with the maths if not for the tool.

Likewise I think there is a substantial limitation of the fact most LLMs are designed for a question-answer structure. There are millions of thoughts I have which aren’t questions, and aren’t answers to questions, but which still help me gain a better understanding of the universe around me. I wonder how much this lack of just “thinking thoughts” that LLMs can’t really do on their own which limits their final form.

4 votes

[2]

tildes-user-101

June 14

Link

Definitely interested to hear answers for folks that understand the process better than I do. IMO fable was a genuine generational leap over the previous models (based purely on the scope of the...

Definitely interested to hear answers for folks that understand the process better than I do. IMO fable was a genuine generational leap over the previous models (based purely on the scope of the projects I was able to undertake with it compared to Opus and how much more reliable its output was), and my guess is that it was in the post training step.

My very limited understanding is that the biggest models have already been trained on almost all publicly available data so I don’t see big leaps coming from there. Which is why my guess is that Fabel was a post training turning/harness improvement. So I would love to better understand how models will continue to improve and grow more useful going forward.

3 votes

kaffo (OP)
June 14
Link Parent
That's interesting to know! Thanks for sharing, especially now Fable was locked down.

That's interesting to know! Thanks for sharing, especially now Fable was locked down.

[3]

delphi

June 14

Link

"GenAI" is a really useless term for what the question here actually is, and without getting into it I really don't like how the term has become to used, especially in circles that don't much...

"GenAI" is a really useless term for what the question here actually is, and without getting into it I really don't like how the term has become to used, especially in circles that don't much think about the topic and use it as a shorthand for "the product I don't like". Yes, obviously you mean LLMs and image diffusion models, but stay with me here.

Let's say that we get full, real Artificial Intelligence. Cmdr Data, Durandal, GLaDOS, Nick Valentine. A computer that is indistinguishable from a human in terms of their inner life. Would this not still be generative? Would these systems, human by any philosophical definition, generate their output? Don't humans do that now? I don't think it's a meaningful distinction.

As for your question, I personally do not think that LLMs are or can ever be conscious, but I'm not an expert. I don't think LLMs get us to the point of Strong AGI. It's certainly worth examining, I think that research that Anthropic did a while back where they injected thought vectors into an LLMs reasoning space and it could retrieve the general "shape" of these ideas was fascinating, and while I'm pretty cynical about this I'll err on the side of caution and say that, sure, maybe, in some way, whatever's going on inside any given model may approximate the same mechanisms that in humans eventually cause sentience to emerge.

But are we weeks, months or even years away from OpenAI releasing Consciousness-as-a-Service? I don't think so.

2 votes

creesch
June 14 (edited June 14)
Link Parent
The company I work for and many others are all in on ai use and explicitly use the term genAI. To me the term is one used by those in management and suffering from the corporate fomo.

especially in circles that don't much think about the topic and use it as a shorthand for "the product I don't like".

The company I work for and many others are all in on ai use and explicitly use the term genAI. To me the term is one used by those in management and suffering from the corporate fomo.

3 votes
kaffo (OP)
June 15
Link Parent
No, you have a point, it's not a good name. I mean "AI" isn't a good name for LLMs right now either, it's all marketing. Though, I'm afraid it might be one of those things we're stuck with for now...

No, you have a point, it's not a good name. I mean "AI" isn't a good name for LLMs right now either, it's all marketing.

Though, I'm afraid it might be one of those things we're stuck with for now until the "next thing" comes along. But yeah, it will likely be the case that whatever is next is better at "generating" than generative AI.

2 votes

skybrian

June 14

Link

Now that AI chatbots are typically an LLM augmented with tools, there’s no natural limit beyond what computers can do. The tools could do anything the LLM can’t do on its own. AI research is...

Now that AI chatbots are typically an LLM augmented with tools, there’s no natural limit beyond what computers can do. The tools could do anything the LLM can’t do on its own.

AI research is moving rapidly and putting any bound on what researchers might come up with is very hard.

2 votes

CrypticCuriosity629

June 23

Link

I know this topic's old, but it just popped up to the top so I wanted to add my two cents here as well. I don't think the future is with LLMs only. I think the future is with large complex...

I know this topic's old, but it just popped up to the top so I wanted to add my two cents here as well.

I don't think the future is with LLMs only. I think the future is with large complex harnesses of LLMs with each LLM basically acting as a neuron.

Once we create what amounts to an LLM that is trained entirely on efficient computer code as opposed to human language, and then wire those together into a complex network where each node works together with the nodes around it to adapt and specialize, I think that's when we'll start seeing more AGI.

Right now that's pretty high concept if you're only taking ChatGPT like LLMs into account.

I've been working with OpenClaw and Hermes, and those are a bit scary capable when compared to barebones LLMs. And extrapolating that kind of harness into a complex network of specialized LLMs starts to get scary when you think about it.

1 vote

LumaBop

June 14

Link

In the general case, it seems likely. However I think there are some domains where it’s possible LLMs/agents will be able to improve indefinitely. By the way, this is informed speculation, not an...

In the general case, it seems likely. However I think there are some domains where it’s possible LLMs/agents will be able to improve indefinitely. By the way, this is informed speculation, not an evidence-backed claim (and in general I’m an LLM sceptic).

There are certain domains and types of problems where constructing a solution is hard, but verifying a solution to once found is relatively simple. To give a concrete example: solving integrals is hard in general, but verifying a solution involves a relatively easier differentiation process; contrast this with a problem such as finding the shortest tour of a large number of locations (e.g. ”find the shortest route, starting at Paris, which visits every European capital exactly once and returns to Paris”) - even if I told you a route, it’s not trivial to confirm that it is indeed the shortest.

For domains concerned with problems in the “hard to solve, easy to check” category, it seems at least in principle possible that, if agents are paired with a suitable “checking” (verification) tool, they could always have a good learning signal to continuously improve (since all agent output can be accurately labelled as “good” or “bad”). So, hypothetically, recursively training models on prior model output would allow continuous improvement in problem solving abilities. That’s the opposite of what it is understood happens in the general case where LLMs are trained on their own output, which is model collapse.

Certainly several sub-fields of maths and computer science are “hard to solve, easy to check”, so I wonder if LLM ability may not hit a ceiling in those domains.

Weldawadyathink

June 14

Link

One thing that works very well is synthetic training data. I believe this is much of what has powered the recent (1-2 years) of LLM innovation. It's actually a pretty simple concept. For a coding...

One thing that works very well is synthetic training data. I believe this is much of what has powered the recent (1-2 years) of LLM innovation. It's actually a pretty simple concept. For a coding type example, take an existing codebase that is not AI generated. Take your "dumb" AI and have it remove a feature. Even the AIs that were not good at coding could do that pretty well. Also have it generate a user query requesting that feature be implemented. Again, small models that are bad at coding can do this easily. Now you have a training problem that includes a codebase without a feature, a user request to implement it, and a codebase with the feature. Everything except the final state was AI generated, and can be done easily with very old model technology (haiku, GPT 3.5 turbo, etc). And this process can be easily automated to generate a ton of training data.

I kinda assumed the same back in the GPT 3.5 turbo era. But techniques like this seem to have worked to get us past that.