35 votes

OpenAI: Introducing Superalignment

33 comments

  1. [8]
    delphi
    Link
    The fact that OpenAI believes superintelligence (established as “more powerful” than AGI) is within a decade is kind of exciting and worrying to me.

    The fact that OpenAI believes superintelligence (established as “more powerful” than AGI) is within a decade is kind of exciting and worrying to me.

    14 votes
    1. [6]
      teaearlgraycold
      Link Parent
      Personally I think that having a stateless model that has mastered the English language and attained mediocre models of many other systems does not mean we’re on track to get AGI soon. LLMs are...

      Personally I think that having a stateless model that has mastered the English language and attained mediocre models of many other systems does not mean we’re on track to get AGI soon. LLMs are awesome but trying to extrapolate and assume they are indicative of imminent superintelligence does not make sense to me. Although surely they know many things I don’t.

      35 votes
      1. [2]
        Moonchild
        Link Parent
        I place even odds that they've drunk their own kool-aid and that this is a deliberate marketing move. Maybe some combination of the two.

        I place even odds that they've drunk their own kool-aid and that this is a deliberate marketing move. Maybe some combination of the two.

        34 votes
        1. matpower64
          Link Parent
          Being more cynical, I believe this is a move to back up their claims for AI regulations. They can point it out and say "See? It's a real problem and we are doing our part!", and sit out if/when...

          Being more cynical, I believe this is a move to back up their claims for AI regulations. They can point it out and say "See? It's a real problem and we are doing our part!", and sit out if/when the banhammer strikes competition.

          14 votes
      2. [3]
        Comment deleted by author
        Link Parent
        1. ignorabimus
          Link Parent
          I don't think the data is the limiting factor, but rather computer architecture. Some people say something along the lines of "producing text is an emergent property of large language models" and...

          I don't think the data is the limiting factor, but rather computer architecture. Some people say something along the lines of "producing text is an emergent property of large language models" and "if we scale our models up they'll become 'intelligent' for some definition of intelligent". Leaving aside all the problems with whether this is true, it turns out that the amount of computing power needed to 'scale up' is really much beyond what we have available today.

          You really only have to look at the success of specialised hardware (GPUs, TPUs/systolic arrays) to see why this is the case.

          3 votes
        2. gf0
          Link Parent
          AI is an ill-defined term: a few decades ago a simple-ish chess engine was considered AI, now it’s just called a program. But if you mean machine learning in general, then no, while in theory they...

          AI is an ill-defined term: a few decades ago a simple-ish chess engine was considered AI, now it’s just called a program.

          that the underlying algorithms are able to basically solve every problem you throw at them at super-human performance

          But if you mean machine learning in general, then no, while in theory they can indeed do general approximations, for plenty of functions it will need an exponentially increasing network size, so there is only so much hardware and data we can throw at a problem.

          Also, people can learn things from just a few examples, as far as I know this area also sorely lacks.

          3 votes
      3. Felicity
        Link Parent
        This is pretty much my stance right now. I see a lot of articles telling us that an AGI or ASI could pop up within the next few years, but all I'm actually seeing are advanced LLMs. I haven't seen...

        This is pretty much my stance right now.

        I see a lot of articles telling us that an AGI or ASI could pop up within the next few years, but all I'm actually seeing are advanced LLMs. I haven't seen actual AI researchers actually elaborate on these things. The last article I read just kept constantly referencing pop culture and using appeal to authority to tell me I should be terrified of our imminent robot overlords.

        Even other specialized AI that can do specific jobs to replace humans cannot all magically be baked into the same machine. At the end of the day neural net technology is very good at learning to do specific things, but in order to approach anything resembling intelligence, much less superintelligence, we will likely need to develop new systems.

        3 votes
    2. Autoxidation
      Link Parent
      Just just within a decade... within this decade, by 2030.

      Just just within a decade... within this decade, by 2030.

  2. [3]
    PositiveNoise
    Link
    I think that humanity is making a giant mistake by charging full steam ahead in the development of AI without being careful and trying to avoid disaster, yet since humanity is just endless...

    I think that humanity is making a giant mistake by charging full steam ahead in the development of AI without being careful and trying to avoid disaster, yet since humanity is just endless different people and many countries each doing their own thing, it's not surprising that this is how the technology is progressing. So, I find the goals the article mentions to be much better than nothing, even if they are kind of vague right now. I'd hate to be so cynical that I think it would be better for people to not even try to control this new branch of technology.

    Will AI Superalignment work? Maybe. Can it even work at all? Maybe. Will bad actors such as authoritarian governments ignore the idea of safeguards? You betcha. But trying to save ourselves from an obvious potential disaster is better than not trying.

    11 votes
    1. [2]
      JackA
      Link Parent
      Agreed, the fact that a random private company with no government or democratic oversight says this about their own technology that they're still actively developing is terrifying: Like I get...

      Agreed, the fact that a random private company with no government or democratic oversight says this about their own technology that they're still actively developing is terrifying:

      But the vast power of superintelligence could also be very dangerous, and could lead to the disempowerment of humanity or even human extinction.

      Like I get we're all desensitized to corporate values, but it's absolutely insane that the acknowledged possibility of human extinction is only worth a single team of researchers and 20% of their compute.

      13 votes
      1. teaearlgraycold
        Link Parent
        On the other hand, superintelligent AGI could be told "here's enough compute power to effectively simulate 10,000 years of research from 10,000 genius humans in one month - please create a cheap...

        On the other hand, superintelligent AGI could be told "here's enough compute power to effectively simulate 10,000 years of research from 10,000 genius humans in one month - please create a cheap fusion reactor" and then "please create a CO2 scrubber to hook up to the fusion reactor". Of course, at that point you need to treat it like a WMD. But hey - it might work out in our favor.

        1 vote
  3. waxwing
    Link
    It seems to me that not only is superintelligence alignment something which nobody really knows how to solve, but we don't even really know what success would look like, or how to evaluate it in...

    It seems to me that not only is superintelligence alignment something which nobody really knows how to solve, but we don't even really know what success would look like, or how to evaluate it in the absense of superintelligent systems.

    On that basis, setting a timeline of four years to "solve the core technical challenges of superintelligence alignment" seems optimistic but also meaningless. No research grant would get funded on a goal this broad.

    9 votes
  4. [6]
    PleasantlyAverage
    Link
    This seems to be a futile arms race where we develop ever smarter AI systems to barely be able to monitor the currently smartest of them. Surely at some point an AI would be able to outsmart every...

    But humans won’t be able to reliably supervise AI systems much smarter than us, and so our current alignment techniques will not scale to superintelligence. We need new scientific and technical breakthroughs.

    This seems to be a futile arms race where we develop ever smarter AI systems to barely be able to monitor the currently smartest of them. Surely at some point an AI would be able to outsmart every supervision. The only actual limitation could be something physical.

    4 votes
    1. [5]
      rickartz
      Link Parent
      Like a bat? There's always a risk associated with playing with the unknown, in this case, developing something smarter than us. But what would even be the realistic application of a super smart...

      The only actual limitation could be something physical.

      Like a bat?

      There's always a risk associated with playing with the unknown, in this case, developing something smarter than us.

      But what would even be the realistic application of a super smart AI? Could it be the gateway to develop quantum computers, cure cancer and create a government that doesn't suck? Or could only be used to serve even more ads? I really don't know the end goal, but I'm interested.

      I mean, what's our motivation for the risk we're taking?

      8 votes
      1. teaearlgraycold
        Link Parent
        Hmm, a paperclip maximizer but it’s tasked with draining the wealth of billionaires? Should I dare to dream?

        Hmm, a paperclip maximizer but it’s tasked with draining the wealth of billionaires? Should I dare to dream?

        9 votes
      2. [4]
        Comment deleted by author
        Link Parent
        1. [3]
          fruitybrisket
          Link Parent
          The r/singularity cult over on reddit has a lot of folks who cannot wait to be literal pets. I found that quite disturbing. I want a purpose. I genuinely enjoy my job. I want to think and read,...

          The r/singularity cult over on reddit has a lot of folks who cannot wait to be literal pets. I found that quite disturbing. I want a purpose. I genuinely enjoy my job. I want to think and read, even if something else can do it for me.

          5 votes
          1. [2]
            teaearlgraycold
            Link Parent
            This is why the humans in Star Trek travel the galaxy. Also - humans are the best things suited for taking care of other humans. So I imagine aspects of care will never be automated. Who wants an...

            This is why the humans in Star Trek travel the galaxy.

            Also - humans are the best things suited for taking care of other humans. So I imagine aspects of care will never be automated. Who wants an android caretaker over a human one? Especially when there are plenty of unemployed people around.

            2 votes
            1. JackA
              Link Parent
              Don't get wrong, I don't want to be a pet, but there's really no basis for this statement in the context of AI. Any animal think's it's best at taking care of itself or it's family until a human...

              humans are the best things suited for taking care of other humans

              Don't get wrong, I don't want to be a pet, but there's really no basis for this statement in the context of AI.

              Any animal think's it's best at taking care of itself or it's family until a human abducts it and provides endless food, luxury, and an extended lifespan. The same could be theorized to happen to humans after the singularity.

              If a superintelligent AI decided it wanted to, it could provide you everything you want better than any human can. Imagine a friend giving advice that's always right. Imagine it can also give complimentary advice to other people that it knows will lead to the best solution for both of you. If it thinks you won't take that advice directly, it can influence your searches and the content you consume to get you there by yourself.

              The aspects that "couldn't" be automated with other humans would amount to taking your dog to the park for socialization. We'd all still be on leashes, maybe not even physically when our brains can be manipulated so easily.

              Even if it was possible you wouldn't want to break free from this. If "life is good" for you and everyone else, the AI has been useful and good to you your whole life, it helps you pursue goals of your choice, and it's been manipulating you to accept it in every facet of your life for years, there's no way you'd want to.

              Those humans would not even have the same principles that we have that makes this sound scary. We may invent god, it may finally give us our purpose we're always looking for, and that might just cure our human condition.

              Remember that if this level of AI is possible, this is the rare good ending. We should be terrified.

              3 votes
  5. [8]
    manosinistra
    Link
    Reading the link as well as some of the rabbit hole links found therein, I sense there is a lot of effort in trying to make these artificial intelligences “more” human and reflect some kind of...

    Reading the link as well as some of the rabbit hole links found therein, I sense there is a lot of effort in trying to make these artificial intelligences “more” human and reflect some kind of ideals (who’s ideals?)

    I wonder what would happen if we just “let AI emerge” somehow and take on whatever characteristics of its own. Mind you, I don’t even know what this means, exactly, but maybe it’s why we fail to see the general utility of AGI.

    I mean, we don’t need computers to become more like us. We already have enough of us, and we can barely get along with each other as it is.

    I want to see an AGI or super intelligence that are NOT human-like. I mean, why is emulating a human even the goal, with all our limitations, etc.

    4 votes
    1. [3]
      tesseractcat
      Link Parent
      To answer your question as to why: Humans have the benefit of millions of years of evolution. Modern AI doesn't have that, because we don't have the computational power to replicate it. So instead...

      To answer your question as to why: Humans have the benefit of millions of years of evolution. Modern AI doesn't have that, because we don't have the computational power to replicate it. So instead current AI is trained to imitate humans, bootstrapping off of our evolution, and will inherently be human-like to some extent.

      It would be very interesting to see a 'black-box' AI created (more accurately 'grown') separate from human influence or culture. The only method to do this I know of would be to create an artificial environment and simulate evolution. This probably won't see success within our lifetimes unless computers get way faster, so we're stuck picking and choosing which aspects of humanity we want to imitate instead.

      7 votes
      1. Adarain
        Link Parent
        Of course humans also just suck in some ways. We come prebuilt with all sorts of routines that made sense once but don't scale to our current society (e.g. a tendency to frame everything in us vs...

        Of course humans also just suck in some ways. We come prebuilt with all sorts of routines that made sense once but don't scale to our current society (e.g. a tendency to frame everything in us vs them terms) and I don't think we would want an AI to inherit those.

        1 vote
      2. lelio
        Link Parent
        That's a really interesting point. AI models will likely learn off human data so they will intrinsically be somewhat human based. As far starting from scratch, is it really that hard...

        That's a really interesting point. AI models will likely learn off human data so they will intrinsically be somewhat human based.

        As far starting from scratch, is it really that hard computationally? Biological evolution is so slow. Human DNA needs 10-20 years to make one edit in its code. Can we design a system that gets that down to one second per iteration? Also we can set up conditions that directly favor intelligence rather than just survival/reproduction. And have more purposeful edits compared to random mutations.

        It seems like it would be hard to design such a complicated system and involve a lot of trial and error. But plausible that we may hit upon something that could generate intelligence much faster than biological evolution.

    2. Omnicrola
      Link Parent
      Well for starters, it's the only example we have. Your point is valid, humans have a lot of flaws built into them. However we're also the only example of intelligent life we know of1, so if we're...

      I mean, why is emulating a human even the goal, with all our limitations, etc.

      Well for starters, it's the only example we have. Your point is valid, humans have a lot of flaws built into them. However we're also the only example of intelligent life we know of1, so if we're trying to build an artificial intelligence it makes sense to try an emulate the one we understand the best. At least as a starting point to know if we got it "correct".

      The other reason is the one I think is most commonly what is being implied by phrases similar to "human-like AI". Which is that by saying they're trying to emulate a human, they imply a whole host of ethical and moral constraints. Such as "don't exterminate humanity" and "be nice to dogs". There are of course examples of terrible humans doing terrible unethical things, but I don't think that's what people are aspiring to when they're describing a "human-like AI".


      1Although that's debatable.

      4 votes
    3. [3]
      stevent
      Link Parent
      Though I partially disagree, I can see where you’re coming from here. I’m in a field that has overlap in AI alignment - ethical and responsible tech. Many of the folks in the field work in Trust &...

      Though I partially disagree, I can see where you’re coming from here. I’m in a field that has overlap in AI alignment - ethical and responsible tech. Many of the folks in the field work in Trust & Safety (community moderation, CSAM removal, threat identification, fraud, etc) so a chunk of the alignment discussion around “ideals” is because of collaboration with each other.

      I think the phrasing around “human-like” and “human values” is easily digestible, but a misnomer. A better way to phrase it would be creating parameters for prosocial behavior, otherwise described as actions that contribute rather than damage, or actions that are beneficial to others (or mutually beneficial). At the end of the day, it’s how do we avoid going the way of Microsoft’s 4chan-esque pro-hitler AI from a few years ago or the way of a far-future apocalyptic singularity science fiction.

      BUT, the thing I’m more interested in, and something you might be getting at with a self-contained experiment is what happens if AI superintelligence (ASI) has agency? Or autonomy? What behaviors would develop without sociotechnological blindness and bias and is building a model without that interference even possible if their dataset and parameters begins with the human experience?

      Because AI lacks the ability to biologically feel or experience emotion in the way we do, the consequences are higher - the concept of “punishment” or consequences if an AI behaves in way contradictory prosocial behavior hasn’t been fleshed out yet. More importantly, to me personally, is if we determine what defines consciousness also applies to ASI. Effectively we will have created artificial life, and with that, the ethical implications of AI rights becomes the most important question in the room.

      So before we get to ASI and agency, I think we need to define the basic hierarchy of needs AI will have, starting with hardware requirements, software maintenance, moving up to dataset accessibility, responsible usage, and finally communication and ai-to-ai or ai-to-human connection.

      I don’t have answers, most of us don’t, but the ethical and moral implications of what’s to come are on the horizon. We’ll have a reckoning unless this path changes or the limits of sentience are determined to end before ASI. You’re asking the big questions that many of us are staring down the road at. Keep asking questions!

      3 votes
      1. fyzzlefry
        Link Parent
        We should probably start having conversations on what rights a conscious AI has. It's becoming a When question more than an If.

        We should probably start having conversations on what rights a conscious AI has. It's becoming a When question more than an If.

        1 vote
      2. manosinistra
        Link Parent
        I appreciate this and the other responses, but I’ll reply here. If you indulge the vulgarities of my enthusiast layman’s understanding, I wonder if there’s still space to not try to impose an...

        I appreciate this and the other responses, but I’ll reply here.

        If you indulge the vulgarities of my enthusiast layman’s understanding, I wonder if there’s still space to not try to impose an ethical or “good behaviour” filter on what we’re trying to make emerge.

        It’s like Conway’s Game of Life. I wonder what would happen if we just let AI “go” and not worry about the first or second (or n-1th) generations of what it creates. I feel like if we took todays guiding frameworks and imposed it on Game of Life, as soon as it generated something that looks like 666 or 88 we’d pull the plug, not letting it iterate into something that could surprise us, and maybe not even reflect us.

        Is this too simplistic? Others have said that because we are our own guidelines for intelligence we have to use ourselves as a starting point. But do we? I guess if language is the basis, but even then maybe language can be used differently and “better”, but we might not even recognize it.

        “How to recognize AI that is intelligence on its own terms.” I think that’s what I want to see…

  6. [3]
    Handshape
    Link
    The announcement is full of lofty, laudable goals and some interesting ideas. I worry that the eventual enshittification feels inevitable, and that this effort will be the first thing jettisoned...

    The announcement is full of lofty, laudable goals and some interesting ideas. I worry that the eventual enshittification feels inevitable, and that this effort will be the first thing jettisoned when investors knock at the door demanding returns.

    3 votes
    1. ignorabimus
      Link Parent
      Hasn't this already happened; OpenAI is no longer a non-profit and no longer doing research (just making existing stuff like ChatGPT that others have discovered more widely available).

      Hasn't this already happened; OpenAI is no longer a non-profit and no longer doing research (just making existing stuff like ChatGPT that others have discovered more widely available).

      2 votes
    2. fyzzlefry
      Link Parent
      But if the AI is super intelligent, can it figure these things out on its own?

      But if the AI is super intelligent, can it figure these things out on its own?

  7. pete_the_paper_boat
    Link
    I feel like it would take a super intelligence to understand a super intelligence. Making an AI for aligning AI, wouldn't you just end up with super intelligent aligned AI? Who's gonna monitor the...

    I feel like it would take a super intelligence to understand a super intelligence.

    Making an AI for aligning AI, wouldn't you just end up with super intelligent aligned AI?

    Who's gonna monitor the super intelligent alignment AI doesn't misalign in some super intelligent manner?

    3 votes
  8. bioemerl
    Link
    AI alignment should be to me, not to open AI or any other company

    AI alignment should be to me, not to open AI or any other company

    1 vote
  9. [2]
    the_man
    Link
    A religion is based on believing a reality without demonstrating it with facts. We could validate (demonstrate) that AI is better by proving that it coherently explains the past, works in...

    A religion is based on believing a reality without demonstrating it with facts.
    We could validate (demonstrate) that AI is better by proving that it coherently explains the past, works in manipulating the present, and predicts or creates the future.
    If we do not validate it and keep questioning it, we will have an equivalent to our current mis-information campaigns, in which we believed them because of f**k you and just keep going because they "are" likable.
    Placing our faith on AI and not humanly questioning it, it is the risk. The power of AI itself is secondary.
    If AI becomes another religion it is our choice and a battle between humans.

    1 vote
    1. Leonidas
      Link Parent
      I believe that the people who blindly cheer for anything suggesting a superintelligent AGI is on the horizon and those who shout warnings about how this hypothetical intelligence could destroy the...

      I believe that the people who blindly cheer for anything suggesting a superintelligent AGI is on the horizon and those who shout warnings about how this hypothetical intelligence could destroy the world are both falling prey to the same idea. They both erase the human element behind AI in order to praise or curse the "deus ex machina" that doesn't exist. AI alignment should be about stopping it from being used irresponsibly/ignorantly to replicate human biases, which is a problem that's currently causing harm, rather than fearmongering about a hypothetical "paper clip maximizer" situation that's irrelevant to actual, existing AI.

      3 votes