23 votes

Introducing Codex [OpenAI]

14 comments

  1. [10]
    donn
    Link
    Welp, enjoyed having a job y'all

    Welp, enjoyed having a job y'all

    9 votes
    1. teaearlgraycold
      Link Parent
      I know someone with access and they say it’s not so good.

      I know someone with access and they say it’s not so good.

      14 votes
    2. tauon
      Link Parent
      My somewhat educated guess is that the job responsibilities will simply shift to include less lower levels of “programming” and more higher levels – which has seemingly been the trend for, uh,...

      My somewhat educated guess is that the job responsibilities will simply shift to include less lower levels of “programming” and more higher levels – which has seemingly been the trend for, uh, always in this profession?

      Follow-up question: Ever seen a non-technical person attempt to write a detailed specification of an existing, or better yet for an entirely/mostly new system? I believe it’ll be a fair while longer until we can truly, minimally describe a desired “output” state and have models/agents think about all internal decisions. And once we’re that far, most other office jobs feel obsolete-able too, so… UBI time?

      12 votes
    3. [7]
      slade
      Link Parent
      I live in daily existential crisis. I'm 45. I have kids. I'm not sure what I'll do for a living if the one thing I'm really good at dries up.

      I live in daily existential crisis. I'm 45. I have kids. I'm not sure what I'll do for a living if the one thing I'm really good at dries up.

      6 votes
      1. [5]
        Greg
        Link Parent
        I don’t think it’s going to be an immediate world-changing issue for us, for similar reasons to what @tauon said above. But I know the feeling. Workloads creep up, compensation creeps down,...

        I don’t think it’s going to be an immediate world-changing issue for us, for similar reasons to what @tauon said above. But I know the feeling. Workloads creep up, compensation creeps down, numbers get squeezed as less of us have to do more. Even if tech workers stay at the relative top of the pile, we’re still watching our friends and family’s jobs getting automated away.

        Realistically, at some point, the only answer becomes: fight for recognition of the fact that we simply don’t live in a world that needs 40+ hours a week of labour from ~5 billion adults.

        Really it should be a cause for celebration - we’re so efficient and productive that we can live comfortably on a fraction of the work that was needed even 50 years ago. But history suggests the emphasis will be on “fight”.

        11 votes
        1. teaearlgraycold
          Link Parent
          I see a future where we do full time work as a tour of duty. You work for 5-10 years and then retire. Or maybe you work for 2 and then have off for 3 years, then back to work.

          I see a future where we do full time work as a tour of duty. You work for 5-10 years and then retire. Or maybe you work for 2 and then have off for 3 years, then back to work.

          4 votes
        2. [3]
          stu2b50
          Link Parent
          Well, let's not get ahead of ourselves. Productivity has been flat since the 2000s - the "computer paradox". "AI", too, has yet to show any productivity increases, and I don't it's in any way a...

          Well, let's not get ahead of ourselves. Productivity has been flat since the 2000s - the "computer paradox". "AI", too, has yet to show any productivity increases, and I don't it's in any way a certainty that it will.

          That is to say, the entire invention of computers - that which modern programming jobs rely on - did not actually increase measured productivity. It's not always obvious what a technology will do economically.

          4 votes
          1. [2]
            Greg
            Link Parent
            The data I’m seeing doesn’t reflect that, but honestly that’s beside the point in my mind. I’m looking at it more from a “bullshit jobs” perspective: strip away the GDP vs hours worked metric, and...

            The data I’m seeing doesn’t reflect that, but honestly that’s beside the point in my mind.

            I’m looking at it more from a “bullshit jobs” perspective: strip away the GDP vs hours worked metric, and look at the labour actually required to contribute effectively to safe, happy, and comfortable lives. A job might be productive in the sense that it generates financial value, but that doesn’t make it necessary or beneficial; conversely, many jobs that have a deep importance to people’s wellbeing look not-great on productivity metrics because they’re hands on, 1:1 or small group endeavours that can’t effectively scale past a certain point. And that’s without even touching on the number of wasted hours in jobs that do have some benefit to society, because a 40-ish hour week is a necessary (but not sufficient) condition for a job to offer liveable pay so there’s a massive incentive against demonstrating that your work can be done in less time than that.

            In short, I think the entire premise of economic output as a proxy for meaningful human productivity is flawed, and I think that even within that metric there is a vast amount of inefficiency that’s missed because the majority of models simply build in assumptions like the 40 hour week or the 50-ish year working lifetime because there’s no viable counterfactual data there to work from.

            7 votes
            1. Lia
              Link Parent
              Also, the product of tech jobs that raises GDP is sometimes (and increasingly) welfare-reducing rather than welfare-enhancing or neutral. It's easier to make a financial profit by blackmailing...

              A job might be productive in the sense that it generates financial value, but that doesn’t make it necessary or beneficial; conversely, many jobs that have a deep importance to people’s wellbeing look not-great on productivity metrics because they’re hands on, 1:1 or small group endeavours that can’t effectively scale past a certain point.

              Also, the product of tech jobs that raises GDP is sometimes (and increasingly) welfare-reducing rather than welfare-enhancing or neutral. It's easier to make a financial profit by blackmailing locked-in customers than to come up with and develop a new innovation that actually makes people's lives better.

              1 vote
      2. DistractionRectangle
        Link Parent
        I wouldn't worry about it. In the short term there may be less demand for programmers, but there will always be programmers. Think about it, short of AGI, a human will always need to be in the...

        I wouldn't worry about it. In the short term there may be less demand for programmers, but there will always be programmers. Think about it, short of AGI, a human will always need to be in the loop. To prompt the AI, to audit its output (to make sure it doesn't pull in dependencies with conflicting licensing, accuracy, security, etc etc). The person will need to be at least as proficient as the AI in order to do these things, i.e. a senior developer. The need for senior developers begets the need for junior developers (because how else do you get senior developers?). It raises the skill floor on junior devs, and maybe shifts emphasis to pen testing, but there will always be developers.

        Imposter syndrome has always been a thing, and this does fan the flames, but you'll be fine. Just keep growing your skillset as a senior dev and play around with AI. At its core, I find it's like rubber ducking + google on steroids. It helps me explorer topics and find blindspots (the you don't know what you don't know problem).

        6 votes
  2. DataWraith
    (edited )
    Link
    Two weeks ago I would have been very skeptical of this, but my experience with agentic coding since then makes me think that this would be a wonderful tool -- write a spec, fire off the prompt,...

    Two weeks ago I would have been very skeptical of this, but my experience with agentic coding since then makes me think that this would be a wonderful tool -- write a spec, fire off the prompt, and then wait a few minutes for it to open a Pull Request on GitHub that implements the feature you asked for, including tests and documentation.

    And while it's running, you can draft the next spec in parallel, fire off another agent, or work on the code yourself -- the latter is a bit of a pain point with my current setup, because Augment Code gets confused if you try to edit a file it wants to change itself.
    With the cloud-based setup, you also don't need to worry about an LLM going mad and deleting your home directory...

    Apart from data privacy, the most glaring downside is the price. If I'm parsing this right, it is only available on the $200/month tier for now, and they may charge extra after the initial testing period.

    Edit: Looks like it doesn't open PRs automatically, you have to click a button (for now). The recording of the livestream they have at the bottom of the page is interesting, and they've kept fluff to a minimum. Everyone is a bit awkward, but you get to see the system in action and they explain their vision for how this could evolve in the future.

    8 votes
  3. DawnPaladin
    Link
    Interesting. I tried Codex CLI recently and found it to be not as good as Claude Code (for my tasks, anyway). Now they're launching a new Codex LLM to go with it, plus a web UI. I will look...

    Interesting. I tried Codex CLI recently and found it to be not as good as Claude Code (for my tasks, anyway). Now they're launching a new Codex LLM to go with it, plus a web UI. I will look forward to trying that out.

    To reduce the amount of time you spend waiting, Claude Code Best Practices recommends you set up multiple git worktrees so several Claudes can work on different areas of your code in parallel. This is a good idea but it's cumbersome to set up. If Codex can make it easier to dispatch and monitor agents working in parallel, that could be quite helpful and make it easier to experiment with different approaches.

    8 votes
  4. [2]
    DataWraith
    Link
    Looks like GitHub is going to release their own version of this within weeks. Unlike Codex, the Copilot Agent can use Model Context Protocol servers and has access to the internet (managed using a...

    Looks like GitHub is going to release their own version of this within weeks.

    Using state-of-the-art models, the agent excels at low-to-medium complexity tasks in well-tested codebases, from adding features and fixing bugs to extending tests, refactoring code, and improving documentation. You can hand off the time-consuming, but boring tasks to Copilot that will use pull requests, CI/CD, and all of your existing tooling while you focus on the interesting work.

    Unlike Codex, the Copilot Agent can use Model Context Protocol servers and has access to the internet (managed using a whitelist).

    3 votes
    1. teaearlgraycold
      (edited )
      Link Parent
      It sounds like the capabilities aren’t far off from what I’d ask Cursor to do, but with MCP support and automatic PRs upon task completion. We’re currently in the UX improvement part of the AI...

      It sounds like the capabilities aren’t far off from what I’d ask Cursor to do, but with MCP support and automatic PRs upon task completion.

      We’re currently in the UX improvement part of the AI cycle. Model performance hasn’t changed dramatically since GPT-4 (I’d guess they’re out of good data and have been for years). But better tool integrations is going a long way towards making LLMs worth the downsides.

      Edit: Should probably point out that in well-tested codebases means... pretty much the upper 1% of codebases. Most are poorly tested or not at all.

      3 votes