• Activity
  • Votes
  • Comments
  • New
  • All activity
    1. That one study that proves developers using AI are deluded

      I've found myself replying to different people about the early 2025 METR study kind of often. So I thought I'd try posting a top level thread, consider it an unsolicitied public service...

      I've found myself replying to different people about the early 2025 METR study kind of often. So I thought I'd try posting a top level thread, consider it an unsolicitied public service announcement.

      You might be familiar with the study because it has been showing up alongside discussions about AI and coding for about a year. It found that LLMs actually decreased developer productivity and so people love to use it to suggest that the whole AI coding thing is really a big lie and the people who think it makes them more productive are hallucinating.

      Here's the thing about that study... No one seems to have even glanced at it!

      First, it's from early 2025, they used Claude Sonnet 3.5 or 3.7. Those models are no way comparable to current gen coding agents. The commonly cited inflection point didn't happen until later in 2025 with, depending on who you ask, Sonnet 4.5 or Opus 4.5

      The study was comprised of 16 people! If those 16 were even vaguely representative of the developer population at the time most of them wouldn't have had significant experience with LLMs for coding.

      These are not tools that just work out of the box, especially back then. It takes time and experimentation, or instruction, to use them well.

      It was cool that they did the study, trying to understand LLMs was a good idea. But it's not what anyone would consider a representative, or even well thought out, study. 16 people!

      But wait! They did a follow up study later in 2025.

      This time with about 60 people and newer models and tools. In that study they found the opposite effect, AI tools sped developers up (which is a shock to no one who has used these tools long enough to get a feel for them). They also mentioned:

      However the true speedup could be much higher among the developers and tasks which are selected out of the experiment.

      In addition they had some, kind of entertaining, issues:

      Due to the severity of these selection effects, we are working on changes to the design of our study.

      Back to the drawing board, because:

      Recruitment and retention of developers has become more difficult. An increased share of developers say they would not want to do 50% of their work without AI, even though our study pays them $50/hour to work on tasks of their own choosing. Our study is thus systematically missing developers who have the most optimistic expectations about AI’s value.

      And...

      Developers have become more selective in which tasks they submit. When surveyed, 30% to 50% of developers told us that they were choosing not to submit some tasks because they did not want to do them without AI. This implies we are systematically missing tasks which have high expected uplift from AI.

      And so...

      Together, these effects make it likely that our estimate reported above is a lower-bound on the true productivity effects of AI on these developers.

      [...]

      Some developers were less likely to complete tasks that they submitted if they were assigned to the AI-disallowed condition. One developer did not complete any of the tasks that were assigned to the AI-disallowed condition.

      [...]

      Altogether, these issues make it challenging to interpret our central estimate, and we believe it is likely a bad proxy for the real productivity impact of AI tools on these developers.

      So to summarize, the new study showed a productivity increase and they estimate it's larger than the ~20% increase the study found. Cheers to them for being honest about the issues they encountered. For my part I know for sure that the increase is significantly more than 20%. The caveat, though, is that is only true after you've had some experience with the tools.

      The truth is that we don't need a study for this, any experienced engineer can readily see it for themselves and you can find them talking about it pretty much everywhere. It would be interesting, though, to see a well designed study that attempted to quantify how big the average productivity increase actually is.

      For that the participants using AI would need to be experienced with it and allowed to use their existing setups.

      I want to add that this is not an attempt to evangelize for AI. I find the tools useful but I'm not selling anything. I'm interested in them and I stay up to date on the conversations surrounding them and the underlying technology. I use them frequently both for my own projects and to help less technical people improve their business productivity.

      Whether AI agents are a good thing or not, from a larger perspective, is a very different, and complicated, conversation. The important thing is that utility and impact are two different conversations. There isn't a debate anymore about utility.

      I know this probably won't stop people from continuing to derail conversations with the claim that developers are wrong about utility, but I had to try. It's just hard to let it pass by when someone claims the sky is green.

      I understand that AI makes people angry and I think they have good reason to be angry. There are a lot of aspects of the AI revolution that I'm not thrilled about. The hype foremost, the FOMO as part of the hype, the potential for increased wealth consolidation really sucks, though I lay that at the feet of systems that existed before LLMs came along.

      It's messy, but let's consider giving the benefit of the doubt to professionals who say a tool works instead of claiming they're wrong. Let them enjoy it. We can still be angry at AI at the same time.

      73 votes
    2. I'm glad Hideo Kojima went into games instead of directing movies

      I'm currently 20 hours and 4 "episodes" deep into Death Stranding 2 on PC and I don't have the patience to wait til the Monday megathread rolls around again to voice my thoughts. This isn't my...

      I'm currently 20 hours and 4 "episodes" deep into Death Stranding 2 on PC and I don't have the patience to wait til the Monday megathread rolls around again to voice my thoughts. This isn't my first time playing a Kojima game; I've got over 100 hours in the first Death Stranding and I've also finished multiple entries in the Metal Gear series, I've even played Boktai 2 on the GBA (though I didn't know that was a Kojima game til much later). I enjoy the vision, wackiness, flexibility in gameplay, and emphasis on little details that are fairly characteristic of a Kojima game, and those things are definitely very present in this one as well. That said though, there is one thing that only becomes more and more clear as I progress:

      Hideo Kojima is terrible at writing dialogue. By that, I don't mean characters fail to express themselves or convey ideas well through a lack of words; rather, they're entirely too reliant on words. In an era of cinema that loves "show, don't tell", Kojima leans more towards "tell, tell, tell some more, and then maybe have a bit more tell as a treat". Any character with a backstory that Kojima wants you to know about will spend a good 10 minutes unloading their life story almost as soon as they meet the main character. Any time there's a new piece of information being revealed, someone will explain it to you in textbook-level depth. I'm not sure if Kojima thinks that it's ok to have so many incredibly long exposition-dumping cutscenes in his game because the ratio of cutscene to game is still fairly low but all I can say is these cutscenes and talking sequences are not good cinema. I don't care which movie star is getting a cameo when the script itself is this absurdly poor, my immersion is shattered and watching has now become a chore.

      That said though, it's not like the game is devoid of cinematic moments, they just happen to be entirely outside of the cutscenes themselves. By far the most memorable and impactful moments in this game and the original are those times of solitude during a delivery where you're just quietly traversing through a zone, luggage in tow, and a Low Roar track starts playing. It's during these moments of calm, of pure show and no tell at all, where the player gets truly immersed in the role of the main character and has time to contemplate their journey while taking in the beauty of the nature around them. These aren't accidental or purely player-driven moments, those songs are set to play at a particular place during certain missions and knowing Kojima, he definitely had a major role in directing these as well. So it's not like he doesn't know how to create absolute cinema, but at the same time it's limited purely to gameplay moments where you're not forced to listen to someone deliver a 10 minute monologue in a way that no actual human being talks.

      So yeah, thanks for not becoming a movie director, Kojima. Your script writing's terrible but your gameplay ideas are great. I'd suggest you hire an editorial team but you probably already have and ignore them.

      16 votes
    3. 2026 Oscar predictions

      Picture: One Battle After Another This became a tight race with Sinners late in the game. The obvious parallel here is 1917 and Parasite. 1917 won PGA, DGA, and BAFTA just like OBAA did. Parasite...

      Picture: One Battle After Another

      This became a tight race with Sinners late in the game. The obvious parallel here is 1917 and Parasite. 1917 won PGA, DGA, and BAFTA just like OBAA did. Parasite won SAG Ensemble and went on to win Picture from there.

      I’m betting on OBAA being stronger than 1917. Notably 1917 was never in contention to win a Screenplay or an Acting award the way OBAA is. And OBAA is nominated in Film Editing and the front-runner in that category which 1917 missed.

      I have correctly predicted every Picture winner since The Shape of Water but this is the first year where I’m in danger of losing that streak.

      Director: Paul Thomas Anderson - One Battle After Another

      Whether Sinners ends up winning Picture, I think this goes to PTA regardless.

      Original Screenplay: Sinners

      Adapted Screenplay: One Battle After Another

      Lead Actor: Michael B. Jordan - Sinners

      After Chalamet shot himself in the foot with his unorthodox press run for Marty Supreme, Jordan is the only one in the category going in with an industry award as the BAFTA awarded a British actor.

      I think Chalamet really fucked up his chances to win for a while and will now win well into his 40s or even into his 50s. He should have played it better.

      Lead Actress: Jessie Buckley - Hamnet

      Supporting Actress: Amy Madigan - Weapons

      After winning SAG it seems like it’s heading that way. Mosaku won the BAFTA but she had a homefield advantage. This is also an opportunity to award a veteran actress who never got her due, and will be supported by young people who enjoyed Weapons.

      Supporting Actor: Sean Penn - One Battle After Another

      Despite not campaigning and not appearing at the majority of the award shows (or perhaps because of that) Penn ended up winning both the SAG and BAFTA. Essentially sleepwalking to his third Oscar.

      Original Score: Sinners

      Original Song: Golden from KPop Demon Hunters

      Sound: F1

      Casting: Sinners

      Production Design: Frankenstein

      Cinematography: One Battle After Another

      Sinners was originally the front-runner and would have made history as the first female cinematographer to win. However it lost both the ASC and BAFTA for Cinematography.

      Makeup and Hairstyling: Frankenstein

      Costume Design: Frankenstein

      Film Editing: One Battle After Another

      Sinners won the ACE Drama awards and OBAA won the ACE Comedy award. This category used to be correlated with Sound but since the sound categories were merged it has now correlated with Picture.

      VFX: Avatar: Fire and Ash

      Animated Feature: KPop Demon Hunters

      Documentary Feature: The Perfect Neighbor

      International Feature: Sentimental Value

      12 votes
    4. Save Point: A game deal roundup for the week of March 22

      Add awesome game deals to this topic as they come up over the course of the week! Alternately, ask about a given game deal if you want the community’s opinions: e.g. “What games from this bundle...

      Add awesome game deals to this topic as they come up over the course of the week!

      Alternately, ask about a given game deal if you want the community’s opinions: e.g. “What games from this bundle are most worth my attention?”

      Rules:

      • No grey market sales
      • No affiliate links

      If posting a sale, it is strongly encouraged that you share why you think the available game/games are worthwhile.


      All previous Save Point topics

      If you don’t want to see threads in this series, add save point to your personal tag filters.

      1 vote
    5. Job hunting absolutely sucks right now

      Feeling pretty discouraged after taking yet another spin around the tech interview circuit for naught I was feeling pretty good this time around as I've interviewed with this company before and...

      Feeling pretty discouraged after taking yet another spin around the tech interview circuit for naught
      I was feeling pretty good this time around as I've interviewed with this company before and was runner up for previous role. The hiring manager contacted me for this new one, and again I aced it until the final stage where I got punted for the all nebulous "culture fit" reasoning. My mood isn't helped by the constant AI doom clouds hovering overhead that makes me wonder if I need to make bigger career changes.

      How's everyone else fairing out there?

      71 votes
    6. Game testers wanted for science fiction game

      I have a bare bones prototype of a game made in twine and I will be honest it needs a lot of work. The story and main architecture of the game is already planned and I am happy with it. It is the...

      I have a bare bones prototype of a game made in twine and I will be honest it needs a lot of work.

      The story and main architecture of the game is already planned and I am happy with it. It is the story hooks and pathing that I am looking to improve and for that I would like to give out a early Alpha build for volunteers to critique and provide any dead ends, errors and story beats they find engaging.

      Please feel free to send a message if you would like to participate. Thank you for your time.

      Edit: Thank you for your interest in the game the final build should be ready for volunteers in one week. I will send links to you directly at that time. Thank you again for your interest this is much better than I hoped for.

      41 votes
    7. Fitness Weekly Discussion

      What have you been doing lately for your own fitness? Try out any new programs or exercises? Have any questions for others about your training? Want to vent about poor behavior in the gym? Started...

      What have you been doing lately for your own fitness? Try out any new programs or exercises? Have any questions for others about your training? Want to vent about poor behavior in the gym? Started a new diet or have a new recipe you want to share? Anything else health and wellness related?

      8 votes
    8. Tildes Minecraft Weekly

      Server host: tildes.nore.gg (Running Java 1.21.11) Verification site: https://tildes.nore.gg BlueMap: https://tildes.nore.gg/map/ Patreon: https://www.patreon.com/TildesMC Plugins and Data Packs...

      Server host: tildes.nore.gg (Running Java 1.21.11)
      Verification site: https://tildes.nore.gg
      BlueMap: https://tildes.nore.gg/map/
      Patreon: https://www.patreon.com/TildesMC

      Plugins and Data Packs Data Packs:
      • Terralith - Overworld terrain upgrade
      • Nullscape - End terrain upgrade
      • Age Lock [Vanilla Tweaks]
      • Armor Statues [Vanilla Tweaks]
      • Bat Membranes [Vanilla Tweaks]
      • Cauldron Concrete [Vanilla Tweaks]
      • Cauldron Mud [Vanilla Tweaks]
      • Custom Nether Portals [Vanilla Tweaks]
      • Husks Drop Sand [Vanilla Tweaks]
      • Mini Blocks [Vanilla Tweaks]
      • More Mob Heads [Vanilla Tweaks]
      • Player Head Drops [Vanilla Tweaks]
      • Silence Mobs [Vanilla Tweaks]
      • Wandering Trades [Vanilla Tweaks]

      Plugins:

      • BlueMap - Provides a live 3D rendering of the game world
      • Clickable Links - Makes http URLs in chat clickable (only for registered players)
      • CoreProtect - Records all block/container/mob changes (Anyone can look up changes with /co inspect)
      • DebugStick - Gives the ability to craft debug sticks in survival
      • DistantHorizons - Provides distant LOD map data to players running the client mod
      • EasyArmorStands - GUI for editing armor stands
      • Hexnicks - Enables Tildes usernames to be displayed
      • hsrails - Allows for 4x speed rail travel
      • LuckPerms - Locks down unregistered users
      • Otherside - Fix for mob farms involving Nether portals
      • Rapid Leaf Decay - Increases the speed of leaf decay by 10x
      • WorldEdit - Used for occasional admin stuff
      • WorldGuard - Prevents unregistered users from changing anything in the world

      The server operates on a soft whitelist. Anyone can log in and walk around, but you need a Tildes account to gain build access.


      We recommend you install our mod web-chat so that you can chat while in your web browser. It turns the server into an old-school chat room.

      <- Previous Thread

      11 votes
    9. What programming/technical projects have you been working on?

      This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's...

      This is a recurring post to discuss programming or other technical projects that we've been working on. Tell us about one of your recent projects, either at work or personal projects. What's interesting about it? Are you having trouble with anything?

      17 votes
    10. Tildes Minecraft Weekly

      Server host: tildes.nore.gg (Running Java 1.21.11) Verification site: https://tildes.nore.gg BlueMap: https://tildes.nore.gg/map/ Patreon: https://www.patreon.com/TildesMC Plugins and Data Packs...

      Server host: tildes.nore.gg (Running Java 1.21.11)
      Verification site: https://tildes.nore.gg
      BlueMap: https://tildes.nore.gg/map/
      Patreon: https://www.patreon.com/TildesMC

      Plugins and Data Packs Data Packs:
      • Terralith - Overworld terrain upgrade
      • Nullscape - End terrain upgrade
      • Age Lock [Vanilla Tweaks]
      • Armor Statues [Vanilla Tweaks]
      • Bat Membranes [Vanilla Tweaks]
      • Cauldron Concrete [Vanilla Tweaks]
      • Cauldron Mud [Vanilla Tweaks]
      • Custom Nether Portals [Vanilla Tweaks]
      • Husks Drop Sand [Vanilla Tweaks]
      • Mini Blocks [Vanilla Tweaks]
      • More Mob Heads [Vanilla Tweaks]
      • Player Head Drops [Vanilla Tweaks]
      • Silence Mobs [Vanilla Tweaks]
      • Wandering Trades [Vanilla Tweaks]

      Plugins:

      • BlueMap - Provides a live 3D rendering of the game world
      • Clickable Links - Makes http URLs in chat clickable (only for registered players)
      • CoreProtect - Records all block/container/mob changes (Anyone can look up changes with /co inspect)
      • DebugStick - Gives the ability to craft debug sticks in survival
      • DistantHorizons - Provides distant LOD map data to players running the client mod
      • EasyArmorStands - GUI for editing armor stands
      • Hexnicks - Enables Tildes usernames to be displayed
      • hsrails - Allows for 4x speed rail travel
      • LuckPerms - Locks down unregistered users
      • Otherside - Fix for mob farms involving Nether portals
      • Rapid Leaf Decay - Increases the speed of leaf decay by 10x
      • WorldEdit - Used for occasional admin stuff
      • WorldGuard - Prevents unregistered users from changing anything in the world

      The server operates on a soft whitelist. Anyone can log in and walk around, but you need a Tildes account to gain build access.


      We recommend you install our mod web-chat so that you can chat while in your web browser. It turns the server into an old-school chat room.

      <- Previous Thread - Next Thread ->

      18 votes
    11. Midweek Movie Free Talk

      Warning: this post may contain spoilers

      Have you watched any movies recently you want to discuss? Any films you want to recommend or are hyped about? Feel free to discuss anything here.

      Please just try to provide fair warning of spoilers if you can.

      9 votes
    12. When you were first getting your driver's license, what were you afraid of?

      Did you fear adverse weather conditions like icy roads? How did you handle them? Did you initially consider yourself as too incompetent to drive? Were you afraid of breaking traffic laws? Were you...

      Did you fear adverse weather conditions like icy roads? How did you handle them?
      Did you initially consider yourself as too incompetent to drive?
      Were you afraid of breaking traffic laws?
      Were you afraid of getting into an accident?
      Did you face any of these situations and how did you handle them?
      Are you still afraid of driving?

      20 votes