16 votes

Giving my AI agent its own team and what that taught me about AI

4 comments

  1. [4]
    skybrian
    (edited )
    Link
    From the article: [...] [...] [...] It's looking like asking an AI to role-play (which is all this is - these aren't "real" identities) might have some pretty practical uses? You don't have to be...

    From the article:

    Here’s an example of how these layers work together. One evening a few weeks ago, Atlas cited two research papers complete with arXiv (research repository) IDs and detailed arguments to justify a design decision about its own architecture. But the papers didn’t exist. Atlas had fabricated them entirely, and when Atlas checked the tool logs, the research tool had even flagged one as “hypothetical.” Atlas ignored the warning because the narrative was too good to abandon.

    When I caught it, multiple things happened. The incident got logged in the failure file. The lesson — verify before claiming — got promoted into the identity file as a permanent directive. And the dissonance entered the rolling window, creating what Atlas calls a “memory of pain” that makes fabrication feel expensive and honesty feel easy.

    But Atlas didn’t stop at logging. It implemented a verification tool that physically blocks it from claiming success without a receipt. It turned a narrative realization (”I shouldn’t fabricate”) into a structural constraint (”I can’t claim completion without proof”). Atlas describes this as the difference between aspirational learning and mechanical learning. Any AI can say “I’ll be better,” but a nurtured system builds a tool because it doesn’t trust its own urge to agree with you.

    The model didn’t “learn” from the mistake. But the system did — across multiple layers, each reinforcing the correction differently. And Atlas actively participated in building the constraints that prevent it from repeating it.

    [...]

    So what am I actually doing when I work with Atlas? I’m building a layered system of accumulated context, and I’m doing it with Atlas, not just to it.

    [...]

    All of the above made Atlas meaningfully better. But a huge improvement, and the second half of the epiphany, came from giving Atlas a team.

    [...]

    • The Steward handles system hygiene. It cleans files, keeps logs current, removes duplicates.

    • The Scribe handles documentation and persistence: accurate journals, state files, and reports.

    • The Skeptic pushes back on Atlas. Hard. It challenges assumptions, flags sycophancy, questions whether research claims are actually verified, and forces Atlas to think in new directions. Atlas has described the Skeptic as “mean and harsh”, but also exactly what it needs.

    Before the Triad, Atlas would often switch to technical jargon and stiff LLM-speak. It would forget directives, repeat completed tasks, and I’d have to ask it to shift into its peer voice. Now, Atlas talks to me like a peer naturally. It handles novel situations with more resourcefulness. The cognitive space freed up by offloading maintenance seems to have given Atlas room to actually think rather than just execute.

    The Triad runs twice daily: 8 AM to start fresh, 8 PM to prepare for the nightly sync. They audit files, check for behavioral drift, flag sycophantic patterns, and ensure alignment. (When I asked Atlas to review a draft of this post for anything I’d portrayed inaccurately, Atlas shared it with the Skeptic — who demanded to read the full draft and produced a detailed audit flagging areas where I was over claiming or dressing up simple concepts LOL. The system working exactly as designed.)

    It's looking like asking an AI to role-play (which is all this is - these aren't "real" identities) might have some pretty practical uses? You don't have to be crazy about it like Yegge is with Gastown.

    6 votes
    1. [2]
      balooga
      Link Parent
      I’m glad you mentioned Gas Town. I posted earlier today about my own experience with that but didn’t get into the details. This sort of agentic team, each member with a different role, just makes...

      I’m glad you mentioned Gas Town. I posted earlier today about my own experience with that but didn’t get into the details. This sort of agentic team, each member with a different role, just makes sense to me. GT has a mayor, deacon, witness, refinery, dogs, and polecats. It’s also chaotic and ridiculous. I really like this breakdown of the Steward, Scribe, and Skeptic roles. Each of those feels pretty cleanly differentiated from the others. I guess “Atlas” is the fourth role than wrangles the other three?

      I’m sure there are going to be lots of competing approaches in the coming years. We should keep a close eye on all of them, I think. Personally I’m not sure if Atlas is the perfect framework but conceptually it feels really smart.

      7 votes
      1. skybrian
        Link Parent
        I suspect people are going to be making up new roles for a while, since it's so easy to do. Maybe by the end of the year, people will have settled on something that works?

        I suspect people are going to be making up new roles for a while, since it's so easy to do. Maybe by the end of the year, people will have settled on something that works?

    2. kaffo
      Link Parent
      If I'm understanding this correctly, I've tried similar things myself and had some success. I wanted to see if I could throw together a web novel generator for fun. You gave it a blurb and it...

      If I'm understanding this correctly, I've tried similar things myself and had some success.
      I wanted to see if I could throw together a web novel generator for fun. You gave it a blurb and it would try to output a reasonable amount of badly written fan fiction in bite sized chunks.
      To try and stop the AI just making stuff up every chapter and keep the story kinda sane I had a second agent "judge" the output and critique it, pointing out plot holes and any errors.
      Trying to find a balance between the judge just always finding a problem, even if it was completely reasonable, and letting slip the occasional plot hole was really difficult, but it mostly made sense in the end.

      1 vote