post_below's recent activity

  1. Comment on The founder of Craigslist has given away half a billion dollars. He fears for an America where generosity is trolled. in ~tech

    post_below
    Link Parent
    'The Farm Poor People for Longevity Blood Infusions for the Rich' Foundation just doesn't have mass appeal.

    'The Farm Poor People for Longevity Blood Infusions for the Rich' Foundation just doesn't have mass appeal.

    5 votes
  2. Comment on Who was the first transgender person? in ~lgbt

    post_below
    Link Parent
    Thanks for posting, that was really interesting. I'd never heard about this Which led me to this. TLDR: Transgender prophet stoner priestesses. Which is almost definitely going to be the coolest...

    Thanks for posting, that was really interesting.

    I'd never heard about this

    In the fifth century B.C.E., two Greek authors – Herodotus, known as the father of history, and Hippocrates, the father of medicine – wrote about people they call Anarieis from Scythia, a vast ancient territory to the north and west of the Black Sea that today would be part of Ukraine and Russia. Their descriptions of the Anarieis’ gender are similar to the way many people describe trans women today. Their accounts are supported by what we know about Scythia and Anarieis from anthropologists and archaeologists today.

    Which led me to this. TLDR: Transgender prophet stoner priestesses. Which is almost definitely going to be the coolest thing I learn about today.

    7 votes
  3. Comment on The founder of Craigslist has given away half a billion dollars. He fears for an America where generosity is trolled. in ~tech

    post_below
    (edited )
    Link
    Cheers to Craig, and all the billionaires still signed on to The Giving Pledge. I don't know a lot about the guy aside from his philanthropy and his refusal to sell out craigslist over the years...

    Cheers to Craig, and all the billionaires still signed on to The Giving Pledge.

    I don't know a lot about the guy aside from his philanthropy and his refusal to sell out craigslist over the years but it's always good to hear about wealthy people that want to do something besides horde and build space penises.

    The article has some interesting details about his un-lavish life (still takes public transportation) and ideals.

    22 votes
  4. Comment on What programming/technical projects have you been working on? in ~comp

    post_below
    Link Parent
    I can't deny the appeal of hard problems! I'll be curious about your solutions

    I can't deny the appeal of hard problems! I'll be curious about your solutions

  5. Comment on What programming/technical projects have you been working on? in ~comp

    post_below
    Link
    Like @edoceo I've recently been working on a way to make bot protection suck less. My solution isn't particularly exciting: Run extensive signal and behavior scoring to determine if a request...

    Like @edoceo I've recently been working on a way to make bot protection suck less. My solution isn't particularly exciting: Run extensive signal and behavior scoring to determine if a request needs to get a CAPTCHA gate.

    The goal is for humans to never see a challenge, while challenging the majority of automated traffic reliably.

    So far it's working great, after a lot of iteration over time, but I haven't yet run it on a high traffic site.

    3 votes
  6. Comment on What programming/technical projects have you been working on? in ~comp

    post_below
    Link Parent
    That's an ambitious project, not so much the tech, but trying to mix human scale and global network scale. If the goal is to replace CAPTCHA, wouldn't it need to fail requests by default? You...

    That's an ambitious project, not so much the tech, but trying to mix human scale and global network scale.

    If the goal is to replace CAPTCHA, wouldn't it need to fail requests by default? You could only pass requests where attestation already existed. And wouldn't that mean that it would always auto-fail some percentage of potentially valid requests even if you managed to drive widespread adoption?

    In addition, wouldn't you need a central attestation store because you couldn't rely on users being connected to the network at a given time? And in that case wouldn't it only work if there was already an attestation record for that request? Would it be IP based? Fingerprinting?

    Or am I misunderstanding and this is meant to replace CAPTCHA on just one particular site? In that case though it sounds like invite-only would be a lot easier and more reliable.

    Hopefully I'm missing something because better CAPTCHA is something we all want.

    1 vote
  7. Comment on Does generative AI have a natural limit without a major innovation? in ~comp

    post_below
    Link Parent
    This is entirely unsolicited, feel free to ignore it, and I don't intend to be condescending in any way. Unless you're close to retirement (and maybe even if you are), be wary of this tendency!...

    This is entirely unsolicited, feel free to ignore it, and I don't intend to be condescending in any way.

    But also I'm tired of the world completely changing around me every 15 years.

    Unless you're close to retirement (and maybe even if you are), be wary of this tendency! There isn't much that ages you faster than the disconnection that comes from feeling like the world has left you behind and doesn't make sense anymore.

    And the speed with which "everything changes" is likely to keep increasing provided some catastrophe doesn't send us back to the dark ages.

    2 votes
  8. Comment on Does generative AI have a natural limit without a major innovation? in ~comp

    post_below
    Link
    To clarify the vocab: Gen AI = LLM powered agents = LLM fine tuned for reasoning and tool use running in a harness that provides tools and other functionality. Boiling it down there are two steps:...

    To clarify the vocab: Gen AI = LLM powered agents = LLM fine tuned for reasoning and tool use running in a harness that provides tools and other functionality.

    Boiling it down there are two steps:

    • Pre training. The giant dataset, tokenizing it (converting it into numbers) and generating embeddings (mathematical relationships between the tokens). This step is constrained by the available data like you said.
    • Post training (or fine tuning). This step turns the LLM, which can't really do anything except output plausible text in response to input, into a tool that can do useful work. It's where it learns to be an assistant, to use tools, do multi-step reasoning, write code that mostly works, develop an em-dash kink, etc..

    The above compresses a bunch of important sub steps for brevity.

    Innovation can happen in various parts of both steps, so there's still a lot of room for improvement. There are undoubtedly better ways to do everything involved, much of it has been replaced with better methods multiple times already.

    Model size is likely to become a limiting factor, both because of the limit of what exists in terms of training data and because bigger models are more computationally expensive to train and to run. But that's assuming better ways of getting, vetting and tagging pre-training data aren't discovered. I'd assume that, yes, eventually there will be a ceiling. In terms of compute, the tech is going to keep getting more efficient and the hardware will keep getting better so likely any limits imposed by compute will be temporary.

    Will recursive self improvement hit an event horizon where LLMs will start improving themselves so fast they start rocketing towards AGI? Probably not with the current state of the art. When models generate their own training data they end up entrenching and exaggerating their flaws, and there are a lot of flaws. Some amount of artifical training data is fine (especially if it comes from a better model), but 100% artifical training isn't viable at this point.

    Even if LLMs were to achieve the ability to recursively self improve without ensloppifying themselves, there's no room in the math for the kind of awareness or understanding we'd associate with AGI. The models don't have a conceptual understanding of reality, they only appear to. They would need to invent new technology to get there, not just iterate on existing LLM tech.

    However, will LLM tools contribute to whatever sort of AGI is someday created? It's hard to imagine they won't.

    I can imagine a future world model with pre-training on a much wider dataset that strives to tokenize reality, as opposed to just language and other creative outout, having a more realistic path to AGI. Especially if it was fine tuned with some sort of feedback mechanism that could approximate real world cause and effect. Maybe you'd need sensory feedback. But that's speculating on technology that doesn't exist yet. Right now world models are mostly focused on improving robotics. As far as I know, no one has tried to make a super-sized general world model. It would take the resources of one of the frontier labs to attempt it.

    My perspective is that AGI is still roughly comparable to stable fusion power. There's no reason to believe it can't be done, but it will most likely be "just around the corner" for years and years.

    8 votes
  9. Comment on Any fellow software engineers using paid GitHub copilot? in ~comp

    post_below
    Link Parent
    You're right, I shouldn't have included enterprise... The team plans offer subscription billing (as opposed to API prices)

    You're right, I shouldn't have included enterprise... The team plans offer subscription billing (as opposed to API prices)

    1 vote
  10. Comment on Any fellow software engineers using paid GitHub copilot? in ~comp

    post_below
    Link Parent
    Meaning that Opus is dramatically better than Deepseek for complex coding tasks, but if you include cost in the calculation, Deepseek looks a lot better.

    Meaning that Opus is dramatically better than Deepseek for complex coding tasks, but if you include cost in the calculation, Deepseek looks a lot better.

    3 votes
  11. Comment on Any fellow software engineers using paid GitHub copilot? in ~comp

    post_below
    Link Parent
    Is that 20€ in API credits or a 20€ subscription? The latter is quite a bit more usage and the limits reset in 5 hour windows. Both options are available in enterprise and team setups. An...

    Is that 20€ in API credits or a 20€ subscription? The latter is quite a bit more usage and the limits reset in 5 hour windows. Both options are available in enterprise and team setups.

    An alternative you might suggest is Open AI at 20/month subscriptions. For Claude you need 100/month subscriptions for serious usage but with GPT 5.4 you can get a lot farther on 20/month.

    But no matter how much they pay, artisanal code is only mostly dead. Miracle Max's pill will come in the form of everyone realizing that having no one left that can actually code isn't working out as planned!

    1 vote
  12. Comment on Any fellow software engineers using paid GitHub copilot? in ~comp

    post_below
    Link Parent
    Putting aside the benchmarks, since it varies widely depending on which ones you look at, Deepseek is most definitely not on par with Opus 4.6. Unless you factor cost in, then Deepseek is...

    Putting aside the benchmarks, since it varies widely depending on which ones you look at, Deepseek is most definitely not on par with Opus 4.6.

    Unless you factor cost in, then Deepseek is lightyears ahead of Opus.

  13. Comment on Claude Fable 5 and Claude Mythos 5 in ~tech

    post_below
    Link Parent
    I was thinking in terms of something homegrown, anyone could of course use an open model as a starting point, but they'd then be reliant on that provider since they wouldn't have their own...

    I was thinking in terms of something homegrown, anyone could of course use an open model as a starting point, but they'd then be reliant on that provider since they wouldn't have their own training pipeline. For a government or academic solution they'd ideally start from scratch.

    But yeah the chinese open weights models are pretty good and it's great that they exist for all sorts of reasons.

    3 votes
  14. Comment on Claude Fable 5 and Claude Mythos 5 in ~tech

    post_below
    Link Parent
    A public LLM of some kind would be amazing, not just for science, for the whole range of applications. Don't underestimate the size of the task though. It would need nation state level funding,...

    A public LLM of some kind would be amazing, not just for science, for the whole range of applications.

    Don't underestimate the size of the task though. It would need nation state level funding, and success would hinge on convincing the right experts to get on board.

    The EU could maybe pull it off, it would make a lot of sense for them given their recent distaste for US tech.

    I imagine various people in government and academia have at least talked about it by now.

    3 votes
  15. Comment on What do you think is the best sandwich? in ~food

    post_below
    Link Parent
    I've had a surprising number of conversations, at various times in my life, about cooking turkeys outside of the holidays. Normally they happen around thanksgiving, sometimes they result in plans,...

    I've had a surprising number of conversations, at various times in my life, about cooking turkeys outside of the holidays. Normally they happen around thanksgiving, sometimes they result in plans, sometimes they even acknowledge the historically low ratio of plans to non-holiday turkeys. Rarely do they result in actual turkey.

    Mostly because I'm ok with just annual turkey sandwiches.

  16. Comment on What do you think is the best sandwich? in ~food

    post_below
    Link
    Leftover thanksgiving turkey sandwich. Because it's the only time shredded turkey is available. It wouldn't be as good otherwise. If you put cranberry sauce on it, I accept that but take it...

    Leftover thanksgiving turkey sandwich. Because it's the only time shredded turkey is available. It wouldn't be as good otherwise.

    If you put cranberry sauce on it, I accept that but take it somewhere else so I don't have to watch.

    3 votes
  17. Comment on Claude Fable 5 and Claude Mythos 5 in ~tech

    post_below
    Link Parent
    Now that Mythos is public in the security hobbled form of Fable, we don't have to speculate. In my testing yesterday Fable found two legitimate vulnerabilities that previous models (and I) had...

    Now that Mythos is public in the security hobbled form of Fable, we don't have to speculate. In my testing yesterday Fable found two legitimate vulnerabilities that previous models (and I) had missed. And that was in non security focused scans (because most security related prompts currently get downgraded to Opus 4.8). In both cases they were subtle issues that were easy to miss.

    It's true that models like Opus 4.8 and GPT 5.5 can be wrangled to find a lot of security issues. In the hands of a decent engineer, with a good harness, you can use either of those models to find and patch or exploit all sorts of vulnerabilities. It's an iterative process though. According to Anthropic, the reason for the controlled release was because Mythos is better at chaining vulnerabilities into working exploits on its own. It would allow anyone, including non-engineers, to find and exploit holes in widely used software. Glasswing gave those companies a chance to patch many of the holes in advance.

    I don't have access to unrestricted Mythos, just Fable, so I can't test the full extent of the capabilities Anthropic is claiming. But seeing Fable's capabilities in other areas of coding I have no doubt they're telling the truth. It's significantly better at putting pieces together into a working thesis and then following it through to a given conclusion, which would definitely generalize into security research.

    That said, I don't think project glasswing was wholly altruistic. When they released Mythos they didn't have anywhere near enough available compute to handle a wide release. So while the safety angle was legitimate, it also served their purposes to do a limited release while they scrambled to find the compute for a full scale release. And yeah, the hype didn't hurt either. But the aforementioned hot takes that it was all smoke and mirrors are now demonstrably false.

    13 votes
  18. Comment on Claude Fable 5 and Claude Mythos 5 in ~tech

    post_below
    Link Parent
    Yes, thanks for adding that. Currently the moat is velocity, none of the cheap/open alternatives have been able to get close enough to the frontier to tempt the majority of users. And velocity is...

    Yes, thanks for adding that. Currently the moat is velocity, none of the cheap/open alternatives have been able to get close enough to the frontier to tempt the majority of users. And velocity is expensive... but it can work as a moat as long as there's plenty of frontier available. If they hit a plateau everyone will probably catch up.

    I've been wondering if Anthropic started their IPO process when they did in order to go public when they had a clear lead. Opus 4.8 put them pretty far ahead, Fable/Mythos just makes it undeniable. For now at least.

    12 votes
  19. Comment on Claude Fable 5 and Claude Mythos 5 in ~tech

    post_below
    Link Parent
    snort that's great. It's relevant too: in my experience so far, Fable (I'm having a hard time getting used to that name for some reason) is significantly better at extrapolating intent with less...

    snort that's great. It's relevant too: in my experience so far, Fable (I'm having a hard time getting used to that name for some reason) is significantly better at extrapolating intent with less prompting and then making mostly reasonable calls on how to proceed without handholding.

    It looks like there was a heavy focus on long running autonomous tasks during fine tuning. I have mixed feelings about this.

    7 votes