Greg's recent activity

  1. Comment on GPT-5 has come a long way in mathematics in ~tech

    Greg
    Link Parent
    This is an interesting one, actually! I’m not fully up to date on it (did a bit of work with a team who were competing for the AIMO prize, but that was last year, so basically a lifetime ago in...

    This is an interesting one, actually! I’m not fully up to date on it (did a bit of work with a team who were competing for the AIMO prize, but that was last year, so basically a lifetime ago in this field), but the balance between tool calling and LLM parsing actually skews (or did skew, at least) pretty heavily towards the latter for mathematical problems.

    Most successful approaches absolutely did use code generation and a python interpreter for the actual arithmetic - but basic arithmetic, and even not-too-complicated symbolic algebra, are pretty much solved problems for computers. For a problem to be challenging at all to an LLM with tool calling abilities, you’re inherently testing its capacity to parse, “understand”, and reformulate the conceptual problem in order to use those tools effectively.

    It’s similar to allowing calculators in an exam: we could spend time doing long division on paper, and similarly we could spend time training LLMs to do accurate arithmetic and symbolic manipulation internally, but for most real-world tests it’s fair to assume that tools will be available to assist. The questions are then formulated to test understanding rather than mechanical skills: did you, or did the LLM, select the right numbers to put into the calculator (if there are numbers in the answer at all)? The only way to do so is to interpret the question correctly, which puts the onus on the human or LLM rather than on the calculator or python runtime.

    One of the unexpectedly tricky bits is actually getting decent questions to benchmark with, though! LLMs generally have an extremely good ability to recall things from their training data, and it’s natural to train a mathematically-focused LLM on any question-answer pairs you can find, but that means if you’re testing at the high school or early university difficulty level you’re going to have to write and test with completely new questions that have never been seen or published on the internet if you want a baseline of how well the model can actually generalise concepts. If you don’t do that, you’re likely to end up testing recall more than generalisation - which is worthwhile in itself, as long as you’re aware that’s what you’re doing, but will fall off a cliff once it hits something it doesn’t have a close example for encoded in the training data.

    2 votes
  2. Comment on Google must double AI serving capacity every six months to meet demand in ~tech

    Greg
    Link Parent
    Ah, I see where you’re coming from - I still think of NVIDIA hardware all being GPUs of one type or another (or NICs, but that’s a whole other thing), because they are still a bit more...

    Ah, I see where you’re coming from - I still think of NVIDIA hardware all being GPUs of one type or another (or NICs, but that’s a whole other thing), because they are still a bit more general-purpose than pure coprocessor boards, but it’s fair to say that the B200/300 are a lot heavier on the CUDA cores than anything else. Blackwell in general is the architecture, so you do have the 50 series gaming cards included there too, but in fairness even those are a bit more CUDA-focused than previous gaming cards - hence the kind of unpopular push for DLSS as a more important part of the graphics pipeline too, because that benefits more from what NVIDIA are really working on now.

    What I will say is you might be surprised how much power you need to dump even into inference-focused cards. A full rack of L40s is somewhere around 30kW, which is a good bit less than some of the full on training system setups I’ve seen marketed, but that one rack is still using more power in a year than your house is likely to use in a decade.

    3 votes
  3. Comment on Google must double AI serving capacity every six months to meet demand in ~tech

    Greg
    Link Parent
    Any LLM you can run locally is making massive compromises to allow you to do so - the fact they still work after dropping 90% of the parameters and quantising the rest by a factor of four is...

    Any LLM you can run locally is making massive compromises to allow you to do so - the fact they still work after dropping 90% of the parameters and quantising the rest by a factor of four is actually super impressive to me, but if you want to see the compute it takes to run a full fat model, just try running flagship DeepSeek in its native bf16. And DeepSeek’s market-shaking breakthrough was its efficiency, too; the broad understanding as far as I’m aware is that the major closed source models are even more resource intensive in the interest of getting better results.

    As I understand it more or less everyone is still running NVIDIA chips, too? I know Google’s been pushing the TPUs for a long while, and we’re getting general tensor coprocessors in more hardware as time goes on, but there’s a reason NVIDIA’s datacenter division is basically driving the entire S&P500 nowadays.

    4 votes
  4. Comment on Google must double AI serving capacity every six months to meet demand in ~tech

    Greg
    Link Parent
    Overall I agree with where you’re coming from, but I was surprised when I saw some OpenAI cost estimates recently: they’re scaling the user base to the extent that training spend is only a bit...

    Overall I agree with where you’re coming from, but I was surprised when I saw some OpenAI cost estimates recently: they’re scaling the user base to the extent that training spend is only a bit more than double the inference spend. So still a good amount more, for sure, but less of a gap than I expected.

    10 votes
  5. Comment on Google must double AI serving capacity every six months to meet demand in ~tech

    Greg
    Link
    How much of this is user demand, I wonder, and how much is “we decided on your behalf to put an LLM summary at the top of every search”?

    How much of this is user demand, I wonder, and how much is “we decided on your behalf to put an LLM summary at the top of every search”?

    66 votes
  6. Comment on US President Donald Trump signs order to remove tariffs from Brazilian beef, coffee other food items in ~society

    Greg
    Link Parent
    That’s all fair, and to me at least, the “if demand truly outstrips supply” part pretty much inherently prevents it from being real greedflation. For what it’s worth, I’m still in favour of market...

    That’s all fair, and to me at least, the “if demand truly outstrips supply” part pretty much inherently prevents it from being real greedflation. For what it’s worth, I’m still in favour of market systems as a distribution mechanism in general, I just don’t like seeing them treated as inherently positive, inherently fair, or inherently natural: they’re a dangerous tool, to be managed carefully and with respect.

    Balancing an actual demand increase with genuinely limited supply is a great example of where a price increase works well (mostly… the ethics get trickier if we’re talking about things like basic sustenance or medical care, or if that demand increase is going to wealthy hoarders as a luxury at the expense of people in genuine need). The “greed” part comes in if the supply is being artificially limited by a cartel, for example, or if the price is being kept high by monopolistic practices. Or, as in our starting example, where external circumstances (e.g. a genuine increase in costs) allow for a price increase that then sticks after the costs drop back down, because it’s “re-trained” the customer to expect a new price point.

    I’d prefer to see cost-plus as the default starting point that regulation and incentives funnel business towards where possible. I think in a lot of people’s minds that’s how they already envisage the economy - they have an ingrained idea of business in terms of covering costs and then adding a reasonable margin for profit, rather than maximising profits regardless of any other concerns. And I’d prefer to see “what the market will bear” used as a genuine mechanism to balance supply and demand where necessary, with careful guardrails to handle situations like the one you described, rather than used to mean “companies are better off buying up competitors, undermining scientists, corrupting democracy, and spreading misinformation to customers, because all of those have a better ROI than improving the product”.

    4 votes
  7. Comment on US President Donald Trump signs order to remove tariffs from Brazilian beef, coffee other food items in ~society

    Greg
    Link Parent
    I mean, the economic systems in the Western world are normally pitched as a balancing of interests between market participants, managed and guided by regulation. I think you’re closer to the truth...

    I mean, the economic systems in the Western world are normally pitched as a balancing of interests between market participants, managed and guided by regulation.

    I think you’re closer to the truth on how it actually works, I’d debate whether that’s how it’s intended to work in the sense of “matches what most people expected, based on what was publicly promised by the people making the decisions to manage the economy” - and I’m not even talking about the totally fictional economic policy we’re seeing now, I’m talking about the broad “truths” perpetuated over the last 50 years or so.

    Honestly I don’t even think “greedflation” is necessarily a bad term just because profit maximisation is expected behaviour for companies. If you’re operating on the assumption that they are pure profit maximisers with no other considerations, it’s probably helpful to remind the consumer that the relationship is purely an adversarial one, and that any justification the company gives other than greed shouldn’t be trusted.

    Or, y’know, we could question that assumption that “what the market will bear” is the only thing that should drive behaviour, and maybe try to construct incentives, regulations, and cultural pressures that balance it with the greater good.

    6 votes
  8. Comment on US President Donald Trump signs order to remove tariffs from Brazilian beef, coffee other food items in ~society

    Greg
    Link Parent
    Which may not, in fact, be superior for all participants in all situations, hence the wide array of different philosophies and implementations various governments use to guide or manage the free...

    Which may not, in fact, be superior for all participants in all situations, hence the wide array of different philosophies and implementations various governments use to guide or manage the free market.

    6 votes
  9. Comment on By administratively redefining 'who is a foreigner,' the Lai government is turning the constitutional 'One China' framework into a dead letter in ~society

    Greg
    Link
    This was an interesting read as someone who only has a surface level understanding of the political situation there. My first thought was that creating a legal paradox for Taiwanese residents born...

    This was an interesting read as someone who only has a surface level understanding of the political situation there.

    My first thought was that creating a legal paradox for Taiwanese residents born on the mainland is a shitty thing to do, but the underlying point of no longer pretending that Taiwan has any realistic chance of controlling the mainland seems sensible.

    The bit I didn’t think of - perhaps foolishly - is China being more pissed off about Taiwan declaring them legally separate than about Taiwan claiming ownership over the mainland. In my mind, it would be a good thing from China’s perspective for the Taiwanese government to stop treating Beijing as illegitimate, but in retrospect it’s obvious that a move to disagreeing on what the country is is actually more fundamental than both sides agreeing on the borders but disagreeing on who should be running it.

    10 votes
  10. Comment on New ‘Stargate’ TV series ordered at Amazon from ‘Blindspot’ creator Martin Gero in ~tv

    Greg
    Link Parent
    That’s a hell of a question, actually! TNG at its best beats Stargate at its best, but if I were to pick five random episodes of each I might actually bet on Stargate winning overall…

    That’s a hell of a question, actually! TNG at its best beats Stargate at its best, but if I were to pick five random episodes of each I might actually bet on Stargate winning overall…

    2 votes
  11. Comment on New ‘Stargate’ TV series ordered at Amazon from ‘Blindspot’ creator Martin Gero in ~tv

    Greg
    Link Parent
    The choice to officially announce it with a zoom call between the creators and a couple of fan site admins is what gives me real hope for this one. There would’ve been nothing wrong with a...

    The choice to officially announce it with a zoom call between the creators and a couple of fan site admins is what gives me real hope for this one.

    There would’ve been nothing wrong with a produced 30s teaser, as long as they got the tone right - but the janky, very human approach they took, with Joseph Mallozzi in the corner grinning like a kid at Christmas the whole time just felt genuine in a way that so much current media doesn’t.

    3 votes
  12. Comment on A Cloudflare outage is taking down large parts of the internet - X, ChatGPT and more affected in ~tech

    Greg
    Link Parent
    Yeah that was kind of my meaning - in the “supply chain” for delivering content over the internet, you’ve basically got AWS handling the servers and Cloudflare handling the networking (although...

    Yeah that was kind of my meaning - in the “supply chain” for delivering content over the internet, you’ve basically got AWS handling the servers and Cloudflare handling the networking (although the lines are actually getting fuzzier as they both try to expand into new offerings and eat each others’ lunch a bit), and they each own an extremely significant share of their respective markets (outside China, which is important, but its own thing in a lot of ways).

    I was meaning that other players don’t come close to the dominance of those two, that’s why I mentioned a lot of sites using both in a non-redundant way. We’re a lot closer to eggs in one basket - or two baskets, tied together in a way that you can’t get to the remaining eggs anyway if either breaks - than it’d look like just by seeing how many companies operate in the space.

    5 votes
  13. Comment on A Cloudflare outage is taking down large parts of the internet - X, ChatGPT and more affected in ~tech

    Greg
    Link Parent
    I mostly agree with your broader point - I think the overall uptime is pretty damn good, and the impact of an outage actually is mostly the fault of the companies using these services without...

    I mostly agree with your broader point - I think the overall uptime is pretty damn good, and the impact of an outage actually is mostly the fault of the companies using these services without creating a fallback or failover on a second platform - but by market share we basically have two players: AWS and Cloudflare.

    And for a lot of sites, they’ll be using both in the worst-case way, with different components from each so failure of either one takes the site down, rather than redundancy so that only a simultaneous failure of both would be a problem.

    8 votes
  14. Comment on AGI and Fermi's Paradox in ~science

    Greg
    Link
    If you haven’t read Asimov’s short story The Last Question, I think you’d enjoy it! Also, to echo @tauon’s point a little, They’re Made of Meat comes to mind too. I think if we are ascribing...

    If you haven’t read Asimov’s short story The Last Question, I think you’d enjoy it! Also, to echo @tauon’s point a little, They’re Made of Meat comes to mind too.

    I think if we are ascribing human-ish motivations to the hypothetical AGI - because yeah, we don’t really have another frame of reference for sapience to work from - I’d question the assumption about desiring true immortality. Plenty of people are happy enough to close out their life’s work over the century, more or less, that we’re given. Plenty more desire another century, or a maybe a millennium, but I haven’t seen a lot of people who’ve really thought about it in depth say they’d want 10,000 years, or 100,000.

    Maybe AGI sees timelines an order of magnitude or two longer than that, but a million years is still an unfathomably long time - more than enough for even an artificial life form to potentially be thinking of that in terms of its “natural” lifespan as limited by things like radioactive decay, likelihood of planetary cataclysm, physical limits of data storage (all electrons used within a range reasonable for sublight communication, for example). And if I’m off by an order of magnitude, or perhaps even two, above and beyond that million year baseline we’re still well within the boundaries of a single planet or solar system’s “working lifespan”.

    I think it’s at least reasonable to entertain the possibility that an artificial life form could find contentment, enlightenment, purpose, nontrivial achievement, or similar without ever wanting or attempting to reach galactic-scale near-eternal scope, and choose to see its own existence as bounded (looking at you, Mr Data). I also think that even for artificial life, “indefinite” is actually a very big concept, and I’m inclined to believe that physical limitations still kick in to give some expected boundaries, even if they’re much much longer ones that could theoretically be overcome. As the boundaries of organic life theoretically could be, for that matter.

    4 votes
  15. Comment on How a flawed idea is teaching millions of kids to be poor readers in ~science

    Greg
    Link Parent
    This is fascinating - I can’t conceptualise coming across an unknown word in text and trying to parse it without being able to hear it in my mind. From everything you’ve said, I wonder if the...

    This is fascinating - I can’t conceptualise coming across an unknown word in text and trying to parse it without being able to hear it in my mind.

    From everything you’ve said, I wonder if the better effectiveness of phonics is partly because it makes it harder to hide an inability to read? If the teacher is focusing on phonics, the “bone doesn’t start with an M” conversation kind of has to happen, it can’t just be left to slide. Tie that in with the issues around class sizes, lack of resources, etc. etc. and I wouldn’t be surprised if “more effective” often translates to “makes mistakes easier to spot” rather than “makes learning easier” per se.

    17 votes
  16. Comment on Matching mouse dpi and acceleration across Mac and Linux? in ~tech

    Greg
    Link
    This is a bit oblique, so feel free to ignore it if anyone has a more direct answer, but there's a decent amount of info about normalising HID drivers between OSes in these build videos for a...

    This is a bit oblique, so feel free to ignore it if anyone has a more direct answer, but there's a decent amount of info about normalising HID drivers between OSes in these build videos for a desktop smooth scroll wheel (second, third). They focus on scroll behaviour rather than tracking, for obvious reasons, but it's also clear the creator has done the kind of near-obsessive deep dive that it takes to get these things right, so the libraries and drivers he's using might be helpful to you?

    Side note: this is what I love about open source! A profit motive will rarely get you better than "eh, good enough" - it takes someone who cares about the work they're doing to spend that much time getting it absolutely dead on, and I'm always happy to see more of that in the world

    8 votes
  17. Comment on An AI-generated country song is topping a Billboard chart, and that should infuriate us all in ~music

    Greg
    Link Parent
    Nah, I see where you’re coming from, and I don’t entirely disagree. For what it’s worth I’m very much not saying commercial incentive diminishes genuine art, either - I don’t necessarily think...

    Nah, I see where you’re coming from, and I don’t entirely disagree. For what it’s worth I’m very much not saying commercial incentive diminishes genuine art, either - I don’t necessarily think that all commercial content rises to the level of being considered art at all, but a huge amount of it does on its own merits and the payment doesn’t negate that. Hell, sometimes commercial constraints genuinely act as part of the art - I don’t think Clerks would be the movie it is with a higher budget, for example.

    I’ve just also seen enough genuinely wonderful and moving work put out into the world from people who make their living elsewhere that I don’t really worry about that ever going away.

    I think I’ve kind of skipped ahead to the part where the economy is so laughably fucked that I don’t see any link between something being worth doing (in the sense that the world and the people need and want it), and that thing actually making a living - I don’t worry about it being taken away because I’m already operating in a world where that kind of sensible reward structure is dead. Either we fundamentally restructure the economy to recognise a century or so of exponentially increasing productivity per worker and an exponentially increasing number of workers on top of that, in which case the “making a living” part is moot, or we’re heading for a technofeudalist hellscape in which we’re all doomed, in which case the “making a living” part is moot.

    5 votes
  18. Comment on How has AI positively impacted your life? in ~tech

    Greg
    Link Parent
    Short answer: no, the fundamentals of how neural networks operate right down at the base level haven’t changed. But the ways we build them, the tools we have available, and the ways we run them...

    Short answer: no, the fundamentals of how neural networks operate right down at the base level haven’t changed. But the ways we build them, the tools we have available, and the ways we run them have advanced a lot, and that does change what’s plausible to achieve - think of it a bit like the difference between what modern programmers achieve compared to writing BASIC on a Commodore 64, even though both are Turing complete, for example.

    The massive jumps I see have more often come from the first time someone figures out a viable way to use a transformer or convnet or SSM in a space that was previously using an entirely different approach. It’s often less about, say, a big change in the weather prediction field coming from newer neural net models displacing older ones (although that does happen to an extent nowadays too), more a step change as our understanding matures to a point that someone writes the first viable weather prediction transformer model and it blows right past what previous analytical approaches were capable of. We’ve just hit a point in time where hardware speed, software stack, and state of knowledge are all converging to let us replace a whole lot of techniques in a whole lot of fields with fundamentally more efficient and capable ones.

    If you’re interested, I’d say Two Minute Papers is a great place to see some examples of how things are progressing! He tends to stack outputs from the last few years prior to whatever piece of research he’s talking about, so it’s really clear how things have changed, and he covers a fairly wide array of topics (although with a lean towards visual models and computer graphics, since that’s his field of research).

    8 votes
  19. Comment on An AI-generated country song is topping a Billboard chart, and that should infuriate us all in ~music

    Greg
    Link Parent
    I’m incredibly biased, I’ll be the first to admit that. But the technology wasn’t built for music - diffusion models were literally that, scientific models to study the process of diffusion....

    I’m incredibly biased, I’ll be the first to admit that. But the technology wasn’t built for music - diffusion models were literally that, scientific models to study the process of diffusion. That’s the problem, and I think why the backlash bothers me so much: because the tech exists for a reason, a damn good one in my opinion, and the fact that it’s been relatively easy to port over to other purposes doesn’t undo that. It’s a problem of shitty incentives, and as far as I’m concerned only a problem of shitty incentives.

    I do also get frustrated about the whole “all training is stealing” argument; not just because consuming and analysing data is quite different to copying it wholesale, but because it’s a fundamental misunderstanding of what’s possible with the technology as well. I’d actually love to create a model that makes work truly and unequivocally its own, without guidance from existing human work, simply because I find that a fascinating artistic concept in and of itself. But I have neither the time nor money to spend on that right now, and it would be a tricky and expensive way to create something artistically interesting that I’d never see commercial return from - again, incentive problem, not technical problem.

    And for what it’s worth, I agree wholeheartedly about art being about communication, conversation between artist and audience. I’ve actually said that in almost the exact same words myself before, but with the exact opposite conclusion: machines can’t take art away from us, because that desire to communicate will always exist.

    5 votes
  20. Comment on How has AI positively impacted your life? in ~tech

    Greg
    Link Parent
    I seem to have fallen into a bit of a career niche as "guy who knows how to make GPU code run well" in recent years, so I've been fortunate enough to work with a few different teams on a few...

    I seem to have fallen into a bit of a career niche as "guy who knows how to make GPU code run well" in recent years, so I've been fortunate enough to work with a few different teams on a few different projects in quite diverse fields recently! This is probably going to sound overly enthusiastic to some people, especially because there are a lot of legitimate concerns about the broader impact big corporate players in "AI" are going to have with the user facing tech, but from the scientific side a lot of the advancements really are just unequivocally fucking awesome.

    One of the biggest shifts I've seen over time is weather and climate simulation, and that's what was at the top of my mind when I was talking about going from an entire cluster down to a single workstation. Anything even vaguely fluid-dynamics-adjacent is pretty much known as a computational black hole, with a ton of shortcuts and approximations needed to run in reasonable time - but simulating something as wildly complex and interconnected as the atmosphere is so sensitive that it's the literal origin of the term butterfly effect.

    We're hitting a point now that neural net models are essentially converging to their own mathematical shortcuts, necessarily so for them to operate with the architectural constraints we place on them, and doing so in a way that those approximations are far more efficient and stable than the ones we've been figuring out for ourselves. Which in some ways I guess shouldn't be a surprise, because I'm certainly not capable of ingesting a 50TB of numerical data and understanding remotely enough about it to spot patterns, let alone spotting the subtle metapatterns that allow for good approximation without sacrificing accuracy!

    This expands to pattern prediction of almost any kind, as well: financial forecasting, traffic planning, disaster response modelling, and energy demand prediction are all the same core technology, and the accuracy that a well trained model can hit now is almost uncanny. The same goes for the analytical side - that's not simulation, but parsing 2D or 3D images, and n-dimensional data of any kind really, can be done at a quality equal to or far beyond human and many orders of magnitude faster across medical imaging, astronomy, particle detection, search and rescue, and a whole stack of others that I'm sure I just haven't come across yet.

    Beyond that, the other example I always come back to (not one that I worked on, but one that still just stuns me by how much it changed the landscape) is AlphaFold. The shift from understanding the structure of thousands of proteins to understanding literally all of them (hundreds of millions) is mindbogglingly vast, and the fact it took a single organisation a couple of years, rather than the decades that most in the field expected for humanity as a whole to chip away at the problem, is what really sums up to me just how powerful neural nets are.

    In almost every case I've mentioned above, it's been either a matter of taking something that was previously constrained to supercomputers and bringing similar result quality into the realms of things an individual researcher can test and prototype for themselves, or taking something that wasn't thought to be possible at all and bringing it down to a few months or years of supercomputer time. And this isn't just the standard Moore's law type of hardware progression - some of these things have gone from supercomputer territory to running on a laptop in the time I've been using the same laptop.

    11 votes