Asynchronous IO: the next billion-dollar mistake? - ~comp

[7]

whbboyd

September 7, 2024

Link

Fundamentally, "async" versus "OS threads" isn't about… whatever it is you think it's about. It's not about OS threads, or an async runtime, or performance, or ease of use, or any of that....

Exemplary

Fundamentally, "async" versus "OS threads" isn't about… whatever it is you think it's about. It's not about OS threads, or an async runtime, or performance, or ease of use, or any of that.

Fundamentally, the distinction is between cooperative versus preemptive multitasking.

And when you put it like that, it gets real easy to roll your eyes at "async" concurrency. After all, we all rolled our eyes at classic Mac OS, and the cooperative multitasking was a rich source of issues and beachballs of death. And at the OS level—for scheduling distinct, independent, possibly not mutually-trusted processes—cooperative multitasking is definitely not the way to go, and there's a reason Apple dropped it unceremoniously and nobody else has attempted to pick it up.

But within a single application, there are real tradeoffs. For one thing, if the application wants to block itself, it can just block itself, and there's nothing the OS scheduler can do about it (although Unix signals sure do try!). One of the major detriments of cooperatively scheduling OS threads—getting blocked by something out of your control—simply doesn't apply.

And because OS threads are preemptive, they may be interrupted at literally any time, literally between any two machine instructions, because that's the difference between cooperative and preemptive multitasking. But that imposes real constraints on how OS threads can behave and be represented. For example, every OS thread must always maintain a full callstack, because otherwise it could get interrupted at a bad time and lose track of important local state. "Async" threads can usually get away with storing much less data, because they know exactly when they yield control to the scheduler and exactly which data they need on hand to resume.

On the other hand, while getting blocked by someone else is not a concern within a single application, getting blocked by yourself still is! The constant exhortations to "not block the executor" in Javascript and async Rust pretty clearly demonstrate the issue. (I have bad news, by the way: every operation except a yield is "blocking the executor".)

And on the gripping hand, the async programming experience in a lot of languages is… bad. It definitely is in Rust (hello Send, my new frenemy), and Javascript, and Kotlin coroutines are no party, either. This is (to my very limited, entirely secondhand understanding) the one thing that Go definitely gets right: a goroutine and an OS thread are indistinguishable to the programmer. (I mean, other than the totally divergent semantics and behaviors which become immediately obvious under observation. But, like, you don't have to sprinkle Send + Sync + 'static all over like salt on a bad meal, and you can call map on your iterables without contorting yourself ridiculously. Well, you could if Go had map on iterables. I do mean the one thing it gets right.)

So: yeah, tradeoffs. For raw performance, there are structural reasons preemptive multitasking cannot match cooperative; it just has to do more work. For reliability and ease of use—well, people write stuff in Python, so obviously "raw performance" is not the sole guiding light of modern software. Maybe some day we'll come up with some better abstraction over concurrency and roll our eyes at the silly things we did back in the bad old days. But null pointers aren't a billion dollar mistake because we can now roll our eyes at poor old Tony Hoare; they're a billion dollar mistake because there are no tradeoffs. There's literally no upside to having every single reference type in your application potentially be a bomb.

26 votes

[6]
skybrian
September 7, 2024
Link Parent
I’m pretty happy with async-await in JavaScript. My one complaint is that it’s too easy to forget an await. A required keyword for spawning a new “thread” of execution (like ‘go’ in Go) would be nice.

I’m pretty happy with async-await in JavaScript. My one complaint is that it’s too easy to forget an await. A required keyword for spawning a new “thread” of execution (like ‘go’ in Go) would be nice.

8 votes
1. [3]
  brogeroni
  September 7, 2024
  Link Parent
  For Javascript I'm pretty sure there's an eslint rule to prevent promises that aren't awaited or have void in front (e.g. void fetch(Foo)) has definitely helped in this regard
  
  For Javascript I'm pretty sure there's an eslint rule to prevent promises that aren't awaited or have void in front (e.g. void fetch(Foo)) has definitely helped in this regard
  
  6 votes
  1. Weldawadyathink
    September 7, 2024
    Link Parent
    That definitely helps. Also using typescript helps sometimes. When your editor throws an error that Promise<something> doesn’t have the member you want, you know you need to throw some awaits in...
    
    That definitely helps. Also using typescript helps sometimes. When your editor throws an error that Promise<something> doesn’t have the member you want, you know you need to throw some awaits in somewhere.
    
    4 votes
  2. skybrian
    September 7, 2024
    Link Parent
    Yeah, this is using a lint to make up for a deficiency in the language. But good to know. Thanks!
    
    Yeah, this is using a lint to make up for a deficiency in the language. But good to know. Thanks!
    
    1 vote
2. [2]
  vord
  September 7, 2024
  Link Parent
  The problem I've noticed with async/await in Javascript especially is that it becomes trivially easy to write code that doesn't function the way you think it does. I'm reminded of the await event...
  
  The problem I've noticed with async/await in Javascript especially is that it becomes trivially easy to write code that doesn't function the way you think it does.
  
  I'm reminded of the await event horizon in particular.
  
  2 votes
  1. skybrian
    September 7, 2024 (edited September 7, 2024)
    Link Parent
    Good article, thanks! It’s generally true that most programming environments have nothing to prevent a function from hanging in an infinite or very long loop. Any time a function takes a callback...
    
    Good article, thanks!
    
    It’s generally true that most programming environments have nothing to prevent a function from hanging in an infinite or very long loop. Any time a function takes a callback and calls it, there’s a danger that the callback will never return. It can also happen at the OS level, maybe because the user pressed Control-Z to suspend the process, or just because the system is overloaded and swapping. Promises that never settle are the async version of that.
    
    Async code does make it easier to screw up. It’s also harder to test. But I still think it’s a whole lot easier to get right than other common forms of concurrency. Preemptive concurrency using real threads and shared memory is a bigger minefield, and when programming in JavaScript, I’m happy to only have to think about what other code might do at await keywords rather than everywhere.
    
    3 votes

[2]

stu2b50

September 6, 2024

Link

It's not like async-await is the only model here, you also have green threads. While not available to some languages depending on their requirements (e.g, Rust), it is a model nonetheless, and in...

It's not like async-await is the only model here, you also have green threads. While not available to some languages depending on their requirements (e.g, Rust), it is a model nonetheless, and in some very popular languages.

The billion-dollar-mistake part of null pointers or C strings is just how simple it would have been to not have them. "Just make OS threads fast" is not a simple swap. It is, in fact, a very challenging problem that many smart people have spent a lot of time working on, and it's absolutely not obvious that if you simply spent more resources, you would have gotten more out of it.

18 votes

scherlock
September 6, 2024
Link Parent
Yeah, also, just spawning more threads doesn't really making working with IO somehow easier. Every async pattern tried to date is really focused around making something that is disjointed look...

Yeah, also, just spawning more threads doesn't really making working with IO somehow easier. Every async pattern tried to date is really focused around making something that is disjointed look like it's behaving in a serial manner.

3 votes

[3]

unkz

September 6, 2024

Link

I kind of suspect that we have in fact been dedicating a ton of effort into making threads more efficient, but it’s fundamentally more difficult.

More specifically, what if instead of spending 20 years developing various approaches to dealing with asynchronous IO (e.g. async/await), we had instead spent that time making OS threads more efficient, such that one wouldn't need asynchronous IO in the first place?

I kind of suspect that we have in fact been dedicating a ton of effort into making threads more efficient, but it’s fundamentally more difficult.

12 votes

[2]
yorickpeterse (OP)
September 6, 2024
Link Parent
It's true that time and effort has been spent in optimizing threads, and they're certainly in a better state than they were say 20 years ago. But my very thought is "what if we'd done a better...

It's true that time and effort has been spent in optimizing threads, and they're certainly in a better state than they were say 20 years ago. But my very thought is "what if we'd done a better job", followed by "what if we can/could do better?".

That's the point ultimately: we are where we are today, but I'm curious where we could've been if we took a different path that focused on optimizing threads further, rather than take the path that involves asynchronous IO.
1. unkz
  September 6, 2024
  Link Parent
  I think we have taken that path though, and I don’t really believe putting even more resources on the problem is going to lead to running millions of parallel threads on equivalent cost hardware...
  
  I think we have taken that path though, and I don’t really believe putting even more resources on the problem is going to lead to running millions of parallel threads on equivalent cost hardware like the async model can.
  
  The only path I can see towards even theoretically doing that would involve transformational changes in hardware level threading support that would require vast and expensive amounts of hardware that would be entirely wasted in every other application that isn’t a web scale server.
  
  3 votes