7 votes

Game simulation programming: Continuous time

7 comments

  1. [7]
    Moonchild
    (edited )
    Link
    somewhat a follow on to lerp smoothing is broken—which i have not yet had the chance to watch, but the comments reminded me of it. the presentation is a bit slow and frustrating, but the content...

    somewhat a follow on to lerp smoothing is broken—which i have not yet had the chance to watch, but the comments reminded me of it. the presentation is a bit slow and frustrating, but the content is good—much more illuminating than most of the mainstream resources i have seen on the topic

    1 vote
    1. [6]
      TangibleLight
      Link Parent
      I've added it to watch later, but I learned my lesson in the other thread and will not speculate on what I expect the video to be about. I am hoping he touches on parallelism. Like I mentioned in...

      I've added it to watch later, but I learned my lesson in the other thread and will not speculate on what I expect the video to be about.

      I am hoping he touches on parallelism. Like I mentioned in the other thread, I really don't know how that works in practice. My own attempts in toy projects usually go poorly.

      1 vote
      1. [5]
        Moonchild
        Link Parent
        he does not talk about parallelism, no. it is more conceptual models+basic building blocks for thinking about time than nitty-gritty implementation you may be interested in this idea that i had a...

        he does not talk about parallelism, no. it is more conceptual models+basic building blocks for thinking about time than nitty-gritty implementation

        you may be interested in this idea that i had a couple of years ago. i think it's a good idea, even if i lost interest in video games before developing it past a barely functioning prototype. it's a frame that putatively has a lot of advantages (easy to parallelise; easy to reason about game logic and interactions; easy to do fun thing like time travel and rollback multiplayer), but it's also solidly in the discrete-time paradigm sean shits on (i hadn't seen that video yet). also apparently john carmack already tried it 10 years ago

        4 votes
        1. [4]
          TangibleLight
          (edited )
          Link Parent
          I'm suspicious of the continuous time paradigm that Sean is advocating. In cases where accuracy is paramount and you're reaching for any bits of optimization you can find, it makes sense to look...

          I'm suspicious of the continuous time paradigm that Sean is advocating. In cases where accuracy is paramount and you're reaching for any bits of optimization you can find, it makes sense to look at analytical representations and try to take advantage of that. Real time simulation is not such a case. The more important thing there is determinism (reproducibility/consistency) and I think a fixed time step accomplishes that much better in most cases.

          For example he spends some time around 26:00 trying to come up with an issue where a parabolic motion should collide with something, but due to discrete timestep it incorrectly tunnels through the object. He struggles to find a good example. Mathematically, I would expect him to struggle to find a good example simply because of inertia. High curvature motion tends to be the slowest part of some trajectory, since accelerating through that high curvature at speed takes more force. Unless there's something to provide that force, all the speed happens on straighter sections of the trajectory. So either it moves slowly along the higher curvature, and naturally gets more samples from the discrete timesteps, or it moves quickly along the lower curvature, and the linear approximation is fine.

          What he's looking for is a situation where curvature and velocity are correlated. One example that does demonstrate this is orbital mechanics; think n-body problem or solar system simulation. Objects have highest kinetic energy at closest approach, and that's always the point of highest curvature on the trajectory (highest gravitational force). If you don't account for the errors you get inconsistent orbital procession and energy non-conservation that looks very wrong after only a few orbits. And orbital simulators tend to do these continuous-time strategies that Sean talks about. "On-rails" planetary motion is probably closest to what Sean discusses. You can also compute error terms and correct for it down to some upper bound with more advanced integrators. Or you can break up the timesteps into stages which are each energy-conserving to ensure errors don't accumulate and blow up over time. None of those require a fixed timestep, but determinism still does.

          I'm really interested in what he talks about around 19:49

          [...] a classic way to do that is you look at all the objects and you find the time of their first collision [...] without doing any more simulation of them [...] you take the lowest one, and you simulate all objects up to that point, because up to that point, nothing else can happen [...]

          [...] that can be expensive [...] but you still have all the old collision times from the original thing so say you had a hundred objects and two of them collide at this point. You can now just take those two objects and say when do those two objects collide with anything else including the other objects. And you can't miss anything because all the other objects, if they interact with each other, they're not affected by these two objects until these two objects collide with another object [...]

          You can think of all the collisions as nodes in a graph, where edges correspond with uninterrupted motion of a particular object. He's really just saying that interactions can't propagate backward in time - it's a DAG - and if you look at a particular point in time you'll have disconnected regions which have no data dependencies between them.

          This starts to smell like a task scheduling problem to me.

          I wonder what use a tool like Bend might have here... surely Bend itself and its other performance constraints are not suitable for real-time simulation, but the paradigm in general of how it parallelizes graph traversal might be relevant?

          This then starts to smell like what I understand of your linked comment. "Interactions" are where data dependencies come from, so make these first-class and parallelize across distinct cliques of interactions.

          I'm still suspicious. Timestamp in Carmack's keynote.

          [...] Since everyone's looking at the previous frame's rendering: two people approach a narrow hallway, they both say they want to go into that hallway, they both think it's clear, so they both go into it. Well, what do you do next frame? They both say, well, I'm here, but two people wound up at the same place. So there's a little bit of resolving that has to be done there [...]

          An adage I like but I'm not sure who it's attributed to: complexity is always conserved.

          Yes, writing the thing with barriers and locks is a nightmare. But those are the data dependencies. You can't just get rid of them. You can reframe the problem with immutability and create a bunch of parallelism in part of the process, but then you've traded the complexity of all the barriers and locks for the complexity of this conflict resolution. I suspect that this conflict resolution in general is more complex than writing the locks and barriers.

          I suspect an easier way forward is some functional paradigm and some compile-time analysis that identifies as many data dependencies as it can and injects the locks and barriers for you. This is why I mentioned Bend. I don't think Bend itself is a solution, but something like it seems more on-track.

          1 vote
          1. Moonchild
            Link Parent
            i don't currently have the wherewithal to engage properly with most of what you say, but i will paste this section from my unpublished paper on software transactional memory. it is 'future work',...

            i don't currently have the wherewithal to engage properly with most of what you say, but i will paste this section from my unpublished paper on software transactional memory. it is 'future work', so speculative, but may be of interest. (let me add i think there is general merit to the idea of generating concurrent algorithms automatically. and all performance optimisation is scheduling on von neumann machines, which is currently all of them)

            \subsection{Transactions as dynamically expanding DAGs}\label{dag}

            The concept of representing a computation as a DAG (direct acyclic graph) in order to incrementally recompute it when part of the input changes is well known, with obvious applications to build systems and spreadsheets. There is a class of transactions that can be straightforwardly expressed as DAGs, with a given node representing a read, a computation, or a write; a given node can depend on any node other than a write, so long as there is no cycle (in particular, a read can depend on another read). There are two important advantages to representing a transaction in this way:

            \begin{enumerate}
            \item When a conflict is detected, rather than aborting the transaction completely, we can simply dirty the relevant read nodes and recompute only what is downstream of them, rather than throwing out the \textit{entire} transaction state and starting over.
            \item When multiple threads are helping a particular transaction, they can increase concurrency and reduce contention by picking independent portions of the DAG to execute.
            \end{enumerate}

            There is another advantage, pertinent to our helping mechanism.

            [snip]

            In fact, we can represent \textit{every} transaction as a kind of DAG. But it must be dynamically expandable, not static; the `downstream' structure must be able to depend on the values read `upstream'. Suppose a transaction iterates over an array of objects, accessing each; in the source code, there is a cycle, but in any given execution, only a finite number of elements will actually be read, and we can have one node per read object.

            In practice, it seems difficult to come up with a good compilation strategy for turning general-purpose code into dynamically expanding DAGs and efficiently executing them.

            One point of difficulty is determining exactly when two computations are independent. Consider a transaction which iterates over an array of objects and mutates each one. It is highly likely that the objects are unique, and therefore that the iterations are independent, but absent a sophisticated type system or global analysis, it is unlikely that it will be possible to prove this. One likely approach is to employ speculative parallelisation, similar to most out-of-order CPUs. We may speculate that the elements of the array are distinct, and operate on them in parallel, but check afterwards that they were in fact distinct.

            We can build an interference graph dynamically: when initially iterating over the array, keep track of which elements alias each other, adding the appropriate program-order dependency edges between iterations corresponding to aliasing elements. If there are conflicting accesses to the objects, but the array itself remains unchanged, then this dependency structure remains intact and can continue to be used; if there are conflicting accesses to the array, then it must be updated accordingly.

            2 votes
          2. [2]
            Moonchild
            Link Parent
            ever so slightly more sane now essential complexity is essential. inessential complexity can be removed. this seems tantamount to saying 'nothing can be simplified', which is empirically wrong...

            ever so slightly more sane now

            complexity is always conserved

            essential complexity is essential. inessential complexity can be removed. this seems tantamount to saying 'nothing can be simplified', which is empirically wrong

            functional paradigm and some compile-time analysis that identifies as many data dependencies as it can and injects the locks and barriers for you

            functional parallelism is fork-join—importantly, deterministic. locks are nondeterministic: multiple concurrent acquires can happen in any order. it's a question whether you need determinism, but think it's only worth it if it significantly changes the effort:performance curve

            those are the data dependencies. You can't just get rid of them. You can reframe the problem with immutability and create a bunch of parallelism in part of the process, but then you've traded the complexity of all the barriers and locks for the complexity of this conflict resolution. I suspect that this conflict resolution in general is more complex than writing the locks and barriers.

            in the case of collisions, the classic approach (accelerate the overlapping objects out of each other) works and is pretty easy; can't speak for quality, but i'm told super mario bros has good game feel. another example that comes to mind—two different actors try to pick up the same object—is easy to resolve: let the actors simply declare intent to pick up the object, and let the object decide whom to be picked up by. in the latter case, there is a fairly straightforward linearity constraint (an object has one holder) in the gameplay constraints. in the physics case, there is emphatically not: suppose there is a line of blocks touching each other; if i push the first one, i expect the whole line to move. (well, in real life the impulse travels at most at the speed of light, but that's basically instant at frametime-scales.) to be honest, i don't know how you solve this in a sequential setting either

            taking a step back, i think there are basically two problems.

            the first is that classic approaches (monolithic update(state)) necessarily lend themselves to implicit ad-hoc dependencies threaded through the code, expressed implicitly—sometimes by accident—in the order in which various operations are done or game entities are processed. it seems to me that any process by which these can be made explicit is a win for clarity and maintainability, independent of performance

            the second is: we can imagine constructing a graph of connected components and parallelising processing across components, with serial processing within each component. but i don't think that that graph actually exists. i think in general every game entity will have some radius of other entities around it with which it could interact, and that entities will generally be dense enough that every entity will be connected to every other entity. even though the actual realised interaction graph will have quite a lot of parallelism. not sure where to go from there, but it seems to require optimism and therefore conflict resolution. i'm not sure conflict resolution necessarily has to be complicated, though, then; it could just be 'these entities form a conflict set; please write some sequential code updating them as a group'. then you write exactly the same code you'd write in a totally sequential setting, and the tricky thing is conflict detection

            The more important thing there is determinism (reproducibility/consistency) and I think a fixed time step accomplishes that much better in most cases.

            why is it better at determinism? i think they both suffice to achieve it, except that with a fixed timestep, you also need to interpolate to match the display frequency, and you have to trade off input latency vs. update frequency vs. extrapolation hacks

            the one thing i will say in favour of fixed timesteps is that in context of, say, a precision platformer or a fighting game where you are literally counting frames, it's nice (not just because you have ticks—could do that with continuous time too, as sean shows—but because if display frequency matches update frequency then you can ensure the two are aligned)

            also, btw, re my other comment, i was informed that it's been tried before; sent a couple of mediocre papers (my version is obviously better) and one interesting one (not read yet—too much substance to skim); may be of interest: https://arxiv.org/pdf/1403.5645

            1. TangibleLight
              (edited )
              Link Parent
              I think most one-line adages like that, if you take them to their extreme conclusion, are empirically wrong. Here I was trying to talk about the essential complexity in the problem. If two...

              essential complexity is essential. inessential complexity can be removed. this seems tantamount to saying 'nothing can be simplified', which is empirically wrong

              I think most one-line adages like that, if you take them to their extreme conclusion, are empirically wrong. Here I was trying to talk about the essential complexity in the problem.

              If two entities in the physical simulation cannot interact, there trivially is parallelism there. If two entities are currently interacting, there trivially isn't. If they might be interacting, the problem is harder. And that is the problem. You can shuffle how you frame it, look at different heuristics or binning strategies, look at graph analysis to try to prove which entities can't interact... but ultimately that's the thing it seems like we're trying to address.

              in the case of collisions, the classic approach (accelerate the overlapping objects out of each other) works and is pretty easy; can't speak for quality, but i'm told super mario bros has good game feel.

              Fair enough. I was focused on collision since that's most of the examples that Sean used in the linked video, but I think there's obviously some more interesting problem in there about looking at the causality in the system and seeing which parts of it you can actually run in parallel.

              why is it better at determinism? i think they both suffice to achieve it, except that with a fixed timestep, you also need to interpolate to match the display frequency, and you have to trade off input latency vs. update frequency vs. extrapolation hacks

              I think I had numerical error in mind. If you fix the time-step you could guarantee bitwise reproducibility. Even if the simulation misbehaves, it will misbehave in a consistent way that you can work around as a designer. You may or may not be able to do that if you allow variable timestep... For a given piece of the piece-wise function, sure, but once some interaction occurs, you need to determine the next piece. I guess if the solver you use to find the next piece of the function is deterministic then the whole system would be. You just need to be careful to never use the particular frame time anywhere in that solver, and I don't know how that would generalize.

              Whether that matters in practice... I have no real empirical evidence one way or the other in general. Seems likely that in most cases it doesn't really matter, but there are definitely cases where it's critical. Anything that's meant to support TAS (mostly platformers, probably?). Old Line Rider tracks come to mind. Engineering/factory games where production rates need to all match up correctly. But, again, for more ephemeral things it probably doesn't really matter.

              functional parallelism is fork-join—importantly, deterministic. locks are nondeterministic: multiple concurrent acquires can happen in any order. it's a question whether you need determinism, but think it's only worth it if it significantly changes the effort:performance curve

              in the physics case, there is emphatically not: suppose there is a line of blocks touching each other; if i push the first one, i expect the whole line to move. (well, in real life the impulse travels at most at the speed of light, but that's basically instant at frametime-scales.) to be honest, i don't know how you solve this in a sequential setting either

              but i don't think that that graph actually exists. i think in general every game entity will have some radius of other entities around it with which it could interact, and that entities will generally be dense enough that every entity will be connected to every other entity. even though the actual realised interaction graph will have quite a lot of parallelism. not sure where to go from there, but it seems to require optimism and therefore conflict resolution. i'm not sure conflict resolution necessarily has to be complicated, though, then; it could just be 'these entities form a conflict set; please write some sequential code updating them as a group'. then you write exactly the same code you'd write in a totally sequential setting, and the tricky thing is conflict detection

              also, btw, re my other comment, i was informed that it's been tried before; sent a couple of mediocre papers (my version is obviously better) and one interesting one (not read yet—too much substance to skim); may be of interest: https://arxiv.org/pdf/1403.5645

              Gonna need to chew on those for a while. These three four points are very interesting to me, but I don't have a good response right now. I'll revisit this later tonight or tomorrow and follow up after I can think about it a bit.