How to discuss probability to/as devs and the community?
Consider
- A game system with a random success rate of 1% (a) (like a loot drop in an mmo)
- A game system with a random success rate of 1%, pity=100 (b) (pity in this context means the probability of success is changed on your 100th successive failure to 100%)
How long would it take a player to earn an their success given 1 attempt/minute?
The answer for (a) is "infinity" which the community rarely accepts. It is possible (though unlikely) for someone to fail forever, they can. The answer for (b) at most 100 attempts (100 minutes).
Developers can describe (a) as the average player will succeed after a little over an hour (~69 attempts). However the 99th precentile takes about 7.5 hours... and the unlucky 1%? Longer. 1 hour and 7.5 hours aren't in the same ballpark.
Anyone have a solution to cut through the mathplexity? Decisions in their own game design or what they've seen others do? I simply have pities when the odds are worse than 1 in 5 or relegate (a) style probabilities to combat systems (non-reward).
(disclaimer: I'm not a game dev specifically, but I run TTRPG campaigns)
Sounds like there are three goals you're trying to balance:
Independence of outcomes. This is the MMO-style constant probability. This gives some degree of realism or agency, as opposed to railroading where someone is watching you and tuning things behind the scenes.
Short probability tails. Like you describe, it's kind of sad to have lots of people take 10x longer than other people.
Explainability. Like you said, it's difficult to describe the expected value of system (a) in a way that communicates the long tails.
As you've pointed out, goals (1) and (2) are definitely incompatible, just because (1) fully specifies the probability distribution, and that distribution has a long tail.
I personally dislike the pity solution. Option (b) feels unsatisfying to me: a good number of players (36.6%) will need the full 100 rounds, and hitting the absolute theoretical maximum of rounds feels like a failure to me, especially if this number is well documented. It feels unearned and unrewarding in my opinion.
I think a generalisation of the pity mechanic might work better though. If you're able to count the number of attempts for the pity mechanic each round, then you are able to implement whatever distribution you like. Just choose some distribution which has a short tail satisfying (2), and which can be easily described. For example, some Gaussian- or Poisson-looking distribution would upper-bound the number of tries required with high probability, and be very accurate when you say "the expected number of tries is X, with Y amount of variation". The downside is that you lose realism -- there's no physical reason to expect the 100-th monster to have a higher chance of the item than the 1st one.
I've seen one solution which I really like though. In Maplestory, there is a notoriously difficult jump quest which needs a lot of manual dexterity and platforming skill, but is required to access one of the important bosses. Most people spend hours on it, but some take days, and some people are just completely unable to do it for various reasons. One private server implemented an alternate quest, which is more of a "fetch 2000 bear asses" kind of thing, which is very tedious, but anyone at all should be able to complete it in a predictable (but large) amount of time. People who got tired of the jump quest could switch over to the alternate quest and gain access that way.
Generalising a little, I think the best solution is to provide multiple pathways to success --- (A) one with a lower expected time to completion and a high variance, and another (B) with a higher expected time to completion, but with low or no variance. I think this lets you achieve all three goals: independence of outcomes aka physical realism via (A), a bounded time-to-completion since people can swap over to (B), and ease of communication because both paths are individually easy to explain. As a bonus, I think both paths make a success feel earned, avoiding the problem that I feel exists with pity successes.
Something somewhat similar but not exactly the same is (or was?) Rainbow 6 Siege's item drops. You would start with a low chance of getting a pack, and if you didn't, then your probability of success went up. It doesn't separate the high variance, low time to completion from the low variance, high time to completion paths, but instead combines them. However, the UX is the same at the end of the day.
I think a good compromise for the three constraints is sampling without replacement. Sure, you eliminate independence, but it can actually still be decently realistic, and it doesn't railroad too hard.
So you could have a lottery of 199 duds and one prize. Every minute, every player gets to draw one sample, which will then be eliminated from the lottery. After 200 minutes, you're definitely done. But the "without replacement" part means that we're ramping up the chances slowly as time goes on, meaning most players won't make it to the very last one, and there is almost always an actual element of chance involved.
There might be ways to further tweak this. If you increase the tail length you'd accept, you'll make it so that the chance stays almost(TM) independent for longer, as the impact on the number of duds will stay low for a while. Make it 1 prize and 11999 duds, sampling once per second, and you're not only smoothing out the curve a bit, my intuition tells me you'd actually meaningfully change the curve, but I'd have to do some simulations to know for sure.
And if you want to keep a tail with progressively lower odds in place, nothing stops you from occasionally adding duds back into the lottery.
That said, personally I'm a fan of axioms one and three. Achieve explainability by weaving the chance elements into a larger system, and let players manipulate the system to increase their odds. But that goes more into game design philosophy than actual statistics.
Agreed! I hadn't thought of sampling without replacement until I read yours and @skybrian's comments, but I think I like that the best now. It is "fair", completely bounded and very simple to communicate.
This can be expressed in the same framework where probability is a function of the trial number, specifically
P(success on trial X) = 1 / (200 - X)
. (I'm using 200 as the number of "shuffled cards", and using X starting at 0, for simplicity.)This distribution has the very very nice property where the chance that you've failed on all trials prior to X is
That is to say, the distribution of time-to-success is a uniform distribution over the entire range, which is intuitively correct given that this method is the same as shuffling a deck of cards.
Unfortunately, since this distribution is uniform, I think this means that doing the "1 prize and 11999 duds" thing probably won't have an effect on the time-to-success distribution.
But I think this
1/(k-X)
distribution is a very good base to work off of, to achieve a distribution with a tail but which drops off quickly. I can imagine just capping the drop probability at some number (e.g. 0.1), which would be like drawing from a deck of cards but topping it up to 10 if it goes below 10. This has the original kind of independent-outcomes long tail but with a half-life that you can choose, rather than having a half life tied to the expected value you chose.Huh, that does make an awful lot of sense. My intuition lied to me! How dare it!
Oooh, here's another interesting distribution: Stick with the lottery without replacements idea, but replace them with objects sampled from another lottery. For example: L1 is the lottery the player samples from. Starts out 1 at 1 in 200 odds. Draw a dud, it's now a 1 in 199 lottery. But we want to keep it at 200 elements, so we draw a sample from lottery L2 to put into L1. L2 could just be "1 in 100 with replacement" for simplicity. As time goes on, L1 will get backfilled with more and more positive samples, meaning that the tail will be stay long, but will contain much less probability mass. Or you could make L2 another lottery without replacement, at which point I'm at a loss for how this will play out.
It's kinda crazy how much room there is for design here. Not usually my cup of tea, I much prefer system driven games where such complex chance based systems emergently arise from the world. Cool stuff regardless, it just sits in a different part of my interests if that makes sense?
It is also possible to rephrase the question which is being answered by the random variable from "is this specific attempt successful" to "which specific attempt will be the one to first be successful". In other words, replacing the repeated random events (where a success indicates the goal state is achieved) with a single random event where the outcome is the amount of time before success is achieved.
This is at least possible for any distribution which has a bounded domain (i.e. we expect to have at least 1 attempt, and at most K attempts).
Talking in the scope of a series of discrete attempts, the probability mass function (i.e. the function
f(k)
whose value is the amount of attempt series that would first succeed on attemptk
) can be converted to a cumulative probability mass function (i.e. the functionF(k)
whose value is the amount of attempt series that would first succeed on or before attemptk
).Selecting your distribution so that this has a bounded domain (i.e. it is only non-zero for
0 ≤ k < K
whereK
is the upper bound on the number of attempts andk
is 0-indexed so thatk=0
is the first attempt) means we have a finite amount of possible outcomes with respect to how many attempts are required.From this it is possible, instead of performing multiple separate random samplings (to see whether a specific attempt succeeds), to calculate ahead of time which specific attempt will be successful, without affecting the probability distribution of the outcomes.
If a priori we want each attempt to be as likely to be the last as any other (within the bounded amount of attempts no more than K), then the probability mass function is the uniform distribution
f(k) = 1/K
.The cumulative probability mass function is then the cumulative sum a linear function
F(k) = (k+1)/K
. Atk=0
(the first attempt)(0+1)/K = 1/K
of the attempt series have been successful. Atk=1
(the second attempt),(1+1)/K = 2/K
of the attempt series have been successful; and so on until atk=K-1
(the K'th attempt),(K-1+1)/K = K/K = 1 = 100%
of the attempt series have been successful.This specific case (with the probability mass function being the uniform distribution) is equivalent to sampling from a bag without replacement, looking for a red ball, having 1 red ball and K-1 black balls.
In tabletop games computational restrictions are likely highly relevant with respect to how to formulate such an approach, but as a very simple example you could have a system where checks like this are made by rolling two six-sided dice and summing the result. If the sum meets or exceeds a target number of 10, then the first attempt is successful. If the sum is less than the target number, for every point below one additional attempt is required.
1/6 rolls would get 10 or more, meaning success at the first attempt. 1/36 rolls would get 2, meaning success after 9 attempts. For the full probability distribution from this example, you can see program 30144 on anydice.com.
Very nice, I hadn't noticed that this all could be simplified into a single distribution, which would make it much easier to implement both in a video game and a tabletop game setting.
I guess I missed that because I didn't think to cross the gap between the probability from the game's perspective and the player's perspective. The players still see the same distribution either way, but in your approach, their future has already been pre-determined.
I mean, this makes sense because probability is a function of how much knowledge you have, and players, missing the behind-the-scenes info, can only arrive at the same distribution regardless of the means of deriving it. But for some reason the invisible pre-determination is a somewhat disturbing thought to me, haha. There are probably some implications for free will or whatever.
Thanks for adding the extra perspective!
Careful with this approach, depending on some boundary conditions. For example, you'd have to store the state ("remaining draws until success"), which means it doesn't change when saving/reloading unless you manually mess with it. Furthermore, if you're using this for "safety-critical" things, I'm not sure storing the state is a good idea. Also, I'm sure some lawyer has made the argument that if you're using this implementation for loot boxes, any box you open where the software already knows you're not going to get anything isn't even gambling, it's a straight up scam. It's probably fine for most use cases though.
There's a cool way you can shape distributions nicely: Shape the CDF (cumulative distribution function) as you want, then sample from its derivative, which is the PDF (Probability Density Function). Let me explain: You use whatever tools you want to build a cdf. Starts at zero, converges to 1 as time goes on, can only ever increase. That describes your distribution now. Think of it in terms of "at time X, what is my chance of having gotten the McGuffin?". To eliminate the tail, you want to strengthen the convergence of this cdf to 1 more.
Ok, now you have your CDF, what do we do with it? Well, the simplest approach is to sample a uniform random number between 0 and 1, and look up when the CDF transitions that number. That time is when you get the mcguffin. But, we can also make this computation continuous-time stateless if we want: You store up to which part of the CDF we've previously gotten. (For discrete time stateless, this is just the last time step) Let's call that x. Start with x=0, obviously. At any time t, check the difference between CDF(t) and x. Build a lottery from it: McGuffin gets a weight of CDF(t) - x; Dud gets a weight of 1 - CDF(t). If you draw the dud, then set x = CDF(t), as previously described. What we're doing here is every iteration, sample from the distribution of "The McGuffin is later", aka "Dud", and the weight is proportional to the probability mass that is in the future. The probability mass for "we get it this turn" is CDF(t) - x because that describes the part of the CDF where we haven't already had it last turn (x), but expect to have it this turn (CDF(t)).
The way I'm thinking about PDFs and CDFs here is thus: By defintiion the CDF is the integral of the PDF. But that also means that the PDF is the derivative of the CDF. Which means we can shape our CDF to be what we want, and then convert it to something we can sample from. If you squint a little, thatCDF(t) - x is a finite differences gradient of the CDF.
That make sense? It's a bit discombobulated, but I hope it gives you an idea of the ways you can manipulate distributions. I think it might need some illustrations. If you got questions, ask away.
That's a good point. It does make sense to delay calculating future failures if you want it to be random when you replay from a saved game state.
But if you want the probability to change depending on how many failures the player already had, I think you need to remember past failures somehow? Otherwise you're limited to stateless functions.
There are some games where this would be counter-intuitive for the player. Should buried treasure move if you replay a saved game, or should you be able to go directly to wherever you remember it was from the previous game? Maybe it's more fun to be able to use previous knowledge.
When realism matters, anything that the player thinks of as having been determined before saving the game should remain in the same place.
For some kinds of events, maybe it would be fun if the randomness came from the player's actions somehow, so if you do the same things then it will happen the same way, but if you deviate from that path, the alternate history diverges due to a mysterious butterfly effect.
How is it stateless if you're storing how far along the process you are?
As for designing the distribution, just as you can design the cumulative mass / density functions, you can also design the probability mass / density function, as long as you are able to normalize it so that the sum / integral is 1.
I deal almost exclusively with discrete distributions (I like thinking about TTRPG and board game mechanics), and for that I find my intuition much more strongly connected with the probability functions rather than the cumulative functions. I generally have design consideration such as "what is the most common result", "what is the average result", "where will the bulk of results be", and these are (to me) more readily available in the mass / density function.
For illustration, take the distribution of 3d6 (the sum of three six-sided dice), as shown in anydice program 1. Selecting the "graph" view and "normal" data shows the probability mass function. Selecting the "graph" view and "at most" data shows the cumulative mass function.
Good question that I also anticipated writing this but ultimately forgot to preempt. cc @skybrian I'm using the term "stateless" here perhaps a bit loosely to refer to the fact that future rolls aren't precomputed, in a way. It does hold some state information about how far along the process you are - i.e. the time, as well as the last time the distribution was sampled. In a game engine, these are global variables that are always available: It's your timestep and the current time. They're not in any way specific to the distribution. It's as stateless as you can make this distribution, but ultimately different from actually stateless distributions. But those are always independent and thus have long tails.
The reason I'm showing off CDF notation here is because IMO it's easier to build a function that converges to 1, rather than one that integrates to 1. Normalization of unusual PDFs can be a pain. It also crystallizes the feature OP is interested in: How important is that tail. For a PDF, you can't at a glance tell apart whether that non-zero tail contributes meaningfully to the attempts the player needs. You'd need to integrate a possibly infinitely narrow and infinitely long tail. The CDF does that for you, you just compare it to 1. 1 - CDF(t) is the probability that the McGuffin drops at time to or later.
That said, for discrete distributions, CDFs are often unnecessary.
OT, but I thought economists were the only people who habitually flipped X and Y axis. What's up with those flipped bar charts in anydice? </rant>
Definitely going on a tangent here, but it's at least purposeful. (I made this a separate reply to keep it from the more pertinent discussion.)
The default is technically tabulated data, not a graph, but yes it in that mode it is presented effectively rotated 90 degrees clockwise to a regular graph. I assume this is because it was a simple way to ensure that you can easily fit in any number of data points without having to scroll sideways and still keep every data point explicitly legible. Having used that site to calculate discrete distributions with hundreds of outcomes, I would say it certainly has its uses.
Thank you! I see the point you are making and I don't know any appropriate terminology for the distinction either, so I understand the choice of words. Maybe "ongoing stochastic process", as opposed to "initially stochastic process", but that's two mouthfuls.
I also appreciate the context of (computer) game engines. It is absolutely a distinction to be made from (tabletop) game systems, which operate under very different circumstances and common practices. Designing for the medium you are working with is a very important aspect of game design.
Your ClapBearCheeks quest had me rethink some objectives and I'm grateful for your post. Pity systems worked great for matching design objectives (quests should take no longer than X) but allowing some randomness. They have also shown to work well for lootboxes (probably legally required at this point).
The high variance, low success chance problem doesn't appear to have any perfect solution. A pity system on top of a ClapBearCheeks quest a player may lose the ability to measure/perceive progress even if the max required kills is capped (by the pity) at exactly the same requirement (2000 in your case).
I think what you're asking is how can you make sure a player always succeeds by a fixed deadline (say, two hours). The answer is that eventually the probability of success has to be 100%.
Consider someone who's been playing for one hour and 59 minutes. The only way to guarantee that they get it in two hours is if they win on the last attempt.
Another way to do it: imagine you had a deck of cards where one card is the winner. To guarantee a win in two hours, you have 120 cards (one per minute). If you play 119 times then then the last card must be the winner, because you eliminated all the others.
You could just pick a random number in advance for the number of failed attempts there will be before they succeed.
This is also how many Tetris implementations handle it, where it is known as the 'bag' randomizer, because of the obvious analogy of blindly drawing pieces from a bag.
Since Tetris is unwinnable if you get unlucky and only get S or Z pieces for a while or the I piece never shows up, putting all seven pieces into a bag and then randomly drawing from it bounds the time between I pieces, for example, making it less frustrating for the player.
If you have a bag (or deck of cards) with just success and failure pieces, you should be able to use the Hypergeometric distribution to calculate the expected probability of success after [1, 2, 3, ..., n] draws. Since the randomizer is using a bag, you're always guaranteed to succeed once all failures in the bag have been drawn, and the success probability will go to 100% in that case.
Warframe has introduced a pity system that surfaced to the player by turning it into a currency. Basic idea is that if you have a loot table for killing the boss at the end of dungeon you also have a dungeon specific currency that can be spent to buy loot from that table. You can then reward the currency in the dungeon (drops, gathering, side objectives etc) including a minimum amount on compilation. That minimum and the item costs place an easily understandable upper bound on the number of completions it takes to get any item or set of items.