Probability of sum of N dice being above certain value X
In Dnd, sometimes you have to roll n, m-sided dice (say 5, d20s) and have the sum be greater than or equal to a certain value, x (say 90).
This is easy for me to calculate by brute force, for most typical examples. I simply take the total number of possibilities that meet the criteria, and divide by the total number of possibilities, by running through each combination of dice rolls. My result for the above example is 3003 dice combinations that sum to > 90, out of 3200000 combinations in total p = 0.009384375 chance of getting 90 or over.
Is there a way (e.g. an equation) to reach this value directly?
probability dice
add a comment |
In Dnd, sometimes you have to roll n, m-sided dice (say 5, d20s) and have the sum be greater than or equal to a certain value, x (say 90).
This is easy for me to calculate by brute force, for most typical examples. I simply take the total number of possibilities that meet the criteria, and divide by the total number of possibilities, by running through each combination of dice rolls. My result for the above example is 3003 dice combinations that sum to > 90, out of 3200000 combinations in total p = 0.009384375 chance of getting 90 or over.
Is there a way (e.g. an equation) to reach this value directly?
probability dice
add a comment |
In Dnd, sometimes you have to roll n, m-sided dice (say 5, d20s) and have the sum be greater than or equal to a certain value, x (say 90).
This is easy for me to calculate by brute force, for most typical examples. I simply take the total number of possibilities that meet the criteria, and divide by the total number of possibilities, by running through each combination of dice rolls. My result for the above example is 3003 dice combinations that sum to > 90, out of 3200000 combinations in total p = 0.009384375 chance of getting 90 or over.
Is there a way (e.g. an equation) to reach this value directly?
probability dice
In Dnd, sometimes you have to roll n, m-sided dice (say 5, d20s) and have the sum be greater than or equal to a certain value, x (say 90).
This is easy for me to calculate by brute force, for most typical examples. I simply take the total number of possibilities that meet the criteria, and divide by the total number of possibilities, by running through each combination of dice rolls. My result for the above example is 3003 dice combinations that sum to > 90, out of 3200000 combinations in total p = 0.009384375 chance of getting 90 or over.
Is there a way (e.g. an equation) to reach this value directly?
probability dice
probability dice
edited Nov 29 '18 at 11:54
GNUSupporter 8964民主女神 地下教會
12.8k72445
12.8k72445
asked Nov 29 '18 at 11:40
james_alvarez
1266
1266
add a comment |
add a comment |
4 Answers
4
active
oldest
votes
Concerning a "closed" ( = finite summation) formula, start from
$$
eqalign{
& N_b (s - n,m - 1,n) = {rm No}{rm .},{rm of},{rm solutions},{rm to};left{ matrix{
{rm 1} le {rm integer};x_{,j} le m hfill cr
x_{,1} + x_{,2} + ; cdots ; + x_{,n} = s hfill cr} right. = cr
& = {rm No}{rm .},{rm of},{rm solutions},{rm to};left{ matrix{
0 le {rm integer};y_{,j} le m - 1 hfill cr
y_{,1} + y_{,2} + ; cdots ; + y_{,n} = s - n hfill cr} right. cr}
$$
where $N_b$ is given by
$$
N_b (s,r,m)quad left| {;0 leqslant text{integers }s,m,r} right.quad =
sumlimits_{left( {0, leqslant } right),,k,,left( { leqslant ,frac{s}{r+1}, leqslant ,m} right)}
{left( { - 1} right)^k binom{m}{k}
binom
{ s + m - 1 - kleft( {r + 1} right) }
{ s - kleft( {r + 1} right)} }
$$
as explained in this related post.
Then the number of ways to obtain a sum $x le s$ is given by
$$
eqalign{
& M(x,m,n) = sumlimits_{x, le ,s,left( { le ,m,n} right),} {N_b (s - n,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - 1,} {N_b (s - n,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - n - 1,} {N_b (s,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - n - 1,} {sumlimits_{left( {0, le } right),,k,,left( { le ,{s over m}, le ,n} right)}
{left( { - 1} right)^k left( matrix{ n hfill cr
k hfill cr} right)left( matrix{
s + n - 1 - k,m cr
s - k,m cr} right)} } = cr
& = m^{,n} - sumlimits_{left( {0, le } right),,k,,left( { le ,{s over m}, le ,n} right)} {left( { - 1} right)^k left( matrix{
n hfill cr
k hfill cr} right)left( matrix{
x - 1 - k,m cr
x - n - 1 - k,m cr} right)} cr}
$$
and in fact $M(90,20,5)=3003$.
Note that, as explained in the cited related post, the problem has the geometric equivalent of finding:
the number of integral points on the diagonal plane $y_1, cdots, y_n=s-n$, intercepted by a $n$-D cube
with side $[0,m-1]$.
The formula for $N_b$ corresponds to calculating the points on the whole plane ($k=0$) and subtracting those
pertaining to the surrounding cubes.
The geometric analogy clearly shows that $N_b(nm-s,m,n)=N_b(s,m,n)$.
add a comment |
You could try to solve this using the Chebychev inequality. This should at least give you an estimate.
nice idea (+1), but same comment as to the answer above.
– G Cab
Nov 30 '18 at 15:40
add a comment |
G Cab gave a good closed-form answer, and it may well be what you're looking for.
Another approach that works decently well -- particularly if the number of dice is large at all, or if you're looking for events that are closer to the mean results -- is to use a Central Limit Theorem approximation. Here's how that would play out for your example in the problem text:
First: we're going to use something continuous to approximate something discrete, so let's do it carefully. If $S$ is the sum, we want to know about $mathbb P(S > 90) = mathbb P (S geq 91)$, so we'll split the difference and consider $S = 90.5$. (This is the so-called "continuity correction.")
Asking whether the sum of five dice will be at least $90.5$ is equivalent to asking whether the average of five dice will exceed $90.5/5 = 18.1$. For one single die, the expected die roll is $10.5$ and the standard deviation of the die roll is $sqrt{133}/2$. If you average $n$ of these (independent) rolls together, the average will be an approximately normal random variable with mean $10.5$ and standard deviation $sqrt{133}/(2 sqrt n)$; in the case of $n = 5$, this is roughly $2.579$. The question then becomes: "How often does a normal random variable with those properties exceed $18.1$?" Use your favorite calculator, or a $Z$-score technique and a table of known values about the standard normal distribution, to get an answer of about $0.0016$.
Then, stop and appreciate the fact that our answer wasn't that great! It was off by a factor of almost 10. So, what happened? Well, two things: (1) the approximation would have worked better if we were using more dice, and (2) the approximation would have worked better if we were considering numbers closer to the expected average of $10.5$. For instance, let's redo the problem with asking about the sum of five dice being larger than $60$ (again, we'll use $60.5$): it can be shown (click the "at least" button on the link) that the true probability is about $0.2730$, and repeating the above procedure gives an answer of $0.2804$ as our approximation.
So, why is this worth doing? For five dice, maybe it's not. But for more dice, this gives a really efficient way of getting a good approximation (that, again, gets better and better the more dice you use) that can save you from some truly obnoxious combinatorics and enormous numbers. As another example of the value of this tactic: the true probability that the sum of $10$ dice exceeds $100$ is $0.5961$, and the aforementioned tactic gives an estimate of $0.5975$. This approach scales really well with the number of dice in a way that doing actual counting does not.
1
Your "asymptotics" analysis is appreciable (+1). But in fact, the difficulty to apply it is to estimate in advance the error we are going to have. The matter is that actually we have the sum of variables uniformly distributed over a given range, which is limited, while the gaussian is unlimited and not enough "flat" in the central part to approximate the "box" function of the uniform.
– G Cab
Nov 30 '18 at 15:38
1
@GCab Right. As I mentioned, this approach works very well for the "bulk" of the distribution, but will consistently have a high relative error near the tails of the distribution. Notably, the absolute error of these kinds of estimates will still tend to $0$ as the number of dice increases, but that's mostly because both the true values and estimates will both tend to $0$. And estimating the error remains a challenge, yes.
– Aaron Montgomery
Nov 30 '18 at 16:07
add a comment |
I found a simple answer based on a similar but different question:
Where $n$ = number of dice, $m$ = sides of dice, $x$ = target score,
$$ p = frac{nm - x + nchoose n} {m^n} $$
But this answer only works when $(nm - x) < m$
apart from the denominator, which is the total n. of throws, for the numerator consider that $N_b(s,m,r)=N_b(mr-s,m,r)$ ($N_b$ being the number given in my answer). So your answer, missing a $-1$, corresponds to the term with $k=0$ of $N_b$. In the other post I give as reference it is explained the geometric analogy with determining the number of points on the diagonal plane of a hyper-cube ... Pls. read that and you will understand that your answer is in general wrong.
– G Cab
Nov 30 '18 at 14:37
Sorry your answer is too far above my ability level, I don't know where to start :( - other than that my formula produces the same as my brute force calculations. e.g. for n = 5, m = 20, x = 90, nominator produces 3003. If you care to explain it would be much appreciated!
– james_alvarez
Nov 30 '18 at 15:09
I forgot to mention that the reference you provided didn't seem to work...
– james_alvarez
Nov 30 '18 at 15:16
yes, sorry, I did not properly link to it: corrected.
– G Cab
Nov 30 '18 at 15:26
More testing reveals that indeed this simple formula above is indeed wrong, so I will have to trust yours as I can't figure out how to actually calculate it given the three parameters. I do think the above is correct for when (nm-x) < m. But you mention a missing -1? Where should that go? It seems ok when I'm comparing the result with my counting method in python.
– james_alvarez
Nov 30 '18 at 15:59
|
show 1 more comment
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3018518%2fprobability-of-sum-of-n-dice-being-above-certain-value-x%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
Concerning a "closed" ( = finite summation) formula, start from
$$
eqalign{
& N_b (s - n,m - 1,n) = {rm No}{rm .},{rm of},{rm solutions},{rm to};left{ matrix{
{rm 1} le {rm integer};x_{,j} le m hfill cr
x_{,1} + x_{,2} + ; cdots ; + x_{,n} = s hfill cr} right. = cr
& = {rm No}{rm .},{rm of},{rm solutions},{rm to};left{ matrix{
0 le {rm integer};y_{,j} le m - 1 hfill cr
y_{,1} + y_{,2} + ; cdots ; + y_{,n} = s - n hfill cr} right. cr}
$$
where $N_b$ is given by
$$
N_b (s,r,m)quad left| {;0 leqslant text{integers }s,m,r} right.quad =
sumlimits_{left( {0, leqslant } right),,k,,left( { leqslant ,frac{s}{r+1}, leqslant ,m} right)}
{left( { - 1} right)^k binom{m}{k}
binom
{ s + m - 1 - kleft( {r + 1} right) }
{ s - kleft( {r + 1} right)} }
$$
as explained in this related post.
Then the number of ways to obtain a sum $x le s$ is given by
$$
eqalign{
& M(x,m,n) = sumlimits_{x, le ,s,left( { le ,m,n} right),} {N_b (s - n,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - 1,} {N_b (s - n,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - n - 1,} {N_b (s,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - n - 1,} {sumlimits_{left( {0, le } right),,k,,left( { le ,{s over m}, le ,n} right)}
{left( { - 1} right)^k left( matrix{ n hfill cr
k hfill cr} right)left( matrix{
s + n - 1 - k,m cr
s - k,m cr} right)} } = cr
& = m^{,n} - sumlimits_{left( {0, le } right),,k,,left( { le ,{s over m}, le ,n} right)} {left( { - 1} right)^k left( matrix{
n hfill cr
k hfill cr} right)left( matrix{
x - 1 - k,m cr
x - n - 1 - k,m cr} right)} cr}
$$
and in fact $M(90,20,5)=3003$.
Note that, as explained in the cited related post, the problem has the geometric equivalent of finding:
the number of integral points on the diagonal plane $y_1, cdots, y_n=s-n$, intercepted by a $n$-D cube
with side $[0,m-1]$.
The formula for $N_b$ corresponds to calculating the points on the whole plane ($k=0$) and subtracting those
pertaining to the surrounding cubes.
The geometric analogy clearly shows that $N_b(nm-s,m,n)=N_b(s,m,n)$.
add a comment |
Concerning a "closed" ( = finite summation) formula, start from
$$
eqalign{
& N_b (s - n,m - 1,n) = {rm No}{rm .},{rm of},{rm solutions},{rm to};left{ matrix{
{rm 1} le {rm integer};x_{,j} le m hfill cr
x_{,1} + x_{,2} + ; cdots ; + x_{,n} = s hfill cr} right. = cr
& = {rm No}{rm .},{rm of},{rm solutions},{rm to};left{ matrix{
0 le {rm integer};y_{,j} le m - 1 hfill cr
y_{,1} + y_{,2} + ; cdots ; + y_{,n} = s - n hfill cr} right. cr}
$$
where $N_b$ is given by
$$
N_b (s,r,m)quad left| {;0 leqslant text{integers }s,m,r} right.quad =
sumlimits_{left( {0, leqslant } right),,k,,left( { leqslant ,frac{s}{r+1}, leqslant ,m} right)}
{left( { - 1} right)^k binom{m}{k}
binom
{ s + m - 1 - kleft( {r + 1} right) }
{ s - kleft( {r + 1} right)} }
$$
as explained in this related post.
Then the number of ways to obtain a sum $x le s$ is given by
$$
eqalign{
& M(x,m,n) = sumlimits_{x, le ,s,left( { le ,m,n} right),} {N_b (s - n,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - 1,} {N_b (s - n,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - n - 1,} {N_b (s,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - n - 1,} {sumlimits_{left( {0, le } right),,k,,left( { le ,{s over m}, le ,n} right)}
{left( { - 1} right)^k left( matrix{ n hfill cr
k hfill cr} right)left( matrix{
s + n - 1 - k,m cr
s - k,m cr} right)} } = cr
& = m^{,n} - sumlimits_{left( {0, le } right),,k,,left( { le ,{s over m}, le ,n} right)} {left( { - 1} right)^k left( matrix{
n hfill cr
k hfill cr} right)left( matrix{
x - 1 - k,m cr
x - n - 1 - k,m cr} right)} cr}
$$
and in fact $M(90,20,5)=3003$.
Note that, as explained in the cited related post, the problem has the geometric equivalent of finding:
the number of integral points on the diagonal plane $y_1, cdots, y_n=s-n$, intercepted by a $n$-D cube
with side $[0,m-1]$.
The formula for $N_b$ corresponds to calculating the points on the whole plane ($k=0$) and subtracting those
pertaining to the surrounding cubes.
The geometric analogy clearly shows that $N_b(nm-s,m,n)=N_b(s,m,n)$.
add a comment |
Concerning a "closed" ( = finite summation) formula, start from
$$
eqalign{
& N_b (s - n,m - 1,n) = {rm No}{rm .},{rm of},{rm solutions},{rm to};left{ matrix{
{rm 1} le {rm integer};x_{,j} le m hfill cr
x_{,1} + x_{,2} + ; cdots ; + x_{,n} = s hfill cr} right. = cr
& = {rm No}{rm .},{rm of},{rm solutions},{rm to};left{ matrix{
0 le {rm integer};y_{,j} le m - 1 hfill cr
y_{,1} + y_{,2} + ; cdots ; + y_{,n} = s - n hfill cr} right. cr}
$$
where $N_b$ is given by
$$
N_b (s,r,m)quad left| {;0 leqslant text{integers }s,m,r} right.quad =
sumlimits_{left( {0, leqslant } right),,k,,left( { leqslant ,frac{s}{r+1}, leqslant ,m} right)}
{left( { - 1} right)^k binom{m}{k}
binom
{ s + m - 1 - kleft( {r + 1} right) }
{ s - kleft( {r + 1} right)} }
$$
as explained in this related post.
Then the number of ways to obtain a sum $x le s$ is given by
$$
eqalign{
& M(x,m,n) = sumlimits_{x, le ,s,left( { le ,m,n} right),} {N_b (s - n,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - 1,} {N_b (s - n,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - n - 1,} {N_b (s,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - n - 1,} {sumlimits_{left( {0, le } right),,k,,left( { le ,{s over m}, le ,n} right)}
{left( { - 1} right)^k left( matrix{ n hfill cr
k hfill cr} right)left( matrix{
s + n - 1 - k,m cr
s - k,m cr} right)} } = cr
& = m^{,n} - sumlimits_{left( {0, le } right),,k,,left( { le ,{s over m}, le ,n} right)} {left( { - 1} right)^k left( matrix{
n hfill cr
k hfill cr} right)left( matrix{
x - 1 - k,m cr
x - n - 1 - k,m cr} right)} cr}
$$
and in fact $M(90,20,5)=3003$.
Note that, as explained in the cited related post, the problem has the geometric equivalent of finding:
the number of integral points on the diagonal plane $y_1, cdots, y_n=s-n$, intercepted by a $n$-D cube
with side $[0,m-1]$.
The formula for $N_b$ corresponds to calculating the points on the whole plane ($k=0$) and subtracting those
pertaining to the surrounding cubes.
The geometric analogy clearly shows that $N_b(nm-s,m,n)=N_b(s,m,n)$.
Concerning a "closed" ( = finite summation) formula, start from
$$
eqalign{
& N_b (s - n,m - 1,n) = {rm No}{rm .},{rm of},{rm solutions},{rm to};left{ matrix{
{rm 1} le {rm integer};x_{,j} le m hfill cr
x_{,1} + x_{,2} + ; cdots ; + x_{,n} = s hfill cr} right. = cr
& = {rm No}{rm .},{rm of},{rm solutions},{rm to};left{ matrix{
0 le {rm integer};y_{,j} le m - 1 hfill cr
y_{,1} + y_{,2} + ; cdots ; + y_{,n} = s - n hfill cr} right. cr}
$$
where $N_b$ is given by
$$
N_b (s,r,m)quad left| {;0 leqslant text{integers }s,m,r} right.quad =
sumlimits_{left( {0, leqslant } right),,k,,left( { leqslant ,frac{s}{r+1}, leqslant ,m} right)}
{left( { - 1} right)^k binom{m}{k}
binom
{ s + m - 1 - kleft( {r + 1} right) }
{ s - kleft( {r + 1} right)} }
$$
as explained in this related post.
Then the number of ways to obtain a sum $x le s$ is given by
$$
eqalign{
& M(x,m,n) = sumlimits_{x, le ,s,left( { le ,m,n} right),} {N_b (s - n,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - 1,} {N_b (s - n,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - n - 1,} {N_b (s,m - 1,n)} = cr
& = m^{,n} - sumlimits_{0, le ,s, le ,x - n - 1,} {sumlimits_{left( {0, le } right),,k,,left( { le ,{s over m}, le ,n} right)}
{left( { - 1} right)^k left( matrix{ n hfill cr
k hfill cr} right)left( matrix{
s + n - 1 - k,m cr
s - k,m cr} right)} } = cr
& = m^{,n} - sumlimits_{left( {0, le } right),,k,,left( { le ,{s over m}, le ,n} right)} {left( { - 1} right)^k left( matrix{
n hfill cr
k hfill cr} right)left( matrix{
x - 1 - k,m cr
x - n - 1 - k,m cr} right)} cr}
$$
and in fact $M(90,20,5)=3003$.
Note that, as explained in the cited related post, the problem has the geometric equivalent of finding:
the number of integral points on the diagonal plane $y_1, cdots, y_n=s-n$, intercepted by a $n$-D cube
with side $[0,m-1]$.
The formula for $N_b$ corresponds to calculating the points on the whole plane ($k=0$) and subtracting those
pertaining to the surrounding cubes.
The geometric analogy clearly shows that $N_b(nm-s,m,n)=N_b(s,m,n)$.
edited Nov 30 '18 at 15:23
answered Nov 29 '18 at 12:25
G Cab
18k31237
18k31237
add a comment |
add a comment |
You could try to solve this using the Chebychev inequality. This should at least give you an estimate.
nice idea (+1), but same comment as to the answer above.
– G Cab
Nov 30 '18 at 15:40
add a comment |
You could try to solve this using the Chebychev inequality. This should at least give you an estimate.
nice idea (+1), but same comment as to the answer above.
– G Cab
Nov 30 '18 at 15:40
add a comment |
You could try to solve this using the Chebychev inequality. This should at least give you an estimate.
You could try to solve this using the Chebychev inequality. This should at least give you an estimate.
answered Nov 29 '18 at 11:57
Thomas Lang
1624
1624
nice idea (+1), but same comment as to the answer above.
– G Cab
Nov 30 '18 at 15:40
add a comment |
nice idea (+1), but same comment as to the answer above.
– G Cab
Nov 30 '18 at 15:40
nice idea (+1), but same comment as to the answer above.
– G Cab
Nov 30 '18 at 15:40
nice idea (+1), but same comment as to the answer above.
– G Cab
Nov 30 '18 at 15:40
add a comment |
G Cab gave a good closed-form answer, and it may well be what you're looking for.
Another approach that works decently well -- particularly if the number of dice is large at all, or if you're looking for events that are closer to the mean results -- is to use a Central Limit Theorem approximation. Here's how that would play out for your example in the problem text:
First: we're going to use something continuous to approximate something discrete, so let's do it carefully. If $S$ is the sum, we want to know about $mathbb P(S > 90) = mathbb P (S geq 91)$, so we'll split the difference and consider $S = 90.5$. (This is the so-called "continuity correction.")
Asking whether the sum of five dice will be at least $90.5$ is equivalent to asking whether the average of five dice will exceed $90.5/5 = 18.1$. For one single die, the expected die roll is $10.5$ and the standard deviation of the die roll is $sqrt{133}/2$. If you average $n$ of these (independent) rolls together, the average will be an approximately normal random variable with mean $10.5$ and standard deviation $sqrt{133}/(2 sqrt n)$; in the case of $n = 5$, this is roughly $2.579$. The question then becomes: "How often does a normal random variable with those properties exceed $18.1$?" Use your favorite calculator, or a $Z$-score technique and a table of known values about the standard normal distribution, to get an answer of about $0.0016$.
Then, stop and appreciate the fact that our answer wasn't that great! It was off by a factor of almost 10. So, what happened? Well, two things: (1) the approximation would have worked better if we were using more dice, and (2) the approximation would have worked better if we were considering numbers closer to the expected average of $10.5$. For instance, let's redo the problem with asking about the sum of five dice being larger than $60$ (again, we'll use $60.5$): it can be shown (click the "at least" button on the link) that the true probability is about $0.2730$, and repeating the above procedure gives an answer of $0.2804$ as our approximation.
So, why is this worth doing? For five dice, maybe it's not. But for more dice, this gives a really efficient way of getting a good approximation (that, again, gets better and better the more dice you use) that can save you from some truly obnoxious combinatorics and enormous numbers. As another example of the value of this tactic: the true probability that the sum of $10$ dice exceeds $100$ is $0.5961$, and the aforementioned tactic gives an estimate of $0.5975$. This approach scales really well with the number of dice in a way that doing actual counting does not.
1
Your "asymptotics" analysis is appreciable (+1). But in fact, the difficulty to apply it is to estimate in advance the error we are going to have. The matter is that actually we have the sum of variables uniformly distributed over a given range, which is limited, while the gaussian is unlimited and not enough "flat" in the central part to approximate the "box" function of the uniform.
– G Cab
Nov 30 '18 at 15:38
1
@GCab Right. As I mentioned, this approach works very well for the "bulk" of the distribution, but will consistently have a high relative error near the tails of the distribution. Notably, the absolute error of these kinds of estimates will still tend to $0$ as the number of dice increases, but that's mostly because both the true values and estimates will both tend to $0$. And estimating the error remains a challenge, yes.
– Aaron Montgomery
Nov 30 '18 at 16:07
add a comment |
G Cab gave a good closed-form answer, and it may well be what you're looking for.
Another approach that works decently well -- particularly if the number of dice is large at all, or if you're looking for events that are closer to the mean results -- is to use a Central Limit Theorem approximation. Here's how that would play out for your example in the problem text:
First: we're going to use something continuous to approximate something discrete, so let's do it carefully. If $S$ is the sum, we want to know about $mathbb P(S > 90) = mathbb P (S geq 91)$, so we'll split the difference and consider $S = 90.5$. (This is the so-called "continuity correction.")
Asking whether the sum of five dice will be at least $90.5$ is equivalent to asking whether the average of five dice will exceed $90.5/5 = 18.1$. For one single die, the expected die roll is $10.5$ and the standard deviation of the die roll is $sqrt{133}/2$. If you average $n$ of these (independent) rolls together, the average will be an approximately normal random variable with mean $10.5$ and standard deviation $sqrt{133}/(2 sqrt n)$; in the case of $n = 5$, this is roughly $2.579$. The question then becomes: "How often does a normal random variable with those properties exceed $18.1$?" Use your favorite calculator, or a $Z$-score technique and a table of known values about the standard normal distribution, to get an answer of about $0.0016$.
Then, stop and appreciate the fact that our answer wasn't that great! It was off by a factor of almost 10. So, what happened? Well, two things: (1) the approximation would have worked better if we were using more dice, and (2) the approximation would have worked better if we were considering numbers closer to the expected average of $10.5$. For instance, let's redo the problem with asking about the sum of five dice being larger than $60$ (again, we'll use $60.5$): it can be shown (click the "at least" button on the link) that the true probability is about $0.2730$, and repeating the above procedure gives an answer of $0.2804$ as our approximation.
So, why is this worth doing? For five dice, maybe it's not. But for more dice, this gives a really efficient way of getting a good approximation (that, again, gets better and better the more dice you use) that can save you from some truly obnoxious combinatorics and enormous numbers. As another example of the value of this tactic: the true probability that the sum of $10$ dice exceeds $100$ is $0.5961$, and the aforementioned tactic gives an estimate of $0.5975$. This approach scales really well with the number of dice in a way that doing actual counting does not.
1
Your "asymptotics" analysis is appreciable (+1). But in fact, the difficulty to apply it is to estimate in advance the error we are going to have. The matter is that actually we have the sum of variables uniformly distributed over a given range, which is limited, while the gaussian is unlimited and not enough "flat" in the central part to approximate the "box" function of the uniform.
– G Cab
Nov 30 '18 at 15:38
1
@GCab Right. As I mentioned, this approach works very well for the "bulk" of the distribution, but will consistently have a high relative error near the tails of the distribution. Notably, the absolute error of these kinds of estimates will still tend to $0$ as the number of dice increases, but that's mostly because both the true values and estimates will both tend to $0$. And estimating the error remains a challenge, yes.
– Aaron Montgomery
Nov 30 '18 at 16:07
add a comment |
G Cab gave a good closed-form answer, and it may well be what you're looking for.
Another approach that works decently well -- particularly if the number of dice is large at all, or if you're looking for events that are closer to the mean results -- is to use a Central Limit Theorem approximation. Here's how that would play out for your example in the problem text:
First: we're going to use something continuous to approximate something discrete, so let's do it carefully. If $S$ is the sum, we want to know about $mathbb P(S > 90) = mathbb P (S geq 91)$, so we'll split the difference and consider $S = 90.5$. (This is the so-called "continuity correction.")
Asking whether the sum of five dice will be at least $90.5$ is equivalent to asking whether the average of five dice will exceed $90.5/5 = 18.1$. For one single die, the expected die roll is $10.5$ and the standard deviation of the die roll is $sqrt{133}/2$. If you average $n$ of these (independent) rolls together, the average will be an approximately normal random variable with mean $10.5$ and standard deviation $sqrt{133}/(2 sqrt n)$; in the case of $n = 5$, this is roughly $2.579$. The question then becomes: "How often does a normal random variable with those properties exceed $18.1$?" Use your favorite calculator, or a $Z$-score technique and a table of known values about the standard normal distribution, to get an answer of about $0.0016$.
Then, stop and appreciate the fact that our answer wasn't that great! It was off by a factor of almost 10. So, what happened? Well, two things: (1) the approximation would have worked better if we were using more dice, and (2) the approximation would have worked better if we were considering numbers closer to the expected average of $10.5$. For instance, let's redo the problem with asking about the sum of five dice being larger than $60$ (again, we'll use $60.5$): it can be shown (click the "at least" button on the link) that the true probability is about $0.2730$, and repeating the above procedure gives an answer of $0.2804$ as our approximation.
So, why is this worth doing? For five dice, maybe it's not. But for more dice, this gives a really efficient way of getting a good approximation (that, again, gets better and better the more dice you use) that can save you from some truly obnoxious combinatorics and enormous numbers. As another example of the value of this tactic: the true probability that the sum of $10$ dice exceeds $100$ is $0.5961$, and the aforementioned tactic gives an estimate of $0.5975$. This approach scales really well with the number of dice in a way that doing actual counting does not.
G Cab gave a good closed-form answer, and it may well be what you're looking for.
Another approach that works decently well -- particularly if the number of dice is large at all, or if you're looking for events that are closer to the mean results -- is to use a Central Limit Theorem approximation. Here's how that would play out for your example in the problem text:
First: we're going to use something continuous to approximate something discrete, so let's do it carefully. If $S$ is the sum, we want to know about $mathbb P(S > 90) = mathbb P (S geq 91)$, so we'll split the difference and consider $S = 90.5$. (This is the so-called "continuity correction.")
Asking whether the sum of five dice will be at least $90.5$ is equivalent to asking whether the average of five dice will exceed $90.5/5 = 18.1$. For one single die, the expected die roll is $10.5$ and the standard deviation of the die roll is $sqrt{133}/2$. If you average $n$ of these (independent) rolls together, the average will be an approximately normal random variable with mean $10.5$ and standard deviation $sqrt{133}/(2 sqrt n)$; in the case of $n = 5$, this is roughly $2.579$. The question then becomes: "How often does a normal random variable with those properties exceed $18.1$?" Use your favorite calculator, or a $Z$-score technique and a table of known values about the standard normal distribution, to get an answer of about $0.0016$.
Then, stop and appreciate the fact that our answer wasn't that great! It was off by a factor of almost 10. So, what happened? Well, two things: (1) the approximation would have worked better if we were using more dice, and (2) the approximation would have worked better if we were considering numbers closer to the expected average of $10.5$. For instance, let's redo the problem with asking about the sum of five dice being larger than $60$ (again, we'll use $60.5$): it can be shown (click the "at least" button on the link) that the true probability is about $0.2730$, and repeating the above procedure gives an answer of $0.2804$ as our approximation.
So, why is this worth doing? For five dice, maybe it's not. But for more dice, this gives a really efficient way of getting a good approximation (that, again, gets better and better the more dice you use) that can save you from some truly obnoxious combinatorics and enormous numbers. As another example of the value of this tactic: the true probability that the sum of $10$ dice exceeds $100$ is $0.5961$, and the aforementioned tactic gives an estimate of $0.5975$. This approach scales really well with the number of dice in a way that doing actual counting does not.
answered Nov 29 '18 at 17:06
Aaron Montgomery
4,712523
4,712523
1
Your "asymptotics" analysis is appreciable (+1). But in fact, the difficulty to apply it is to estimate in advance the error we are going to have. The matter is that actually we have the sum of variables uniformly distributed over a given range, which is limited, while the gaussian is unlimited and not enough "flat" in the central part to approximate the "box" function of the uniform.
– G Cab
Nov 30 '18 at 15:38
1
@GCab Right. As I mentioned, this approach works very well for the "bulk" of the distribution, but will consistently have a high relative error near the tails of the distribution. Notably, the absolute error of these kinds of estimates will still tend to $0$ as the number of dice increases, but that's mostly because both the true values and estimates will both tend to $0$. And estimating the error remains a challenge, yes.
– Aaron Montgomery
Nov 30 '18 at 16:07
add a comment |
1
Your "asymptotics" analysis is appreciable (+1). But in fact, the difficulty to apply it is to estimate in advance the error we are going to have. The matter is that actually we have the sum of variables uniformly distributed over a given range, which is limited, while the gaussian is unlimited and not enough "flat" in the central part to approximate the "box" function of the uniform.
– G Cab
Nov 30 '18 at 15:38
1
@GCab Right. As I mentioned, this approach works very well for the "bulk" of the distribution, but will consistently have a high relative error near the tails of the distribution. Notably, the absolute error of these kinds of estimates will still tend to $0$ as the number of dice increases, but that's mostly because both the true values and estimates will both tend to $0$. And estimating the error remains a challenge, yes.
– Aaron Montgomery
Nov 30 '18 at 16:07
1
1
Your "asymptotics" analysis is appreciable (+1). But in fact, the difficulty to apply it is to estimate in advance the error we are going to have. The matter is that actually we have the sum of variables uniformly distributed over a given range, which is limited, while the gaussian is unlimited and not enough "flat" in the central part to approximate the "box" function of the uniform.
– G Cab
Nov 30 '18 at 15:38
Your "asymptotics" analysis is appreciable (+1). But in fact, the difficulty to apply it is to estimate in advance the error we are going to have. The matter is that actually we have the sum of variables uniformly distributed over a given range, which is limited, while the gaussian is unlimited and not enough "flat" in the central part to approximate the "box" function of the uniform.
– G Cab
Nov 30 '18 at 15:38
1
1
@GCab Right. As I mentioned, this approach works very well for the "bulk" of the distribution, but will consistently have a high relative error near the tails of the distribution. Notably, the absolute error of these kinds of estimates will still tend to $0$ as the number of dice increases, but that's mostly because both the true values and estimates will both tend to $0$. And estimating the error remains a challenge, yes.
– Aaron Montgomery
Nov 30 '18 at 16:07
@GCab Right. As I mentioned, this approach works very well for the "bulk" of the distribution, but will consistently have a high relative error near the tails of the distribution. Notably, the absolute error of these kinds of estimates will still tend to $0$ as the number of dice increases, but that's mostly because both the true values and estimates will both tend to $0$. And estimating the error remains a challenge, yes.
– Aaron Montgomery
Nov 30 '18 at 16:07
add a comment |
I found a simple answer based on a similar but different question:
Where $n$ = number of dice, $m$ = sides of dice, $x$ = target score,
$$ p = frac{nm - x + nchoose n} {m^n} $$
But this answer only works when $(nm - x) < m$
apart from the denominator, which is the total n. of throws, for the numerator consider that $N_b(s,m,r)=N_b(mr-s,m,r)$ ($N_b$ being the number given in my answer). So your answer, missing a $-1$, corresponds to the term with $k=0$ of $N_b$. In the other post I give as reference it is explained the geometric analogy with determining the number of points on the diagonal plane of a hyper-cube ... Pls. read that and you will understand that your answer is in general wrong.
– G Cab
Nov 30 '18 at 14:37
Sorry your answer is too far above my ability level, I don't know where to start :( - other than that my formula produces the same as my brute force calculations. e.g. for n = 5, m = 20, x = 90, nominator produces 3003. If you care to explain it would be much appreciated!
– james_alvarez
Nov 30 '18 at 15:09
I forgot to mention that the reference you provided didn't seem to work...
– james_alvarez
Nov 30 '18 at 15:16
yes, sorry, I did not properly link to it: corrected.
– G Cab
Nov 30 '18 at 15:26
More testing reveals that indeed this simple formula above is indeed wrong, so I will have to trust yours as I can't figure out how to actually calculate it given the three parameters. I do think the above is correct for when (nm-x) < m. But you mention a missing -1? Where should that go? It seems ok when I'm comparing the result with my counting method in python.
– james_alvarez
Nov 30 '18 at 15:59
|
show 1 more comment
I found a simple answer based on a similar but different question:
Where $n$ = number of dice, $m$ = sides of dice, $x$ = target score,
$$ p = frac{nm - x + nchoose n} {m^n} $$
But this answer only works when $(nm - x) < m$
apart from the denominator, which is the total n. of throws, for the numerator consider that $N_b(s,m,r)=N_b(mr-s,m,r)$ ($N_b$ being the number given in my answer). So your answer, missing a $-1$, corresponds to the term with $k=0$ of $N_b$. In the other post I give as reference it is explained the geometric analogy with determining the number of points on the diagonal plane of a hyper-cube ... Pls. read that and you will understand that your answer is in general wrong.
– G Cab
Nov 30 '18 at 14:37
Sorry your answer is too far above my ability level, I don't know where to start :( - other than that my formula produces the same as my brute force calculations. e.g. for n = 5, m = 20, x = 90, nominator produces 3003. If you care to explain it would be much appreciated!
– james_alvarez
Nov 30 '18 at 15:09
I forgot to mention that the reference you provided didn't seem to work...
– james_alvarez
Nov 30 '18 at 15:16
yes, sorry, I did not properly link to it: corrected.
– G Cab
Nov 30 '18 at 15:26
More testing reveals that indeed this simple formula above is indeed wrong, so I will have to trust yours as I can't figure out how to actually calculate it given the three parameters. I do think the above is correct for when (nm-x) < m. But you mention a missing -1? Where should that go? It seems ok when I'm comparing the result with my counting method in python.
– james_alvarez
Nov 30 '18 at 15:59
|
show 1 more comment
I found a simple answer based on a similar but different question:
Where $n$ = number of dice, $m$ = sides of dice, $x$ = target score,
$$ p = frac{nm - x + nchoose n} {m^n} $$
But this answer only works when $(nm - x) < m$
I found a simple answer based on a similar but different question:
Where $n$ = number of dice, $m$ = sides of dice, $x$ = target score,
$$ p = frac{nm - x + nchoose n} {m^n} $$
But this answer only works when $(nm - x) < m$
edited Nov 30 '18 at 15:54
answered Nov 30 '18 at 9:34
james_alvarez
1266
1266
apart from the denominator, which is the total n. of throws, for the numerator consider that $N_b(s,m,r)=N_b(mr-s,m,r)$ ($N_b$ being the number given in my answer). So your answer, missing a $-1$, corresponds to the term with $k=0$ of $N_b$. In the other post I give as reference it is explained the geometric analogy with determining the number of points on the diagonal plane of a hyper-cube ... Pls. read that and you will understand that your answer is in general wrong.
– G Cab
Nov 30 '18 at 14:37
Sorry your answer is too far above my ability level, I don't know where to start :( - other than that my formula produces the same as my brute force calculations. e.g. for n = 5, m = 20, x = 90, nominator produces 3003. If you care to explain it would be much appreciated!
– james_alvarez
Nov 30 '18 at 15:09
I forgot to mention that the reference you provided didn't seem to work...
– james_alvarez
Nov 30 '18 at 15:16
yes, sorry, I did not properly link to it: corrected.
– G Cab
Nov 30 '18 at 15:26
More testing reveals that indeed this simple formula above is indeed wrong, so I will have to trust yours as I can't figure out how to actually calculate it given the three parameters. I do think the above is correct for when (nm-x) < m. But you mention a missing -1? Where should that go? It seems ok when I'm comparing the result with my counting method in python.
– james_alvarez
Nov 30 '18 at 15:59
|
show 1 more comment
apart from the denominator, which is the total n. of throws, for the numerator consider that $N_b(s,m,r)=N_b(mr-s,m,r)$ ($N_b$ being the number given in my answer). So your answer, missing a $-1$, corresponds to the term with $k=0$ of $N_b$. In the other post I give as reference it is explained the geometric analogy with determining the number of points on the diagonal plane of a hyper-cube ... Pls. read that and you will understand that your answer is in general wrong.
– G Cab
Nov 30 '18 at 14:37
Sorry your answer is too far above my ability level, I don't know where to start :( - other than that my formula produces the same as my brute force calculations. e.g. for n = 5, m = 20, x = 90, nominator produces 3003. If you care to explain it would be much appreciated!
– james_alvarez
Nov 30 '18 at 15:09
I forgot to mention that the reference you provided didn't seem to work...
– james_alvarez
Nov 30 '18 at 15:16
yes, sorry, I did not properly link to it: corrected.
– G Cab
Nov 30 '18 at 15:26
More testing reveals that indeed this simple formula above is indeed wrong, so I will have to trust yours as I can't figure out how to actually calculate it given the three parameters. I do think the above is correct for when (nm-x) < m. But you mention a missing -1? Where should that go? It seems ok when I'm comparing the result with my counting method in python.
– james_alvarez
Nov 30 '18 at 15:59
apart from the denominator, which is the total n. of throws, for the numerator consider that $N_b(s,m,r)=N_b(mr-s,m,r)$ ($N_b$ being the number given in my answer). So your answer, missing a $-1$, corresponds to the term with $k=0$ of $N_b$. In the other post I give as reference it is explained the geometric analogy with determining the number of points on the diagonal plane of a hyper-cube ... Pls. read that and you will understand that your answer is in general wrong.
– G Cab
Nov 30 '18 at 14:37
apart from the denominator, which is the total n. of throws, for the numerator consider that $N_b(s,m,r)=N_b(mr-s,m,r)$ ($N_b$ being the number given in my answer). So your answer, missing a $-1$, corresponds to the term with $k=0$ of $N_b$. In the other post I give as reference it is explained the geometric analogy with determining the number of points on the diagonal plane of a hyper-cube ... Pls. read that and you will understand that your answer is in general wrong.
– G Cab
Nov 30 '18 at 14:37
Sorry your answer is too far above my ability level, I don't know where to start :( - other than that my formula produces the same as my brute force calculations. e.g. for n = 5, m = 20, x = 90, nominator produces 3003. If you care to explain it would be much appreciated!
– james_alvarez
Nov 30 '18 at 15:09
Sorry your answer is too far above my ability level, I don't know where to start :( - other than that my formula produces the same as my brute force calculations. e.g. for n = 5, m = 20, x = 90, nominator produces 3003. If you care to explain it would be much appreciated!
– james_alvarez
Nov 30 '18 at 15:09
I forgot to mention that the reference you provided didn't seem to work...
– james_alvarez
Nov 30 '18 at 15:16
I forgot to mention that the reference you provided didn't seem to work...
– james_alvarez
Nov 30 '18 at 15:16
yes, sorry, I did not properly link to it: corrected.
– G Cab
Nov 30 '18 at 15:26
yes, sorry, I did not properly link to it: corrected.
– G Cab
Nov 30 '18 at 15:26
More testing reveals that indeed this simple formula above is indeed wrong, so I will have to trust yours as I can't figure out how to actually calculate it given the three parameters. I do think the above is correct for when (nm-x) < m. But you mention a missing -1? Where should that go? It seems ok when I'm comparing the result with my counting method in python.
– james_alvarez
Nov 30 '18 at 15:59
More testing reveals that indeed this simple formula above is indeed wrong, so I will have to trust yours as I can't figure out how to actually calculate it given the three parameters. I do think the above is correct for when (nm-x) < m. But you mention a missing -1? Where should that go? It seems ok when I'm comparing the result with my counting method in python.
– james_alvarez
Nov 30 '18 at 15:59
|
show 1 more comment
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3018518%2fprobability-of-sum-of-n-dice-being-above-certain-value-x%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown