Tail of sum of discrete random variables
$begingroup$
I'm trying to derive the tail of
$sum_{iin[n]}Z_i$, where $Z_i=
begin{cases}
(1-p)^2, & w.p.quad p \
p^2, & w.p. quad 1-p
end{cases}$ are independent and $npgeqlog n$. A most straight forward way is by
$$mathbb{P}(sum_{iin[n]}Z_igeq t)leq e^{-lambda t}prod_{iin[n]}mathbb{E}[e^{lambda Z_i}]leq e^{-lambda t}prod_{iin[n]}Big[pe^{lambda(1-p)^2}+(1-p)e^{lambda p^2}Big].
$$
However, I 'm very miserable after getting this equation and not sure how I can simplify it so that I can take a good $lambda$ to get a tight tail bound. Is there any intuition how I can simplify this bound?
probability probability-distributions
$endgroup$
add a comment |
$begingroup$
I'm trying to derive the tail of
$sum_{iin[n]}Z_i$, where $Z_i=
begin{cases}
(1-p)^2, & w.p.quad p \
p^2, & w.p. quad 1-p
end{cases}$ are independent and $npgeqlog n$. A most straight forward way is by
$$mathbb{P}(sum_{iin[n]}Z_igeq t)leq e^{-lambda t}prod_{iin[n]}mathbb{E}[e^{lambda Z_i}]leq e^{-lambda t}prod_{iin[n]}Big[pe^{lambda(1-p)^2}+(1-p)e^{lambda p^2}Big].
$$
However, I 'm very miserable after getting this equation and not sure how I can simplify it so that I can take a good $lambda$ to get a tight tail bound. Is there any intuition how I can simplify this bound?
probability probability-distributions
$endgroup$
$begingroup$
How good is the bound with Markov's inequality? The mean of $Z_i$ has a nice form: $p(1-p)(2p-1)$, so the bound will have a nice closed form. But I am not sure about the tightness. Same consideration for Chebyshev's inequality.
$endgroup$
– Aditya Dua
Jan 7 at 22:21
$begingroup$
Thanks for the feedback. Actually I'm looking for a sub-exponential type bound. An ideal bound would be like $p(sum Zgeq t)leq exp(-C(p)t)$ where $C(p)$ is a function of p. So Markov bound is clearly not tight in that case.
$endgroup$
– sdoov
Jan 7 at 22:34
add a comment |
$begingroup$
I'm trying to derive the tail of
$sum_{iin[n]}Z_i$, where $Z_i=
begin{cases}
(1-p)^2, & w.p.quad p \
p^2, & w.p. quad 1-p
end{cases}$ are independent and $npgeqlog n$. A most straight forward way is by
$$mathbb{P}(sum_{iin[n]}Z_igeq t)leq e^{-lambda t}prod_{iin[n]}mathbb{E}[e^{lambda Z_i}]leq e^{-lambda t}prod_{iin[n]}Big[pe^{lambda(1-p)^2}+(1-p)e^{lambda p^2}Big].
$$
However, I 'm very miserable after getting this equation and not sure how I can simplify it so that I can take a good $lambda$ to get a tight tail bound. Is there any intuition how I can simplify this bound?
probability probability-distributions
$endgroup$
I'm trying to derive the tail of
$sum_{iin[n]}Z_i$, where $Z_i=
begin{cases}
(1-p)^2, & w.p.quad p \
p^2, & w.p. quad 1-p
end{cases}$ are independent and $npgeqlog n$. A most straight forward way is by
$$mathbb{P}(sum_{iin[n]}Z_igeq t)leq e^{-lambda t}prod_{iin[n]}mathbb{E}[e^{lambda Z_i}]leq e^{-lambda t}prod_{iin[n]}Big[pe^{lambda(1-p)^2}+(1-p)e^{lambda p^2}Big].
$$
However, I 'm very miserable after getting this equation and not sure how I can simplify it so that I can take a good $lambda$ to get a tight tail bound. Is there any intuition how I can simplify this bound?
probability probability-distributions
probability probability-distributions
asked Jan 7 at 22:10
sdoovsdoov
11
11
$begingroup$
How good is the bound with Markov's inequality? The mean of $Z_i$ has a nice form: $p(1-p)(2p-1)$, so the bound will have a nice closed form. But I am not sure about the tightness. Same consideration for Chebyshev's inequality.
$endgroup$
– Aditya Dua
Jan 7 at 22:21
$begingroup$
Thanks for the feedback. Actually I'm looking for a sub-exponential type bound. An ideal bound would be like $p(sum Zgeq t)leq exp(-C(p)t)$ where $C(p)$ is a function of p. So Markov bound is clearly not tight in that case.
$endgroup$
– sdoov
Jan 7 at 22:34
add a comment |
$begingroup$
How good is the bound with Markov's inequality? The mean of $Z_i$ has a nice form: $p(1-p)(2p-1)$, so the bound will have a nice closed form. But I am not sure about the tightness. Same consideration for Chebyshev's inequality.
$endgroup$
– Aditya Dua
Jan 7 at 22:21
$begingroup$
Thanks for the feedback. Actually I'm looking for a sub-exponential type bound. An ideal bound would be like $p(sum Zgeq t)leq exp(-C(p)t)$ where $C(p)$ is a function of p. So Markov bound is clearly not tight in that case.
$endgroup$
– sdoov
Jan 7 at 22:34
$begingroup$
How good is the bound with Markov's inequality? The mean of $Z_i$ has a nice form: $p(1-p)(2p-1)$, so the bound will have a nice closed form. But I am not sure about the tightness. Same consideration for Chebyshev's inequality.
$endgroup$
– Aditya Dua
Jan 7 at 22:21
$begingroup$
How good is the bound with Markov's inequality? The mean of $Z_i$ has a nice form: $p(1-p)(2p-1)$, so the bound will have a nice closed form. But I am not sure about the tightness. Same consideration for Chebyshev's inequality.
$endgroup$
– Aditya Dua
Jan 7 at 22:21
$begingroup$
Thanks for the feedback. Actually I'm looking for a sub-exponential type bound. An ideal bound would be like $p(sum Zgeq t)leq exp(-C(p)t)$ where $C(p)$ is a function of p. So Markov bound is clearly not tight in that case.
$endgroup$
– sdoov
Jan 7 at 22:34
$begingroup$
Thanks for the feedback. Actually I'm looking for a sub-exponential type bound. An ideal bound would be like $p(sum Zgeq t)leq exp(-C(p)t)$ where $C(p)$ is a function of p. So Markov bound is clearly not tight in that case.
$endgroup$
– sdoov
Jan 7 at 22:34
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
You could try using Hoeffding's inequality; the random variables $Z_i - mathbb{E}(Z_i) = Z_i - p(1-p)$ are clearly bounded (e.g. you can easily see $-1 le Z_i - mathbb{E}(Z_i) le 1$ if you are not interested in good constants; the best bound should be $-1/4 le Z_i - mathbb{E} Z_i le 1$) - of course you'd have to recenter the sum. Hoeffding's inequality gives you $$mathbb{P}(sum_{i = 1}^n Z_i ge t) = mathbb{P}(sum_{i = 1}^n Z_i - np(1-p) ge t - np(1-p)) le 2 expleft(- frac{(t-np(1-p))^2}{2n}right).$$
Now it highly depends which asymptotic you're interested in. For example, taking the squares and ignoring the parts which will vanish when $n to infty$, you can for example show
$$mathbb{P}left(sum_{i = 1}^n Z_i ge tright) le 2 exp left( tp(1-p) - frac{n}{2}p^2(1-p)^2right). $$
If you do not care for a good constant, you can throw away the first part, since it is bounded (for fixed $t$ that is). However, I do not believe that you can find a good bound of the form $exp(-C(p)t)$ - the expectation of your sum is $np(1-p)$, and you should study the fluctuations around this point, rather than $0$.
$endgroup$
add a comment |
$begingroup$
Very intuitive feedback guiding me to the correct way!
We can check $mathbb{E}[Z_i]=p(1-p)$ and $text{Var}(Z_i)=p(1-p)(1-2p)^2$. Using Bernstein inequality, we get
$$mathbb{P}Big(|sum_{i=1}^n Z_i-np(1-p)|geq tBig)leq2expBig(-frac{t^2/2}{np(1-p)(1-2p)^2+t/3}Big)leq 2expBig(-frac{t^2/2}{np+t/3}Big).$$
This is indeed a nice bound I'm looking for. Actually if we let $Y=sum_{i=1}^n Z_i$, one can prove (just skip that) that $mathbb{E}[max_{jin [n]}Y_j]leq 20np/3$ for independent copies $Y_jstackrel{d}{=}Y$. Comparing that $mathbb{E}[Y_j]=np(1-p)$, we are just paying a small factor 20/3 to get a uniform bound when $p$ is small, $n$ is large, and $npgeqlog n$, for arbitrary $n$. Comparing that the compensation factor is $sqrt{log n}$ for independent Gaussian case.
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3065561%2ftail-of-sum-of-discrete-random-variables%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
You could try using Hoeffding's inequality; the random variables $Z_i - mathbb{E}(Z_i) = Z_i - p(1-p)$ are clearly bounded (e.g. you can easily see $-1 le Z_i - mathbb{E}(Z_i) le 1$ if you are not interested in good constants; the best bound should be $-1/4 le Z_i - mathbb{E} Z_i le 1$) - of course you'd have to recenter the sum. Hoeffding's inequality gives you $$mathbb{P}(sum_{i = 1}^n Z_i ge t) = mathbb{P}(sum_{i = 1}^n Z_i - np(1-p) ge t - np(1-p)) le 2 expleft(- frac{(t-np(1-p))^2}{2n}right).$$
Now it highly depends which asymptotic you're interested in. For example, taking the squares and ignoring the parts which will vanish when $n to infty$, you can for example show
$$mathbb{P}left(sum_{i = 1}^n Z_i ge tright) le 2 exp left( tp(1-p) - frac{n}{2}p^2(1-p)^2right). $$
If you do not care for a good constant, you can throw away the first part, since it is bounded (for fixed $t$ that is). However, I do not believe that you can find a good bound of the form $exp(-C(p)t)$ - the expectation of your sum is $np(1-p)$, and you should study the fluctuations around this point, rather than $0$.
$endgroup$
add a comment |
$begingroup$
You could try using Hoeffding's inequality; the random variables $Z_i - mathbb{E}(Z_i) = Z_i - p(1-p)$ are clearly bounded (e.g. you can easily see $-1 le Z_i - mathbb{E}(Z_i) le 1$ if you are not interested in good constants; the best bound should be $-1/4 le Z_i - mathbb{E} Z_i le 1$) - of course you'd have to recenter the sum. Hoeffding's inequality gives you $$mathbb{P}(sum_{i = 1}^n Z_i ge t) = mathbb{P}(sum_{i = 1}^n Z_i - np(1-p) ge t - np(1-p)) le 2 expleft(- frac{(t-np(1-p))^2}{2n}right).$$
Now it highly depends which asymptotic you're interested in. For example, taking the squares and ignoring the parts which will vanish when $n to infty$, you can for example show
$$mathbb{P}left(sum_{i = 1}^n Z_i ge tright) le 2 exp left( tp(1-p) - frac{n}{2}p^2(1-p)^2right). $$
If you do not care for a good constant, you can throw away the first part, since it is bounded (for fixed $t$ that is). However, I do not believe that you can find a good bound of the form $exp(-C(p)t)$ - the expectation of your sum is $np(1-p)$, and you should study the fluctuations around this point, rather than $0$.
$endgroup$
add a comment |
$begingroup$
You could try using Hoeffding's inequality; the random variables $Z_i - mathbb{E}(Z_i) = Z_i - p(1-p)$ are clearly bounded (e.g. you can easily see $-1 le Z_i - mathbb{E}(Z_i) le 1$ if you are not interested in good constants; the best bound should be $-1/4 le Z_i - mathbb{E} Z_i le 1$) - of course you'd have to recenter the sum. Hoeffding's inequality gives you $$mathbb{P}(sum_{i = 1}^n Z_i ge t) = mathbb{P}(sum_{i = 1}^n Z_i - np(1-p) ge t - np(1-p)) le 2 expleft(- frac{(t-np(1-p))^2}{2n}right).$$
Now it highly depends which asymptotic you're interested in. For example, taking the squares and ignoring the parts which will vanish when $n to infty$, you can for example show
$$mathbb{P}left(sum_{i = 1}^n Z_i ge tright) le 2 exp left( tp(1-p) - frac{n}{2}p^2(1-p)^2right). $$
If you do not care for a good constant, you can throw away the first part, since it is bounded (for fixed $t$ that is). However, I do not believe that you can find a good bound of the form $exp(-C(p)t)$ - the expectation of your sum is $np(1-p)$, and you should study the fluctuations around this point, rather than $0$.
$endgroup$
You could try using Hoeffding's inequality; the random variables $Z_i - mathbb{E}(Z_i) = Z_i - p(1-p)$ are clearly bounded (e.g. you can easily see $-1 le Z_i - mathbb{E}(Z_i) le 1$ if you are not interested in good constants; the best bound should be $-1/4 le Z_i - mathbb{E} Z_i le 1$) - of course you'd have to recenter the sum. Hoeffding's inequality gives you $$mathbb{P}(sum_{i = 1}^n Z_i ge t) = mathbb{P}(sum_{i = 1}^n Z_i - np(1-p) ge t - np(1-p)) le 2 expleft(- frac{(t-np(1-p))^2}{2n}right).$$
Now it highly depends which asymptotic you're interested in. For example, taking the squares and ignoring the parts which will vanish when $n to infty$, you can for example show
$$mathbb{P}left(sum_{i = 1}^n Z_i ge tright) le 2 exp left( tp(1-p) - frac{n}{2}p^2(1-p)^2right). $$
If you do not care for a good constant, you can throw away the first part, since it is bounded (for fixed $t$ that is). However, I do not believe that you can find a good bound of the form $exp(-C(p)t)$ - the expectation of your sum is $np(1-p)$, and you should study the fluctuations around this point, rather than $0$.
answered Jan 8 at 20:57
Arthur SinulisArthur Sinulis
196110
196110
add a comment |
add a comment |
$begingroup$
Very intuitive feedback guiding me to the correct way!
We can check $mathbb{E}[Z_i]=p(1-p)$ and $text{Var}(Z_i)=p(1-p)(1-2p)^2$. Using Bernstein inequality, we get
$$mathbb{P}Big(|sum_{i=1}^n Z_i-np(1-p)|geq tBig)leq2expBig(-frac{t^2/2}{np(1-p)(1-2p)^2+t/3}Big)leq 2expBig(-frac{t^2/2}{np+t/3}Big).$$
This is indeed a nice bound I'm looking for. Actually if we let $Y=sum_{i=1}^n Z_i$, one can prove (just skip that) that $mathbb{E}[max_{jin [n]}Y_j]leq 20np/3$ for independent copies $Y_jstackrel{d}{=}Y$. Comparing that $mathbb{E}[Y_j]=np(1-p)$, we are just paying a small factor 20/3 to get a uniform bound when $p$ is small, $n$ is large, and $npgeqlog n$, for arbitrary $n$. Comparing that the compensation factor is $sqrt{log n}$ for independent Gaussian case.
$endgroup$
add a comment |
$begingroup$
Very intuitive feedback guiding me to the correct way!
We can check $mathbb{E}[Z_i]=p(1-p)$ and $text{Var}(Z_i)=p(1-p)(1-2p)^2$. Using Bernstein inequality, we get
$$mathbb{P}Big(|sum_{i=1}^n Z_i-np(1-p)|geq tBig)leq2expBig(-frac{t^2/2}{np(1-p)(1-2p)^2+t/3}Big)leq 2expBig(-frac{t^2/2}{np+t/3}Big).$$
This is indeed a nice bound I'm looking for. Actually if we let $Y=sum_{i=1}^n Z_i$, one can prove (just skip that) that $mathbb{E}[max_{jin [n]}Y_j]leq 20np/3$ for independent copies $Y_jstackrel{d}{=}Y$. Comparing that $mathbb{E}[Y_j]=np(1-p)$, we are just paying a small factor 20/3 to get a uniform bound when $p$ is small, $n$ is large, and $npgeqlog n$, for arbitrary $n$. Comparing that the compensation factor is $sqrt{log n}$ for independent Gaussian case.
$endgroup$
add a comment |
$begingroup$
Very intuitive feedback guiding me to the correct way!
We can check $mathbb{E}[Z_i]=p(1-p)$ and $text{Var}(Z_i)=p(1-p)(1-2p)^2$. Using Bernstein inequality, we get
$$mathbb{P}Big(|sum_{i=1}^n Z_i-np(1-p)|geq tBig)leq2expBig(-frac{t^2/2}{np(1-p)(1-2p)^2+t/3}Big)leq 2expBig(-frac{t^2/2}{np+t/3}Big).$$
This is indeed a nice bound I'm looking for. Actually if we let $Y=sum_{i=1}^n Z_i$, one can prove (just skip that) that $mathbb{E}[max_{jin [n]}Y_j]leq 20np/3$ for independent copies $Y_jstackrel{d}{=}Y$. Comparing that $mathbb{E}[Y_j]=np(1-p)$, we are just paying a small factor 20/3 to get a uniform bound when $p$ is small, $n$ is large, and $npgeqlog n$, for arbitrary $n$. Comparing that the compensation factor is $sqrt{log n}$ for independent Gaussian case.
$endgroup$
Very intuitive feedback guiding me to the correct way!
We can check $mathbb{E}[Z_i]=p(1-p)$ and $text{Var}(Z_i)=p(1-p)(1-2p)^2$. Using Bernstein inequality, we get
$$mathbb{P}Big(|sum_{i=1}^n Z_i-np(1-p)|geq tBig)leq2expBig(-frac{t^2/2}{np(1-p)(1-2p)^2+t/3}Big)leq 2expBig(-frac{t^2/2}{np+t/3}Big).$$
This is indeed a nice bound I'm looking for. Actually if we let $Y=sum_{i=1}^n Z_i$, one can prove (just skip that) that $mathbb{E}[max_{jin [n]}Y_j]leq 20np/3$ for independent copies $Y_jstackrel{d}{=}Y$. Comparing that $mathbb{E}[Y_j]=np(1-p)$, we are just paying a small factor 20/3 to get a uniform bound when $p$ is small, $n$ is large, and $npgeqlog n$, for arbitrary $n$. Comparing that the compensation factor is $sqrt{log n}$ for independent Gaussian case.
answered Jan 8 at 23:02
sdoovsdoov
11
11
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3065561%2ftail-of-sum-of-discrete-random-variables%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
How good is the bound with Markov's inequality? The mean of $Z_i$ has a nice form: $p(1-p)(2p-1)$, so the bound will have a nice closed form. But I am not sure about the tightness. Same consideration for Chebyshev's inequality.
$endgroup$
– Aditya Dua
Jan 7 at 22:21
$begingroup$
Thanks for the feedback. Actually I'm looking for a sub-exponential type bound. An ideal bound would be like $p(sum Zgeq t)leq exp(-C(p)t)$ where $C(p)$ is a function of p. So Markov bound is clearly not tight in that case.
$endgroup$
– sdoov
Jan 7 at 22:34