Empirical distribution of sorted Gaussian numbers
$begingroup$
I wrote a small program that does the following :
- Pick $N$ independent standard Gaussian numbers (expected value : 0, standard deviation : 1). Call that list $L={y_1, ldots, y_N}$.
- Sort that list in increasing order : $tilde{L}=mathrm{sort}(L)$.
- Plot that list on the $[-1,1]$ interval using a regularly distributed grid $x_i=-1+frac{2i}{N-1}$, with $i=0,ldots,N-1$.
I found that the plot was similar to that of the inverse error function, only differing by a multiplicative factor $a>0$.
I made a linear regression to find an approximate value of $1.42104$ for $a$. The two functions are very close for $N=10^5$ :
I have two questions :
- What is the exact value of $a$ ?
- How to prove that the limit function is indeed $a*mathrm{inverf}$ as $Nto infty$ ?
probability normal-distribution probability-limit-theorems sorting
$endgroup$
add a comment |
$begingroup$
I wrote a small program that does the following :
- Pick $N$ independent standard Gaussian numbers (expected value : 0, standard deviation : 1). Call that list $L={y_1, ldots, y_N}$.
- Sort that list in increasing order : $tilde{L}=mathrm{sort}(L)$.
- Plot that list on the $[-1,1]$ interval using a regularly distributed grid $x_i=-1+frac{2i}{N-1}$, with $i=0,ldots,N-1$.
I found that the plot was similar to that of the inverse error function, only differing by a multiplicative factor $a>0$.
I made a linear regression to find an approximate value of $1.42104$ for $a$. The two functions are very close for $N=10^5$ :
I have two questions :
- What is the exact value of $a$ ?
- How to prove that the limit function is indeed $a*mathrm{inverf}$ as $Nto infty$ ?
probability normal-distribution probability-limit-theorems sorting
$endgroup$
1
$begingroup$
What about $sqrt 2$ ?
$endgroup$
– Claude Leibovici
Jan 4 at 9:07
add a comment |
$begingroup$
I wrote a small program that does the following :
- Pick $N$ independent standard Gaussian numbers (expected value : 0, standard deviation : 1). Call that list $L={y_1, ldots, y_N}$.
- Sort that list in increasing order : $tilde{L}=mathrm{sort}(L)$.
- Plot that list on the $[-1,1]$ interval using a regularly distributed grid $x_i=-1+frac{2i}{N-1}$, with $i=0,ldots,N-1$.
I found that the plot was similar to that of the inverse error function, only differing by a multiplicative factor $a>0$.
I made a linear regression to find an approximate value of $1.42104$ for $a$. The two functions are very close for $N=10^5$ :
I have two questions :
- What is the exact value of $a$ ?
- How to prove that the limit function is indeed $a*mathrm{inverf}$ as $Nto infty$ ?
probability normal-distribution probability-limit-theorems sorting
$endgroup$
I wrote a small program that does the following :
- Pick $N$ independent standard Gaussian numbers (expected value : 0, standard deviation : 1). Call that list $L={y_1, ldots, y_N}$.
- Sort that list in increasing order : $tilde{L}=mathrm{sort}(L)$.
- Plot that list on the $[-1,1]$ interval using a regularly distributed grid $x_i=-1+frac{2i}{N-1}$, with $i=0,ldots,N-1$.
I found that the plot was similar to that of the inverse error function, only differing by a multiplicative factor $a>0$.
I made a linear regression to find an approximate value of $1.42104$ for $a$. The two functions are very close for $N=10^5$ :
I have two questions :
- What is the exact value of $a$ ?
- How to prove that the limit function is indeed $a*mathrm{inverf}$ as $Nto infty$ ?
probability normal-distribution probability-limit-theorems sorting
probability normal-distribution probability-limit-theorems sorting
edited Jan 3 at 13:22
Florian Omnès
asked Jan 3 at 10:32
Florian OmnèsFlorian Omnès
266
266
1
$begingroup$
What about $sqrt 2$ ?
$endgroup$
– Claude Leibovici
Jan 4 at 9:07
add a comment |
1
$begingroup$
What about $sqrt 2$ ?
$endgroup$
– Claude Leibovici
Jan 4 at 9:07
1
1
$begingroup$
What about $sqrt 2$ ?
$endgroup$
– Claude Leibovici
Jan 4 at 9:07
$begingroup$
What about $sqrt 2$ ?
$endgroup$
– Claude Leibovici
Jan 4 at 9:07
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).
Specifically, for a continuous increasing cdf, theoretical quantiles are given by
$$
x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
$$
The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
$$
hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
$$
where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.
In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
so
$$
Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
$$
What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.
$endgroup$
add a comment |
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3060431%2fempirical-distribution-of-sorted-gaussian-numbers%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).
Specifically, for a continuous increasing cdf, theoretical quantiles are given by
$$
x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
$$
The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
$$
hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
$$
where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.
In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
so
$$
Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
$$
What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.
$endgroup$
add a comment |
$begingroup$
What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).
Specifically, for a continuous increasing cdf, theoretical quantiles are given by
$$
x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
$$
The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
$$
hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
$$
where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.
In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
so
$$
Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
$$
What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.
$endgroup$
add a comment |
$begingroup$
What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).
Specifically, for a continuous increasing cdf, theoretical quantiles are given by
$$
x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
$$
The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
$$
hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
$$
where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.
In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
so
$$
Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
$$
What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.
$endgroup$
What you observe is the convergence of the empirical distribution function to cumulative distribution function of the sample - more accurately, of the empirical quantiles to theoretical quantiles (= values of inverse cumulative distribution function).
Specifically, for a continuous increasing cdf, theoretical quantiles are given by
$$
x_q = F^{-1}(q) = sup{xinmathbb R: F(x)<q}, qin(0,1).
$$
The definition of empirical quantiles varies. For a sample $X_1,dots,X_n$ of iid variables they can e.g. be defined by
$$
hat x_q = X_{(lfloor nqrfloor +1)}, qin (0,1),
$$
where $X_{(1)}le dotsle X_{(n)}$ is the sorted sample. It is known that whenever the cdf $F$ is strictly increasing, $hat x_q to x_q$ for all $qin (0,1)$ with probability $1$ as the sample size $ntoinfty$.
In your case, $F(x) = Phi(x)$ is the standart normal cdf, which is related to the error function by $$Phi(x) = frac{1+mathrm{Erf}(x/sqrt{2})}{2},$$
so
$$
Phi^{-1}(y) = sqrt{2}operatorname{Erf}^{-1}(2y-1), yin (0,1).
$$
What you are doing is applying a similar transformation to the empirical quantiles, so the convergence to $sqrt{2}operatorname{Erf}^{-1} approx 1.4142 operatorname{Erf}^{-1}$ is not surprising.
answered Jan 4 at 9:22
zhorasterzhoraster
16k21853
16k21853
add a comment |
add a comment |
Thanks for contributing an answer to Mathematics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3060431%2fempirical-distribution-of-sorted-gaussian-numbers%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
What about $sqrt 2$ ?
$endgroup$
– Claude Leibovici
Jan 4 at 9:07