Are confidence intervals useful?
$begingroup$
In frequentist statistics, a 95% confidence interval is an interval-producing procedure that, if repeated an infinite number of times, would contain the true parameter 95% of the time. Why is this useful?
Confidence intervals are often misunderstood. They are not an interval that we can be 95% certain the parameter is in (unless you are using the similar Bayesian credibility interval). Confidence intervals feel like a bait-and-switch to me.
The one use case I can think of is to provide the range of values for which we could not reject the null hypothesis that the parameter is that value. Wouldn't p-values provide this information, but better? Without being so misleading?
In short: Why do we need confidence intervals? How are they, when correctly interpreted, useful?
hypothesis-testing bayesian mathematical-statistics confidence-interval frequentist
$endgroup$
add a comment |
$begingroup$
In frequentist statistics, a 95% confidence interval is an interval-producing procedure that, if repeated an infinite number of times, would contain the true parameter 95% of the time. Why is this useful?
Confidence intervals are often misunderstood. They are not an interval that we can be 95% certain the parameter is in (unless you are using the similar Bayesian credibility interval). Confidence intervals feel like a bait-and-switch to me.
The one use case I can think of is to provide the range of values for which we could not reject the null hypothesis that the parameter is that value. Wouldn't p-values provide this information, but better? Without being so misleading?
In short: Why do we need confidence intervals? How are they, when correctly interpreted, useful?
hypothesis-testing bayesian mathematical-statistics confidence-interval frequentist
$endgroup$
1
$begingroup$
Related: "From a Bayesian probability perspective, why doesn't a 95% confidence interval contain the true parameter with 95% probability?".
$endgroup$
– Nat
Jan 31 at 11:47
$begingroup$
The Bayesian credibility interval is neither an interval that we can be 95% certain the parameter is in.
$endgroup$
– Martijn Weterings
Feb 1 at 12:32
$begingroup$
@MartijnWeterings: unless you are 100% certain of your prior.
$endgroup$
– Xi'an
Feb 4 at 6:55
$begingroup$
@Xi'an that works when a parameter $theta$ is 100% certain to be reasonably considered a random variable and an experiment is like sampling from a joint frequency distribution $P(theta,x)$, ie you use Bayes rule as: $P(theta|x) = P(theta,x)/P(x)$ without explicit 'prior'. It's not the same for a parameter that is considered to be fixed. Then the posterior beliefs would require you to also 'update' the old joint frequency distribution of $X$ and $theta$. It is a bit absurd to claim to be updating 'prior beliefs' which were 100% sure.
$endgroup$
– Martijn Weterings
Feb 4 at 7:27
add a comment |
$begingroup$
In frequentist statistics, a 95% confidence interval is an interval-producing procedure that, if repeated an infinite number of times, would contain the true parameter 95% of the time. Why is this useful?
Confidence intervals are often misunderstood. They are not an interval that we can be 95% certain the parameter is in (unless you are using the similar Bayesian credibility interval). Confidence intervals feel like a bait-and-switch to me.
The one use case I can think of is to provide the range of values for which we could not reject the null hypothesis that the parameter is that value. Wouldn't p-values provide this information, but better? Without being so misleading?
In short: Why do we need confidence intervals? How are they, when correctly interpreted, useful?
hypothesis-testing bayesian mathematical-statistics confidence-interval frequentist
$endgroup$
In frequentist statistics, a 95% confidence interval is an interval-producing procedure that, if repeated an infinite number of times, would contain the true parameter 95% of the time. Why is this useful?
Confidence intervals are often misunderstood. They are not an interval that we can be 95% certain the parameter is in (unless you are using the similar Bayesian credibility interval). Confidence intervals feel like a bait-and-switch to me.
The one use case I can think of is to provide the range of values for which we could not reject the null hypothesis that the parameter is that value. Wouldn't p-values provide this information, but better? Without being so misleading?
In short: Why do we need confidence intervals? How are they, when correctly interpreted, useful?
hypothesis-testing bayesian mathematical-statistics confidence-interval frequentist
hypothesis-testing bayesian mathematical-statistics confidence-interval frequentist
asked Jan 31 at 4:20
purpleostrichpurpleostrich
1447
1447
1
$begingroup$
Related: "From a Bayesian probability perspective, why doesn't a 95% confidence interval contain the true parameter with 95% probability?".
$endgroup$
– Nat
Jan 31 at 11:47
$begingroup$
The Bayesian credibility interval is neither an interval that we can be 95% certain the parameter is in.
$endgroup$
– Martijn Weterings
Feb 1 at 12:32
$begingroup$
@MartijnWeterings: unless you are 100% certain of your prior.
$endgroup$
– Xi'an
Feb 4 at 6:55
$begingroup$
@Xi'an that works when a parameter $theta$ is 100% certain to be reasonably considered a random variable and an experiment is like sampling from a joint frequency distribution $P(theta,x)$, ie you use Bayes rule as: $P(theta|x) = P(theta,x)/P(x)$ without explicit 'prior'. It's not the same for a parameter that is considered to be fixed. Then the posterior beliefs would require you to also 'update' the old joint frequency distribution of $X$ and $theta$. It is a bit absurd to claim to be updating 'prior beliefs' which were 100% sure.
$endgroup$
– Martijn Weterings
Feb 4 at 7:27
add a comment |
1
$begingroup$
Related: "From a Bayesian probability perspective, why doesn't a 95% confidence interval contain the true parameter with 95% probability?".
$endgroup$
– Nat
Jan 31 at 11:47
$begingroup$
The Bayesian credibility interval is neither an interval that we can be 95% certain the parameter is in.
$endgroup$
– Martijn Weterings
Feb 1 at 12:32
$begingroup$
@MartijnWeterings: unless you are 100% certain of your prior.
$endgroup$
– Xi'an
Feb 4 at 6:55
$begingroup$
@Xi'an that works when a parameter $theta$ is 100% certain to be reasonably considered a random variable and an experiment is like sampling from a joint frequency distribution $P(theta,x)$, ie you use Bayes rule as: $P(theta|x) = P(theta,x)/P(x)$ without explicit 'prior'. It's not the same for a parameter that is considered to be fixed. Then the posterior beliefs would require you to also 'update' the old joint frequency distribution of $X$ and $theta$. It is a bit absurd to claim to be updating 'prior beliefs' which were 100% sure.
$endgroup$
– Martijn Weterings
Feb 4 at 7:27
1
1
$begingroup$
Related: "From a Bayesian probability perspective, why doesn't a 95% confidence interval contain the true parameter with 95% probability?".
$endgroup$
– Nat
Jan 31 at 11:47
$begingroup$
Related: "From a Bayesian probability perspective, why doesn't a 95% confidence interval contain the true parameter with 95% probability?".
$endgroup$
– Nat
Jan 31 at 11:47
$begingroup$
The Bayesian credibility interval is neither an interval that we can be 95% certain the parameter is in.
$endgroup$
– Martijn Weterings
Feb 1 at 12:32
$begingroup$
The Bayesian credibility interval is neither an interval that we can be 95% certain the parameter is in.
$endgroup$
– Martijn Weterings
Feb 1 at 12:32
$begingroup$
@MartijnWeterings: unless you are 100% certain of your prior.
$endgroup$
– Xi'an
Feb 4 at 6:55
$begingroup$
@MartijnWeterings: unless you are 100% certain of your prior.
$endgroup$
– Xi'an
Feb 4 at 6:55
$begingroup$
@Xi'an that works when a parameter $theta$ is 100% certain to be reasonably considered a random variable and an experiment is like sampling from a joint frequency distribution $P(theta,x)$, ie you use Bayes rule as: $P(theta|x) = P(theta,x)/P(x)$ without explicit 'prior'. It's not the same for a parameter that is considered to be fixed. Then the posterior beliefs would require you to also 'update' the old joint frequency distribution of $X$ and $theta$. It is a bit absurd to claim to be updating 'prior beliefs' which were 100% sure.
$endgroup$
– Martijn Weterings
Feb 4 at 7:27
$begingroup$
@Xi'an that works when a parameter $theta$ is 100% certain to be reasonably considered a random variable and an experiment is like sampling from a joint frequency distribution $P(theta,x)$, ie you use Bayes rule as: $P(theta|x) = P(theta,x)/P(x)$ without explicit 'prior'. It's not the same for a parameter that is considered to be fixed. Then the posterior beliefs would require you to also 'update' the old joint frequency distribution of $X$ and $theta$. It is a bit absurd to claim to be updating 'prior beliefs' which were 100% sure.
$endgroup$
– Martijn Weterings
Feb 4 at 7:27
add a comment |
5 Answers
5
active
oldest
votes
$begingroup$
So long as the confidence interval is treated as random (i.e., looked at from the perspective of treating the data as a set of random variables that we have not seen yet) then we can indeed make useful probability statements about it. Specifically, suppose you have a confidence interval at level $1-alpha$ for the parameter $theta$, and the interval has bounds $L(mathbf{x}) leqslant U(mathbf{x})$. Then we can say that:
$$mathbb{P}(L(mathbf{X}) leqslant theta leqslant U(mathbf{X}) | theta) = 1-alpha
quad quad quad text{for all } theta in Theta.$$
Moving outside the frequentist paradigm and marginalising over $theta$ for any prior distribution gives the corresponding marginal probability result:
$$mathbb{P}(L(mathbf{X}) leqslant theta leqslant U(mathbf{X})) = 1-alpha.$$
Once we fix the bounds of the confidence interval by fixing the data to $mathbf{X} = mathbb{x}$, we no longer appeal to this probability statement, because we now have fixed the data. However, if the confidence interval is treated as a random interval then we can indeed make this probability statement --- i.e., with probability $1-alpha$ the parameter $theta$ will fall within the (random) interval.
Within frequentist statistics, probability statements are statements about relative frequencies over infinitely repeated trials. But that is true of every probability statement in the frequentist paradigm, so if your objection is to relative frequency statements, that is not an objection that is specific to confidence intervals. If we move outside the frequentist paradigm then we can legitimately say that a confidence interval contains its target parameter with the desired probability, so long as we make this probability statement marginally (i.e., not conditional on the data) and we thus treat the confidence interval in its random sense.
I don't know about others, but that seems to me to be a pretty powerful probability result, and a reasonable justification for this form of interval. I am more partial to Bayesian methods myself, but the probability results backing confidence intervals (in their random sense) are powerful results that are not to be sniffed at.
$endgroup$
1
$begingroup$
"Moving outside the frequentist paradigm" isn't that exactly the problem? In general we want an interval that contains the true value of a parameter of interest with some probability. No frequentist analysis can give us that, and implicitly re-interpreting it as a Bayesian analysis leads to misunderstandings. Better to answer the question directly via a Bayesian credible interval. There are uses for confidence intervals where you are repeatedly performing "experiments", e.g. quality control.
$endgroup$
– Dikran Marsupial
Jan 31 at 8:18
$begingroup$
It is not a matter of implicitly reinterpreting as Bayesian (the latter would condition on the data to get a posterior). The answer is merely showing the OP that we can make useful probability statements about the confidence interval. As to more general objections to the frequentist paradigm, those are well and good, but they are not objections specific to confidence intervals.
$endgroup$
– Ben
Jan 31 at 8:57
1
$begingroup$
As you can see from the above probability statements, we can guarantee that the CI contains the parameter with some probability, so long as we look at this a priori.
$endgroup$
– Ben
Jan 31 at 9:09
1
$begingroup$
If you have moved out of the frequentist paradigm, but are not moving to a Bayesian framework, what framework is it? I wasn't expressing an objection to frequentism, I believe you should use the framework that most directly answers the question you actually want to pose. Confidence and credible intervals answer different questions.
$endgroup$
– Dikran Marsupial
Jan 31 at 11:11
1
$begingroup$
@Dikran: The probability statement stands as written, and is a pure mathematical statement. I really don't see how you can reasonably object to this.
$endgroup$
– Ben
Feb 5 at 8:20
|
show 4 more comments
$begingroup$
I agree with @Ben above, and I thought I would provide a simple example of where a Bayesian versus a Frequentist interval would be of value in the same circumstance.
Imagine a factory with parallel assembly lines. It is costly to stop a line, and at the same time, they want to produce quality products. They are concerned about both false positives and false negatives over time. To the factory, it is an averaging process: both power and guaranteed protection against false positives matter. Confidence intervals, as well as tolerance intervals, matter to the factory. Nonetheless, machines will go out of alignment, that is $thetaneTheta$, and detection gear will observe spurious events. The average outcome matters while the specific outcome is an operational detail.
On the opposite side of this is a single customer purchasing a single product or a single lot of products. They do not care about the repetition properties of the assembly line. They care about the one product that they purchased. Let us imagine the customer is NASA and they need the product to meet a specification, say $gammaleGamma.$ They do not care about the quality of the parts they did not purchase. They need a Bayesian interval of some form.
Furthermore, a single failure could kill many astronauts and cost billions of dollars. They need to know that every single part purchased meets specifications. Averaging would be deadly. For a Saturn V rocket, a one percent defect rate would have implied 10,000 defective parts during the Apollo flights. They required 0% defects on all missions.
You worry about having a confidence interval when you are working in the sample space as a factory is doing. It is creating the sample space. You worry about credible intervals when you are working in the parameter space, as a customer would be doing. If you do not care about the observations outside yours, then you are Bayesian. If you do care about the samples that were not seen, but could have been seen, then you are a Frequentist.
Are you concerned with long-run averaging or the specific event?
$endgroup$
$begingroup$
Does NASA actually purchase parts based on Bayesian intervals? I understand your point, but do they actually do it?
$endgroup$
– Aksakal
Jan 31 at 17:55
$begingroup$
@Aksakal I do not know. Juran, of course, wrote a wonderful work on quality assurance at NASA, but I cannot remember at all if the testing process was discussed as it has been more than a decade since I read it. I know that W Edwards Deming was opposed to confidence intervals in favor of credible intervals, but again, that doesn't directly pertain. My guess, and I do know people who would know but it is inconvenient to ask at the moment, is that they use Frequentist methods because that is what most people are trained in. You use the hammer you have.
$endgroup$
– Dave Harris
Jan 31 at 17:58
$begingroup$
Is it the case of "a hammer" though? Maybe it has something to do with the way things are in engineering?
$endgroup$
– Aksakal
Jan 31 at 18:09
$begingroup$
@Aksakal I am not qualified to opine on that.
$endgroup$
– Dave Harris
Jan 31 at 19:56
$begingroup$
Say a company makes $n$ parts, with a $alpha$ level composite hypothesis test $H_0: gamma > Gamma$ you have them tested for mistakes: $x$ of them pass without mistakes and $y$ of them fail. You can give NASA a reasonable guarantee. The maximum amount of products that can accidentally pass the test (wrongly considered without error) is $nalpha$. Knowing that you sold $x$ items you can compute a maximum probability that a sold part is actually not in accordance with the alternative hypothesis $gamma leq Gamma$.
$endgroup$
– Martijn Weterings
Feb 4 at 9:01
|
show 1 more comment
$begingroup$
Confidence intervals are not only useful, but essential in some field, such as physics. Unfortunately, the most noise regarding CIs comes from Bayesians caught up in fake debates with Frequentists, usually in the context of social "sciences" and other science-like disciplines.
Suppose that I measure a quantity in Physics, such as electricity charge. I would always supply it with the measure of uncertainty of the value, which is usually a standard deviation. Since, in Physics errors are often Gaussian, this directly translated into CI. However, when the errors are not Gaussian, it gets a bit complicated, some integrals need to be evaluated etc. Nothing too esoteric though usually.
Here's a brief presentation on the CI in particle physics, and the definition:
quantitative statement about the fraction of times that such an
interval would contain the true value of the parameter in a large
number of repeated experiments
Note, that in Physics "repeated experiments" has often a literal meaning: it's assumed you can actually repeat experiments in the paper, and would actually observe that fraction. So, the CI has almost a literal meaning to you, and is just a way to express the information about the uncertainty of the measurement. It's not a thought experiment, not a subjective opinion, not your or my feelings about likelihoods etc. It's what you were able to devise from experiments, and what I should be able to observe when reproducing your experiment.
$endgroup$
add a comment |
$begingroup$
Note that by the strict definition of confidence interval, it is possible that they are completely meaningless, i.e., not informative about the parameter of interest. However, in practice, they are generally very meaningful.
As an example of a meaningless confidence interval, suppose I have a procedure that 95% of the time produces $[0,1]$, and 5% of the time produces [$U_{min}$, $U_{max}$], where $U_{min}, U_{max}$ are any pair of random variables such that $U_{min} < U_{max}$. Then this is a procedure that captures any probability at least 95% of the time, so is technically a valid confidence interval for any probability. Yet if I told that the interval produced by this procedure was $[0.01, 0.011]$ for a given $p$, you should realize that you really have learned nothing about $p$.
On the other hand, most confidence intervals are constructed in a more useful fashion. For example, if I told you it was created using a Wald Interval procedure, then we know that
$hat p text{ } dotsim text{ } N(p, s_e)$
where $s_e$ is the standard error. This is a very meaningful statement about how $hat p$ relates to $p$. Turning this into a confidence interval is simply an attempt to simplify this result to someone who is not so familiar with normal distributions. That's also not just to say that it's only a tool for people who don't know about normal distributions; for example, the percentile bootstrap is a tool for summarizing the error between the estimator and true parameter when the distribution of this error may be non-Gaussian.
$endgroup$
add a comment |
$begingroup$
This thread has devolved quickly into the Frequentist vs Bayesian debate, and that is not easily resolvable. The math in both approaches is solid, so it always comes down to philosophical preferences. The frequentist interpretation of probability as the limit of an event's relative frequency is justified by the strong law of large numbers; regardless of your preferred interpretation of probability, an event's relative frequency will converge to its probability with probability 1.
Frequentist confidence intervals are indeed trickier to interpret than Bayesian credible intervals. By treating an unknown quantity as a random variable, Bayesians can assert that one interval contains that quantity with some probability. Frequentists refuse to treat some quantities as random variables, and any equations containing only constants can only be true or false. So when estimating an unknown constant, frequentists must bound them with a RANDOM interval to involve probability at all. Rather than one interval containing a random variable with some probability, a frequentist method generates many different possible intervals, some of which contains the unknown constant. If the coverage probability is reasonably high, it's a reasonable leap of faith to assert that a particular interval contains the unknown constant (note, not "with some probability").
A Bayesian would balk at such a leap of faith as much as a Frequentist balks at treating any unknown quantity as a random variable. The frequentist Neyman construction method in fact exposed an embarrassing issue with such leaps of faith. Without actively preventing it (see Feldman and Cousins, 1997 for one approach), rare outcomes may generate EMPTY confidence intervals for a distribution parameter. Such a leap of faith would be very unreasonable! I've seen a few Bayesians using that example to mock frequentist methods, while frequentists typically respond with "well I still get a correct interval most of the time, and without making false assumptions." I'll point out that the Bayesian/frequentist impasse is not important to most who apply their methods. Even people who are committed to coverage probability will often use Bayesian methods if the methods are shown to have good coverage probability in simulations.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f390093%2fare-confidence-intervals-useful%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
So long as the confidence interval is treated as random (i.e., looked at from the perspective of treating the data as a set of random variables that we have not seen yet) then we can indeed make useful probability statements about it. Specifically, suppose you have a confidence interval at level $1-alpha$ for the parameter $theta$, and the interval has bounds $L(mathbf{x}) leqslant U(mathbf{x})$. Then we can say that:
$$mathbb{P}(L(mathbf{X}) leqslant theta leqslant U(mathbf{X}) | theta) = 1-alpha
quad quad quad text{for all } theta in Theta.$$
Moving outside the frequentist paradigm and marginalising over $theta$ for any prior distribution gives the corresponding marginal probability result:
$$mathbb{P}(L(mathbf{X}) leqslant theta leqslant U(mathbf{X})) = 1-alpha.$$
Once we fix the bounds of the confidence interval by fixing the data to $mathbf{X} = mathbb{x}$, we no longer appeal to this probability statement, because we now have fixed the data. However, if the confidence interval is treated as a random interval then we can indeed make this probability statement --- i.e., with probability $1-alpha$ the parameter $theta$ will fall within the (random) interval.
Within frequentist statistics, probability statements are statements about relative frequencies over infinitely repeated trials. But that is true of every probability statement in the frequentist paradigm, so if your objection is to relative frequency statements, that is not an objection that is specific to confidence intervals. If we move outside the frequentist paradigm then we can legitimately say that a confidence interval contains its target parameter with the desired probability, so long as we make this probability statement marginally (i.e., not conditional on the data) and we thus treat the confidence interval in its random sense.
I don't know about others, but that seems to me to be a pretty powerful probability result, and a reasonable justification for this form of interval. I am more partial to Bayesian methods myself, but the probability results backing confidence intervals (in their random sense) are powerful results that are not to be sniffed at.
$endgroup$
1
$begingroup$
"Moving outside the frequentist paradigm" isn't that exactly the problem? In general we want an interval that contains the true value of a parameter of interest with some probability. No frequentist analysis can give us that, and implicitly re-interpreting it as a Bayesian analysis leads to misunderstandings. Better to answer the question directly via a Bayesian credible interval. There are uses for confidence intervals where you are repeatedly performing "experiments", e.g. quality control.
$endgroup$
– Dikran Marsupial
Jan 31 at 8:18
$begingroup$
It is not a matter of implicitly reinterpreting as Bayesian (the latter would condition on the data to get a posterior). The answer is merely showing the OP that we can make useful probability statements about the confidence interval. As to more general objections to the frequentist paradigm, those are well and good, but they are not objections specific to confidence intervals.
$endgroup$
– Ben
Jan 31 at 8:57
1
$begingroup$
As you can see from the above probability statements, we can guarantee that the CI contains the parameter with some probability, so long as we look at this a priori.
$endgroup$
– Ben
Jan 31 at 9:09
1
$begingroup$
If you have moved out of the frequentist paradigm, but are not moving to a Bayesian framework, what framework is it? I wasn't expressing an objection to frequentism, I believe you should use the framework that most directly answers the question you actually want to pose. Confidence and credible intervals answer different questions.
$endgroup$
– Dikran Marsupial
Jan 31 at 11:11
1
$begingroup$
@Dikran: The probability statement stands as written, and is a pure mathematical statement. I really don't see how you can reasonably object to this.
$endgroup$
– Ben
Feb 5 at 8:20
|
show 4 more comments
$begingroup$
So long as the confidence interval is treated as random (i.e., looked at from the perspective of treating the data as a set of random variables that we have not seen yet) then we can indeed make useful probability statements about it. Specifically, suppose you have a confidence interval at level $1-alpha$ for the parameter $theta$, and the interval has bounds $L(mathbf{x}) leqslant U(mathbf{x})$. Then we can say that:
$$mathbb{P}(L(mathbf{X}) leqslant theta leqslant U(mathbf{X}) | theta) = 1-alpha
quad quad quad text{for all } theta in Theta.$$
Moving outside the frequentist paradigm and marginalising over $theta$ for any prior distribution gives the corresponding marginal probability result:
$$mathbb{P}(L(mathbf{X}) leqslant theta leqslant U(mathbf{X})) = 1-alpha.$$
Once we fix the bounds of the confidence interval by fixing the data to $mathbf{X} = mathbb{x}$, we no longer appeal to this probability statement, because we now have fixed the data. However, if the confidence interval is treated as a random interval then we can indeed make this probability statement --- i.e., with probability $1-alpha$ the parameter $theta$ will fall within the (random) interval.
Within frequentist statistics, probability statements are statements about relative frequencies over infinitely repeated trials. But that is true of every probability statement in the frequentist paradigm, so if your objection is to relative frequency statements, that is not an objection that is specific to confidence intervals. If we move outside the frequentist paradigm then we can legitimately say that a confidence interval contains its target parameter with the desired probability, so long as we make this probability statement marginally (i.e., not conditional on the data) and we thus treat the confidence interval in its random sense.
I don't know about others, but that seems to me to be a pretty powerful probability result, and a reasonable justification for this form of interval. I am more partial to Bayesian methods myself, but the probability results backing confidence intervals (in their random sense) are powerful results that are not to be sniffed at.
$endgroup$
1
$begingroup$
"Moving outside the frequentist paradigm" isn't that exactly the problem? In general we want an interval that contains the true value of a parameter of interest with some probability. No frequentist analysis can give us that, and implicitly re-interpreting it as a Bayesian analysis leads to misunderstandings. Better to answer the question directly via a Bayesian credible interval. There are uses for confidence intervals where you are repeatedly performing "experiments", e.g. quality control.
$endgroup$
– Dikran Marsupial
Jan 31 at 8:18
$begingroup$
It is not a matter of implicitly reinterpreting as Bayesian (the latter would condition on the data to get a posterior). The answer is merely showing the OP that we can make useful probability statements about the confidence interval. As to more general objections to the frequentist paradigm, those are well and good, but they are not objections specific to confidence intervals.
$endgroup$
– Ben
Jan 31 at 8:57
1
$begingroup$
As you can see from the above probability statements, we can guarantee that the CI contains the parameter with some probability, so long as we look at this a priori.
$endgroup$
– Ben
Jan 31 at 9:09
1
$begingroup$
If you have moved out of the frequentist paradigm, but are not moving to a Bayesian framework, what framework is it? I wasn't expressing an objection to frequentism, I believe you should use the framework that most directly answers the question you actually want to pose. Confidence and credible intervals answer different questions.
$endgroup$
– Dikran Marsupial
Jan 31 at 11:11
1
$begingroup$
@Dikran: The probability statement stands as written, and is a pure mathematical statement. I really don't see how you can reasonably object to this.
$endgroup$
– Ben
Feb 5 at 8:20
|
show 4 more comments
$begingroup$
So long as the confidence interval is treated as random (i.e., looked at from the perspective of treating the data as a set of random variables that we have not seen yet) then we can indeed make useful probability statements about it. Specifically, suppose you have a confidence interval at level $1-alpha$ for the parameter $theta$, and the interval has bounds $L(mathbf{x}) leqslant U(mathbf{x})$. Then we can say that:
$$mathbb{P}(L(mathbf{X}) leqslant theta leqslant U(mathbf{X}) | theta) = 1-alpha
quad quad quad text{for all } theta in Theta.$$
Moving outside the frequentist paradigm and marginalising over $theta$ for any prior distribution gives the corresponding marginal probability result:
$$mathbb{P}(L(mathbf{X}) leqslant theta leqslant U(mathbf{X})) = 1-alpha.$$
Once we fix the bounds of the confidence interval by fixing the data to $mathbf{X} = mathbb{x}$, we no longer appeal to this probability statement, because we now have fixed the data. However, if the confidence interval is treated as a random interval then we can indeed make this probability statement --- i.e., with probability $1-alpha$ the parameter $theta$ will fall within the (random) interval.
Within frequentist statistics, probability statements are statements about relative frequencies over infinitely repeated trials. But that is true of every probability statement in the frequentist paradigm, so if your objection is to relative frequency statements, that is not an objection that is specific to confidence intervals. If we move outside the frequentist paradigm then we can legitimately say that a confidence interval contains its target parameter with the desired probability, so long as we make this probability statement marginally (i.e., not conditional on the data) and we thus treat the confidence interval in its random sense.
I don't know about others, but that seems to me to be a pretty powerful probability result, and a reasonable justification for this form of interval. I am more partial to Bayesian methods myself, but the probability results backing confidence intervals (in their random sense) are powerful results that are not to be sniffed at.
$endgroup$
So long as the confidence interval is treated as random (i.e., looked at from the perspective of treating the data as a set of random variables that we have not seen yet) then we can indeed make useful probability statements about it. Specifically, suppose you have a confidence interval at level $1-alpha$ for the parameter $theta$, and the interval has bounds $L(mathbf{x}) leqslant U(mathbf{x})$. Then we can say that:
$$mathbb{P}(L(mathbf{X}) leqslant theta leqslant U(mathbf{X}) | theta) = 1-alpha
quad quad quad text{for all } theta in Theta.$$
Moving outside the frequentist paradigm and marginalising over $theta$ for any prior distribution gives the corresponding marginal probability result:
$$mathbb{P}(L(mathbf{X}) leqslant theta leqslant U(mathbf{X})) = 1-alpha.$$
Once we fix the bounds of the confidence interval by fixing the data to $mathbf{X} = mathbb{x}$, we no longer appeal to this probability statement, because we now have fixed the data. However, if the confidence interval is treated as a random interval then we can indeed make this probability statement --- i.e., with probability $1-alpha$ the parameter $theta$ will fall within the (random) interval.
Within frequentist statistics, probability statements are statements about relative frequencies over infinitely repeated trials. But that is true of every probability statement in the frequentist paradigm, so if your objection is to relative frequency statements, that is not an objection that is specific to confidence intervals. If we move outside the frequentist paradigm then we can legitimately say that a confidence interval contains its target parameter with the desired probability, so long as we make this probability statement marginally (i.e., not conditional on the data) and we thus treat the confidence interval in its random sense.
I don't know about others, but that seems to me to be a pretty powerful probability result, and a reasonable justification for this form of interval. I am more partial to Bayesian methods myself, but the probability results backing confidence intervals (in their random sense) are powerful results that are not to be sniffed at.
answered Jan 31 at 5:05
BenBen
25.3k227119
25.3k227119
1
$begingroup$
"Moving outside the frequentist paradigm" isn't that exactly the problem? In general we want an interval that contains the true value of a parameter of interest with some probability. No frequentist analysis can give us that, and implicitly re-interpreting it as a Bayesian analysis leads to misunderstandings. Better to answer the question directly via a Bayesian credible interval. There are uses for confidence intervals where you are repeatedly performing "experiments", e.g. quality control.
$endgroup$
– Dikran Marsupial
Jan 31 at 8:18
$begingroup$
It is not a matter of implicitly reinterpreting as Bayesian (the latter would condition on the data to get a posterior). The answer is merely showing the OP that we can make useful probability statements about the confidence interval. As to more general objections to the frequentist paradigm, those are well and good, but they are not objections specific to confidence intervals.
$endgroup$
– Ben
Jan 31 at 8:57
1
$begingroup$
As you can see from the above probability statements, we can guarantee that the CI contains the parameter with some probability, so long as we look at this a priori.
$endgroup$
– Ben
Jan 31 at 9:09
1
$begingroup$
If you have moved out of the frequentist paradigm, but are not moving to a Bayesian framework, what framework is it? I wasn't expressing an objection to frequentism, I believe you should use the framework that most directly answers the question you actually want to pose. Confidence and credible intervals answer different questions.
$endgroup$
– Dikran Marsupial
Jan 31 at 11:11
1
$begingroup$
@Dikran: The probability statement stands as written, and is a pure mathematical statement. I really don't see how you can reasonably object to this.
$endgroup$
– Ben
Feb 5 at 8:20
|
show 4 more comments
1
$begingroup$
"Moving outside the frequentist paradigm" isn't that exactly the problem? In general we want an interval that contains the true value of a parameter of interest with some probability. No frequentist analysis can give us that, and implicitly re-interpreting it as a Bayesian analysis leads to misunderstandings. Better to answer the question directly via a Bayesian credible interval. There are uses for confidence intervals where you are repeatedly performing "experiments", e.g. quality control.
$endgroup$
– Dikran Marsupial
Jan 31 at 8:18
$begingroup$
It is not a matter of implicitly reinterpreting as Bayesian (the latter would condition on the data to get a posterior). The answer is merely showing the OP that we can make useful probability statements about the confidence interval. As to more general objections to the frequentist paradigm, those are well and good, but they are not objections specific to confidence intervals.
$endgroup$
– Ben
Jan 31 at 8:57
1
$begingroup$
As you can see from the above probability statements, we can guarantee that the CI contains the parameter with some probability, so long as we look at this a priori.
$endgroup$
– Ben
Jan 31 at 9:09
1
$begingroup$
If you have moved out of the frequentist paradigm, but are not moving to a Bayesian framework, what framework is it? I wasn't expressing an objection to frequentism, I believe you should use the framework that most directly answers the question you actually want to pose. Confidence and credible intervals answer different questions.
$endgroup$
– Dikran Marsupial
Jan 31 at 11:11
1
$begingroup$
@Dikran: The probability statement stands as written, and is a pure mathematical statement. I really don't see how you can reasonably object to this.
$endgroup$
– Ben
Feb 5 at 8:20
1
1
$begingroup$
"Moving outside the frequentist paradigm" isn't that exactly the problem? In general we want an interval that contains the true value of a parameter of interest with some probability. No frequentist analysis can give us that, and implicitly re-interpreting it as a Bayesian analysis leads to misunderstandings. Better to answer the question directly via a Bayesian credible interval. There are uses for confidence intervals where you are repeatedly performing "experiments", e.g. quality control.
$endgroup$
– Dikran Marsupial
Jan 31 at 8:18
$begingroup$
"Moving outside the frequentist paradigm" isn't that exactly the problem? In general we want an interval that contains the true value of a parameter of interest with some probability. No frequentist analysis can give us that, and implicitly re-interpreting it as a Bayesian analysis leads to misunderstandings. Better to answer the question directly via a Bayesian credible interval. There are uses for confidence intervals where you are repeatedly performing "experiments", e.g. quality control.
$endgroup$
– Dikran Marsupial
Jan 31 at 8:18
$begingroup$
It is not a matter of implicitly reinterpreting as Bayesian (the latter would condition on the data to get a posterior). The answer is merely showing the OP that we can make useful probability statements about the confidence interval. As to more general objections to the frequentist paradigm, those are well and good, but they are not objections specific to confidence intervals.
$endgroup$
– Ben
Jan 31 at 8:57
$begingroup$
It is not a matter of implicitly reinterpreting as Bayesian (the latter would condition on the data to get a posterior). The answer is merely showing the OP that we can make useful probability statements about the confidence interval. As to more general objections to the frequentist paradigm, those are well and good, but they are not objections specific to confidence intervals.
$endgroup$
– Ben
Jan 31 at 8:57
1
1
$begingroup$
As you can see from the above probability statements, we can guarantee that the CI contains the parameter with some probability, so long as we look at this a priori.
$endgroup$
– Ben
Jan 31 at 9:09
$begingroup$
As you can see from the above probability statements, we can guarantee that the CI contains the parameter with some probability, so long as we look at this a priori.
$endgroup$
– Ben
Jan 31 at 9:09
1
1
$begingroup$
If you have moved out of the frequentist paradigm, but are not moving to a Bayesian framework, what framework is it? I wasn't expressing an objection to frequentism, I believe you should use the framework that most directly answers the question you actually want to pose. Confidence and credible intervals answer different questions.
$endgroup$
– Dikran Marsupial
Jan 31 at 11:11
$begingroup$
If you have moved out of the frequentist paradigm, but are not moving to a Bayesian framework, what framework is it? I wasn't expressing an objection to frequentism, I believe you should use the framework that most directly answers the question you actually want to pose. Confidence and credible intervals answer different questions.
$endgroup$
– Dikran Marsupial
Jan 31 at 11:11
1
1
$begingroup$
@Dikran: The probability statement stands as written, and is a pure mathematical statement. I really don't see how you can reasonably object to this.
$endgroup$
– Ben
Feb 5 at 8:20
$begingroup$
@Dikran: The probability statement stands as written, and is a pure mathematical statement. I really don't see how you can reasonably object to this.
$endgroup$
– Ben
Feb 5 at 8:20
|
show 4 more comments
$begingroup$
I agree with @Ben above, and I thought I would provide a simple example of where a Bayesian versus a Frequentist interval would be of value in the same circumstance.
Imagine a factory with parallel assembly lines. It is costly to stop a line, and at the same time, they want to produce quality products. They are concerned about both false positives and false negatives over time. To the factory, it is an averaging process: both power and guaranteed protection against false positives matter. Confidence intervals, as well as tolerance intervals, matter to the factory. Nonetheless, machines will go out of alignment, that is $thetaneTheta$, and detection gear will observe spurious events. The average outcome matters while the specific outcome is an operational detail.
On the opposite side of this is a single customer purchasing a single product or a single lot of products. They do not care about the repetition properties of the assembly line. They care about the one product that they purchased. Let us imagine the customer is NASA and they need the product to meet a specification, say $gammaleGamma.$ They do not care about the quality of the parts they did not purchase. They need a Bayesian interval of some form.
Furthermore, a single failure could kill many astronauts and cost billions of dollars. They need to know that every single part purchased meets specifications. Averaging would be deadly. For a Saturn V rocket, a one percent defect rate would have implied 10,000 defective parts during the Apollo flights. They required 0% defects on all missions.
You worry about having a confidence interval when you are working in the sample space as a factory is doing. It is creating the sample space. You worry about credible intervals when you are working in the parameter space, as a customer would be doing. If you do not care about the observations outside yours, then you are Bayesian. If you do care about the samples that were not seen, but could have been seen, then you are a Frequentist.
Are you concerned with long-run averaging or the specific event?
$endgroup$
$begingroup$
Does NASA actually purchase parts based on Bayesian intervals? I understand your point, but do they actually do it?
$endgroup$
– Aksakal
Jan 31 at 17:55
$begingroup$
@Aksakal I do not know. Juran, of course, wrote a wonderful work on quality assurance at NASA, but I cannot remember at all if the testing process was discussed as it has been more than a decade since I read it. I know that W Edwards Deming was opposed to confidence intervals in favor of credible intervals, but again, that doesn't directly pertain. My guess, and I do know people who would know but it is inconvenient to ask at the moment, is that they use Frequentist methods because that is what most people are trained in. You use the hammer you have.
$endgroup$
– Dave Harris
Jan 31 at 17:58
$begingroup$
Is it the case of "a hammer" though? Maybe it has something to do with the way things are in engineering?
$endgroup$
– Aksakal
Jan 31 at 18:09
$begingroup$
@Aksakal I am not qualified to opine on that.
$endgroup$
– Dave Harris
Jan 31 at 19:56
$begingroup$
Say a company makes $n$ parts, with a $alpha$ level composite hypothesis test $H_0: gamma > Gamma$ you have them tested for mistakes: $x$ of them pass without mistakes and $y$ of them fail. You can give NASA a reasonable guarantee. The maximum amount of products that can accidentally pass the test (wrongly considered without error) is $nalpha$. Knowing that you sold $x$ items you can compute a maximum probability that a sold part is actually not in accordance with the alternative hypothesis $gamma leq Gamma$.
$endgroup$
– Martijn Weterings
Feb 4 at 9:01
|
show 1 more comment
$begingroup$
I agree with @Ben above, and I thought I would provide a simple example of where a Bayesian versus a Frequentist interval would be of value in the same circumstance.
Imagine a factory with parallel assembly lines. It is costly to stop a line, and at the same time, they want to produce quality products. They are concerned about both false positives and false negatives over time. To the factory, it is an averaging process: both power and guaranteed protection against false positives matter. Confidence intervals, as well as tolerance intervals, matter to the factory. Nonetheless, machines will go out of alignment, that is $thetaneTheta$, and detection gear will observe spurious events. The average outcome matters while the specific outcome is an operational detail.
On the opposite side of this is a single customer purchasing a single product or a single lot of products. They do not care about the repetition properties of the assembly line. They care about the one product that they purchased. Let us imagine the customer is NASA and they need the product to meet a specification, say $gammaleGamma.$ They do not care about the quality of the parts they did not purchase. They need a Bayesian interval of some form.
Furthermore, a single failure could kill many astronauts and cost billions of dollars. They need to know that every single part purchased meets specifications. Averaging would be deadly. For a Saturn V rocket, a one percent defect rate would have implied 10,000 defective parts during the Apollo flights. They required 0% defects on all missions.
You worry about having a confidence interval when you are working in the sample space as a factory is doing. It is creating the sample space. You worry about credible intervals when you are working in the parameter space, as a customer would be doing. If you do not care about the observations outside yours, then you are Bayesian. If you do care about the samples that were not seen, but could have been seen, then you are a Frequentist.
Are you concerned with long-run averaging or the specific event?
$endgroup$
$begingroup$
Does NASA actually purchase parts based on Bayesian intervals? I understand your point, but do they actually do it?
$endgroup$
– Aksakal
Jan 31 at 17:55
$begingroup$
@Aksakal I do not know. Juran, of course, wrote a wonderful work on quality assurance at NASA, but I cannot remember at all if the testing process was discussed as it has been more than a decade since I read it. I know that W Edwards Deming was opposed to confidence intervals in favor of credible intervals, but again, that doesn't directly pertain. My guess, and I do know people who would know but it is inconvenient to ask at the moment, is that they use Frequentist methods because that is what most people are trained in. You use the hammer you have.
$endgroup$
– Dave Harris
Jan 31 at 17:58
$begingroup$
Is it the case of "a hammer" though? Maybe it has something to do with the way things are in engineering?
$endgroup$
– Aksakal
Jan 31 at 18:09
$begingroup$
@Aksakal I am not qualified to opine on that.
$endgroup$
– Dave Harris
Jan 31 at 19:56
$begingroup$
Say a company makes $n$ parts, with a $alpha$ level composite hypothesis test $H_0: gamma > Gamma$ you have them tested for mistakes: $x$ of them pass without mistakes and $y$ of them fail. You can give NASA a reasonable guarantee. The maximum amount of products that can accidentally pass the test (wrongly considered without error) is $nalpha$. Knowing that you sold $x$ items you can compute a maximum probability that a sold part is actually not in accordance with the alternative hypothesis $gamma leq Gamma$.
$endgroup$
– Martijn Weterings
Feb 4 at 9:01
|
show 1 more comment
$begingroup$
I agree with @Ben above, and I thought I would provide a simple example of where a Bayesian versus a Frequentist interval would be of value in the same circumstance.
Imagine a factory with parallel assembly lines. It is costly to stop a line, and at the same time, they want to produce quality products. They are concerned about both false positives and false negatives over time. To the factory, it is an averaging process: both power and guaranteed protection against false positives matter. Confidence intervals, as well as tolerance intervals, matter to the factory. Nonetheless, machines will go out of alignment, that is $thetaneTheta$, and detection gear will observe spurious events. The average outcome matters while the specific outcome is an operational detail.
On the opposite side of this is a single customer purchasing a single product or a single lot of products. They do not care about the repetition properties of the assembly line. They care about the one product that they purchased. Let us imagine the customer is NASA and they need the product to meet a specification, say $gammaleGamma.$ They do not care about the quality of the parts they did not purchase. They need a Bayesian interval of some form.
Furthermore, a single failure could kill many astronauts and cost billions of dollars. They need to know that every single part purchased meets specifications. Averaging would be deadly. For a Saturn V rocket, a one percent defect rate would have implied 10,000 defective parts during the Apollo flights. They required 0% defects on all missions.
You worry about having a confidence interval when you are working in the sample space as a factory is doing. It is creating the sample space. You worry about credible intervals when you are working in the parameter space, as a customer would be doing. If you do not care about the observations outside yours, then you are Bayesian. If you do care about the samples that were not seen, but could have been seen, then you are a Frequentist.
Are you concerned with long-run averaging or the specific event?
$endgroup$
I agree with @Ben above, and I thought I would provide a simple example of where a Bayesian versus a Frequentist interval would be of value in the same circumstance.
Imagine a factory with parallel assembly lines. It is costly to stop a line, and at the same time, they want to produce quality products. They are concerned about both false positives and false negatives over time. To the factory, it is an averaging process: both power and guaranteed protection against false positives matter. Confidence intervals, as well as tolerance intervals, matter to the factory. Nonetheless, machines will go out of alignment, that is $thetaneTheta$, and detection gear will observe spurious events. The average outcome matters while the specific outcome is an operational detail.
On the opposite side of this is a single customer purchasing a single product or a single lot of products. They do not care about the repetition properties of the assembly line. They care about the one product that they purchased. Let us imagine the customer is NASA and they need the product to meet a specification, say $gammaleGamma.$ They do not care about the quality of the parts they did not purchase. They need a Bayesian interval of some form.
Furthermore, a single failure could kill many astronauts and cost billions of dollars. They need to know that every single part purchased meets specifications. Averaging would be deadly. For a Saturn V rocket, a one percent defect rate would have implied 10,000 defective parts during the Apollo flights. They required 0% defects on all missions.
You worry about having a confidence interval when you are working in the sample space as a factory is doing. It is creating the sample space. You worry about credible intervals when you are working in the parameter space, as a customer would be doing. If you do not care about the observations outside yours, then you are Bayesian. If you do care about the samples that were not seen, but could have been seen, then you are a Frequentist.
Are you concerned with long-run averaging or the specific event?
answered Jan 31 at 17:36
Dave HarrisDave Harris
3,767515
3,767515
$begingroup$
Does NASA actually purchase parts based on Bayesian intervals? I understand your point, but do they actually do it?
$endgroup$
– Aksakal
Jan 31 at 17:55
$begingroup$
@Aksakal I do not know. Juran, of course, wrote a wonderful work on quality assurance at NASA, but I cannot remember at all if the testing process was discussed as it has been more than a decade since I read it. I know that W Edwards Deming was opposed to confidence intervals in favor of credible intervals, but again, that doesn't directly pertain. My guess, and I do know people who would know but it is inconvenient to ask at the moment, is that they use Frequentist methods because that is what most people are trained in. You use the hammer you have.
$endgroup$
– Dave Harris
Jan 31 at 17:58
$begingroup$
Is it the case of "a hammer" though? Maybe it has something to do with the way things are in engineering?
$endgroup$
– Aksakal
Jan 31 at 18:09
$begingroup$
@Aksakal I am not qualified to opine on that.
$endgroup$
– Dave Harris
Jan 31 at 19:56
$begingroup$
Say a company makes $n$ parts, with a $alpha$ level composite hypothesis test $H_0: gamma > Gamma$ you have them tested for mistakes: $x$ of them pass without mistakes and $y$ of them fail. You can give NASA a reasonable guarantee. The maximum amount of products that can accidentally pass the test (wrongly considered without error) is $nalpha$. Knowing that you sold $x$ items you can compute a maximum probability that a sold part is actually not in accordance with the alternative hypothesis $gamma leq Gamma$.
$endgroup$
– Martijn Weterings
Feb 4 at 9:01
|
show 1 more comment
$begingroup$
Does NASA actually purchase parts based on Bayesian intervals? I understand your point, but do they actually do it?
$endgroup$
– Aksakal
Jan 31 at 17:55
$begingroup$
@Aksakal I do not know. Juran, of course, wrote a wonderful work on quality assurance at NASA, but I cannot remember at all if the testing process was discussed as it has been more than a decade since I read it. I know that W Edwards Deming was opposed to confidence intervals in favor of credible intervals, but again, that doesn't directly pertain. My guess, and I do know people who would know but it is inconvenient to ask at the moment, is that they use Frequentist methods because that is what most people are trained in. You use the hammer you have.
$endgroup$
– Dave Harris
Jan 31 at 17:58
$begingroup$
Is it the case of "a hammer" though? Maybe it has something to do with the way things are in engineering?
$endgroup$
– Aksakal
Jan 31 at 18:09
$begingroup$
@Aksakal I am not qualified to opine on that.
$endgroup$
– Dave Harris
Jan 31 at 19:56
$begingroup$
Say a company makes $n$ parts, with a $alpha$ level composite hypothesis test $H_0: gamma > Gamma$ you have them tested for mistakes: $x$ of them pass without mistakes and $y$ of them fail. You can give NASA a reasonable guarantee. The maximum amount of products that can accidentally pass the test (wrongly considered without error) is $nalpha$. Knowing that you sold $x$ items you can compute a maximum probability that a sold part is actually not in accordance with the alternative hypothesis $gamma leq Gamma$.
$endgroup$
– Martijn Weterings
Feb 4 at 9:01
$begingroup$
Does NASA actually purchase parts based on Bayesian intervals? I understand your point, but do they actually do it?
$endgroup$
– Aksakal
Jan 31 at 17:55
$begingroup$
Does NASA actually purchase parts based on Bayesian intervals? I understand your point, but do they actually do it?
$endgroup$
– Aksakal
Jan 31 at 17:55
$begingroup$
@Aksakal I do not know. Juran, of course, wrote a wonderful work on quality assurance at NASA, but I cannot remember at all if the testing process was discussed as it has been more than a decade since I read it. I know that W Edwards Deming was opposed to confidence intervals in favor of credible intervals, but again, that doesn't directly pertain. My guess, and I do know people who would know but it is inconvenient to ask at the moment, is that they use Frequentist methods because that is what most people are trained in. You use the hammer you have.
$endgroup$
– Dave Harris
Jan 31 at 17:58
$begingroup$
@Aksakal I do not know. Juran, of course, wrote a wonderful work on quality assurance at NASA, but I cannot remember at all if the testing process was discussed as it has been more than a decade since I read it. I know that W Edwards Deming was opposed to confidence intervals in favor of credible intervals, but again, that doesn't directly pertain. My guess, and I do know people who would know but it is inconvenient to ask at the moment, is that they use Frequentist methods because that is what most people are trained in. You use the hammer you have.
$endgroup$
– Dave Harris
Jan 31 at 17:58
$begingroup$
Is it the case of "a hammer" though? Maybe it has something to do with the way things are in engineering?
$endgroup$
– Aksakal
Jan 31 at 18:09
$begingroup$
Is it the case of "a hammer" though? Maybe it has something to do with the way things are in engineering?
$endgroup$
– Aksakal
Jan 31 at 18:09
$begingroup$
@Aksakal I am not qualified to opine on that.
$endgroup$
– Dave Harris
Jan 31 at 19:56
$begingroup$
@Aksakal I am not qualified to opine on that.
$endgroup$
– Dave Harris
Jan 31 at 19:56
$begingroup$
Say a company makes $n$ parts, with a $alpha$ level composite hypothesis test $H_0: gamma > Gamma$ you have them tested for mistakes: $x$ of them pass without mistakes and $y$ of them fail. You can give NASA a reasonable guarantee. The maximum amount of products that can accidentally pass the test (wrongly considered without error) is $nalpha$. Knowing that you sold $x$ items you can compute a maximum probability that a sold part is actually not in accordance with the alternative hypothesis $gamma leq Gamma$.
$endgroup$
– Martijn Weterings
Feb 4 at 9:01
$begingroup$
Say a company makes $n$ parts, with a $alpha$ level composite hypothesis test $H_0: gamma > Gamma$ you have them tested for mistakes: $x$ of them pass without mistakes and $y$ of them fail. You can give NASA a reasonable guarantee. The maximum amount of products that can accidentally pass the test (wrongly considered without error) is $nalpha$. Knowing that you sold $x$ items you can compute a maximum probability that a sold part is actually not in accordance with the alternative hypothesis $gamma leq Gamma$.
$endgroup$
– Martijn Weterings
Feb 4 at 9:01
|
show 1 more comment
$begingroup$
Confidence intervals are not only useful, but essential in some field, such as physics. Unfortunately, the most noise regarding CIs comes from Bayesians caught up in fake debates with Frequentists, usually in the context of social "sciences" and other science-like disciplines.
Suppose that I measure a quantity in Physics, such as electricity charge. I would always supply it with the measure of uncertainty of the value, which is usually a standard deviation. Since, in Physics errors are often Gaussian, this directly translated into CI. However, when the errors are not Gaussian, it gets a bit complicated, some integrals need to be evaluated etc. Nothing too esoteric though usually.
Here's a brief presentation on the CI in particle physics, and the definition:
quantitative statement about the fraction of times that such an
interval would contain the true value of the parameter in a large
number of repeated experiments
Note, that in Physics "repeated experiments" has often a literal meaning: it's assumed you can actually repeat experiments in the paper, and would actually observe that fraction. So, the CI has almost a literal meaning to you, and is just a way to express the information about the uncertainty of the measurement. It's not a thought experiment, not a subjective opinion, not your or my feelings about likelihoods etc. It's what you were able to devise from experiments, and what I should be able to observe when reproducing your experiment.
$endgroup$
add a comment |
$begingroup$
Confidence intervals are not only useful, but essential in some field, such as physics. Unfortunately, the most noise regarding CIs comes from Bayesians caught up in fake debates with Frequentists, usually in the context of social "sciences" and other science-like disciplines.
Suppose that I measure a quantity in Physics, such as electricity charge. I would always supply it with the measure of uncertainty of the value, which is usually a standard deviation. Since, in Physics errors are often Gaussian, this directly translated into CI. However, when the errors are not Gaussian, it gets a bit complicated, some integrals need to be evaluated etc. Nothing too esoteric though usually.
Here's a brief presentation on the CI in particle physics, and the definition:
quantitative statement about the fraction of times that such an
interval would contain the true value of the parameter in a large
number of repeated experiments
Note, that in Physics "repeated experiments" has often a literal meaning: it's assumed you can actually repeat experiments in the paper, and would actually observe that fraction. So, the CI has almost a literal meaning to you, and is just a way to express the information about the uncertainty of the measurement. It's not a thought experiment, not a subjective opinion, not your or my feelings about likelihoods etc. It's what you were able to devise from experiments, and what I should be able to observe when reproducing your experiment.
$endgroup$
add a comment |
$begingroup$
Confidence intervals are not only useful, but essential in some field, such as physics. Unfortunately, the most noise regarding CIs comes from Bayesians caught up in fake debates with Frequentists, usually in the context of social "sciences" and other science-like disciplines.
Suppose that I measure a quantity in Physics, such as electricity charge. I would always supply it with the measure of uncertainty of the value, which is usually a standard deviation. Since, in Physics errors are often Gaussian, this directly translated into CI. However, when the errors are not Gaussian, it gets a bit complicated, some integrals need to be evaluated etc. Nothing too esoteric though usually.
Here's a brief presentation on the CI in particle physics, and the definition:
quantitative statement about the fraction of times that such an
interval would contain the true value of the parameter in a large
number of repeated experiments
Note, that in Physics "repeated experiments" has often a literal meaning: it's assumed you can actually repeat experiments in the paper, and would actually observe that fraction. So, the CI has almost a literal meaning to you, and is just a way to express the information about the uncertainty of the measurement. It's not a thought experiment, not a subjective opinion, not your or my feelings about likelihoods etc. It's what you were able to devise from experiments, and what I should be able to observe when reproducing your experiment.
$endgroup$
Confidence intervals are not only useful, but essential in some field, such as physics. Unfortunately, the most noise regarding CIs comes from Bayesians caught up in fake debates with Frequentists, usually in the context of social "sciences" and other science-like disciplines.
Suppose that I measure a quantity in Physics, such as electricity charge. I would always supply it with the measure of uncertainty of the value, which is usually a standard deviation. Since, in Physics errors are often Gaussian, this directly translated into CI. However, when the errors are not Gaussian, it gets a bit complicated, some integrals need to be evaluated etc. Nothing too esoteric though usually.
Here's a brief presentation on the CI in particle physics, and the definition:
quantitative statement about the fraction of times that such an
interval would contain the true value of the parameter in a large
number of repeated experiments
Note, that in Physics "repeated experiments" has often a literal meaning: it's assumed you can actually repeat experiments in the paper, and would actually observe that fraction. So, the CI has almost a literal meaning to you, and is just a way to express the information about the uncertainty of the measurement. It's not a thought experiment, not a subjective opinion, not your or my feelings about likelihoods etc. It's what you were able to devise from experiments, and what I should be able to observe when reproducing your experiment.
answered Jan 31 at 18:05
AksakalAksakal
38.7k451119
38.7k451119
add a comment |
add a comment |
$begingroup$
Note that by the strict definition of confidence interval, it is possible that they are completely meaningless, i.e., not informative about the parameter of interest. However, in practice, they are generally very meaningful.
As an example of a meaningless confidence interval, suppose I have a procedure that 95% of the time produces $[0,1]$, and 5% of the time produces [$U_{min}$, $U_{max}$], where $U_{min}, U_{max}$ are any pair of random variables such that $U_{min} < U_{max}$. Then this is a procedure that captures any probability at least 95% of the time, so is technically a valid confidence interval for any probability. Yet if I told that the interval produced by this procedure was $[0.01, 0.011]$ for a given $p$, you should realize that you really have learned nothing about $p$.
On the other hand, most confidence intervals are constructed in a more useful fashion. For example, if I told you it was created using a Wald Interval procedure, then we know that
$hat p text{ } dotsim text{ } N(p, s_e)$
where $s_e$ is the standard error. This is a very meaningful statement about how $hat p$ relates to $p$. Turning this into a confidence interval is simply an attempt to simplify this result to someone who is not so familiar with normal distributions. That's also not just to say that it's only a tool for people who don't know about normal distributions; for example, the percentile bootstrap is a tool for summarizing the error between the estimator and true parameter when the distribution of this error may be non-Gaussian.
$endgroup$
add a comment |
$begingroup$
Note that by the strict definition of confidence interval, it is possible that they are completely meaningless, i.e., not informative about the parameter of interest. However, in practice, they are generally very meaningful.
As an example of a meaningless confidence interval, suppose I have a procedure that 95% of the time produces $[0,1]$, and 5% of the time produces [$U_{min}$, $U_{max}$], where $U_{min}, U_{max}$ are any pair of random variables such that $U_{min} < U_{max}$. Then this is a procedure that captures any probability at least 95% of the time, so is technically a valid confidence interval for any probability. Yet if I told that the interval produced by this procedure was $[0.01, 0.011]$ for a given $p$, you should realize that you really have learned nothing about $p$.
On the other hand, most confidence intervals are constructed in a more useful fashion. For example, if I told you it was created using a Wald Interval procedure, then we know that
$hat p text{ } dotsim text{ } N(p, s_e)$
where $s_e$ is the standard error. This is a very meaningful statement about how $hat p$ relates to $p$. Turning this into a confidence interval is simply an attempt to simplify this result to someone who is not so familiar with normal distributions. That's also not just to say that it's only a tool for people who don't know about normal distributions; for example, the percentile bootstrap is a tool for summarizing the error between the estimator and true parameter when the distribution of this error may be non-Gaussian.
$endgroup$
add a comment |
$begingroup$
Note that by the strict definition of confidence interval, it is possible that they are completely meaningless, i.e., not informative about the parameter of interest. However, in practice, they are generally very meaningful.
As an example of a meaningless confidence interval, suppose I have a procedure that 95% of the time produces $[0,1]$, and 5% of the time produces [$U_{min}$, $U_{max}$], where $U_{min}, U_{max}$ are any pair of random variables such that $U_{min} < U_{max}$. Then this is a procedure that captures any probability at least 95% of the time, so is technically a valid confidence interval for any probability. Yet if I told that the interval produced by this procedure was $[0.01, 0.011]$ for a given $p$, you should realize that you really have learned nothing about $p$.
On the other hand, most confidence intervals are constructed in a more useful fashion. For example, if I told you it was created using a Wald Interval procedure, then we know that
$hat p text{ } dotsim text{ } N(p, s_e)$
where $s_e$ is the standard error. This is a very meaningful statement about how $hat p$ relates to $p$. Turning this into a confidence interval is simply an attempt to simplify this result to someone who is not so familiar with normal distributions. That's also not just to say that it's only a tool for people who don't know about normal distributions; for example, the percentile bootstrap is a tool for summarizing the error between the estimator and true parameter when the distribution of this error may be non-Gaussian.
$endgroup$
Note that by the strict definition of confidence interval, it is possible that they are completely meaningless, i.e., not informative about the parameter of interest. However, in practice, they are generally very meaningful.
As an example of a meaningless confidence interval, suppose I have a procedure that 95% of the time produces $[0,1]$, and 5% of the time produces [$U_{min}$, $U_{max}$], where $U_{min}, U_{max}$ are any pair of random variables such that $U_{min} < U_{max}$. Then this is a procedure that captures any probability at least 95% of the time, so is technically a valid confidence interval for any probability. Yet if I told that the interval produced by this procedure was $[0.01, 0.011]$ for a given $p$, you should realize that you really have learned nothing about $p$.
On the other hand, most confidence intervals are constructed in a more useful fashion. For example, if I told you it was created using a Wald Interval procedure, then we know that
$hat p text{ } dotsim text{ } N(p, s_e)$
where $s_e$ is the standard error. This is a very meaningful statement about how $hat p$ relates to $p$. Turning this into a confidence interval is simply an attempt to simplify this result to someone who is not so familiar with normal distributions. That's also not just to say that it's only a tool for people who don't know about normal distributions; for example, the percentile bootstrap is a tool for summarizing the error between the estimator and true parameter when the distribution of this error may be non-Gaussian.
edited Feb 1 at 1:17
answered Jan 31 at 18:01
Cliff ABCliff AB
13.2k12566
13.2k12566
add a comment |
add a comment |
$begingroup$
This thread has devolved quickly into the Frequentist vs Bayesian debate, and that is not easily resolvable. The math in both approaches is solid, so it always comes down to philosophical preferences. The frequentist interpretation of probability as the limit of an event's relative frequency is justified by the strong law of large numbers; regardless of your preferred interpretation of probability, an event's relative frequency will converge to its probability with probability 1.
Frequentist confidence intervals are indeed trickier to interpret than Bayesian credible intervals. By treating an unknown quantity as a random variable, Bayesians can assert that one interval contains that quantity with some probability. Frequentists refuse to treat some quantities as random variables, and any equations containing only constants can only be true or false. So when estimating an unknown constant, frequentists must bound them with a RANDOM interval to involve probability at all. Rather than one interval containing a random variable with some probability, a frequentist method generates many different possible intervals, some of which contains the unknown constant. If the coverage probability is reasonably high, it's a reasonable leap of faith to assert that a particular interval contains the unknown constant (note, not "with some probability").
A Bayesian would balk at such a leap of faith as much as a Frequentist balks at treating any unknown quantity as a random variable. The frequentist Neyman construction method in fact exposed an embarrassing issue with such leaps of faith. Without actively preventing it (see Feldman and Cousins, 1997 for one approach), rare outcomes may generate EMPTY confidence intervals for a distribution parameter. Such a leap of faith would be very unreasonable! I've seen a few Bayesians using that example to mock frequentist methods, while frequentists typically respond with "well I still get a correct interval most of the time, and without making false assumptions." I'll point out that the Bayesian/frequentist impasse is not important to most who apply their methods. Even people who are committed to coverage probability will often use Bayesian methods if the methods are shown to have good coverage probability in simulations.
$endgroup$
add a comment |
$begingroup$
This thread has devolved quickly into the Frequentist vs Bayesian debate, and that is not easily resolvable. The math in both approaches is solid, so it always comes down to philosophical preferences. The frequentist interpretation of probability as the limit of an event's relative frequency is justified by the strong law of large numbers; regardless of your preferred interpretation of probability, an event's relative frequency will converge to its probability with probability 1.
Frequentist confidence intervals are indeed trickier to interpret than Bayesian credible intervals. By treating an unknown quantity as a random variable, Bayesians can assert that one interval contains that quantity with some probability. Frequentists refuse to treat some quantities as random variables, and any equations containing only constants can only be true or false. So when estimating an unknown constant, frequentists must bound them with a RANDOM interval to involve probability at all. Rather than one interval containing a random variable with some probability, a frequentist method generates many different possible intervals, some of which contains the unknown constant. If the coverage probability is reasonably high, it's a reasonable leap of faith to assert that a particular interval contains the unknown constant (note, not "with some probability").
A Bayesian would balk at such a leap of faith as much as a Frequentist balks at treating any unknown quantity as a random variable. The frequentist Neyman construction method in fact exposed an embarrassing issue with such leaps of faith. Without actively preventing it (see Feldman and Cousins, 1997 for one approach), rare outcomes may generate EMPTY confidence intervals for a distribution parameter. Such a leap of faith would be very unreasonable! I've seen a few Bayesians using that example to mock frequentist methods, while frequentists typically respond with "well I still get a correct interval most of the time, and without making false assumptions." I'll point out that the Bayesian/frequentist impasse is not important to most who apply their methods. Even people who are committed to coverage probability will often use Bayesian methods if the methods are shown to have good coverage probability in simulations.
$endgroup$
add a comment |
$begingroup$
This thread has devolved quickly into the Frequentist vs Bayesian debate, and that is not easily resolvable. The math in both approaches is solid, so it always comes down to philosophical preferences. The frequentist interpretation of probability as the limit of an event's relative frequency is justified by the strong law of large numbers; regardless of your preferred interpretation of probability, an event's relative frequency will converge to its probability with probability 1.
Frequentist confidence intervals are indeed trickier to interpret than Bayesian credible intervals. By treating an unknown quantity as a random variable, Bayesians can assert that one interval contains that quantity with some probability. Frequentists refuse to treat some quantities as random variables, and any equations containing only constants can only be true or false. So when estimating an unknown constant, frequentists must bound them with a RANDOM interval to involve probability at all. Rather than one interval containing a random variable with some probability, a frequentist method generates many different possible intervals, some of which contains the unknown constant. If the coverage probability is reasonably high, it's a reasonable leap of faith to assert that a particular interval contains the unknown constant (note, not "with some probability").
A Bayesian would balk at such a leap of faith as much as a Frequentist balks at treating any unknown quantity as a random variable. The frequentist Neyman construction method in fact exposed an embarrassing issue with such leaps of faith. Without actively preventing it (see Feldman and Cousins, 1997 for one approach), rare outcomes may generate EMPTY confidence intervals for a distribution parameter. Such a leap of faith would be very unreasonable! I've seen a few Bayesians using that example to mock frequentist methods, while frequentists typically respond with "well I still get a correct interval most of the time, and without making false assumptions." I'll point out that the Bayesian/frequentist impasse is not important to most who apply their methods. Even people who are committed to coverage probability will often use Bayesian methods if the methods are shown to have good coverage probability in simulations.
$endgroup$
This thread has devolved quickly into the Frequentist vs Bayesian debate, and that is not easily resolvable. The math in both approaches is solid, so it always comes down to philosophical preferences. The frequentist interpretation of probability as the limit of an event's relative frequency is justified by the strong law of large numbers; regardless of your preferred interpretation of probability, an event's relative frequency will converge to its probability with probability 1.
Frequentist confidence intervals are indeed trickier to interpret than Bayesian credible intervals. By treating an unknown quantity as a random variable, Bayesians can assert that one interval contains that quantity with some probability. Frequentists refuse to treat some quantities as random variables, and any equations containing only constants can only be true or false. So when estimating an unknown constant, frequentists must bound them with a RANDOM interval to involve probability at all. Rather than one interval containing a random variable with some probability, a frequentist method generates many different possible intervals, some of which contains the unknown constant. If the coverage probability is reasonably high, it's a reasonable leap of faith to assert that a particular interval contains the unknown constant (note, not "with some probability").
A Bayesian would balk at such a leap of faith as much as a Frequentist balks at treating any unknown quantity as a random variable. The frequentist Neyman construction method in fact exposed an embarrassing issue with such leaps of faith. Without actively preventing it (see Feldman and Cousins, 1997 for one approach), rare outcomes may generate EMPTY confidence intervals for a distribution parameter. Such a leap of faith would be very unreasonable! I've seen a few Bayesians using that example to mock frequentist methods, while frequentists typically respond with "well I still get a correct interval most of the time, and without making false assumptions." I'll point out that the Bayesian/frequentist impasse is not important to most who apply their methods. Even people who are committed to coverage probability will often use Bayesian methods if the methods are shown to have good coverage probability in simulations.
edited Jan 31 at 20:20
answered Jan 31 at 18:45
BatWannaBeBatWannaBe
937
937
add a comment |
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f390093%2fare-confidence-intervals-useful%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
Related: "From a Bayesian probability perspective, why doesn't a 95% confidence interval contain the true parameter with 95% probability?".
$endgroup$
– Nat
Jan 31 at 11:47
$begingroup$
The Bayesian credibility interval is neither an interval that we can be 95% certain the parameter is in.
$endgroup$
– Martijn Weterings
Feb 1 at 12:32
$begingroup$
@MartijnWeterings: unless you are 100% certain of your prior.
$endgroup$
– Xi'an
Feb 4 at 6:55
$begingroup$
@Xi'an that works when a parameter $theta$ is 100% certain to be reasonably considered a random variable and an experiment is like sampling from a joint frequency distribution $P(theta,x)$, ie you use Bayes rule as: $P(theta|x) = P(theta,x)/P(x)$ without explicit 'prior'. It's not the same for a parameter that is considered to be fixed. Then the posterior beliefs would require you to also 'update' the old joint frequency distribution of $X$ and $theta$. It is a bit absurd to claim to be updating 'prior beliefs' which were 100% sure.
$endgroup$
– Martijn Weterings
Feb 4 at 7:27