本文着重梳理 假設檢驗 HYPOTHESIS TEST(SIGNIFICANCE TEST),通過 邏輯性知識 和 概念性知識 兩部分釐清該重點內容。
1.Logic of Hypothesis Test
0) What’s hypothesis test?
If we have some doubt in the origin hypothesis or assumption, then we can raise a hypothesis test to prove our doubt or said reject the origin hypothesis.
For example:
Then, we introduce some new concepts on hypothesis:
-
(Null hypothesis): Something we doubt.
-
(Alternative hypothesis): Our guess, or said the new hypothesis
-
Single-sided and Double-sided Hypothesis :
- If our new hypothesis is in the form of , this is a double-sided hypothesis ;
- If our new hypothesis is in the form of or , this is a double-sided hypothesis.
PAY ATTENTION: All the hypothesis is aimed at testing the population parameter.
1) Set up Hypothesis
To set up a Null Hypothesis, we can just need to figure out what is the concerned problem. Or said the feature of null hpothesis is that there is no news if the null hypothesis is actually true.
To set up a lnternative Hypothesis, we use the number of null hypothesis and then choose single-sided or double-sided hypothesis to set up our internative hypothesis.
2) Set up Significance level
PAY ATTENTION: Before we carry out the calculation, we need to set up a significance level. It is an ethical problem if we set up a significance level to suit our calculation result in order to generate a attracting conclusion.
3) Take Sample and Calculate
4) Make Conclusion
2.Concepts of Hypothesis Test
1) What’s p-value and significance level?
- : It is a probability that current sample statistic occur.
-
: We call it significance level. It is a thresold that quantify the word “extreme”. In ohter words, it’s a relatively small probability that indicates whether the is small enough that shake our belief on .
-
$power $: It is a probability that not making Type II error
2) Type I Error & Type II Error
i) Understanding the concepts
Meaning of Type I Error :The origin hypothesis () is actually True, but due to some extreme event happens (), we consider the hypothesis might be wrong and therefore reject it. It is obivous that if the is too large, we can easily get a Type I Error.
Meaning of Type II Error:The origin hypothesis () is actually False, but nothing seems to happen (), then we consider the hypothesis should be true and therefore accpet it. It is obivous that if the is too small, we can easily get a Type II Error. Another way to think of Type II Error is that if we can start from using the concept of instead of .
Trade-off problems : There exists a trade-off between Type I Error and Type II Error, which means that we need to set an appropriate to “balance” the error probability of these two type of error. Here is an example on trade-off problem:
Employees at a health club do a daily water quality test in the club’s swimming pool. If the level of contaminants are too high, then they temporarily close the pool to perform a water treatment.
We can state the hypotheses for their test as : The water quality is acceptable vs. : The water quality is not acceptable. Consider the following two questions:
- In terms of safety, which error has the more dangerous consequences in this setting?
- What significance level should they use to reduce the probability of the more dangerous error?
FROM Khan Academy
What will affect the error probability :
- Significance level : If , then and
- Sample size n: If , then . But it doesn’t impact the likelihood of a Type I error.Larger samples are still preferred since they produce less variable results, but we’ll still reject a true at a rate equal to the significance level .
- Statistic variability : If , then . It suits the intuition as when the statistic variablity is low, the outliner should be easier to figured out, and therefore we can hava higher probability to reject it. But we could not control this variable.
- Distance from true parameter to : If , then . It suits the intuition as when the true value far from , we can hava higher probability to reject it. But we could not control this variable.
ii) Some associations
- 看論文經常會接觸到以下的表格,出處正是來自statistics
Table of error types | is True(in reality) | is False(in reality) |
---|---|---|
Fail to reject | Correct inference | Type II error(False Negative) |
Reject | Type I error(False Positive) | Correct inference |
- ROC曲線