当前位置: 首页 > 工具软件 > Hypothesis > 使用案例 >

Hypothesis Test Overview

曾修真
2023-12-01

H 0 H_0 H0

  • Null Hypothesis. Our default assumption of the value of the population parameter of our interest. For example, for simple linear regression, the null hypothesis is the true model’s slope, β 1 \beta_1 β1, equals 0 0 0 (i.e. In this example, H 0 H_0 H0 has nothing to do with the estimator of the true slope, β 1 ^ \hat {\beta_1} β1^, which is a random variable, not a constant). Before you have enough evidence from your sample to invalidate this hypothesis, it is always assumed to be true.

H a H_a Ha

  • Alternative Hypothesis, also a hypothesis regarding the population parameter. This is the hypothesis that we want to test. For example, in SLR, we want to test if the slope is significant i,e, non-zero, so our H a H_a Ha is β 1 ≠ 0 \beta_1≠0 β1=0. Notice that H 0 H_0 H0 and H a H_a Ha altogether do not have to enumerate all the possible behavoirs of the parameter. For example, in SLR, you can have H 0 : β 1 = 0 ; H a : β 1 > 0 H_0: \beta_1 =0; H_a:\beta_1>0 H0:β1=0;Ha:β1>0 i.e. your test can be two-tailed or one-tailed. Once you reject the null hypothesis, you can conclude that this hypothesis holds, but it does not mean that your alternative hypothesis is TRUE, since we can never know the truth of the population.

Test statistic

  • A test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis testing (from Wikipedia).

Significance Level α \alpha α

  • See this post for intuitive understanding: https://blog.csdn.net/Bill_Wang_01/article/details/115673440
  • This is the probability of committing a Type I error, which means the probability of rejecting your null hypothesis when it is actually true. (whereas a Type II error is not rejecting the null hypothesis when the null hypothesis is actually false). This value is usually 5 % 5\% 5% for t-tests or z-tests, meaning that you preset a 5 % 5\% 5% probability that you reject the null hypothesis even if it is true, so before enough evidence has been presented by your sample, you won’t reject your default hypothesis to avoid committing a Type I error. In other words, you will not be convineced that your default hypothesis is false until the estimate calculated from your sample is so extreme compared to the value in the null hypothesis that holding your default hypothesis is obviously contradictory to the evidence presented by the sample.

Critical value of the test statistic given the significance level, t α , n t_{\alpha,n} tα,n

  • This means the value of the test statistic corresponding to the given significance level. You often look it up in the t or z tables for z/t tests.
  • For example, for a two tailed t t t-test given significance level of 5 % 5\% 5% and sample size n n n, the critical t t t value is denoted as t α / 2 , n − 1 t_{\alpha/2,n-1} tα/2,n1
  • t t t distribution is a class of sampling distributions, and you must specify the sample size so that you can pick the correct t t t distribution to approximate the sampling distribution of your statistic, usually the difference in sample mean
  • Notice that when you are using two samples, the

p − v a l u e p-value pvalue

  • This is the probability of getting a sample that is at least as extreme as yours assuming the null hypothesis is true. If this is smaller than the significance level, then you should reject the null hypothesis because the evidence has shown that if the null hypothesis is true, it is super unlikely that an at-least-as-extreme sample will be drawn, so it is highly probably that the default hypothesis is incorrect (so we should reject it). However, notice that even if we conclude the null hypothesis is inplausible, we cannot know the truth. For the word “extreme”: (a) if your test is two tailed, then it means “will produce a test statistic whose absolute value is larger than or equal to that of yours”; (b) if you test is one-tailed, then it means "will produce a test statistic whose value is lager than or equal to(when your test statistic is high and positive) OR smaller than or equal to (when your test statsitic is very negative) that of your sample.

Two Ways to Make a Conclusion in Two Sample t-test

Using the p p p-value: Compare p p p to α \alpha α

  • p ≤ α p≤\alpha pα: reject H 0 H_0 H0
  • O.W.: fail to reject H 0 H_0 H0

Using the Critical Region: Compare the test statistic you got to the critical value

  • Critical Region is the set of values of the test statistic for which the null hypothesis will be rejected.

Caveat

  • For two sample t-tests, when your test is two tailed, you cannot simply compare ∣ t 0 ∣ |t_0| t0 with t ∗ t* t. This is because, for example, when your alternative hypothesis is μ 1 > μ 2 \mu_1>\mu_2 μ1>μ2, and your statistic is y 1 ˉ − y 2 ˉ \bar{y_1}-\bar{y_2} y1ˉy2ˉ, but you got an extremely negative t t t value. In this case, you should compare t t t to t ∗ t^* t (positive), and you should not reject the null hypothesis. But if you compared ∣ t ∣ |t| t with t ∗ t^* t, you may end up rejecting the null hypothesis.
  • Pay attention to where the “tail” is when the test is one tailed!
  • Two tailed
    • ∣ t 0 ∣ ≥ t ∗ |t_0|≥t^* t0t: reject H 0 H_0 H0
    • ∣ t 0 ∣ < t ∗ |t_0|<t^* t0<t: fail to reject H 0 H_0 H0
  • One tailed
    • H a : μ 1 < μ 2 H_a:\mu_1<\mu_2 Ha:μ1<μ2
      • t 0 ≤ − t ∗ t_0≤-t^* t0t: reject H 0 H_0 H0
      • t 0 > − t ∗ t_0>-t^* t0>t: fail to reject H 0 H_0 H0
    • H a : μ 1 > μ 2 H_a:\mu_1>\mu_2 Ha:μ1>μ2
      • t 0 ≥ t ∗ t_0≥t^* t0t: reject H 0 H_0 H0
      • t 0 < t ∗ t_0<t^* t0<t: fail to reject H 0 H_0 H0
 类似资料:

相关阅读

相关文章

相关问答