We often hear companies advertise dandruff shampoos that they claim are 100% effective, are these results realistically possible?
Hypothesis tests give a way of using samples to test whether statistical claims are likely to be true or not about a population.
What is Hypothesis Testing?
A statistical measure for testing an assumption based on a population parameter. The technique for testing depends on the type of the data used and the cause for the analysis.
Why is it done?
Hypothesis testing is used for situations where an assessment of the probable existence of an assumption or hypothesis can be ascertained with the help of sample data. This data can be either generated or extracted from a larger group of the population.
Types of Hypothesis
Null hypothesis: This occurs when the assumption of the hypothesis test holds true and is failed to be rejected at some level of significance. This is usually the most conservative possibility. In case of any difference in proportions, it is simply due to chance. It is denoted by H0.
Alternative hypothesis: This occurs when the assumption of the hypothesis test does not hold true and is rejected at some level of significance. This is a less conservative possibility and the one we are actually interested in. In case there is any difference in proportions, it is not due to chance. It is denoted by H1 or HA.
One-tailed and Two-tailed Hypothesis Testing
A Two-tailed Hypothesis test is referred to a situation where the alternate hypothesis (H1) gives the possibility in both directions (less than and greater than) of the value of the parameter specified in the null hypothesis (H0).
Eg: The Garnier Men Power White hypothesis. This required a two-tailed test.
A One-tailed Hypothesis test is referred to a situation where the alternate hypothesis gives the possibility in only one direction (either less than or greater than and not both) of the value of the parameter specified in the null hypothesis (H0).
Eg: The coin is biased towards heads with a probability of 0.8. This would require a one-tailed test
Process of Hypothesis Testing
Step 1: Decide on the hypothesis and choose the test statistic(z, t, F, or others)
If the Null Hypothesis is rejected based on any evidence, an Alternate Hypothesis, H1, needs to be accepted. We always start with the assumption that the Null Hypothesis is true.
The test statistic allows us to measure how far an observation is from what is expected to be seen if the null hypothesis is true, its value is calculated with the assumption that the distribution (used) is known.
Step 2: Specify the critical region(a certain level of certainty or standard needs to be set)
We have to specify the Significance Level, α. It is a measure of how unlikely you want the results of the sample to be before you reject the null hypothesis, H0.
The critical region is sample values improbable enough to consider rejecting the null hypothesis.
In a 95% CI, 0.05 is the confidence level
In a 99% CI, 0.01 is the confidence level.
Step 3: Find the p-value(It helps in understanding how rare the results are, with the assumption that the null hypothesis is true)
The p-value is the probability of getting only by chance a value at least as extreme as the one in the sample under the assumption that the null hypothesis is true.
Step 4: Is the sample result in the critical region?
- Is the calculated value of the test statistic in the critical region or outside
- Is the p-value < significance level or > than it?
Step 5: Make your decision
Based on the evidence found, the decision to reject or fail to reject the hypothesis is made. If there isn’t sufficient evidence to reject the null hypothesis, the claims of the company are “accepted”, or vise versa.
Type I and Type II Error
Type I error: This is a situation of a false positive i.e. when a true null hypothesis is rejected.
Type II error: This is a situation of a false negative i.e. when a false null hypothesis is accepted.
Real-World Example of Hypothesis Testing
Smoking During Pregnancy and Child’s IQ Study
It investigated the impact of maternal smoking on the subsequent IQ of children at ages 1, 2, 3, and 4 years of age.
Null hypothesis: Mean IQ scores for children whose mothers smoke 10 or more cigarettes a day during pregnancy are the same as the mean for those whose mothers do not smoke, in populations similar to one from which this sample was drawn.
Alternative hypothesis: Mean IQ scores for children whose mothers smoke 10 or more cigarettes a day during pregnancy are not the same as mean for those whose mothers do not smoke, in populations similar to one from which this sample was drawn.
Researchers conducted two-tailed tests for the possibility the mean IQ score could actually be higher for those whose mothers smoke. The CI provides evidence of the direction in which the difference falls. The p-value simply tells us there is a statistically significant difference.
Reference articles and research papers:
- Applied Business Statistics by Ken Black
- Statistics For Business: Decision Making and Analysis by Robert Stine and Dean Foster
- Introduction to Probability and Statistics for Engineers and Scientists (Fifth Edition) by Sheldon M. Ross
- The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd Edition by Trevor Hastie, Robert Tibshirani, and Jerome Friedman