Tech Notes

My notes on Statistics, Big Data, Cloud Computing, Cyber Security

Hypothesis Testing

It is based on the idea that we can tell things about the population based on a sample taken from it.

5 Steps

  1. Hypothesis
  2. Significance
  3. Sample
  4. P-Value
  5. Decide

Inferential Statistics is based on the premise that you cannot prove something to be true, but you can disprove something by finding an exception.

You decide what you want to find evidence for (H1 – there is an effect), ie the alternative hypothesis, then set up the null hypothesis (H0 – there is no effect) and find evidence to disprove it.

This is a statistical method for testing whether the factor we are talking about has any effect on our observation

In other words, this helps us decide if

  • We should believe that the relationship we found in our sample is the same as the relationship we would find if we tested the population
  • OR We should believe that the relationship we found in our sample is a coincidence due to sampling error

Screenshot_112213_101535_PM

General Procedure :

  1. Based on the research question, develop the first statistical hypothesis, called the null hypothesis.
  2. Develop another hypothesis, called the alternative hypothesis.
  3. Decide the level of statistical significance (usually 0.05), which is also the probability of Type I error.
  4. Run NHST and determine the P-Value under the null hypothesis. Reject the null hypothesis if the P-Value is smaller than the level of statistical significance you decided.

Steps

Screenshot_112413_082922_AM

The hypothesis testing is trying to ask the question, is the observed difference between sample statistic(real world) and population statistic (theoretical world) just by chance, or is there a significant difference.

Null hypothesis says

  • “Nothing is happening”
  • “Status quo”
  • “There is no relationship”

Alternatives could be

  • One sided (greater than or less than) (One tailed)
  • Two sided (not equal) (Two tailed)

How to identify if what goes under Null hypothesis and what goes under Alternative Hypothesis ?

In the claim, the “equal to” will be under NH and the “not equal to” or “LT” or “GT” will be under AH.

Example :

Claim : The mean amount of waste recycled per day is more than 1 pound per person  (over the population)
Sample – 12 people . Found to be recycling avg 1.46 pounds with SD = 0.58. Alpha = 0.05

  • Hypothesis
    • H0 – Mean is LT or EQ 1 pound per day
    • H1 – Mean is GT 1 pound per day

Why does the claim go under H1 and not under H0 ? Thats because H0 always has an “equal to” under it

Screenshot_112413_083243_AM

Test Statistic

A test statistic is a quantity calculated from our sample of data. Its value is used to decide whether or not the null hypothesis should be rejected in our hypothesis test.

Screenshot_112413_084637_AM

Screenshot_112413_084152_AM

That is , if the P-value is small, then choose the alternative hypothesis. That means that “nothing going on” or “just by chance” is false and the result is statistically significant. Smaller the P-Value, the stronger the evidence against the null hypothesis.

For example, a t test or an ANOVA test for comparing the means is a good example of NHST

One-sided Test (aka One tailed test)
A one-sided test is a statistical hypothesis test in which the values for which we can reject the null hypothesis, H0 are located entirely in one tail of the probability distribution.

Eg : H1: µ < 50 would be a right tailed test and H1: µ > 50 would be a left tailed test

Two-Sided Test

A two-sided test is a statistical hypothesis test in which the values for which we can reject the null hypothesis, H0 are located in both tails of the probability distribution.

H1: µ not equal to 50

Type I Error
In a hypothesis test, a type I error occurs when the null hypothesis is rejected when it is in fact true; that is, H0 is wrongly rejected. This is often considered to be more serious

Type II Error
In a hypothesis test, a type II error occurs when the null hypothesis H0, is not rejected when it is in fact false

Power
The power of a statistical hypothesis test measures the test’s ability to reject the null hypothesis when it is actually false – that is, to make a correct decision.

In other words, the power of a hypothesis test is the probability of not committing a type II error

Disclaimer : These are my study notes – online – instead of on paper so that others can benefit. In the process I’ve have used some pictures / content from other original authors. All sources / original content publishers are listed below and they deserve credit for their work. No copyright violation intended.

References for these notes :

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: