Week 2 – Sept 22: T-Test

T Test

Today, Im exploring T-test and its significance. T test is a type of hypothesis testing and its a very important tool in data science. A hypothesis is any testable assupmtion about the data set and hypothesis testing allows us to validate these assumptions

T-test is predominantly used to understand whether the differrence in means of two datasets have any statistical significance. For T test to provide any meaningful insights, the datasets has to satisfy the following conditions

      • The data sets must be normally distributed, i.e, the shape must resemble a bell curve to an extent
      • are independent and continuous, i.e., the measurement scale for data should follow a continuous pattern.
      • Variance of data in both sample groups is similar, i.e., samples have almost equal standard deviation

Hypotheses:

    • H0: There is a significant differrence between the means of the data sets
    • H1: There is no significant differrence between the means of the data sets

T Test code

Results:

Reject the null hypothesis. There is a significant difference between the datasets.
T-statistic: -8.586734600367794
P-value: 1.960253729590773e-17

Note:

This T-test does not provide any meaningful insights as two of the requisite conditions are violates

    1. The datasets are not normally distributes
    2. the variances of the datasets are not quite similar

Leave a Reply

Your email address will not be published. Required fields are marked *