# Week 4: 2 Oct – Understanding T-test II

Understanding T-test II

In the last post, we saw what a T distribution and Normal distribution is. Now, let us lok at some key features

In a normal distribution, data tends to cluster around the mean (68% of data lies within i standard deviation around the mean and 99% of data lies within 3 standard deviations). As one moves farther away from the mean, the frequency of data points decreases exponentially. This phenomenon implies that the probability of an event occurring is closely tied to its proximity to the mean value. This correlation is of paramount importance because it underscores the mean’s effectiveness as a precise descriptor of the distribution.

Understanding the mean value provides valuable insights into the population and its behavior. This is precisely why normality is crucial for conducting a T-Test. When dealing with a population that does not exhibit a normal distribution, there is no assurance that the population mean carries inherent significance on its own. Consequently, knowledge of the mean may provide little to no meaningful information about the dataset. In such cases, conducting a t-test becomes a futile exercise because determining whether the difference in means is statistically significant offers no meaningful insights when the means themselves lack significance.

Central Limit Theorem

The central limit theorem states that as we sample data from any population, regardless of the population distribution, the samples’ means tends towards a normal distribution as the sample size increases, i.e, Given a sufficiently large sample size from ay distribution, the sample means will be normally distributed

The Central Limit Theorem plays a pivotal role in the widespread application of T-tests. As previously discussed, T-tests are most effective when applied to populations that exhibit a normal distribution. However, according the Central Limit Theorem, for any given population, if we collect a sufficiently large number of random samples from it, the cumulative distribution of sample means tends to follow a normal distribution. This phenomenon allows us to apply T-tests to the derived sample population, even when the original population may not be normally distributed.