[MUSIC PLAYING] - Scientists aim to accomplish two goals with their data. First, they want to organize and describe the data in meaningful ways. Second, they want to use the data to make inferences or predictions about a population of interest. To achieve the first goal, scientists rely on descriptive statistics. 

Descriptive statistics summarize a data set. They most often involve summarizing the data as a measure of central tendency, a single number that represents the entire data set. Common ones are the mean or the average, the median or the middle score, and the mode or the most common score. 

The mean or the average is calculated by adding all of the values in a data set and then dividing the sum by the total number of values. So if you had these five exam scores, 67, 72, 72, 91, and 77, we would add them all up and then divide them by 5. And the mean would be 75.8. 

The median is the value in the data set where half of the values fall below that value and half of them are above it. To calculate the median, values need to be put in order from highest to lowest. In this data set, the median is 72 because there are two exam scores below this number and two above it. 

The last measure of central tendency is the mode, which is the most frequently occurring value in the data set. In this case, the mode is also 72 because it occurs twice in this data set. In addition to using measures of central tendency as a descriptive score, we can also summarize our data using variance or how similar or different the values are from one another. 

One measure of variance is the range, the numerical difference between the highest and lowest values in a data set. A more complex measure of variance is standard deviation, a measure of how much values differ from the mean. Variance is often displayed as a normal distribution, a symmetrical, bell-shaped distribution in which most values fall near the mean. About 68% are within one standard deviation of that mean. 

For example, grades generally follow a bell-shaped distribution. Some grades are very high, and others are very low. But most are around the middle of the range. 

Using descriptive statistics is a good way for us to summarize the data. But if we want to make predictions based off of our data, we need to use inferential statistics to generalize conclusions from the sample to a large population. Different types of inferential statistics are used for different purposes. For example, the correlation coefficient measures the relationship between variables, while t-tests measure differences between groups. How confident we are in the inferences that we make is determined by calculating statistical significance, which indicates the probability that a result occurred due to chance. 

As a general rule, a p-value, or calculated probability, of 0.05 is the cutoff for a statistically significant result. That 0.05 is equal to a 95% confidence that a result is not due to error. However, just because a result is statistically significant does not mean that it's practically significant. Practical significance or whether the result is useful in the real world is determined by effect size, a measure of the magnitude of the findings. In large samples, findings can be statistically significant without representing a large enough effect to have a practical significance. 

[MUSIC PLAYING]