Sampling Methods

Instructor: So far, I've talked to you about populations and samples. Now, we're going to talk about how we choose what goes into our samples. And that is the question, how do we select from the population? What goes into our sample? 

And now, before I start talking about different sampling methods, I'm going to talk about different ways of representing population and sample sizes. Basically, if you see a capital N, that means I'm talking about the size of a population. And if you see a lower case n, that means I'm talking about the size of a sample. And I'm going to give you an example of that now just so we don't get confused in the future. 

So let's say you want to find the average GPA of a student at your university. Your university has 20,000 students, and you select 100 of those students to ask them about their GPAs. What are capital N and lowercase n, and what are the population and sample sizes? 

In this case, the size of the population you're drawing from is 20,000 because I'm talking about those 20,000 students. And lowercase n is 100 because we're taking a sample of 100 students. So that is the distinction between the capital and lowercase n. 

Now, the goal of sampling is to create a sample that is representative of the population it is being drawn from. Ideally, we would just do tests and collect data from populations, but the problem with that is sometimes they're so large that they can be millions and billions of people. That might be impossible, and that's why we take samples. 

Now, the most basic sampling method is called "simple random sampling." You've probably heard of it already. When performing simple random sampling, every member of the population has an equal chance of being selected from your sample. Arguably, the best sampling method because your sample is almost guaranteed to be representative of the population, but it's hardly ever used because it's too impractical. I mean, in order to do this properly, you have to know who every single person in your population is. And that's often impossible. 

So we might do something called "stratified sampling." When performing stratified sampling, the population is split into two or more non-overlapping groups called "strata." And then sample random sampling is done on each group to form a sample. So for example, you might split a population of students into men and women and then sample from each of those two groups. This might allow us to collect the same amount of information as sample random sampling but use less people because we've already made the distinction between men and women. 

There's also systematic sampling. When performing systematic sampling, every nth individual from the population is placed into the sample. And that might sound kind of weird, so let me show you. 

For example, if you add every seventh individual to walk out of a supermarket to your sample, you are performing systematic sampling. That way, you don't need to know everyone in the population of supermarket shoppers. You can just add every seventh person, and then you're done. And that's a lot easier. This is convenient when you cannot obtain a frame, which is a list of everyone in the population. 

And also, there is convenient sampling-- the easiest, laziest, and also kind of the worst way of sampling. But when performing convenience sampling, easily-obtained individuals are placed into the samples. Simply put, in this kind of sampling, you pick the easiest way of getting people into your sample. This kind of sampling could also be called "voluntary response sampling" because individuals often select to be a part of it, like you'd call them on the phone or send them a letter or something like that. But this could also be a problem, because there may be a difference between people who choose to participate and people who choose not to. 

Now, remember, the goal of sampling is to create a sample that is representative of the population it is being drawn from. That's all. 

And that's it for sampling. You should know the difference between capital and lowercase n's. They represent population and sample sizes. And also understand the distinction between the four different kinds of sampling.