Welcome to chapter 2 of the National Social Norms Center’s informational series on sampling for your social norms campaign!  In this chapter, you’ll learn the skills needed to generate a convenience sample using your survey data within the SPSS data analysis software.  Sometimes due to issues such as time restraints, cost, or complexity, we aren’t able to collect data from every single person in a population. In order to help us gain a deeper understanding, we use samples in surveys to try to develop representative estimates of what is true in a population. In this exercise, Dr. Hembroff will guide us through gathering convenience samples and help us to understand how effective convenience samples of different sizes are at estimating what is true in a given population.

Watch: Comparing Different Types of Samples. Questions? E-mail us at nsnc@msu.edu. (13 minutes)

 

To prepare for this chapter in the series, copy and paste the data table you generated in chapter 1 into a new blank document.  One great feature of SPSS is that it allows you to look at a subset of your entire data file, or draw a sample from it. To explore this feature, we will create a convenience sample of 1,000 individuals from our population of 250,000 and see how well that sample does at estimating what we know is true.

  • Step 1.1 – Steps to Select your Convenience Sample:
    • Select ‘data’ from the toolbar
    • From the drop-down bar, click ‘select cases’
    • Select ‘Based on time or case range’
    • Select ‘range’ and under ‘first case’ type 1, under ‘last case’ type 1,000.  Then click ‘continue’
    • Select ‘filter out unselected cases’
    • Select ‘ok’

From here, SPSS will show you which cases went unselected by putting a diagonal line through the number representing them.  You now have a sample of 1,000 cases

  • Step 1.2 – Analyzing Your Sample:
    • Select ‘analyze’ from the toolbar
    • From the drop-down menu, click ‘descriptive statistics,’ then ‘frequencies’
    • Make sure you have the same variable from chapter 1 selected, then select ‘ok’

You should now see a newly generated table in your ‘output’ window.  The results from our convenience sample are quite different from what we know is true of our population.  This convenience sample overestimated the number of respondents who said they never refuse, and overestimated the amount who answered that they refuse one or more times.  We can now say that in this case, a convenience sample of 1,000 was inaccurate and wouldn’t be useful to make claims or to learn something from.

Next, let’s see what happens when we choose a larger convenience sample.

  • Step 2.1 – Next Steps – Making a Bigger Sample:
    • Choose ‘data’ from the toolbar, and click ‘select cases’ from the drop-down menu
    • Make sure ‘Based on time or case range’ is still selected, then click ‘range’
    • Now, you’ll want to change your ‘last case’ to 10,000, rather than 1,000
    • Make sure ‘filter out unselected cases’ is still selected, then hit ‘okay’
  • Step 2.2 – Analyzing your Bigger Sample:
    • Select ‘analyze’ from the toolbar, then under ‘descriptive statistics’ select ‘frequencies’
    • Make sure that the selected variable remains the same as the one you used in all of the previous steps, then select ‘ok’

Once again, we see that the results differ greatly from what we know is true of our population.  So what’s the problem? – Some would say that despite being a sample of 10,000, the results are still expected to be inaccurate due to it being such a small proportion of the total population (250,000) it is intended to represent

Now, let’s try an even larger sample!

  • Step 3.1 – Selecting an even larger sample:
    • Repeat the steps from part 2, except where you indicated 10,000 under ‘last case,’ now write 100,000
  • Step 3.2 – Frequency Distribution on 100,000 Sample:
    • Repeat the analysis steps from part 2

Although these new answers aren’t as drastically different from the population as the last two steps, they still aren’t representative enough of what we know is true.

What happens if we select the same sample size from a different part of the data set?

  • Step 4.1 – Selecting a Same Size Sample from a Different Chunk of the Data Set:
    • Repeat the steps from part 1, except under ‘first case’ select 100,001 and under ‘last case’ select 200,000
  • Step 4.2 – Frequency Distribution:
    • Repeat the analysis steps from the previous step

Although our sample isn’t perfect, we are now beginning to get closer to answers that are representative of the population.  Sometimes we don’t know exactly where in the population to look in order to find a representative sample of people, which is why many prefer to use random samples.  In chapter 3 of this series, you’ll explore using SPSS to generate random samples from your total population.  You’ll also learn how to analyze your results to find out just how representative they are of your population!