In our final chapter of the Sampling Demonstration Series, Dr. Larry Hembroff explains how we handle the daunting scenario of samples choosing not to respond. Over the years, there has been a decline in response rates. Can we only get accurate data from large samples? What do we do when we have a low response rate? Will the data from the responders accurately represent the non-responders? Can we be confident in our data in non-response situations? Find out all of these answers and how to work with this data in the conclusion of our Sampling Demonstration Series: Non-Response is a Random Process.

Questions? E-mail us at nsnc@msu.edu.  (18 minutes)

 

Welcome to Chapter 4 of the National Social Norms Center’s informational series on sampling for your social norms campaign!  In the previous chapter, you learned how to use SPSS to generate random samples for the purpose of seeing how representative they were of what you already knew was true of your population.  When you analyzed your random samples, you most likely found that they were much more representative of your population than convenience samples. 

In this chapter, we’ll touch on non-response in your research.  In any study, it’s pretty rare that 100% of the participants in your sample will respond.  The percent of people selected in your sample that actually do respond is called your response rate.  The lower your response rate is, the greater the possibility that the difference between respondents and non-respondents might make your sample not representative of the population.  When you begin a study, you typically have a number set in your mind of how many responses you want to build your data set. Due to the low average rate of response, many researchers start off with substantially larger samples to control for this issue.

In SPSS, let’s see how accurate a random sample of 5,000 that gets a 20% response rate might be for our population.  You’ll want to have a blank output document open for your results to export to.

  • Step 1.1 – Selecting Your Sample
    • Under “data” in the toolbar, click “select cases”
    • Select “random sample of cases” and click “exactly”
    • Fill in the blanks as “5000 cases from the first 250000 cases,” then hit “continue”
    • Hit “ok” to save your responses and exit the data tool
  • Step 2.2 – Analyzing Your Variable
    • Under ‘analyze’ in the toolbar, select ‘descriptive statistics’ then ‘frequencies’
    • Make sure the variable selected is the same as what you have used in all the previous steps and chapters

You should now see an exported data table for your variable and random sample in the blank output document you created.  From this data set, you’ll see that the results achieved from the random sample of 5,000 individuals was closely representative of what was known to be true of the population.

Now, let’s simulate a 20% response rate for this sample.  To accomplish this, we’re going to change our data settings to delete our unselected responses and save our selected responses to a new file named “Sample5000.”

  • Step 2.1 – Creating a New File for Selected Cases
    • Under “data” in the toolbar, click “select cases”
    • Select “random sample of cases” and click “exactly”
    • Make sure your blanks still say “5000 cases from the first 250000 cases,” then hit “continue”
    • Then, under “output,” select “copy selected cases to a new data set” and fill in the name as “Sample5000”
    • Hit “ok” to save your changes and exit the tool

Now that we have our new file of 5000 participants, in order to simulate getting a response rate of 20% we’re going to select a new random sample of 1,000 participants from our new randomly sampled data  set.

  • Step 3.1 – A Random Sample From a Random Sample
    • Under “data” in the toolbar, click “select cases”
    • Select “random sample of cases” and click “exactly”
    • Fill in the blanks as “1000 cases from the first 5000 cases,” then hit “continue”
    • Make sure “filter out the unselected cases” is selected
    • Hit “ok” to save your changes and exit the tool
  • Step 3.2 – Analyzing Your Variable
    • Under ‘analyze’ in the toolbar, select ‘descriptive statistics’ then ‘frequencies’
    • Make sure the variable selected is the same as what you have used in all the previous steps and chapters

You should now see a new exported data table for your variable and new random sample in your output document.  From this data set, you’ll see that the results achieved from a random sample of 1,000 individuals out of a random sample of 5,000 participants from your population of 250,000 was closely representative of what was known to be true of that population.

Not only should a good sample represent the population when it comes to a characteristic we know (the variable we’ve used throughout the series), it should be representative as well when it comes to something we don’t know.  This is the reason we conduct sample surveys. If we already knew what was true in the population, there would be no need for a survey.  

Let’s take a look at what happens when we look at the results of a low response rate sample for multiple variables. 

  • Step 4.1 – Selecting Your Variables
    • Under ‘analyze’ in the toolbar, select ‘descriptive statistics’ then ‘frequencies’
    • Select three additional variables (VAR1, VAR3, and VAR7)
    • Hit “ok” to save your changes and exit the tool

In your output document, you should now see separate tables for each of your variables.  

  • Step 4.2 – Selecting Your Cases
    • Return to the file that contains your entire population
    • Under “data” in the toolbar, click “select cases”
    • Hit “reset” at the bottom of the tool
    • Select “ok” to save your changes and exit the tool
  • Step 4.3 – Analyzing Your Variables
    • Under ‘analyze’ in the toolbar, select ‘descriptive statistics’ then ‘frequencies’
    • Select three additional variables (VAR1, VAR3, and VAR7)
    • Hit “ok” to save your changes and exit the tool

In order to make it easier to compare your results, copy and paste all of the tables in your output document into the document with your reference table. 

Amazingly, you should see that even when you randomly sample for multiple variables, your results are closely representative of what you know is true of the population!