SPSS eTutor: Cleaning and Checking Your SPSS Database

A Brief Guidebook for using SPSS at Empire State College

Cleaning and Checking Your SPSS Database

Once you have entered your data, you need to check for errors. Run a frequency distribution on each of your variables. Does all of the data fall within the expected range? For example, if you have a variable with a Likert scale ranging from 1 – 5, all of your values should be in this range. Are they?

To run a frequency distribution, click AnalyzeDescriptive Statistics, then Frequencies. Then click on the variable name that you are checking and move it to the Variable box. For this example, I am checking the variable “Happy” from the General Social Survey. Your screen should look like this:

Click on Statistics, and then Minimum and Maximum. Click Continue and OK. Your screen should look like this:

This variable asks the respondent’s general level of happiness:

Obviously, your data should include only 0, 1, 2, 3, 8, and 9. I have altered the database so that there are several errors. There are obviously two mistakes.

How do you find the mistakes? You can sort your cases by either ascending or descending value. Click on DataSort Cases. Then click the name of the variable that you know has an error. (“happy”) and put it in the Sort By box. Since the values are at the top of the expected range, I have decided to sort by “descending”. Your screen should look like this:

Click OK. Make sure that you are still in the Data View tab (you don’t want to be looking at the output). Your cases with errors are near the top of the lists (the ’10’ and the ‘4’).

If this was your own database, you would look up the case and correct the error. If you do not have the information necessary to identify the case with the error, delete the value and SPSS will treat it as a missing value.

Creative Commons License


SPSS eTutor by Dee Britton is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.