SPSS for Your Dissertation: A Practical Guide for Graduate Students

SPSS (Statistical Package for the Social Sciences) remains one of the most widely used statistical software tools in dissertation research, particularly in education, psychology, public health, nursing, and the social sciences. If your dissertation involves quantitative data analysis, there is a strong chance your committee expects you to use SPSS – or at least to understand the output it produces.

This guide walks you through the practical aspects of using SPSS for your dissertation, from setting up your data file to running common analyses and reporting results in APA format. This is not a statistics textbook. It is a field guide for the doctoral student sitting in front of SPSS for the first time with real data and a committee expecting results.

Setting Up Your Data File

Before running any analysis, your data must be properly structured in SPSS. Poor data setup is one of the most common sources of errors in dissertation research, and cleaning up a poorly structured data file after the fact is far more time-consuming than setting it up correctly from the start.

Variable View: Defining Your Variables

Every column in your SPSS data file represents a variable. Before entering data, switch to Variable View and define each variable:

Name. Use short, descriptive names without spaces. “Age” is better than “Participant_Age_In_Years.” SPSS limits variable names to 64 characters, but shorter is better for readability.

Type. Numeric for quantitative data (age, scores, Likert responses). String for text data (open-ended responses, participant IDs that contain letters).

Label. Use the Label field for the full, descriptive name of the variable. “Self-Efficacy Pre-Test Score” goes in the Label field while “SE_Pre” goes in the Name field. Labels appear in your output and make results much easier to interpret.

Values. For categorical and ordinal variables, assign numeric codes and labels. For a gender variable coded 1 and 2, assign Value Labels: 1 = Male, 2 = Female. For Likert scale items, define each point: 1 = Strongly Disagree, 2 = Disagree, 3 = Neutral, 4 = Agree, 5 = Strongly Agree.

Measure. Specify the level of measurement: Scale (interval or ratio data), Ordinal (ranked categories), or Nominal (unranked categories). This affects which analyses SPSS allows you to run.

Data View: Entering and Importing Data

Each row in Data View represents one case (usually one participant). If you collected data through an online survey tool like Qualtrics or Google Forms, export it as a CSV file and import it into SPSS using File then Open then Data. SPSS will walk you through the import wizard.

After importing, always check:

  • The correct number of cases imported (compare to your survey response count)
  • Variables are assigned the correct type and measure
  • Missing data is coded consistently (SPSS recognizes system-missing values automatically for numeric variables, displayed as a period)

Data Cleaning: The Step Most Students Rush

Data cleaning is not glamorous, but it is where the integrity of your entire analysis is established. Skipping or rushing this step is one of the most common mistakes in dissertation research.

Check for Out-of-Range Values

Run Descriptive Statistics (Analyze then Descriptive Statistics then Descriptives) on all continuous variables. Check minimums and maximums. If your Likert scale runs 1 to 5 and you see a minimum of 0 or a maximum of 55, you have a data entry error.

For categorical variables, run Frequencies (Analyze then Descriptive Statistics then Frequencies) and check for unexpected categories. If your gender variable should have values of 1 and 2 but also shows a 3 or a 9, investigate.

Handle Missing Data

Decide on your approach to missing data before running any analyses, and document your decision in your methodology chapter. Common approaches:

Listwise deletion. Exclude any case with missing data on any variable in the analysis. This is the SPSS default. It is simple but can dramatically reduce your sample size.

Pairwise deletion. Exclude cases only for the specific analysis where data is missing. Preserves more data but can produce inconsistent sample sizes across analyses.

Multiple imputation. SPSS can estimate missing values based on patterns in your existing data (Analyze then Multiple Imputation). This is the most statistically sound approach for data that is missing at random, but it adds complexity that your committee may or may not require.

Check Reliability of Scales

If you used established survey instruments, run Cronbach’s alpha (Analyze then Scale then Reliability Analysis) for each scale and subscale. Report these values in your results chapter. A Cronbach’s alpha of 0.70 or above is generally considered acceptable, though the threshold varies by discipline.

Common Dissertation Analyses in SPSS

The following sections cover the analyses most frequently used in quantitative dissertations. For each, I include the SPSS menu path, the key assumptions to check, and the results you need to report.

Descriptive Statistics

Nearly every quantitative dissertation begins with descriptive statistics summarizing the sample and key variables.

Menu path: Analyze then Descriptive Statistics then Descriptives (for continuous variables) or Frequencies (for categorical variables).

What to report: Mean, standard deviation, and range for continuous variables. Frequency counts and percentages for categorical variables. Present these in a table formatted according to your style guide (APA, Chicago, or your university’s requirements).

Independent Samples t-Test

Used to compare means between two groups (e.g., treatment vs. control, male vs. female).

Menu path: Analyze then Compare Means then Independent-Samples T Test.

Assumptions to check:

  • Continuous dependent variable
  • Two independent groups
  • Normal distribution of the dependent variable in each group (check with Shapiro-Wilk test or visual inspection of histograms)
  • Homogeneity of variance (SPSS reports Levene’s test automatically – use the “Equal variances assumed” row if Levene’s test is not significant, “Equal variances not assumed” if it is)

What to report: Group means and standard deviations, t-statistic, degrees of freedom, p-value, and effect size (Cohen’s d). SPSS does not calculate Cohen’s d automatically – compute it manually or use the means and pooled standard deviation.

One-Way ANOVA

Used to compare means across three or more groups.

Menu path: Analyze then Compare Means then One-Way ANOVA. Click “Post Hoc” to add pairwise comparison tests (Tukey HSD is the most common).

Assumptions to check:

  • Continuous dependent variable
  • Three or more independent groups
  • Normal distribution within each group
  • Homogeneity of variance (check Levene’s test – if violated, use Welch’s ANOVA instead)

What to report: F-statistic, degrees of freedom (between and within groups), p-value, effect size (eta-squared or partial eta-squared), and post hoc comparison results if the overall F is significant.

Chi-Square Test of Independence

Used to test the association between two categorical variables.

Menu path: Analyze then Descriptive Statistics then Crosstabs. Click “Statistics” and check “Chi-square.”

Assumptions to check:

  • Both variables are categorical
  • Expected cell frequencies are 5 or greater in at least 80 percent of cells (SPSS reports this – if violated, consider Fisher’s Exact Test or collapsing categories)

What to report: Chi-square statistic, degrees of freedom, p-value, and effect size (Cramer’s V for tables larger than 2x2, Phi for 2x2 tables).

Correlation

Used to measure the strength and direction of the linear relationship between two continuous variables.

Menu path: Analyze then Correlate then Bivariate.

What to report: Pearson’s r (or Spearman’s rho for ordinal data), p-value, and sample size. Present correlation matrices in a table when reporting relationships among multiple variables.

Multiple Linear Regression

Used to predict a continuous outcome from multiple predictor variables.

Menu path: Analyze then Regression then Linear.

Assumptions to check:

  • Linear relationship between each predictor and the outcome
  • Independence of residuals (Durbin-Watson statistic, reported automatically)
  • Homoscedasticity (plot standardized residuals against predicted values – look for a random scatter)
  • No multicollinearity (check VIF values – each should be below 10, ideally below 5)
  • Normally distributed residuals (check histogram and P-P plot of residuals)

What to report: R-squared and adjusted R-squared, F-statistic for the overall model, and for each predictor: unstandardized coefficient (B), standard error, standardized coefficient (Beta), t-statistic, and p-value. Present in a regression table.

Reporting SPSS Results in APA Format

Your committee expects results reported in a specific format. APA 7th edition is the most common in social science dissertations. Key formatting conventions:

Statistical symbols in italics. Italicize statistical symbols in text: M, SD, t, F, p, r, R-squared. Do not italicize Greek letters (alpha, beta, chi-square).

Exact p-values. Report exact p-values to two or three decimal places (p = .034, not p < .05), unless the value is less than .001, in which case report p < .001.

Effect sizes. Always report effect sizes alongside significance tests. Significance tells you whether an effect exists; effect size tells you whether it matters.

Tables and figures. Present complex results in tables rather than in running text. Every table needs a number, a title, and a note explaining abbreviations. Follow your style guide’s table formatting requirements exactly.

Common Mistakes to Avoid

Running analyses before cleaning data. Results from dirty data are meaningless. Always clean first.

Ignoring assumption violations. If your data violates the assumptions of a parametric test, your results may be unreliable. Check assumptions and use alternative tests (nonparametric equivalents) when necessary.

Reporting only significant results. Report all results, significant or not. Selective reporting is a form of bias and committees will ask about analyses that seem to be missing.

Not saving your syntax. Use SPSS syntax (or at minimum the output log) to document every analysis you run. Your committee may ask you to reproduce or modify an analysis, and reconstructing it from memory is error-prone. Paste syntax into a syntax file and save it alongside your data file.

Over-interpreting results. Correlation does not imply causation. A non-significant result does not mean there is no effect – it means you did not detect one with your sample and methodology. Be precise in your language and your committee will respect your rigor.

Getting Help

If you are stuck on a specific analysis, your university’s graduate research office or statistics consulting center may offer free or low-cost support. Many universities also provide SPSS workshops specifically for dissertation students. Our Data Analysis guide covers the broader analytical decision-making process, and the Subthesis ecosystem offers calculators and tools for common statistical operations.

More Articles