A Guide to Doing Statistics in Second Language Research

Statistics can seem daunting, especially when navigating research in a second language (L2). This guide provides a comprehensive overview of statistical methods commonly used in second language research. It covers essential concepts, from foundational statistical ideas to practical applications using software like SPSS and R, empowering researchers to analyze data effectively and draw meaningful conclusions.

Statistical Ideas: Building a Foundation

Before diving into specific statistical tests, understanding fundamental concepts is crucial.

Getting Started with Statistical Software

SPSS: A widely used statistical package with a user-friendly graphical interface.
- Entering Data: Learn how to input data directly into SPSS or import it from other file formats (e.g., Excel, CSV).
- Data View and Variable View: Navigate the two primary views in SPSS to manage data and define variable properties.
- Saving Your Work: Understand how to save data files (.sav) and output files (.spv).
R: A powerful, open-source statistical computing environment and programming language.
- Downloading and Installing R: Detailed instructions for obtaining and setting up R on various operating systems.
- R Commander: A graphical user interface (GUI) for R that simplifies common statistical tasks.
- Working with Data: Learn how to enter, import, and save data within R and R Commander.
- R Environment: Understanding the R environment, calculator functions, objects in R, types of data in R, functions in R.

Preliminaries to Understanding Statistics

Levels of Measurement: Categorize variables as nominal, ordinal, interval, or ratio.
Dependent and Independent Variables: Identify the variables being measured and manipulated.
Hypothesis Testing: Learn about null and alternative hypotheses.
Populations vs. Samples: Understanding how we can infer information about a population through the sample.
Understanding Statistical Reporting: How to interpret a P-Value and the inner workings of statistical testing.
Parametric and Non-Parametric Statistics: Learn how they function and differ from each other.

Describing Data Numerically and Graphically

Descriptive statistics provide a summary of your data.

Numerical Summaries:
- Mean, Median, and Mode: Calculate measures of central tendency.
- Standard Deviation, Variance, and Standard Error: Calculate measure of dispersion.
- Confidence Intervals: Understand the range within which the true population parameter is likely to fall.
Graphic Summaries: Visualizing data distributions.
- Histograms: Examine the shape of data distribution.
Alt text: Example of a Histogram illustrating data distribution.
- Skewness and Kurtosis: Identify the symmetry and peakedness of distributions.
- Stem and Leaf Plots: A simple way to visualize data distribution.
- Quantile-Quantile Plots (Q-Q Plots): Assess normality by comparing sample quantiles to theoretical quantiles.
Alt text: Example of a Q-Q plot illustrating the comparison of sample quantiles.
Checking Homogeneity of Variance: Ensure that the variances of different groups are similar.

Changing the Way We Do Statistics: The New Statistics

Traditional statistical methods often rely heavily on p-values. Modern approaches, known as the “new statistics,” emphasize:

Confidence Intervals: Provide a range of plausible values for a parameter, offering a more informative measure of uncertainty than p-values.
Effect Sizes: Quantify the magnitude of an effect, independent of sample size. Common effect size measures include Cohen’s d and eta-squared.
Precision instead of Power: Shifting away from power analysis and emphasizing the use of techniques that increase the precision of research, such as using a narrow CI.

Statistical Tests: A Practical Guide

This section covers specific statistical tests frequently used in L2 research.

Choosing a Statistical Test

Selecting the appropriate statistical test is crucial for drawing valid conclusions. The choice depends on the type of data and the research question:

Correlation: Examines the relationship between two variables.
- Pearson Correlation: Measures the linear relationship between two continuous variables.
- Spearman Correlation: Measures the monotonic relationship between two variables (ordinal or non-normally distributed).
Multiple Regression: Predicts the value of a dependent variable based on multiple independent variables.
T-Tests: Compares the means of two groups.
- Independent Samples T-Test: Compares the means of two independent groups.
- Paired Samples T-Test: Compares the means of two related groups (e.g., pre-test and post-test scores).
Analysis of Variance (ANOVA): Compares the means of three or more groups.
- One-Way ANOVA: Compares the means of three or more independent groups on one factor.
Alt text: Visual Representation of the one-way ANOVA Model.
- Factorial ANOVA: Examines the effects of two or more independent variables on a dependent variable.
- Repeated-Measures ANOVA: Examines the effects of an independent variable on the same group of participants measured at multiple time points.
Analysis of Covariance (ANCOVA): A general linear model that blends ANOVA and regression. ANCOVA evaluates whether the means of a dependent variable (DV) are equal across levels of a categorical independent variable (IV), often called a treatment, while statistically controlling for the effects of other continuous variables that are not of primary interest, known as covariates (CVs).

Finding Relationships Using Correlation: Age of Learning

Correlation measures the strength and direction of a linear relationship between two variables.

Visual Inspection: Scatterplots Visual inspection using Scatterplots is a necessary first step.
Creating Scatterplots in SPSS: Adding a Regression or Loess Line & Viewing Simple Scatterplot Data by Categories.
Creating Scatterplots in R: Modifying a Scatterplot in R Console & Viewing Simple Scatterplot Data by Categories.
Assumptions of Parametric Statistics for Correlation: Effect Size and Confidence Intervals for Correlation.
Calculating Correlation Coefficients and Confidence Intervals: Robust Correlations.

Looking for Groups of Explanatory Variables Through Multiple Regression: Predicting Important Factors in First Grade Reading

Multiple regression is used to predict a dependent variable from a set of independent variables.

Understanding Regression Design Sequential (Hierarchical) Regression.
Visualizing Multiple Relationships: Graphs in R for Understanding Complex Relationships.
Assumptions of Multiple Regression.
Performing a Multiple Regression: Examining Regression Assumptions.
Reporting the Results of a Regression Analysis.

Looking for Differences between Two Means with T-Tests: Think-Aloud Methodology and Teaching Sarcasm

T-Tests are used to determine if there is a significant difference between the means of two groups.

Types of T-Tests: Choosing a T-Test.
Data Summaries and Numerical Inspection: Visual Inspection: Box Plots.
Assumptions of T-Tests Data Formatting for Tests of Group Differences.
The Independent Samples T-Test: Performing an Independent Samples T-Test in SPSS & Performing an Independent Samples T-Test in R.
The Paired Samples T-Test: Performing a Paired Samples T-Test in SPSS & Performing a Paired Samples T-Test in R.

Looking for Group Differences with a One-Way Analysis of Variance: Effects of Planning Time

One-Way ANOVA is used to compare the means of multiple groups.

Understanding the Analysis of Variance Design.
The Topic of Chapter 9: Numerical and Visual Inspection of the Data.
Assumptions for an Analysis of Variance.
One-Way Analysis of Variance: Omnibus Tests and Post-Hoc Tests.
Performing an Omnibus One-Way Analysis of Variance Test in SPSS with Subsequent Post-Hoc Tests.
Performing an Omnibus One-Way Analysis of Variance Test in R with Subsequent Post-Hoc Tests.

Looking for Group Differences with Factorial Analysis of Variance When there is More than One Independent Variable: Learning with Music

Extending the ANOVA to having more than one independent variable to analyze.

Analysis of Variance Design: Analysis of Variance Design: Interaction.
Numerical and Visual Inspection.
Assumptions of a Factorial Analysis of Variance.
Getting Ready to Perform a Factorial Analysis of Variance: Making Sure Your Data is in the Correct Format for a Factorial Analysis of Variance.
Factorial Analysis of Variance: Extending Analyses to More than One Independent Variable.

Looking for Group Differences When the Same People are Tested More than Once: Repeated-Measures Analysis of Variance

Examining when the same people are tested more than once.

Understanding Repeated-Measures Analysis of Variance Designs.
Arranging the Data for a Repeated-Measures Analysis of Variance.
Visualizing Repeated-Measures Data.
Repeated-Measures Analysis of Variance Assumptions.
Performing a Repeated-Measures Analysis of Variance with the Least-Squares Approach.

Appendices

Appendix A: Doing Things in R
Glossary
Bibliography
Author index
R commands
Subject index

Conclusion

Statistical analysis is a powerful tool for understanding data in second language research. By grounding yourself in statistical principles, you can perform appropriate analysis, clearly interpret results, and rigorously support claims. Whether you prefer SPSS or R, mastering these statistical techniques will significantly enhance your ability to contribute to this evolving field.