9 Task Four - Paired t-tests

9.1 Objective:

In the previous exercise {Chapter 8}, you used R and the lm() function to perform a two-sample t-test. In this assignment, you’ll extend that work by conducting a paired t-test using the Darwin plant height dataset as an example:

Your task is to:

Import and inspect the data
Conduct a paired t-test
Explore the relationship between paired observations
Visualise and interpret your results
Present your analysis in a Quarto report {Chapter 20}

9.2 Learning Outcomes

By completing this assignment, you will:

Apply the concept of paired data to control for within-pair variation.
Understand the equivalence between a paired t-test and a linear model with a pair factor.
Build confidence in constructing reproducible analysis workflows using Quarto.
Communicate results clearly with integrated text, code, and visuals.

9.3 Task Outline

9.3.1 Set up your Quarto document

Create a new .qmd file with a clear title, author, and date.

---
title: "My Report"
author: "Your Name"
format: html
  embed-resources: true
execute:
  echo: false
  warning: false
  message: false

---

Load packages
Read in the dataset:

```{r}
#| include: false

library(tidyverse)
library(here)
library(rstatix)     # simple t-tests with tidy output
library(broom)       # tidy model summaries (optional)
library(sandwich)    # robust (heteroskedasticity-consistent) variance estimators
library(lmtest)      # coeftest/coefci for robust inference
library(performance) # easy model checks

darwin <- read_csv(here("data", "darwin.csv"))
```

Include a brief introduction: What question are we testing?

9.4 Explore and understand the data

Before analysis, ensure you understand what each variable represents. Check-in:

How are “Cross” and “Self” plants related within pairs?
Why might it make sense to treat “pair” as a factor rather than a numeric variable?

Question: Does this design look like independent samples, or are observations linked in some way?

9.5 Carry out a paired t-test

Test whether there is a significant difference between pairs in plant height between the two pollination types.

Think about:

How could you account for pairs in your analysis?
How could you model this using lm()?

You can add covariates to a linear model by including them on the right-hand side of the formula (after ~ ) separated with + e.g. lm(variable ~ predictor_1 + predictor_2)

Note

This model analyses height based on type while controlling for pair. Including pair will produce coefficients for every pair, but for this assignment you can ignore the individual pair coefficients and focus on the main effect of type — this tells you the average difference in height between pairs of Cross and Self plants.

9.6 Examine the correlation between paired measurements

Run a correlation test between the Cross and Self values.

Check-in:

Is there evidence that pairs are correlated?
How does this justify the use of a paired rather than independent t-test?

Correlation with rstatix

darwin |> 
  pivot_wider(names_from = type, 
              values_from = height) |> 
  rstatix::cor_test(`Self`,`Cross`)

9.7 Visualise your data

Create a plot that includes:

Raw data (all individual points),
(Optionally) Paired connections (lines between pairs),
A summary geom (mean ± SE/95% CI or boxplots).

Jitter plots

If you jitter your points, make sure your lines use the same jitter so they connect correctly (you can share a position_jitter object between geom_line() and geom_point()).

See here

9.8 Summarise your findings

In 4–6 sentences, interpret your analysis:

Write a short results summary - it should include estimates of difference, uncertainty values, test statistics and p-values.
How does the estimate of height difference compare to the two-sample t-test in the previous chapter {Chapter 8}?
What is the strength of correlation between paired observations?
How do your visuals complement your findings?

Discussion point

Discuss with your classmates - what effect does adding pair have about our confidence on the effect of inbreeding?
Should we exclude variables that are a part of our study design if they make us less confident about the results?

9.9 Rendering your Quarto doc

After completing your analysis, render your .qmd file to HTML to produce a polished report. Check the following:

Code runs top to bottom – all chunks execute without errors.
Figures and tables appear correctly – plots are visible, captions make sense.
Formatting is clear – headers, text, and code chunks are readable.
Interpretation is included – your narrative explains results, not just code output.
- Include a figure
- Include a short results section (with a model write-up and correlation test)

My document won’t render

If the document doesn’t render, check that all packages are loaded and all objects are created in previous chunks.

Check out {Chapter 20} for more help.