4 Worked Example 2 - Nested design

This experiment involved a simple one-way anova with 3 treatments given to 6 rats, the researchers the measured glycogen levels in their liver. The analysis was complicated by the fact that three preparations were taken from the liver of each rat, and two readings of glycogen content were taken from each preparation. This generated 6 pseudoreplicates per rat to give a total of 36 readings in all.

Clearly, it would be a mistake to analyse these data as if they were a straightforward one-way anova, because that would give us 33 degrees of freedom for error. In fact, since there are only two rats in each treatment, we have only one degree of freedom per treatment, giving a total of 3 d.f. for error.

The variance is likely to be different at each level of this nested analysis because:

the readings differ because of variation in the glycogen detection method within each liver sample (measurement error); the pieces of liver may differ because of heterogeneity in the distribution of glycogen within the liver of a single rat; the rats will differ from one another in their glycogen levels because of sex, age, size, genotype, etc.; rats allocated different experimental treatments may differ as a result of the fixed effects of treatment. If all we want to test is whether the experimental treatments have affected the glycogen levels, then we are not interested in liver bits within rat’s livers, or in preparations within liver bits. We could combine all the pseudoreplicates together, and analyse the 6 averages. This would have the virtue of showing what a tiny experiment this really was. This latter approach also ignores the nested sources of uncertainties. Instead we will use a linear mixed model.

rats <- readRDS("rats.rds")
rats %>% 
  aggregate(Glycogen ~ Rat + Treatment + Liver, data = ., mean)
   Glycogen Rat Treatment Liver
1       131   1         1     1
2       130   1         1     1
3       131   1         1     2
4       125   1         1     2
5       136   1         1     3
6       142   1         1     3
7       150   2         1     1
8       148   2         1     1
9       140   2         1     2
10      143   2         1     2
11      160   2         1     3
12      150   2         1     3
13      157   3         2     1
14      145   3         2     1
15      154   3         2     2
16      142   3         2     2
17      147   3         2     3
18      153   3         2     3
19      151   4         2     1
20      155   4         2     1
21      147   4         2     2
22      147   4         2     2
23      162   4         2     3
24      152   4         2     3
25      134   5         3     1
26      125   5         3     1
27      138   5         3     2
28      138   5         3     2
29      135   5         3     3
30      136   5         3     3
31      138   6         3     1
32      140   6         3     1
33      139   6         3     2
34      138   6         3     2
35      134   6         3     3
36      127   6         3     3
rats_lmer.1 <- lmer(Glycogen ~ Treatment + (1 | Rat) + (1 | Liver), data = rats)
summary(rats_lmer.1)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Glycogen ~ Treatment + (1 | Rat) + (1 | Liver)
##    Data: rats
## 
## REML criterion at convergence: 221.9
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.79334 -0.66536  0.01792  0.59281  2.05206 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Rat      (Intercept) 39.187   6.260   
##  Liver    (Intercept)  2.168   1.472   
##  Residual             30.766   5.547   
## Number of obs: 36, groups:  Rat, 6; Liver, 3
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  140.500      4.783   3.174  29.373 5.65e-05 ***
## Treatment2    10.500      6.657   3.000   1.577    0.213    
## Treatment3    -5.333      6.657   3.000  -0.801    0.482    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##            (Intr) Trtmn2
## Treatment2 -0.696       
## Treatment3 -0.696  0.500

Q. Why is this model above wrong

It is wrong because this would add two random effect terms, one for rat 1 and one for rat 2. In fact there are 6 rats altogether. The way that the data have been coded allows for these kinds of mistakes to happen. The same is true for Liver, which is coded as a 1, 2 or 3. This means that we could write this thinking that we are including the correct random effects for Rat and Liver. In fact, this assumes that the data come from a crossed design, in which there are 2 rats and 3 parts of the liver, and that Liver = 1 corresponds to the same type of measurement in rats 1 and 2 and so on. Sometimes this is appropriate, but not here!

The nature of the way that many data sets are coded makes these kinds of mistakes very easy to make!

A better design is the one below:

rats_lmer.2 <- lmer(Glycogen ~ Treatment + (1 | Rat / Liver), data = rats)
summary(rats_lmer.2)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Glycogen ~ Treatment + (1 | Rat/Liver)
##    Data: rats
## 
## REML criterion at convergence: 219.6
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.48212 -0.47263  0.03062  0.42934  1.82935 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev.
##  Liver:Rat (Intercept) 14.17    3.764   
##  Rat       (Intercept) 36.06    6.005   
##  Residual              21.17    4.601   
## Number of obs: 36, groups:  Liver:Rat, 18; Rat, 6
## 
## Fixed effects:
##             Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)  140.500      4.707   3.000  29.848 8.26e-05 ***
## Treatment2    10.500      6.657   3.000   1.577    0.213    
## Treatment3    -5.333      6.657   3.000  -0.801    0.482    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##            (Intr) Trtmn2
## Treatment2 -0.707       
## Treatment3 -0.707  0.500

And we can see the effects in the figures below:

plot(ggpredict(rats_lmer.2, terms = c("Treatment")))
plot(ggpredict(rats_lmer.2, 
               terms = c("Treatment", "Rat"), 
               type = "random"), 
     add.data = TRUE)