28  Foundations of Mixed Modelling

28.1 Introduction

So far all of our analyses have dealt with a few explanatory variables - referred to as fixed effects. In these examples one of our core assumptions was the independence of each data point. That is no two observations are linked, or in any way part of a subgroup or block effect.

However, there are many occasions where this might not be the case:

  • We make multiple/repeated measures on a single individual or group of individuals

  • We repeat an experiment in different areas or times

  • Samples processed in batches (on different days or machines etc.)

  • Paired or matched designs

Each of these break the assumption of independence in a standard linear model.

Sometimes we might know exactly how these blocks of data differ, on other occasions we may have little idea how or why they differ. But when we use a mixed model we can attempt to capture and account for this non-independence.

28.2 What is a mixed model?

Mixed models (also known as linear mixed models or hierarchical linear models) are statistical tests that build on the simpler tests of regression, t-tests and ANOVA. All of these tests are special cases of the general linear model; they all fit a straight line to data to explain variance in a systematic way.

The key difference with a linear mixed-effects model is the inclusion of random effects - alongside fixed effects.

28.2.1 Fixed and Random effects

  • Fixed effects: are parameters associated with specific, non-random levels of a factor. You estimate a separate coefficient for each level (or contrast) and make inferences about those specific levels. They’re treated as systematic, not varying by chance.

  • Random effects: are modeled as draws from a probability distribution, typically assumed to be Normal with mean zero and an estimated variance component. Instead of estimating a separate coefficient for each group, the model estimates the variance of the distribution from which those group effects are drawn. You’re not interested in the specific group effects per se, but in how much they vary around a population mean.

The random effect \(U_i\) is often assumed to follow a normal distribution with a mean of zero and a variance estimated during the model fitting process.

\[Y_i = β_0 + U_j + ε_i\] Defining what a random effect actually is is tricky. The key practical distinction is how they are treated in mixed models.

In the book “Data analysis using regression and multilevel/hierarchical models” By Gelman Hill 2006. The authors examined five definitions of fixed and random effects and found no consistent agreement.

  1. Fixed effects are constant across individuals, and random effects vary. For example, in a growth study, a model with random intercepts a_i and fixed slope b corresponds to parallel lines for different individuals i, or the model y_it = a_i + b t thus distinguish between fixed and random coefficients.

  2. Effects are fixed if they are interesting in themselves or random if there is interest in the underlying population.

  3. “When a sample exhausts the population, the corresponding variable is fixed; when the sample is a small (i.e., negligible) part of the population the corresponding variable is random.”

  4. “If an effect is assumed to be a realized value of a random variable, it is called a random effect.”

  5. Fixed effects are estimated using least squares (or, more generally, maximum likelihood) and random effects are estimated with shrinkage.

Thus it turns out fixed and random effects are not born but made. We must make the decision to treat a variable as fixed or random in a particular analysis.

Note that the golden rule is that you generally want your random effect to have at least five levels. So, for instance, if we wanted to control for the effects of fish sex on body length, we would fit sex (a two level factor: male or female) as a fixed, not random, effect.

This is because estimating variance on few data points is very imprecise. Mathematically you could, but you wouldn’t have a lot of confidence in it. If you only have two or three levels, the model will struggle to partition the variance - it will give you an output, but not necessarily one you can trust.

There’s no firm rule as to what the minimum number of factor levels should be for random effect. It is commonly reported that you may want five or more factor levels for a random effect in order to really benefit from what the random effect can do (though some argue for even more, 10 levels).

28.3 Using random effects

Random effects are usually grouping factors for which we are trying to control. They are always categorical, as you can’t force R to treat a continuous variable as a random effect. A lot of the time we are not specifically interested in their impact on the response variable, but we know that they might be influencing the patterns we see.

The mixed-effects model can be used in many situations instead of one of our more straightforward tests when this structure may be important. The main advantages of this approach are:

  1. mixed-effects models account for more of the variance

  2. mixed-effects models incorporate group and even individual-level differences

  3. mixed-effects models cope well with missing data, unequal group sizes and repeated measurements

NoteOverdispersion

Another application of mixed models is to account for overdispersion in fixed-dipsersion models (Poisson and Binomial). If a random effect is included to account for individual-level variation we may find our models have a better fit.

28.4 Making a mixed models

28.4.1 The data

This dataset represents results from replicate feeding experiments investigating how Spodoptera frugiperda larvae respond to increasing concentrations of a plant defensive compound.

  • The explanatory variable is the concentration of the benzoxazinoid DIMBOA incorporated into the larval diet (mM)

  • The response variable y is a composite detoxification index combining cytochrome P450 and glutathione S-transferase activity together with expression of a marker detoxification gene.

  • Each group corresponds to an independent experimental replicate—separate batches of larvae reared and assayed under identical conditions on different days—to capture between-replicate variation in baseline activity and toxin responsiveness.

Fall Army Worm

28.4.2 All in one model

We will begin by highlighting the importance of considering data structure and hierarchy when building linear models. To illustrate this, we will delve into an example that showcases the consequences of ignoring the underlying data structure. We might naively construct a single linear model that ignores the group-level variation and treats all observations as independent. This oversimplified approach fails to account for the fact that observations within groups may be more similar to each other.

\[Y_i = \beta_0 + \beta_1X_i + \epsilon_i\]

basic_model <- lm(detox_exp ~ benzo_um, data = benzo_data)
summary(basic_model)

Call:
lm(formula = detox_exp ~ benzo_um, data = benzo_data)

Residuals:
   Min     1Q Median     3Q    Max 
-37.23 -12.11  -2.36  11.00  44.53 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  21.0546     1.6490  12.768   <2e-16 ***
benzo_um      2.5782     0.2854   9.034   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 16.96 on 428 degrees of freedom
Multiple R-squared:  0.1602,    Adjusted R-squared:  0.1582 
F-statistic: 81.62 on 1 and 428 DF,  p-value: < 2.2e-16

Here we can see that the basic linear model has produced a statistically significant regression analysis (t428 = 9.034, p <0.001) with an \(R^2\) of 0.16. There is a medium effect positive relationship between changes in x and y (Estimate = 2.58, S.E. = 0.29).

We can see that clearly if we produce a simple plot of x against y:

plot(benzo_data$benzo_um, benzo_data$detox_exp)

Simple scatter plot x against y

Here if we use the function geom_smooth() on the scatter plot, the plot also includes a fitted regression line obtained using the lm method. This allows us to examine the overall trend and potential linear association between the variables.

ggplot(benzo_data, aes(x = benzo_um, 
                 y = detox_exp)) +
  geom_point() +
  labs(x = "Toxin concentration in diet", 
       y = "Detox capacity index")+
  geom_smooth(method = "lm")

Scatter plot displaying the relationship between the independent variable and the dependent variable. The points represent the observed data, while the fitted regression line represents the linear relationship between the variables. The plot helps visualize the trend and potential association between the variables.

Looking at the fit of our model we would be tempted to conclude that we have an accurate and robust model.

However, when data is structured, such as individuals nested within groups, there is typically correlation or similarity among observations within the same group. By not accounting for this clustering effect, estimates derived from a single model can be biased and inefficient. The assumption of independence among observations is violated, leading to incorrect standard errors and inflated significance levels.

From the figures below we can see the difference in median and range of x values within each of our groups:

Visualise the groups

  • Make a simple ggplot that highlights the differences between the groups.

Hint try boxplots for average difference between groups or colour the original scatterplot by group

ggplot(benzo_data, aes(x = group, 
                 y = detox_exp,
                 fill = group)) +
  geom_boxplot() +
  labs(x = "Toxin concentration in diet", 
       y = "Detox capacity index")+
  theme(legend.position = "none")

# Color tagged by group
plot_group <- ggplot(benzo_data, aes(x = benzo_um, 
                               y = detox_exp, 
                               color = group,
                               group = group)) +
  geom_point(alpha = 0.6) +
  labs(title = "Data Coloured by Group", 
       x = "Toxin concentration in diet", 
       y = "Detox capacity index")+
  theme(legend.position="none")

plot_group

From the above plots, it confirms that our observations from within each of the ranges aren’t independent. We can’t ignore that: as it might make our estimates biased.

28.4.3 Multiple analyses approach

Running separate linear models per group, also known as stratified analysis, could a feasible approach in certain situations. However, there are several drawbacks including

  • Increased complexity

  • Inability to draw direct conclusions on overall variability

  • Reduced statistical power

  • Inflated Type 1 error risk

  • Inconsistent estimates

  • Limited ability to handle unbalanced/missing data

# Plotting the relationship between x and y with group-level smoothing
ggplot(benzo_data, aes(x = benzo_um, y = detox_exp, color = group, group = group)) +
  geom_point(alpha = 0.6) +  # Scatter plot of x and y with transparency
  labs(title = "Data Colored by Group", x = "Toxin concentration in diet", 
       y = "Detox capacity index") +
  theme(legend.position = "none") +
  geom_smooth(method = "lm") +  # Group-level linear regression smoothing
  facet_wrap(~group)  # Faceting the plot by group

Scatter plot showing the relationship between the independent variable (x) and the dependent variable (y) colored by group. Each subplot represents a different group. The line represents the group-level linear regression smoothing.
## Custom function to make pretty p-reporting
 report_p <- function(p, digits = 3) {
     reported <- if_else(p < 0.001,
             "p < 0.001",
             paste("p=", round(p, digits)))
     
     return(reported)
 }


# Creating nested benzo_data by grouping the benzo_data by 'group'
nested_data <- benzo_data |> 
  group_by(group) |> 
  nest()

# Fitting linear regression models to each nested data group
models <- map(nested_data$data, ~ lm(detox_exp ~ benzo_um, data = .)) |>  
          map(broom::tidy)

# Combining the model results into a single data frame
combined_models <- bind_rows(models)

# Filtering the rows to include only the 'x' predictor
filtered_models <- combined_models |> 
                   filter(term == "benzo_um")

# Adding a column for the group index using rowid_to_column function
group_indexed_models <- filtered_models |> 
                        rowid_to_column("group")

# Modifying the p-values using a custom function report_p
final_models <- group_indexed_models |> 
                mutate(p.value = report_p(p.value))

final_models
group term estimate std.error statistic p.value
1 benzo_um 3.0643940 0.3620254 8.4645823 p < 0.001
2 benzo_um 2.5162898 0.3354848 7.5004590 p < 0.001
3 benzo_um 1.3904720 0.3192878 4.3549169 p < 0.001
4 benzo_um 1.6856649 0.3550511 4.7476686 p < 0.001
5 benzo_um -0.0755633 0.6576868 -0.1148926 p= 0.909

Each analysis has produced a different estimate - and in some cases we did not find a statistically significant association.

28.4.4 Fixed effects model

Using a group level term with an effect on x as a fixed effect means explicitly including the terms benzo_um and the group as a predictors in the model equation. This approach assumes that the relationship between x and the outcome variable differs across groups and that these differences are constant and fixed.

It implies that each group has a unique intercept (baseline level) for the relationship between x and the outcome variable. By treating the group level term as a fixed effect, the model estimates specific parameter values for each group.

\[Y_i = \beta_0 + \beta_1X_i + \beta_2.group_i+\beta_3(X_i.group_i)+\epsilon_i\]

28.4.4.1 Include group as a fixed effect in your model

additive_model <- lm(detox_exp ~ benzo_um + group, data = benzo_data)

summary(additive_model)

Call:
lm(formula = detox_exp ~ benzo_um + group, data = benzo_data)

Residuals:
     Min       1Q   Median       3Q      Max 
-26.3011  -6.3665  -0.3373   6.6295  31.5816 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  11.7808     1.2931   9.111  < 2e-16 ***
benzo_um      2.0242     0.1704  11.883  < 2e-16 ***
group2        3.8771     1.4214   2.728 0.006642 ** 
group3       36.3036     1.4286  25.412  < 2e-16 ***
group4        9.2323     1.4215   6.495 2.33e-10 ***
group5        8.0609     2.0922   3.853 0.000135 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 10.05 on 424 degrees of freedom
Multiple R-squared:  0.7077,    Adjusted R-squared:  0.7042 
F-statistic: 205.3 on 5 and 424 DF,  p-value: < 2.2e-16

If each group has

  • reasonably large sample sizes

  • a balanced design (same n per group)

  • relatively consistent group differences

  • when you have few groups (<5)

Then the difference between a fixed-effect model and a mixed-effect model will be small. And this may be the preferred option.

BUT

  • When replicate experiments vary in n

  • When you want to generalise across groups

  • When you have lots of groups (> 5 - 10)

  • When you have incomplete groups or timepoints

Then mixed models can be much more powerful.

28.5 Our first mixed model

A mixed model is a good choice here: it will allow us to use all the data we have (higher sample size) and account for the correlations between data coming from the groups. We will also estimate fewer parameters and avoid problems with multiple comparisons that we would encounter while using separate regressions.

We can now join our random effect \(U_j\) to the full dataset and define our y values as

\[Y_{ij} = β_0 + β_1*X_{ij} + U_j + ε_{ij}\].

We have a response variable, and we are attempting to explain part of the variation through fitting an independent variable as a fixed effect. But the response variable has some residual variation (i.e. unexplained variation) associated with group. By using random effects, we are modeling that unexplained variation through variance.

We now want to know if an association between y ~ x exists after controlling for the variation in group.

28.5.1 Running mixed effects models with lmerTest

This section will detail how to run mixed models with the lmer function in the R package lmerTest (Kuznetsova et al. (2020)). There are other R packages that can be used to run mixed-effects models such as the glmmTMB package.

plot_function2 <- function(model, data, title = "Data Coloured by Group"){
  
data <- data |>  
  mutate(fit.m = predict(model, re.form = NA),
         fit.c = predict(model, re.form = NULL))

data |> 
  ggplot(aes(x = x, y = y, col = group)) +
  geom_point(pch = 16, alpha = 0.6) +
  geom_line(aes(y = fit.c, col = group), linewidth = 2)  +
  coord_cartesian(ylim = c(-40, 100))+
  labs(title = title, 
       x = "Independent Variable", 
       y = "Dependent Variable") 
}

mixed_model <- lmer(y ~ x + (1|group), data = data)

plot_function2(mixed_model,benzo_data, "Random intercept")

Here we fit our random intercepts with (1|random_effect_variable).

# random intercept model
mixed_model <- lmer(detox_exp ~ benzo_um + (1|group), data = benzo_data)

summary(mixed_model)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: detox_exp ~ benzo_um + (1 | group)
   Data: benzo_data

REML criterion at convergence: 3224.4

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.61968 -0.63654 -0.03584  0.66113  3.13597 

Random effects:
 Groups   Name        Variance Std.Dev.
 group    (Intercept) 205      14.32   
 Residual             101      10.05   
Number of obs: 430, groups:  group, 5

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)  23.2692     6.4818   4.1570    3.59   0.0215 *  
benzo_um      2.0271     0.1703 424.0815   11.90   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
         (Intr)
benzo_um -0.131

Here our groups clearly explain a lot of variance


205/(205 + 101) = 0.669 / 66.9%

So the differences between groups explain ~67% of the variance that’s “left over” after the variance explained by our fixed effects.

28.5.2 Standard Errors

Below we can take a look at the estimates and standard errors for three of our previously constructed models:

pooled <- basic_model |>  
  broom::tidy() |>  
  mutate(Approach = "Pooled", .before = 1) |>  
  dplyr::select(term, estimate, std.error, Approach)

no_pool <- additive_model |>  
  broom::tidy() |>  
  mutate(Approach = "Fixed Effect", .before = 1) |>  
  dplyr::select(term, estimate, std.error, Approach)

partial_pool <- mixed_model |>  
  broom.mixed::tidy() |>  
  mutate(Approach = "Mixed Model", .before = 1) |>  
  dplyr::select(Approach, term, estimate, std.error)

bind_rows(pooled, no_pool, partial_pool) |>
  filter(term %in% c("(Intercept)", "benzo_um"))
term estimate std.error Approach
(Intercept) 21.054614 1.6490185 Pooled
benzo_um 2.578217 0.2853798 Pooled
(Intercept) 11.780827 1.2931031 Fixed Effect
benzo_um 2.024207 0.1703505 Fixed Effect
(Intercept) 23.269187 6.4818211 Mixed Model
benzo_um 2.027065 0.1703423 Mixed Model
  • Pooled: in the pooled model the averaged estimates may not accurately reflect the true values within each group. As a result, the estimates in pooled models can be biased towards the average behavior across all groups. We can see this in the too small standard error of the intercept, underestimating the variance in our dataset. At the same time if there is substantial variability in the relationships between groups, the pooled estimates can be less precise. This increased variability across groups can contribute to larger standard errors of the difference (SED) for fixed effects in pooled models.

  • Fixed effect: This model is extremely precise with the smallest errors, however these estimates reflect conditions only for the first group in the model.

  • Mixed models: this model reflects the greater uncertainty of the Mean and SE of the intercept. However, the SED in a partial pooling model accounts for both the variability within groups and the uncertainty between groups. This adjusted SED provides a more accurate measure of the uncertainty associated with the estimated differences between groups or fixed effects.

28.6 Predictions

It’s not always easy/straightforward to make prediciton or confidence intervals from complex general and generalized linear mixed models, luckily there are some excellent R packages that will do this for us.

28.6.1 ggpredict

The argument type = random in the ggpredict function (from the ggeffects package Lüdecke (2025a)) is used to specify the type of predictions to be generated in the context of mixed effects models. The main difference between using ggpredict with and without type = random lies in the type of predictions produced:

type = fixed: ggpredict will generate fixed-effects predictions. These estimates are based solely on the fixed effects of the model, ignoring any variability associated with the random effects. The resulting estimate represent the average or expected values of the response variable for specific combinations of predictor values, considering only the fixed components of the model.

library(ggeffects)

ggpredict(mixed_model, 
          terms = "benzo_um", 
          type = "fixed") |>  
plot(x = _, show_data = TRUE)

With type = random: ggpredict will generate predictions that incorporate both fixed and random effects. These predictions take into account the variability introduced by the random effects in the model. The resulting predictions reflect not only the average trend captured by the fixed effects but also the additional variability associated with the random effects at different levels of the grouping factor(s).

ggpredict(mixed_model, 
          terms = c("benzo_um", "group"), 
          type = "random") |>  
plot(x = _, show_data = TRUE) + 
  facet_wrap(~group)

28.6.2 sjPlot

Another way to visualise mixed model results is with the package sjPlot(Lüdecke (2025b)), and if you are interested in showing the variation among levels of your random effects, you can plot the departure from the overall model estimate for intercepts - and slopes, if you have a random slope model:

library(sjPlot)
library(patchwork)

plot_model(mixed_model, terms = c("benzo_um", "group"), type = "re")/
(plot_model(mixed_model, terms = c("benzo_um", "group"), type = "est")+ggtitle("Fixed Effects"))

Warning

The values you see are NOT actual values, but rather the difference between the general intercept or slope value found in your model summary and the estimate for this specific level of random/fixed effect.

sjPlot is also capable of producing prediction plots in the same way as ggeffects:

plot_model(mixed_model,type="pred",
           terms=c("benzo_um", "group"),
           pred.type="re",
           show.data = T)+
  facet_wrap( ~ group)

28.6.3 emmeans

Emmeans will only produce predictions based on fixed effects:

emmeans::emmeans(mixed_model, 
                 specs = "benzo_um", 
                 at = list("benzo_um" = seq(0,10,2)))
 benzo_um emmean   SE   df lower.CL upper.CL
        0   23.3 6.48 4.14     5.51     41.0
        2   27.3 6.45 4.05     9.51     45.1
        4   31.4 6.43 4.01    13.54     49.2
        6   35.4 6.43 4.01    17.59     53.3
        8   39.5 6.45 4.05    21.68     57.3
       10   43.5 6.48 4.14    25.78     61.3

Degrees-of-freedom method: kenward-roger 
Confidence level used: 0.95 

In mixed models, df are not straightforward because of:

  • The hierarchical structure (e.g., subjects nested in groups),

  • Random effects, which reduce the effective number of independent observations.

Method What it does Used by
Satterthwaite Estimates df based on uncertainty in SE lmerTest, emmeans()
Kenward-Roger More advanced, adjusts SE and df emmeans()
Asymptotic (∞) Assumes large sample, uses normal distribution lme4, ggpredict()

When df is assumed to be essentially infinite (∞) - we default to the z-distribution, which will produce the narrowest confidence intervals.

28.6.4 Interval types

28.6.4.1 Confidence Intervals:

A confidence interval is used to estimate the range of plausible values for a population parameter, such as the mean or the regression coefficient, based on a sample from that population. It provides a range within which the true population parameter is likely to fall with a certain level of confidence. For example, a 95% confidence interval implies that if we were to repeat the sampling process many times, 95% of the resulting intervals would contain the true population mean.

ggpredict(mixed_model, 
          terms = "benzo_um", 
          interval = "confidence")
x predicted std.error conf.low conf.high group
0 23.26919 6.481821 10.52885 36.00952 1
2 27.32332 6.446136 14.65313 39.99351 1
4 31.37745 6.428332 18.74225 44.01265 1
6 35.43158 6.428560 22.79594 48.06722 1
8 39.48571 6.446816 26.81418 52.15724 1
10 43.53984 6.482949 30.79729 56.28239 1

28.6.4.2 Prediction Intervals:

On the other hand, a prediction interval is used to estimate the range of plausible values for an individual observation or a future observation from the population. It takes into account both the uncertainty in the estimated model parameters and the inherent variability within the data. A 95% prediction interval provides a range within which a specific observed or predicted value is likely to fall with a certain level of confidence. The wider the prediction interval, the greater the uncertainty around the specific value being predicted.

ggpredict(mixed_model, 
          terms = "benzo_um", 
          interval = "prediction")
x predicted std.error conf.low conf.high group
0 23.26919 11.95906 -0.2369238 46.77530 1
2 27.32332 11.93976 3.8551499 50.79149 1
4 31.37745 11.93015 7.9281549 54.82674 1
6 35.43158 11.93028 11.9820450 58.88111 1
8 39.48571 11.94012 16.0168209 62.95460 1
10 43.53984 11.95967 20.0325298 67.04715 1

28.7 Checking model assumptions

28.7.1 performance

The check_model() function from the performance package in R is a useful tool for evaluating the performance and assumptions of a statistical model, particularly in the context of mixed models. It provides a comprehensive set of diagnostics and visualizations to assess the model’s fit, identify potential issues, and verify the assumptions underlying the model, which can be difficult with complex models

library(performance)
check_model(mixed_model,
            detrend = FALSE)

Note this package requires Hartig (2024) to work with simulated residuals for mixed model checking

28.8 Reporting Mixed Model Results

Once you get your model, you have to present it in an accurate, clear and attractive form.

The summary() function provides us with some useful numbers such as the amount of variance left over after fitting our fixed effects that can be assigned to each of our random effects. It also provides information on how correlated our random effects are, which may be of interest in understanding the structure of our data as well.

We can use it to check that the modelled random effect structure matches our data.

It includes the coefficient estimates, t-statistics and p-values of our fixed effects

summary(mixed_model)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: detox_exp ~ benzo_um + (1 | group)
   Data: benzo_data

REML criterion at convergence: 3224.4

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.61968 -0.63654 -0.03584  0.66113  3.13597 

Random effects:
 Groups   Name        Variance Std.Dev.
 group    (Intercept) 205      14.32   
 Residual             101      10.05   
Number of obs: 430, groups:  group, 5

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)  23.2692     6.4818   4.1570    3.59   0.0215 *  
benzo_um      2.0271     0.1703 424.0815   11.90   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
         (Intr)
benzo_um -0.131

We may also wish to report \(R^2\) values from our model fit. This requires the MuMIn package.

library(MuMIn)
r.squaredGLMM(mixed_model)
            R2m       R2c
[1,] 0.09949133 0.7027654

This calculates two values the first \(R^2_{m}\) is the marginal \(R^2\) value, representing the proportion of variance explained by our fixed effects. The second \(R^2_{c}\) is the conditional \(R^2\), which is the proportion of variance explained by the full model, with both fixed and random effects.

28.9 Tables

There are a few packages for producing beautiful summary tables from regression models, but not all of them can handle mixed-effects models, the sjPlot Lüdecke (2025b) package is one of the most robust and produces a simple HTML table detailing fixed and random effects, produces 95% confidence intervals for fixed effects and calculates those \(R^2\) values for you:

sjPlot::tab_model(mixed_model, 
                  df.method = "satterthwaite")
  detox exp
Predictors Estimates CI p
(Intercept) 23.27 5.54 – 41.00 0.022
benzo um 2.03 1.69 – 2.36 <0.001
Random Effects
σ2 101.01
τ00group 205.00
ICC 0.67
N group 5
Observations 430
Marginal R2 / Conditional R2 0.099 / 0.703

28.10 Figures

We have covered some of the packages that allow us to easily produce marginal and conditional fits, along with 95% confidence or prediction intervals: these will output ggplot2 figures, allowing plenty of customisation for presentation

# Custom colors

ggpredict(mixed_model, terms = c("benzo_um", "group" ), 
          type = "random") |>  
plot(x = _, show_data = TRUE) + 
  scale_colour_discrete(palette = "Dark2")+
  scale_fill_discrete(palette = "Dark2")+
  facet_wrap(~group)+theme(legend.position = "none")

28.11 Write-ups

We fitted a linear mixed model (estimated using REML) to estimate the effect of “benzo_um” on “detox_exp”. The model included a random intercept design to allow for the repeated experiment design. The model’s total explanatory power is substantial (conditional R^2 = 0.7), and the part related to the fixed effects alone (marginal R2) is 0.099. The effect of x on y was a steady increase of 2.03 per unit of x ([1.69 - 2.36], t(424.1) = 11.9, p < 0.001). These 95% Confidence Intervals (CIs) and p-values were computed using a satterthwaite approximation of the degrees of freedom and a t-distribution approximation.

28.12 Practice Questions

  1. In mixed-effects models, independent variables are also called:

  1. A random effect is best described as?
  1. Which of the following is not an advantage of mixed-effects models?
  1. Mixed-effects models cope well with missing data because?

28.13 Worked Example 1 - Dolphins

28.13.1 How do I decide if it is a fixed or random effect?

One of the most common questions in mixed-effects modelling is how to decide if an effect should be considered as fixed or random. This can be quite a complicated question, as we have touched upon briefly before the definition of a random effect is not universal. It is a considered process that can include the hypothesis or question you are asking.

1 ) Are you directly interested in the effect in question. If the answer is yes it should be a fixed effect.

2 ) Is the variable continuous? If the answer is yes it should be a fixed effect.

3 ) Does the variable have less than five levels? If ther answer is yes it should be a fixed effect.

28.14 Dolphins

This dataset was collected to measure resting lung function in 32 bottlenose dolphins. The main dependent variable was the tidal volume (\(V_T\)) measured in litres, as an index of lung capacity.

dolphins <- read_csv("files/dolphins.csv") 

dolphins$direction <- factor(dolphins$direction)

dolphins <- drop_na(dolphins)

Here we are interested in the relationship between (\(V_T\)) and body mass (kg), we have measurements which are taken on the breath in and the breath out, and each dolphin has been observed between one and four times.

We need to determine our fixed effects, random effects and model structure:

  1. Body Mass

  2. Direction

  3. Animal 1) Body Mass

  4. With the basic structure y ~ x + z + (1|group) what do you think this model should be?:

  1. We are clearly interested in the effect of body mass on (\(V_T\)) so this is a fixed effect.

  2. We may think that the relationship with (\(V_T\)) and body mass may be different on the in and out breath. We may not be directly interested in this, but it has fewer than fivel levels so this is a fixed effect. (outbreath coded as 1, inbreath coded as 2).

  3. Individual dolphins - if we averaged across measurements for each dolphin, our measurement precision would be different for each animal. If we include each data point, we would be double-counting some animals and our observations would not be independent. To account for the multiple observations we should treat animal as a random effect.

dolphmod <- lmer(vt ~ bodymass + direction + (1|animal), data=dolphins)

With our basic linear model in place - we should carry out model fit checks with DHARMa or performance::check_model(), but assuming this is a good fit we can look at interpretation:

summary(dolphmod)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: vt ~ bodymass + direction + (1 | animal)
   Data: dolphins

REML criterion at convergence: 387.4

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.30795 -0.51983  0.04156  0.62404  2.26396 

Random effects:
 Groups   Name        Variance Std.Dev.
 animal   (Intercept) 1.039    1.019   
 Residual             1.158    1.076   
Number of obs: 112, groups:  animal, 31

Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)  2.226398   0.627115 28.758081   3.550  0.00135 ** 
bodymass     0.016782   0.003259 26.720390   5.150  2.1e-05 ***
direction2   1.114821   0.203389 77.414421   5.481  5.1e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
           (Intr) bdymss
bodymass   -0.927       
direction2 -0.162  0.000
dolphins.1 <- dolphins |>  
    mutate(fit.m = predict(dolphmod, re.form = NA),
           fit.c = predict(dolphmod, re.form = NULL))
dolphins.1 |> 
  ggplot(aes(x = bodymass, y = vt, group = direction)) +
  geom_point(pch = 16, aes(colour = direction)) +
  geom_line(aes(y = fit.m, 
                linetype = direction), 
            linewidth = 1)  +
  labs(x = "Body Mass", 
       y = "VT") 

Scatter plot of VT as a function of body mass for dolphins. Different directions of breath are represented by different colors. The solid lines indicate the marginal fitted values from our model.
plot_model(dolphmod,type="pred",
           terms=c("bodymass", "direction"),
           pred.type="fe",
           show.data = T)

We fitted a linear mixed model (estimated using REML and nloptwrap optimizer) to predict (\(V_T\)) with bodymass(kg) and direction (in/out breath). 95% Confidence Intervals (CIs) and p-values were computed using a Wald t-distribution approximation. We included a random intercept effect of animal to account for repeated measurements (of between 1 to 4 observations) across a total of 32 bottlenosed dolphins.

We found that for every 1kg increase in bodymass, (\(V_T\)) increased by 0.02 litres (95% CI [0.01 - 0.02]), . The inbreath had on average a higher volume than the outbreath (1.11 litre difference [0.71 - 1.52]..

Q. This is not a perfect write-up, what else could we consider including?

28.15 Multiple random effects

Previously we used (1|group) to fit our random effect. Whatever is on the right side of the | operator is a factor and referred to as a “grouping factor” for the term.

28.15.1 Crossed or Nested

When we have more than one possible random effect these can be crossed or nested - it depends on the relationship between the variables. Let’s have a look.

A common issue that causes confusion is this issue of specifying random effects as either ‘crossed’ or ‘nested’. In reality, the way you specify your random effects will be determined by your experimental or sampling design. A simple example can illustrate the difference.

28.15.1.1 1. Crossed Random Effects:

Crossed random effects occur when the levels of two or more grouping variables are crossed or independent of each other. In this case, the grouping variables are unrelated, and each combination of levels is represented in the data.

Fully Crossed

Example: Imagine there is a long-term study on breeding success in passerine birds across multiple woodlands, and the researcher returns every year for five years to carry out measurements. Here “Year” is a crossed random effect with “Woodland” as each Woodland can appear in multiple years of study. Imagine a researcher was interested in understanding the factors affecting the clutch mass of a passerine bird. They have a study population spread across five separate woodlands, each containing 30 nest boxes. Every week during breeding they measure the foraging rate of females at feeders, and measure their subsequent clutch mass. Some females have multiple clutches in a season and contribute multiple data points.

lmer(y ~ x + (1 | Year) + (1 | Woodland), data = dataset)

28.15.1.2 2. Nested Random Effects:

Nested random effects occur when the levels of one grouping variable are completely nested within the levels of another grouping variable. In other words, the levels of one variable are uniquely and exclusively associated with specific levels of another variable.

Fully Nested

Example: In the same bird study female ID is said to be nested within woodland : each woodland contains multiple females unique to that woodland (that never move among woodlands). The nested random effect controls for the fact that (i) clutches from the same female are not independent, and (ii) females from the same woodland may have clutch masses more similar to one another than to females from other woodlands

lmer(y ~ x + (1 | Woodland/Female ID), data = dataset)

or if we remember year we have a model with both crossed and nested random effects

lmer(y ~ x + (1 | Woodland/Female ID) + (1|Year), data = dataset)

For more on designing models around crossed and nested designs check out this amazing nature paper

28.16 Worked Example 2 - Nested design

This experiment involved a simple one-way anova with 3 treatments given to 6 rats, the researchers the measured glycogen levels in their liver. The analysis was complicated by the fact that three preparations were taken from the liver of each rat, and two readings of glycogen content were taken from each preparation. This generated 6 pseudoreplicates per rat to give a total of 36 readings in all.

Clearly, it would be a mistake to analyse these data as if they were a straightforward one-way anova, because that would give us 33 degrees of freedom for error. In fact, since there are only two rats in each treatment, we have only one degree of freedom per treatment, giving a total of 3 d.f. for error.

The variance is likely to be different at each level of this nested analysis because:

the readings differ because of variation in the glycogen detection method within each liver sample (measurement error); the pieces of liver may differ because of heterogeneity in the distribution of glycogen within the liver of a single rat; the rats will differ from one another in their glycogen levels because of sex, age, size, genotype, etc.; rats allocated different experimental treatments may differ as a result of the fixed effects of treatment. If all we want to test is whether the experimental treatments have affected the glycogen levels, then we are not interested in liver bits within rat’s livers, or in preparations within liver bits. We could combine all the pseudoreplicates together, and analyse the 6 averages. This would have the virtue of showing what a tiny experiment this really was. This latter approach also ignores the nested sources of uncertainties. Instead we will use a linear mixed model.

rats <- readRDS("rats.rds")
rats |>  
  aggregate(Glycogen ~ Rat + Treatment + Liver, data = _, mean)
   Glycogen Rat Treatment Liver
1       131   1         1     1
2       130   1         1     1
3       131   1         1     2
4       125   1         1     2
5       136   1         1     3
6       142   1         1     3
7       150   2         1     1
8       148   2         1     1
9       140   2         1     2
10      143   2         1     2
11      160   2         1     3
12      150   2         1     3
13      157   3         2     1
14      145   3         2     1
15      154   3         2     2
16      142   3         2     2
17      147   3         2     3
18      153   3         2     3
19      151   4         2     1
20      155   4         2     1
21      147   4         2     2
22      147   4         2     2
23      162   4         2     3
24      152   4         2     3
25      134   5         3     1
26      125   5         3     1
27      138   5         3     2
28      138   5         3     2
29      135   5         3     3
30      136   5         3     3
31      138   6         3     1
32      140   6         3     1
33      139   6         3     2
34      138   6         3     2
35      134   6         3     3
36      127   6         3     3
rats_lmer.1 <- lmer(Glycogen ~ Treatment + (1 | Rat) + (1 | Liver), data = rats)
summary(rats_lmer.1)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Glycogen ~ Treatment + (1 | Rat) + (1 | Liver)
   Data: rats

REML criterion at convergence: 221.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.79334 -0.66536  0.01792  0.59281  2.05206 

Random effects:
 Groups   Name        Variance Std.Dev.
 Rat      (Intercept) 39.187   6.260   
 Liver    (Intercept)  2.168   1.472   
 Residual             30.766   5.547   
Number of obs: 36, groups:  Rat, 6; Liver, 3

Fixed effects:
            Estimate Std. Error      df t value Pr(>|t|)    
(Intercept)  140.500      4.783   3.174  29.373 5.65e-05 ***
Treatment2    10.500      6.657   3.000   1.577    0.213    
Treatment3    -5.333      6.657   3.000  -0.801    0.482    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
           (Intr) Trtmn2
Treatment2 -0.696       
Treatment3 -0.696  0.500

Q. Why is this model above wrong

It is wrong because this would add two random effect terms, one for rat 1 and one for rat 2. In fact there are 6 rats altogether. The way that the data have been coded allows for these kinds of mistakes to happen. The same is true for Liver, which is coded as a 1, 2 or 3. This means that we could write this thinking that we are including the correct random effects for Rat and Liver. In fact, this assumes that the data come from a crossed design, in which there are 2 rats and 3 parts of the liver, and that Liver = 1 corresponds to the same type of measurement in rats 1 and 2 and so on. Sometimes this is appropriate, but not here!

The nature of the way that many data sets are coded makes these kinds of mistakes very easy to make!

A better design is the one below:

rats_lmer.2 <- lmer(Glycogen ~ Treatment + (1 | Rat / Liver), data = rats)
summary(rats_lmer.2)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Glycogen ~ Treatment + (1 | Rat/Liver)
   Data: rats

REML criterion at convergence: 219.6

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-1.48212 -0.47263  0.03062  0.42934  1.82935 

Random effects:
 Groups    Name        Variance Std.Dev.
 Liver:Rat (Intercept) 14.17    3.764   
 Rat       (Intercept) 36.06    6.005   
 Residual              21.17    4.601   
Number of obs: 36, groups:  Liver:Rat, 18; Rat, 6

Fixed effects:
            Estimate Std. Error      df t value Pr(>|t|)    
(Intercept)  140.500      4.707   3.000  29.848 8.26e-05 ***
Treatment2    10.500      6.657   3.000   1.577    0.213    
Treatment3    -5.333      6.657   3.000  -0.801    0.482    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
           (Intr) Trtmn2
Treatment2 -0.707       
Treatment3 -0.707  0.500

And we can see the effects in the figures below:

plot(ggpredict(rats_lmer.2, terms = c("Treatment")))

plot(ggpredict(rats_lmer.2, 
               terms = c("Treatment", "Rat"), 
               type = "random"), 
     show_data = TRUE)

28.17 Worked Example 3 - GLMM Binomial Mixed models

Fit a Binomial GLM

  • Read in this csv and assign it to an R object egg_hatch_data

  • Fit a Binomial model to analyse the hatch rate by treatment for the egg_hatch_data

egg_hatch_data <- egg_hatch_data |> 
  mutate(unhatched_eggs = (number_of_eggs-number_of_larvae))

binomial_model <- glm(cbind(number_of_larvae,unhatched_eggs) ~ cross, family = binomial, data = egg_hatch_data) 
  • Look at the coefficients and run a model check on the data

    - Are there any issues with the model?
summary(binomial_model)

Call:
glm(formula = cbind(number_of_larvae, unhatched_eggs) ~ cross, 
    family = binomial, data = egg_hatch_data)

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  1.05143    0.04658  22.573  < 2e-16 ***
crossB       0.38815    0.09124   4.254  2.1e-05 ***
crossC       1.29224    0.11812  10.940  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1307.6  on 54  degrees of freedom
Residual deviance: 1157.3  on 52  degrees of freedom
AIC: 1363.6

Number of Fisher Scoring iterations: 4
performance::check_model(binomial_model)

There are several issues with outliers, the distribution of quantiles, under-prediction of high values.

If we run a performance::check_overdispersion() we have high overdispersion

Fit a Binomial GLMM

Mixed models can be very helpful for dealing with apparent overdispersion in fixed dispersion models.

If variance can be accounted for by random intercepts that align with the structure in our data.

  • Which variables could be used to build random intercepts?

  • Do we need crossed or nested effects?

  • Use the glmer() to fit a binomial model with random effects

glmer(cbind(win,loss) ~ x + (1|random), family = binomial)

binomial_mixed_rep <- glmer(cbind(number_of_larvae,unhatched_eggs) ~ cross +
                              (1|replicate), 
                            family = binomial, 
                            data = egg_hatch_data) 

binomial_mixed_rep_fem <- glmer(cbind(number_of_larvae,unhatched_eggs) ~ cross +
                              (1|replicate/female_num), 
                            family = binomial, 
                            data = egg_hatch_data) 

AIC(binomial_model, binomial_mixed_rep, binomial_mixed_rep_fem)
df AIC
binomial_model 3 1363.551
binomial_mixed_rep 4 1263.221
binomial_mixed_rep_fem 5 644.059
performance::check_model(binomial_mixed_rep)

performance::check_model(binomial_mixed_rep_fem)

In this example - our main source of variance was between different females - allowing for a random intercept here stabilised dispersion to within acceptable tolerances for a binomial model.

Variance between replicate experiments is actually extremely low - but female ids are nested within replicate experiments so this must be reflected in our random intercept structure.

Tip: Check the number of obs, and groups in the random effect summary - does it match your experimental design?

28.18 Summary

28.18.1 Checklist for Pooling vs. Fixed vs. Random Effects

  1. Are the groups or replicates intentionally different (experimental manipulation)?

    • Yes → Fixed effect.
      You want explicit estimates for each level (treatment A vs B, genotype 1 vs 2).

    • No → Go to 2.

  2. Are the groups/replicates randomly sampled from a broader population you want to generalize over?

    • Yes → Random effect.
      Each level (e.g., plate, batch, cage) is a random draw; you want to estimate variance among levels, not each level’s mean.
      Efficient when many levels; saves degrees of freedom.

    • No → Go to 3.

  3. Are the groups technically identical and under identical conditions?

    • Yes → Pool results.
      If variation among replicates is trivial or purely measurement error, averaging is acceptable.
      Always verify that between-replicate variance is negligible before pooling.
    • No → Go to 4.
  4. Do group differences represent a nuisance (not of direct interest but possibly confounding)?

    • Yes → Random effect (if ≥ 5 levels).
      If only 2–3 levels → safer to model as Fixed, or combine with other factors.

    • No → Go to 5.

  5. Are there only a few levels (≤ 3)?

    • Yes → Fixed effect.
      Variance estimation is unreliable with so few groups.

    • No → Random effect (if sampled at random from a larger population).

  6. Are you uncertain about which specification is appropriate?

    • Fit both fixed and random models: compare AIC/BIC and model checks

28.18.1.1 Final Check

Situation Reasoning Recommended Handling
Designed treatment differences Specific levels of interest Fixed effect
Randomly sampled groups (many levels) Interested in variance, not per-level means Random effect
Identical replicates, same conditions No real grouping variance Pool (average)
Few nuisance groups (≤ 3) Random variance unidentifiable Fixed effect
Unknown influence Test both structures Model comparison

28.18.2 Practical problems

Below are two common issues or warnings you will likely encounter when fitting linear models:

  • boundary (singular) fit: see help('isSingular'): Your model did fit, but it generated that warning because some of your random effects are very small, common with complex mixed-effect models. You can read more about this in help page

  • Convergence warnings: Values for the mixed-effects models are determined using optimisation algorithms. Sometimes these algorithms fail to converge on a best parameter estimate, which will produce an error. There are several possible solutions:

    • Normalise and rescale: Rescaling the variables can mitigate issues caused by the differences in scales or magnitudes of the predictors, which can affect the optimization process. You can rescale the fixed effects predictors by subtracting the mean and dividing by the standard deviation. This centers the variables around zero and scales them to have a standard deviation of 1. This can be done using the scale() function in R.
    • Try alternative optimisation algorithms:

    lmer3 <- lmer(y ~ x + (x | group), data = data, control = lmerControl(optimizer ="Nelder_Mead"))

    • Finally, although it pains me to admit it, you should try running the model using a different package - each has their own unique and slightly different optimisation protocols.

28.19 Further Reading

Some essential reading to continue your mixed-model journey!