library(dplyr)
library(lme4); library(lmerTest); library(broom.mixed)
library(emmeans); library(performance)
7 Split-Split Plot Design
The split-split-plot design is an extension of the split-plot design to accommodate a third factor: one factor in main-plot, other in subplot and the third factor in sub-subplot
7.1 Details for split-split plot designs
The statistical model structure this design:
\[y_{ijk} = \mu + \rho_j + \alpha_i + \beta_k + (\alpha_i\beta_k) + \tau_n + (\alpha_i\tau_n) + (\tau_n\beta_k) + (\alpha_i\beta_k\tau_n) + \epsilon_{ijk} + \delta_{ijkn}\] Where:
\(\mu\)= overall experimental mean, \(\alpha\) = main effect of whole plot (fixed), \(\beta\) = main effect of subplot (fixed), \(\tau\) = main effect of sub-subplot, \(\epsilon_{ij}\) = whole plot error, \(\delta_{ijk}\) = subplot error.
\[ \epsilon \sim N(0, \sigma_\epsilon)\]
\[\ \delta \sim N(0, \sigma_\delta)\]
The assumptions of the model includes normal distribution of both the error and the rep effects with a mean of zero and standard deviations of \(\sigma_\epsilon\) and \(\sigma_\delta\), respectively.
7.2 Example Analysis
library(dplyr)
library(nlme); library(emmeans)
library(broom.mixed); library(performance)
In this example, we have a rice yield data from ‘agricolae’ package. This consists of of 3 different rice varieties grown under 3 management practices and 5 Nitrogen levels in the split-split design. Here, we are using rice yield data from the (agricolae) package.
<- read.csv(here::here("data", "rice_ssp.csv")) rice
block | blocking unit |
nitrogen | different nitrogen fertilizer rates as main plot with 5 levels |
management | management practices as subplot with 3 levels |
variety | crop variety being a sub-subplot with 3 levels |
yield | yield (bushels per acre) |
7.2.1 Data integrity checks
Look at the structure of the data, class of block, nitrogen, management and variety should be a character/factor and yield should be numeric.
str(rice)
'data.frame': 135 obs. of 6 variables:
$ X : int 1 2 3 4 5 6 7 8 9 10 ...
$ block : int 1 1 1 1 1 1 1 1 1 1 ...
$ nitrogen : int 0 0 0 50 50 50 80 80 80 110 ...
$ management: chr "m1" "m2" "m3" "m1" ...
$ variety : int 1 1 1 1 1 1 1 1 1 1 ...
$ yield : num 3.32 3.77 4.66 3.19 3.62 ...
Convert block, nitrogen, variety, and management to characters.
$block <- as.character(rice$block)
rice$nitrogen <- as.character(rice$nitrogen)
rice$management <- as.character(rice$management)
rice$variety <- as.character(rice$variety) rice
Next, run a cross tabulations to check balance of observations across independent variables:
table(rice$variety, rice$nitrogen, rice$management)
, , = m1
0 110 140 50 80
1 3 3 3 3 3
2 3 3 3 3 3
3 3 3 3 3 3
, , = m2
0 110 140 50 80
1 3 3 3 3 3
2 3 3 3 3 3
3 3 3 3 3 3
, , = m3
0 110 140 50 80
1 3 3 3 3 3
2 3 3 3 3 3
3 3 3 3 3 3
It looks perfectly balanced, with exactly 3 observation per treatment group.
Last, check the distribution of the dependent variable by plotting a histogram using hist()
.
hist(rice$yield)
7.2.2 Model Building
The variance analysis of a split-split plot design is divided into three parts: the main-plot, subplot and sub-subplot analysis. We can use the nesting notation in the random part because nitrogen and management are nested in blocks. We can do blocks as fixed or random.
<- lmer(yield ~ nitrogen * management * variety +
model_lmer 1 | block / nitrogen / management),
(data = rice,
na.action = na.exclude)
boundary (singular) fit: see help('isSingular')
tidy(model_lmer)
# A tibble: 49 × 8
effect group term estimate std.error statistic df p.value
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 fixed <NA> (Intercept) 3.90 0.386 10.1 89.7 1.79e-16
2 fixed <NA> nitrogen110 0.753 0.545 1.38 89.7 1.71e- 1
3 fixed <NA> nitrogen140 0.165 0.545 0.302 89.7 7.63e- 1
4 fixed <NA> nitrogen50 0.335 0.545 0.614 89.7 5.41e- 1
5 fixed <NA> nitrogen80 1.33 0.545 2.44 89.7 1.68e- 2
6 fixed <NA> managementm2 0.420 0.540 0.779 80.0 4.38e- 1
7 fixed <NA> managementm3 1.43 0.540 2.65 80.0 9.82e- 3
8 fixed <NA> variety2 1.45 0.540 2.68 80.0 8.83e- 3
9 fixed <NA> variety3 1.48 0.540 2.74 80.0 7.49e- 3
10 fixed <NA> nitrogen110:managem… 0.377 0.763 0.493 80.0 6.23e- 1
# ℹ 39 more rows
<- lme(yield ~ nitrogen*management*variety,
model_lme random = ~ 1|block/nitrogen/management,
data = rice,
na.action = na.exclude)
tidy(model_lme)
Warning in tidy.lme(model_lme): ran_pars not yet implemented for multiple
levels of nesting
# A tibble: 45 × 7
effect term estimate std.error df statistic p.value
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 fixed (Intercept) 3.90 0.386 60 10.1 1.43e-14
2 fixed nitrogen110 0.753 0.545 8 1.38 2.05e- 1
3 fixed nitrogen140 0.165 0.545 8 0.302 7.70e- 1
4 fixed nitrogen50 0.335 0.545 8 0.614 5.56e- 1
5 fixed nitrogen80 1.33 0.545 8 2.44 4.08e- 2
6 fixed managementm2 0.420 0.540 20 0.779 4.45e- 1
7 fixed managementm3 1.43 0.540 20 2.65 1.55e- 2
8 fixed variety2 1.45 0.540 60 2.68 9.38e- 3
9 fixed variety3 1.48 0.540 60 2.74 7.99e- 3
10 fixed nitrogen110:managementm2 0.377 0.763 20 0.493 6.27e- 1
# ℹ 35 more rows
boundary (singular) fit: We get a message that the fit is singular. What does this mean? Some components of the variance-covariance matrix of the random effects are either exactly zero or exactly one. OK what about in English? Basically it means that the algorithm that fits the model parameters doesn’t have enough data to get a good estimate. This often happens when we are trying to fit a model that is too complex for the amount of data we have, or when the random effects are very small and can’t be distinguished from zero. We still get some output but this message should make us take a close look at the random effects and their variances.
7.2.3 Check Model Assumptions
Model Diagnostics: we are looking for a constant variance and normality of residuals. Checking normality requiring first extracting the model residuals and then generating a qq-plot and qq-line. we can do all at one using one function check_model()
.
check_model(model_lmer, check = c('normality', 'linearity'))
check_model(model_lme, check = c('normality', 'linearity'))
Here, we didn’t observe any anomaly in model assumptions.
7.2.4 Inference
Analysis of variance
::Anova(model_lmer, type = 'III', test.statistic="F") car
Analysis of Deviance Table (Type III Wald F tests with Kenward-Roger df)
Response: yield
F Df Df.res Pr(>F)
(Intercept) 102.1211 1 89.706 < 2e-16 ***
nitrogen 1.9160 4 86.474 0.11496
management 3.6962 2 77.143 0.02932 *
variety 4.9129 2 60.000 0.01057 *
nitrogen:management 0.2118 8 77.143 0.98797
nitrogen:variety 2.6681 8 60.000 0.01413 *
management:variety 2.2193 4 60.000 0.07754 .
nitrogen:management:variety 0.5289 16 60.000 0.92105
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(model_lme, type = "marginal")
numDF denDF F-value p-value
(Intercept) 1 60 102.12108 <.0001
nitrogen 4 8 1.91603 0.2012
management 2 20 3.69617 0.0431
variety 2 60 4.91295 0.0106
nitrogen:management 8 20 0.21177 0.9850
nitrogen:variety 8 60 2.66810 0.0141
management:variety 4 60 2.21929 0.0775
nitrogen:management:variety 16 60 0.52893 0.9210
We can estimated the marginal means for each treatment factor (variety, nitrogen, management) which will averaged across other factors. and their interaction.
emmeans(model_lmer, ~ nitrogen)
NOTE: Results may be misleading due to involvement in interactions
nitrogen emmean SE df lower.CL upper.CL
0 5.38 0.139 10 5.08 5.69
110 6.94 0.139 10 6.63 7.25
140 7.23 0.139 10 6.93 7.54
50 6.22 0.139 10 5.91 6.53
80 7.00 0.139 10 6.69 7.30
Results are averaged over the levels of: management, variety
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
emmeans(model_lmer, ~ nitrogen*variety|management)
management = m1:
nitrogen variety emmean SE df lower.CL upper.CL
0 1 3.90 0.386 89.7 3.13 4.66
110 1 4.65 0.386 89.7 3.88 5.42
140 1 4.06 0.386 89.7 3.30 4.83
50 1 4.23 0.386 89.7 3.47 5.00
80 1 5.23 0.386 89.7 4.46 5.99
0 2 5.35 0.386 89.7 4.58 6.11
110 2 6.25 0.386 89.7 5.49 7.02
140 2 6.78 0.386 89.7 6.01 7.54
50 2 5.92 0.386 89.7 5.16 6.69
80 2 5.98 0.386 89.7 5.21 6.75
0 3 5.38 0.386 89.7 4.61 6.14
110 3 7.45 0.386 89.7 6.69 8.22
140 3 8.62 0.386 89.7 7.85 9.38
50 3 6.78 0.386 89.7 6.02 7.55
80 3 7.93 0.386 89.7 7.17 8.70
management = m2:
nitrogen variety emmean SE df lower.CL upper.CL
0 1 4.32 0.386 89.7 3.55 5.08
110 1 5.45 0.386 89.7 4.68 6.21
140 1 5.20 0.386 89.7 4.43 5.97
50 1 4.58 0.386 89.7 3.81 5.34
80 1 5.73 0.386 89.7 4.97 6.50
0 2 4.71 0.386 89.7 3.95 5.48
110 2 7.00 0.386 89.7 6.24 7.77
140 2 7.08 0.386 89.7 6.32 7.85
50 2 5.82 0.386 89.7 5.05 6.58
80 2 6.50 0.386 89.7 5.73 7.27
0 3 6.50 0.386 89.7 5.73 7.26
110 3 8.62 0.386 89.7 7.86 9.39
140 3 9.40 0.386 89.7 8.63 10.17
50 3 7.82 0.386 89.7 7.05 8.58
80 3 8.57 0.386 89.7 7.80 9.33
management = m3:
nitrogen variety emmean SE df lower.CL upper.CL
0 1 5.33 0.386 89.7 4.56 6.09
110 1 6.24 0.386 89.7 5.47 7.00
140 1 5.97 0.386 89.7 5.21 6.74
50 1 5.48 0.386 89.7 4.72 6.25
80 1 6.55 0.386 89.7 5.78 7.31
0 2 5.43 0.386 89.7 4.66 6.20
110 2 7.52 0.386 89.7 6.75 8.28
140 2 8.01 0.386 89.7 7.24 8.77
50 2 6.31 0.386 89.7 5.55 7.08
80 2 7.29 0.386 89.7 6.52 8.05
0 3 7.56 0.386 89.7 6.79 8.33
110 3 9.25 0.386 89.7 8.49 10.02
140 3 9.99 0.386 89.7 9.22 10.76
50 3 9.05 0.386 89.7 8.28 9.81
80 3 9.19 0.386 89.7 8.42 9.96
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
emmeans(model_lme, ~ nitrogen)
NOTE: Results may be misleading due to involvement in interactions
nitrogen emmean SE df lower.CL upper.CL
0 5.38 0.139 2 4.79 5.98
110 6.94 0.139 2 6.34 7.53
140 7.23 0.139 2 6.64 7.83
50 6.22 0.139 2 5.62 6.82
80 7.00 0.139 2 6.40 7.59
Results are averaged over the levels of: management, variety
Degrees-of-freedom method: containment
Confidence level used: 0.95
emmeans(model_lme, ~ nitrogen*variety|management)
management = m1:
nitrogen variety emmean SE df lower.CL upper.CL
0 1 3.90 0.386 2 2.24 5.56
110 1 4.65 0.386 2 2.99 6.31
140 1 4.06 0.386 2 2.40 5.72
50 1 4.23 0.386 2 2.57 5.89
80 1 5.23 0.386 2 3.57 6.89
0 2 5.35 0.386 2 3.69 7.01
110 2 6.25 0.386 2 4.59 7.91
140 2 6.78 0.386 2 5.12 8.43
50 2 5.92 0.386 2 4.26 7.58
80 2 5.98 0.386 2 4.32 7.64
0 3 5.38 0.386 2 3.72 7.04
110 3 7.45 0.386 2 5.79 9.11
140 3 8.62 0.386 2 6.96 10.28
50 3 6.78 0.386 2 5.12 8.44
80 3 7.93 0.386 2 6.27 9.59
management = m2:
nitrogen variety emmean SE df lower.CL upper.CL
0 1 4.32 0.386 2 2.66 5.98
110 1 5.45 0.386 2 3.79 7.11
140 1 5.20 0.386 2 3.54 6.86
50 1 4.58 0.386 2 2.92 6.24
80 1 5.73 0.386 2 4.07 7.39
0 2 4.71 0.386 2 3.05 6.37
110 2 7.00 0.386 2 5.34 8.66
140 2 7.08 0.386 2 5.43 8.74
50 2 5.82 0.386 2 4.16 7.47
80 2 6.50 0.386 2 4.84 8.16
0 3 6.50 0.386 2 4.84 8.16
110 3 8.62 0.386 2 6.97 10.28
140 3 9.40 0.386 2 7.74 11.06
50 3 7.82 0.386 2 6.16 9.48
80 3 8.57 0.386 2 6.91 10.23
management = m3:
nitrogen variety emmean SE df lower.CL upper.CL
0 1 5.33 0.386 2 3.67 6.98
110 1 6.24 0.386 2 4.58 7.90
140 1 5.97 0.386 2 4.31 7.63
50 1 5.48 0.386 2 3.82 7.14
80 1 6.55 0.386 2 4.89 8.21
0 2 5.43 0.386 2 3.77 7.09
110 2 7.52 0.386 2 5.86 9.18
140 2 8.01 0.386 2 6.35 9.66
50 2 6.31 0.386 2 4.65 7.97
80 2 7.29 0.386 2 5.63 8.95
0 3 7.56 0.386 2 5.90 9.22
110 3 9.25 0.386 2 7.59 10.91
140 3 9.99 0.386 2 8.33 11.65
50 3 9.05 0.386 2 7.39 10.70
80 3 9.19 0.386 2 7.53 10.85
Degrees-of-freedom method: containment
Confidence level used: 0.95
Notice we get a message that the estimated means for ‘nitrogen’ are averaged over the levels of ‘management’ and ‘variety’. So we need to be careful about how we interpret these estimates.