library(lme4); library(lmerTest); library(emmeans)
library(dplyr); library(broom.mixed); library(performance)
9 Incomplete Block Design
9.1 Background
The block design in Chapter 4 was complete, meaning that every block contained all the treatments. In practice, it may not be possible to have too many treatments in each block. Sometimes, there are also situations where it is advised to not have many treatments in each block.
In such cases, incomplete block designs are used where we have to decide what subset of treatments to be used in an individual block. This will work well if we enough blocks. However, if we only have small number of blocks, there would be the risk that certain quantities are not estimable anymore.
Incomplete block designs are grouped into balanced lattice design and partially balanced (or alpha-lattice) designs.
To avoid having a disconnected design, a balanced incomplete block design can be used
The statistical model for balanced incomplete block design is:
\[y_{ij} = \mu + \alpha_i + \beta_j + \epsilon_{ij}\] Where:
\(\mu\) = overall experimental mean \(\alpha\) = treatment effects (fixed) \(\beta\) = block effects (random) \(\epsilon\) = error terms
\[ \epsilon \sim N(0, \sigma)\]
\[ \beta \sim N(0, \sigma_b)\] There are few key points that we need to keep in mind while designing incomplete block designs:
A drawback of this design is that block effect and treatment effects are confounded.
To eliminate the block effects, better compare treatments within a block.
No treatment should appear twice in any block as they contributes nothing to within block comparisons.
An excellent description of incomplete block design is provided in ANOVA and Mixed Models by Lukas Meier.
The balanced incomplete block designs are guided by strict principles and guidelines including: the number of treatments must be a perfect square (e.g. 25, 36, and so on); number of replicates must be equal to no. of blocks +1;
Because the blocks are incomplete, the Type I and Type III sums of squares will be different. That is, the missing treatments in each block represent missing observations (but not missing ‘at random’).
9.2 Balanced Incomplete Block Design
9.2.1 Example Analysis
We will demonstrate an example data set designed in a balanced incomplete block design. First, load the libraries required for analysis and estimation.
library(nlme); library(broom.mixed); library(emmeans)
library(dplyr); library(performance)
The data used for this example analysis was extracted from the agridat
package. This example is comprised of soybean balanced incomplete block experiment.
<- agridat::weiss.incblock dat
block | blocking unit |
gen | genotype (variety) factor |
row | row position for each plot |
col | column position for each plot |
yield | grain yield in bu/ac |
9.2.1.1 Data integrity checks
We will start inspecting the data set firstly by looking at the class of variables:
str(dat)
'data.frame': 186 obs. of 5 variables:
$ block: Factor w/ 31 levels "B01","B02","B03",..: 1 2 3 4 5 6 7 8 9 10 ...
$ gen : Factor w/ 31 levels "G01","G02","G03",..: 24 15 20 18 20 5 22 1 9 14 ...
$ yield: num 29.8 24.2 30.5 20 35.2 25 23.6 23.6 29.3 25.5 ...
$ row : int 42 36 30 24 18 12 6 42 36 30 ...
$ col : int 1 1 1 1 1 1 1 2 2 2 ...
The variables we need for the model are block, gen and yield. The block and gen are classified as factor variables and yield is numeric. Therefore, we don’t need to change class of any of the required variables.
Next, let’s check the independent variables. We can look at this by running a cross tabulations among block and gen factors.
table(dat$block, dat$gen)
G01 G02 G03 G04 G05 G06 G07 G08 G09 G10 G11 G12 G13 G14 G15 G16 G17 G18
B01 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
B02 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0
B03 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0
B04 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1
B05 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0
B06 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0
B07 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0
B08 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0
B09 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0
B10 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 0 1 0
B11 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0
B12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
B13 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
B14 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 1
B15 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0
B16 0 1 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0
B17 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0
B18 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1
B19 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0
B20 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0
B21 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0
B22 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0
B23 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0
B24 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 1 0 0
B25 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0
B26 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
B27 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1
B28 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0
B29 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0
B30 1 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0
B31 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0
G19 G20 G21 G22 G23 G24 G25 G26 G27 G28 G29 G30 G31
B01 0 0 1 1 1 1 1 1 0 0 0 0 0
B02 0 0 0 0 0 0 0 1 0 0 0 0 0
B03 0 1 0 0 0 1 0 0 0 1 0 0 0
B04 0 0 1 0 0 0 0 0 0 0 1 0 0
B05 0 1 0 0 0 0 1 0 1 0 0 0 0
B06 1 0 0 1 0 0 0 0 0 0 1 0 0
B07 0 0 0 1 0 0 0 0 1 0 0 0 0
B08 0 0 0 0 0 0 0 1 0 0 0 0 0
B09 0 0 0 1 0 0 0 0 0 0 0 0 1
B10 0 0 0 0 0 0 1 0 0 0 1 0 0
B11 0 0 0 0 1 0 0 0 0 0 0 0 1
B12 1 1 0 0 0 0 0 1 0 0 0 0 0
B13 0 0 0 0 1 0 0 0 1 0 0 0 0
B14 0 0 0 0 0 1 0 0 0 0 0 0 1
B15 0 0 1 0 0 0 0 0 0 1 0 0 0
B16 0 0 0 0 0 1 0 0 0 0 1 0 0
B17 0 0 0 0 0 1 0 0 0 0 0 1 0
B18 0 0 0 1 0 0 0 0 0 1 0 0 0
B19 0 0 1 0 0 0 0 0 1 0 0 0 0
B20 1 0 0 0 1 0 0 0 0 1 0 0 0
B21 0 0 0 0 0 0 1 0 0 1 0 0 0
B22 0 0 0 0 0 0 0 1 0 0 0 0 0
B23 1 0 1 0 0 0 0 0 0 0 0 1 0
B24 0 0 0 0 1 0 0 0 0 0 0 1 0
B25 1 0 0 0 0 1 0 0 1 0 0 0 0
B26 0 0 0 0 0 0 0 1 1 1 1 1 1
B27 0 0 0 0 0 0 1 0 0 0 0 1 0
B28 0 1 1 0 0 0 0 0 0 0 0 0 1
B29 1 0 0 0 0 0 1 0 0 0 0 0 1
B30 0 1 0 0 1 0 0 0 0 0 1 0 0
B31 0 1 0 1 0 0 0 0 0 0 0 1 0
<- dat %>% group_by(gen) %>%
agg_tbl summarise(total_count=n(),
.groups = 'drop')
agg_tbl
# A tibble: 31 × 2
gen total_count
<fct> <int>
1 G01 6
2 G02 6
3 G03 6
4 G04 6
5 G05 6
6 G06 6
7 G07 6
8 G08 6
9 G09 6
10 G10 6
# ℹ 21 more rows
<- aggregate(dat$gen, by=list(dat$block), FUN=length)
agg_df agg_df
Group.1 x
1 B01 6
2 B02 6
3 B03 6
4 B04 6
5 B05 6
6 B06 6
7 B07 6
8 B08 6
9 B09 6
10 B10 6
11 B11 6
12 B12 6
13 B13 6
14 B14 6
15 B15 6
16 B16 6
17 B17 6
18 B18 6
19 B19 6
20 B20 6
21 B21 6
22 B22 6
23 B23 6
24 B24 6
25 B25 6
26 B26 6
27 B27 6
28 B28 6
29 B29 6
30 B30 6
31 B31 6
There are 31 varieties (gen) and it is perfectly balanced, with exactly one observation per treatment per block.
We can calculate the sum of missing values in variables in this data set to evaluate the extent of missing values in different variables
apply(dat, 2, function(x) sum(is.na(x)))
block gen yield row col
0 0 0 0 0
We observed no missing data!
Last, let’s plot a histogram of the dependent variable. This is a quick check before analysis to see if there is any strong deviation in values.
hist(dat$yield, main = "", xlab = "yield")
Response variable values fall within expected range, with few extreme values on right tail. This data set is ready for analysis!
9.2.1.2 Model Building
We will be evaluating the response of yield as affected by gen (fixed effect) and block (random effect).
Please note that incomplete block effect can be analyzed as a fixed (intra-block analysis) or a random (inter-block analysis) effect. When we consider block as a random effect, the mean values of a block also contain information about the treatment effects.
<- lmer(log(yield) ~ gen + (1|block),
model_icbd data = dat,
na.action = na.exclude)
tidy(model_icbd)
# A tibble: 33 × 8
effect group term estimate std.error statistic df p.value
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 fixed <NA> (Intercept) 3.20 0.0335 95.5 153. 3.36e-138
2 fixed <NA> genG02 0.0908 0.0424 2.14 129. 3.40e- 2
3 fixed <NA> genG03 0.281 0.0424 6.63 129. 8.29e- 10
4 fixed <NA> genG04 0.0924 0.0424 2.18 129. 3.12e- 2
5 fixed <NA> genG05 0.0651 0.0424 1.54 129. 1.27e- 1
6 fixed <NA> genG06 0.260 0.0424 6.14 129. 9.57e- 9
7 fixed <NA> genG07 -0.0252 0.0424 -0.593 129. 5.54e- 1
8 fixed <NA> genG08 0.115 0.0424 2.71 129. 7.60e- 3
9 fixed <NA> genG09 0.171 0.0424 4.04 129. 9.06e- 5
10 fixed <NA> genG10 -0.00444 0.0424 -0.105 129. 9.17e- 1
# ℹ 23 more rows
<- lme(yield ~ gen,
model_icbd1 random = ~ 1|block,
data = dat,
na.action = na.exclude)
tidy(model_icbd1)
# A tibble: 33 × 8
effect group term estimate std.error df statistic p.value
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 fixed <NA> (Intercept) 24.6 0.922 125 26.7 2.10e-53
2 fixed <NA> genG02 2.40 1.17 125 2.06 4.18e- 2
3 fixed <NA> genG03 8.04 1.17 125 6.88 2.54e-10
4 fixed <NA> genG04 2.37 1.17 125 2.03 4.43e- 2
5 fixed <NA> genG05 1.60 1.17 125 1.37 1.73e- 1
6 fixed <NA> genG06 7.39 1.17 125 6.32 4.11e- 9
7 fixed <NA> genG07 -0.419 1.17 125 -0.359 7.20e- 1
8 fixed <NA> genG08 3.04 1.17 125 2.60 1.04e- 2
9 fixed <NA> genG09 4.84 1.17 125 4.14 6.33e- 5
10 fixed <NA> genG10 -0.0429 1.17 125 -0.0367 9.71e- 1
# ℹ 23 more rows
9.2.1.3 Check Model Assumptions
Let’s verify the assumption of linear mixed models including normal distribution and constant variance of residuals.
check_model(model_icbd, check = c('normality', 'linearity'))
check_model(model_icbd1, check = c('normality', 'linearity'))
Here we observed a right skewness in residuals, this can be resolved by using data transformation e.g. log transformation of response variable. Please refer to chapter to read more about data transformation.
9.2.1.4 Inference
We can extract information about ANOVA using anova()
from lmer and lme models, respectively. ::: panel-tabset #### lme4
anova(model_icbd, type = "1")
Type I Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
gen 2.5488 0.084959 30 129.02 17.995 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
9.2.1.5 nlme
anova(model_icbd1, type = "sequential")
numDF denDF F-value p-value
(Intercept) 1 125 4042.016 <.0001
gen 30 125 17.675 <.0001
:::
Let’s look at the estimated marginal means of yield for each variety (gen). ::: panel-tabset #### lme4
emmeans(model_icbd, ~ gen)
gen emmean SE df lower.CL upper.CL
G01 3.20 0.0335 153 3.13 3.27
G02 3.29 0.0335 153 3.22 3.36
G03 3.48 0.0335 153 3.41 3.55
G04 3.29 0.0335 153 3.23 3.36
G05 3.26 0.0335 153 3.20 3.33
G06 3.46 0.0335 153 3.39 3.53
G07 3.17 0.0335 153 3.11 3.24
G08 3.31 0.0335 153 3.25 3.38
G09 3.37 0.0335 153 3.30 3.44
G10 3.20 0.0335 153 3.13 3.26
G11 3.29 0.0335 153 3.22 3.35
G12 3.37 0.0335 153 3.30 3.44
G13 3.39 0.0335 153 3.32 3.46
G14 3.19 0.0335 153 3.12 3.25
G15 3.25 0.0335 153 3.19 3.32
G16 3.24 0.0335 153 3.17 3.31
G17 2.95 0.0335 153 2.89 3.02
G18 3.23 0.0335 153 3.17 3.30
G19 3.37 0.0335 153 3.30 3.43
G20 3.50 0.0335 153 3.44 3.57
G21 3.44 0.0335 153 3.37 3.50
G22 3.22 0.0335 153 3.15 3.29
G23 3.39 0.0335 153 3.32 3.46
G24 3.52 0.0335 153 3.45 3.58
G25 3.29 0.0335 153 3.22 3.36
G26 3.30 0.0335 153 3.23 3.36
G27 3.17 0.0335 153 3.10 3.23
G28 3.27 0.0335 153 3.20 3.33
G29 3.20 0.0335 153 3.13 3.27
G30 3.57 0.0335 153 3.51 3.64
G31 3.29 0.0335 153 3.23 3.36
Degrees-of-freedom method: kenward-roger
Results are given on the log (not the response) scale.
Confidence level used: 0.95
9.2.1.6 nlme
emmeans(model_icbd1, ~ gen)
gen emmean SE df lower.CL upper.CL
G01 24.6 0.922 30 22.7 26.5
G02 27.0 0.922 30 25.1 28.9
G03 32.6 0.922 30 30.7 34.5
G04 26.9 0.922 30 25.1 28.8
G05 26.2 0.922 30 24.3 28.1
G06 32.0 0.922 30 30.1 33.8
G07 24.2 0.922 30 22.3 26.0
G08 27.6 0.922 30 25.7 29.5
G09 29.4 0.922 30 27.5 31.3
G10 24.5 0.922 30 22.6 26.4
G11 27.1 0.922 30 25.2 28.9
G12 29.3 0.922 30 27.4 31.1
G13 29.9 0.922 30 28.1 31.8
G14 24.2 0.922 30 22.4 26.1
G15 26.1 0.922 30 24.2 28.0
G16 25.9 0.922 30 24.0 27.8
G17 19.7 0.922 30 17.8 21.6
G18 25.7 0.922 30 23.8 27.6
G19 29.0 0.922 30 27.2 30.9
G20 33.2 0.922 30 31.3 35.0
G21 31.1 0.922 30 29.2 33.0
G22 25.2 0.922 30 23.3 27.1
G23 29.8 0.922 30 27.9 31.7
G24 33.6 0.922 30 31.8 35.5
G25 27.0 0.922 30 25.1 28.9
G26 27.1 0.922 30 25.3 29.0
G27 23.8 0.922 30 21.9 25.7
G28 26.5 0.922 30 24.6 28.4
G29 24.8 0.922 30 22.9 26.6
G30 36.2 0.922 30 34.3 38.1
G31 27.1 0.922 30 25.2 29.0
Degrees-of-freedom method: containment
Confidence level used: 0.95
:::
##data2
The data used in this example is extracted from the agridat
package. This data is a balanced lattice experiment in cotton containing 16 treatments in a 4x4 layout in each of 5 replicates. The response variable in this data is the percentage of young flower buds attacked by boll weevils.
<- agridat::cochran.lattice
dat3 #head(dat3)
rep | replication unit |
trt | treatment factor |
row | row position for each plot |
col | column position for each plot |
y | % of young flower buds attacked |
str(dat3)
'data.frame': 80 obs. of 5 variables:
$ y : num 9 20.3 17.7 26.3 4.7 9 7.3 8.3 9 6.7 ...
$ rep: Factor w/ 5 levels "R1","R2","R3",..: 1 1 1 1 1 1 1 1 1 1 ...
$ row: int 1 1 1 1 2 2 2 2 3 3 ...
$ col: int 1 2 3 4 1 2 3 4 1 2 ...
$ trt: Factor w/ 16 levels "T01","T02","T03",..: 10 12 9 11 2 4 1 3 14 16 ...
table(dat3$trt, dat3$rep)
R1 R2 R3 R4 R5
T01 1 1 1 1 1
T02 1 1 1 1 1
T03 1 1 1 1 1
T04 1 1 1 1 1
T05 1 1 1 1 1
T06 1 1 1 1 1
T07 1 1 1 1 1
T08 1 1 1 1 1
T09 1 1 1 1 1
T10 1 1 1 1 1
T11 1 1 1 1 1
T12 1 1 1 1 1
T13 1 1 1 1 1
T14 1 1 1 1 1
T15 1 1 1 1 1
T16 1 1 1 1 1
library(desplot)
desplot(dat3, y~row*col|rep,
text=trt, # aspect unknown, should be 2 or .5
main="cochran.lattice")
Here, we can use the desplot()
function from the ‘desplot’ package to visualize the plot plan from lattice design.
::desplot(dat3, y~row*col|rep,
desplottext=trt, shorten='none', cex=.6,
aspect=252/96, # true aspect
main="Balanced incomplete block")
9.2.2 Data integrity checks
9.2.3 Data integrity checks
str(dat3)
'data.frame': 80 obs. of 5 variables:
$ y : num 9 20.3 17.7 26.3 4.7 9 7.3 8.3 9 6.7 ...
$ rep: Factor w/ 5 levels "R1","R2","R3",..: 1 1 1 1 1 1 1 1 1 1 ...
$ row: int 1 1 1 1 2 2 2 2 3 3 ...
$ col: int 1 2 3 4 1 2 3 4 1 2 ...
$ trt: Factor w/ 16 levels "T01","T02","T03",..: 10 12 9 11 2 4 1 3 14 16 ...
#dat2$row <- as.factor(dat2$row)
#dat2$col <- as.factor(dat2$col)
$row <- as.factor(dat3$row)
dat3$col <- as.factor(dat3$col) dat3
hist(dat3$y, main = "", xlab = "yield")
<- lmer(y ~ trt + (1|rep) + (1|rep:row:rep) + (1|rep:col),
m1_a data = dat3,
na.action = na.exclude)
boundary (singular) fit: see help('isSingular')
tidy(m1_a)
# A tibble: 20 × 8
effect group term estimate std.error statistic df p.value
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 fixed <NA> (Intercept) 6.09 2.57 2.37 61.6 0.0208
2 fixed <NA> trtT02 7.64 3.45 2.21 47.8 0.0316
3 fixed <NA> trtT03 2.40 3.45 0.696 47.8 0.490
4 fixed <NA> trtT04 5.25 3.45 1.52 47.8 0.135
5 fixed <NA> trtT05 3.47 3.45 1.01 47.8 0.319
6 fixed <NA> trtT06 1.21 3.45 0.350 47.8 0.728
7 fixed <NA> trtT07 1.21 3.45 0.350 47.8 0.728
8 fixed <NA> trtT08 3.32 3.45 0.963 47.8 0.340
9 fixed <NA> trtT09 4.02 3.45 1.17 47.8 0.249
10 fixed <NA> trtT10 9.09 3.45 2.64 47.8 0.0113
11 fixed <NA> trtT11 11.8 3.45 3.42 47.8 0.00128
12 fixed <NA> trtT12 6.76 3.45 1.96 47.8 0.0561
13 fixed <NA> trtT13 4.94 3.45 1.43 47.8 0.159
14 fixed <NA> trtT14 8.16 3.45 2.36 47.8 0.0222
15 fixed <NA> trtT15 3.20 3.45 0.929 47.8 0.358
16 fixed <NA> trtT16 4.54 3.45 1.31 47.8 0.195
17 ran_pars rep:col sd__(Interc… 1.78 NA NA NA NA
18 ran_pars rep:row:rep sd__(Interc… 3.30 NA NA NA NA
19 ran_pars rep sd__(Interc… 0 NA NA NA NA
20 ran_pars Residual sd__Observa… 4.88 NA NA NA NA
$dummy <- factor(1)
dat3
<- lme(y ~ trt,
m1_b random = list(dummy = pdBlocked(list(
pdIdent(~rep - 1),
pdIdent(~rep:row - 1),
pdIdent(~rep:col)))),
data = dat3,
na.action = na.exclude)
VarCorr(m1_b)
dummy = pdIdent(rep - 1), pdIdent(rep:row - 1), pdIdent(rep:col)
Variance StdDev
repR1 6.046133e-08 0.0002458889
repR2 6.046133e-08 0.0002458889
repR3 6.046133e-08 0.0002458889
repR4 6.046133e-08 0.0002458889
repR5 6.046133e-08 0.0002458889
repR1:row1 1.091771e+01 3.3041961026
repR2:row1 1.091771e+01 3.3041961026
repR3:row1 1.091771e+01 3.3041961026
repR4:row1 1.091771e+01 3.3041961026
repR5:row1 1.091771e+01 3.3041961026
repR1:row2 1.091771e+01 3.3041961026
repR2:row2 1.091771e+01 3.3041961026
repR3:row2 1.091771e+01 3.3041961026
repR4:row2 1.091771e+01 3.3041961026
repR5:row2 1.091771e+01 3.3041961026
repR1:row3 1.091771e+01 3.3041961026
repR2:row3 1.091771e+01 3.3041961026
repR3:row3 1.091771e+01 3.3041961026
repR4:row3 1.091771e+01 3.3041961026
repR5:row3 1.091771e+01 3.3041961026
repR1:row4 1.091771e+01 3.3041961026
repR2:row4 1.091771e+01 3.3041961026
repR3:row4 1.091771e+01 3.3041961026
repR4:row4 1.091771e+01 3.3041961026
repR5:row4 1.091771e+01 3.3041961026
(Intercept) 3.169947e+00 1.7804345669
repR1:col1 3.169947e+00 1.7804345669
repR2:col1 3.169947e+00 1.7804345669
repR3:col1 3.169947e+00 1.7804345669
repR4:col1 3.169947e+00 1.7804345669
repR5:col1 3.169947e+00 1.7804345669
repR1:col2 3.169947e+00 1.7804345669
repR2:col2 3.169947e+00 1.7804345669
repR3:col2 3.169947e+00 1.7804345669
repR4:col2 3.169947e+00 1.7804345669
repR5:col2 3.169947e+00 1.7804345669
repR1:col3 3.169947e+00 1.7804345669
repR2:col3 3.169947e+00 1.7804345669
repR3:col3 3.169947e+00 1.7804345669
repR4:col3 3.169947e+00 1.7804345669
repR5:col3 3.169947e+00 1.7804345669
repR1:col4 3.169947e+00 1.7804345669
repR2:col4 3.169947e+00 1.7804345669
repR3:col4 3.169947e+00 1.7804345669
repR4:col4 3.169947e+00 1.7804345669
repR5:col4 3.169947e+00 1.7804345669
Residual 2.385960e+01 4.8846291738
9.2.4 Check Model Assumptions
Remember those iid assumptions? Let’s make sure we actually met them.
check_model(m1_a, check = c("linearity", "normality"))
9.2.5 Inference
Estimates for each treatment level can be obtained with the ‘emmeans’ package. And we can extract the ANOVA table from model using anova()
function.
anova(m1_a)
Type III Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
trt 612.21 40.814 15 47.81 1.7106 0.0808 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Estimated marginal means
emmeans(m1_a, ~ trt)
trt emmean SE df lower.CL upper.CL
T01 6.09 2.68 50 0.70 11.5
T02 13.73 2.68 50 8.34 19.1
T03 8.49 2.68 50 3.10 13.9
T04 11.34 2.68 50 5.95 16.7
T05 9.56 2.68 50 4.17 15.0
T06 7.30 2.68 50 1.91 12.7
T07 7.30 2.68 50 1.91 12.7
T08 9.41 2.68 50 4.02 14.8
T09 10.11 2.68 50 4.72 15.5
T10 15.18 2.68 50 9.79 20.6
T11 17.90 2.68 50 12.51 23.3
T12 12.85 2.68 50 7.46 18.2
T13 11.03 2.68 50 5.64 16.4
T14 14.25 2.68 50 8.86 19.6
T15 9.29 2.68 50 3.91 14.7
T16 10.63 2.68 50 5.24 16.0
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
9.3 Aplha Lattice Design (partially-balanced)
9.3.1 Example Analysis
In incomplete block alpha-design, the blocks are grouped into complete replicates. These designs are also termed as “resolvable incomplete block designs” or “partially balanced incomplete block designs”[^analysis-tips-1]. This design has been more commonly used instead of balanced IBD because of it’s practicability, flexibility, and versatility. [^analysis-tips-1]:Patterson, H. D., & Williams, E. (1976). A new class of resolvable incomplete block designs. Biometrika, 63(1), 83-92.
The data used in this example is published in John and Williams (1995)1. The data in this trial was laid out in an alpha lattice design. This trial data had 24 genotypes (gen), 6 incomplete blocks, and each replicated 3 times.
Let’s start analyzing this example firstly by loading the required libraries for linear mixed models:
library(lme4); library(lmerTest); library(emmeans)
library(dplyr); library(broom.mixed); library(performance)
library(nlme); library(broom.mixed); library(emmeans)
library(dplyr); library(performance)
9.3.1.1 Data integrity checks
::library(readr)
base<- "https://raw.githubusercontent.com/SchmidtPaul/DSFAIR/master/data/John%26Williams1995.csv"
dataURL <- read_csv(dataURL) dat
Rows: 72 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): rep, inc.block, gen
dbl (4): plot, yield, row, col
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
dat
# A tibble: 72 × 7
plot rep inc.block gen yield row col
<dbl> <chr> <chr> <chr> <dbl> <dbl> <dbl>
1 1 Rep1 B1 G11 4.12 4 1
2 2 Rep1 B1 G04 4.45 3 1
3 3 Rep1 B1 G05 5.88 2 1
4 4 Rep1 B1 G22 4.58 1 1
5 5 Rep1 B2 G21 4.65 4 2
6 6 Rep1 B2 G10 4.17 3 2
7 7 Rep1 B2 G20 4.01 2 2
8 8 Rep1 B2 G02 4.34 1 2
9 9 Rep1 B3 G23 4.23 4 3
10 10 Rep1 B3 G14 4.76 3 3
# ℹ 62 more rows
<- dat %>%
dat mutate_at(vars(plot:gen), as.factor)
::desplot(data = dat, flip = TRUE,
desplotform = gen ~ col + row | rep, # fill color per genotype, headers per replicate
text = gen, cex = 0.7, shorten = "no", # show genotype names per plot
out1 = rep, # lines between complete blocks/replicates
out2 = inc.block, # lines between incomplete blocks
main = "Field layout", show.key = F) # formatting
https://kwstat.github.io/agridat/reference/john.alpha.html
<- agridat::john.alpha dat4
block | blocking unit |
gen | genotype (variety) factor |
row | row position for each plot |
col | column position for each plot |
yield | grain yield in bu/ac |
Hering, F. (1996). John, JA, Williams, ER: Cyclic and Computer Generated Designs. Chapmann and Hall, London, New York 1995, 255 S., $ L 32,‐.↩︎