Section 3 Spatial Models
This section contains some of the statistical background behind why spatial models are used and how they work. Understanding this section is not essential, but it is extremely helpful. This section relies on information introduced in 2, so please make sure you have read that section if you are new to empirical variograms and spatial statistics.
General linear statistical models are commonly modeled as thus:
\[Y_i = \beta_0 + X_i\beta_1 + \epsilon_i\] \(\beta_1\) is a slope describing the relationship between a continuous variable and the dependent variable, \(Y_i\). If \(X_i\) is a categorical variable, such as a crop variety, then there will be \(p-1\) slopes estimated, where p is the number of unique treatments levels in \(X\).
The error terms, \(\epsilon_i\) are assumed normally distributed with a mean of zero and a variance of \(\sigma^2\) :
\[e_i ~\sim N(0, \sigma^2)\] The error terms, or residuals, are assumed to be identically and independently distributed (sometimes abbreviated “iid”). This implies a constant variance of the error terms and zero covariance between residuals.
If N = 3, the expanded model looks like this:
\[\left[ {\begin{array}{ccc} Y_1\\ Y_2\\ Y_3 \end{array} } \right] = \beta_0 + \left[ {\begin{array}{ccc} X_1\\ X_2\\ X_3 \end{array} } \right] \beta_1 + \left[ {\begin{array}{ccc} \epsilon_1\\ \epsilon_2\\ \epsilon_3 \end{array} } \right] \]
\[e_i ~\sim N \Bigg( 0, \left[ {\begin{array}{ccc} \sigma^2 & 0 & 0 \\ 0 & \sigma^2 & 0\\ 0 & 0 & \sigma^2\end{array} } \right] \Bigg) \]
If spatial variation is present, the off-diagonals of the variance-covariance matrix are not zero - hence the error terms are not independently distributed. As a result, hypotheses test and parameter estimates from uncorrected linear models will provide erroneous results.
3.2 Spatial Regression methods
These approaches look use information from adjacent plots to adjust for spatial auto-correlation.
3.2.1 Spatial autoregressive (SAR)
Sometimes called a “lag” model, the SAR model uses correlations with neighboring plots dependent variable to predict Y. The auto-regressive model explicitly models correlations between neighboring points.
\[\mathbf{Y = \rho W Y + X\beta + \epsilon} \]
While this may look strange, \(\mathbf{W}\) is an \(n\) x \(n\) matrix weighting the neighbors with a diagonal of zero so the value at \(i=j\), that is \(Y_{ijk}\) itself, is not used on the right-hand side to predict \(Y_{ijk}\) on the left-hand side of the equation. The error terms are assumed iid.
On Weights
Setting weights of neighbors is dealt with in the next chapter 5.
3.3 Trend analysis
3.3.1 Row and column trends
Experiment wide-trends should be modeled with directional trend models. These are comparatively simple models:
\[Y_{ijk} = \beta_0 + X_{i1}\beta_1 + Row_{j2}\beta_2 + Range_{k3}\beta_3 +\epsilon_{ijk}\]
If the assumption of independent, normal, and identical errors are met, then this model will suffice. If spatial variation is still present, additional measures will need to be taken.