Ver Hoef, J.M. 1993. Universal kriging for ecological data. Pages 447 - 453 in Goodchild, M.F., Parks, B., and Steyaert, L.T. (eds.) Environmental Modeling with GIS, Oxford University Press, 488 p.
45
Universal Kriging for Ecological Data JAYM. VERHOEF
The general problem considered here is spatial prediction for a raster data set, where we have one or more complete layers of explanatory variables, and another sparsely filled layer for a response variable (Figure 45-1). The goal is to predict the response variable for the remaining cells in the grid. Three approaches will be compared: linear regression, kriging, and universal kriging. The type of data depicted in Figure 45-1 occurs frequently in ecology. Often, explanatory variables such as slope, aspect, soil type, geological bedrock type, etc. are easy to obtain and may be in electronic form currently. However, data collection of the response variable for each cell may be too expensive or destructive. For example, to collect data on the biomass of a plant species is time consuming and would be totally destructive if collected on
every cell in a grid. Thus, it would be very useful to predict biomass spatially for all of the cells in the raster grid, based on a smaller subset or using explanatory variables. Since ecology is the relationship of organisms to their environment (McIntosh, 1985), historically ecologists have been concerned with the response of organisms to environmental (explanatory) factors (Whittaker, 1975, p. 111), which is the niche concept (Pianka, 1978, p. 237). Linear and nonlinear models of organismal response to explanatory, environmental factors can be built, and in general this approach will be called regression. Regression is appealing for spatial prediction because it uses information on the relationship of organisms to their environment, that is, the explanatory variables. Optimal linear spatial prediction, or kriging, is another approach for predicting the response variable at unsampled locations. Kriging has been described in many places (for example, Matheron, 1963; Journel and Huijbregts, 1978; Cressie, 1991). For spatial prediction, kriging can be thought of simply as an optimally weighted average of the surrounding data for that same response variable. Kriging is less satisfactory to ecologists since it does not use relationships of the organism to its environment; rather, it only uses the idea that locations close together seem to be more similar. The approach I will fea ture is universal kriging (Matheron, 1969; Hui.jbregts and Matheron, 1971; Cressie 1991). Universal kriging combines aspects of regression and kriging; so it retains the usefulness of relating an organism to its environment, together with the idea that locations in close proximity tend to be more similar. Some of the models given here are also described by Cressie in this volume, but there are notational differences so that comparisons can be made between regression, kriging, and universal kriging. PREDICTION MODELS
Figure 45-1: Example of raster data where the explanatory variable fills the grid but the response variable only sparsely fills the grid.
Denote the response variable as a vector. For example, in Figure 45-1,
448 Jay M. Ver Hoef
z(1,3) z(1,9)
1 2
z(10,10)
1
2), kriging usually assumes a constant mean,
z
z=
(45-4)
a
where St= (iJ)' is the spatial location; the ith row and jth column in the raster grid, indexed by t; t= 1,... ,n. The columns of X contain explanatory variables. For example, in Figure 45-1, [x(si)], [x(sz)] ,
X=
= 1,u + 0,
1 17.0 1 14.6
1 x(1,3) 1 x(1,9)
where is a vector whose statistical dependence is described by a variogram model. That is, Var[o(i,j)o(u,v)]=2y(h), where h=[(i-u)2+(j-v)2]O.5; in words, the variance of the difference between o(i,j) and o(u,v) depends only on their distance apart. The function 2y(h) is the variogram, and y(h) is called the semivariogram. The vector [a(so)] contains the optimal weights for z, where optimal is defined by minimizing the mean-squared prediction error, A
,
,
2
E[p(so) - z(so)] ;
=
where kriging uses a linear combination of the data; that is, minimize ,
1 7.5
1 x(10,10)
The vector f3 contains parameters. For the example here,
{3 = [fiO~l]', where {3o is a parameter for an overall mean and {31 is a coefficient for the explanatory variable. Then, a linear regression model can be written as,
z = Xf3
+ a,
,
2
E{[ a(so)] 'z - z(so)} .
(45-1)
A sufficient condition for the predictor to be uniformly unbiased over all f3 is l'[a(so)]=l. For more complete treatments of kriging, see, for example, Journel and Huijbregts (1978) or Cressie (1991). The kriging equations may be written,
1] [[a(s~)]] _ [y(S~)] A-I'
a
where is assumed to be a zero mean vector of independent random errors (var[a] =021; 1 is the identity matrix). After selecting a regression model, with the appropriate explanatory variables included in X, the parameters fJ can be estimated with ordinary least squares,! = (X'X)-lX'z. Let x(so) be a vector containing the values of the explanatory variables at some location to be predicted, so. For example, from Figure 45-1, x(so) might be x(I,I) = [1,20.2]'. Then the regression predictor p";( so} for the response variable z(l,l) at So is simply
The advantage of statistical models is that prediction variances can also be obtained:
r [ 1'0
(45-5)
where r is a matrix of all pairwise semivariogram values for the data z, and y(so) is a vector of semivariogram values between the data z and the location to be predicted, z( So ), and 1 is a vector of ones. A Lagrange multiplier, A, occurs due to the unbiasedness conditions. Variograms indicate the spatial dependency in the data and must be estimated from the data. See Journel and Huijbregts (1978) or Cressie (1991) for various variogram models and Cressie (1985, 1991) for methods to estimate the variogram. The optimal weights are obtained by solving the kriging equations (45-5) for [a(so )], (45-6) 1 1- (so, ]'r- 1, - 1 pk(SO) = [a(so))'z = [y(so)]' + [:T~11 1 r z. A'
,
,
1
(
where 0 2 is estimated by,
d- -
The prediction variance is given by,
(z-xP) '{z-xP) n-rank(X)
.
Notice that in regression, prediction depends primarily on the explanatory variables at so. Kriging, on the other hand, uses only the surrounding data for the response variable; denote the kriging predictor p~(so) = [a(so)] 'z. In terms of the linear model (45-
Notice that kriging does not use information on explanatory variables. Universal kriging is a generalization of kriging and regression; a good description is given by Cressie (1991).
Universal Kriging for Ecological Data 449
For universal kriging, we assume the linear nlodel (45-1), but in contrast to regression we also assume that 0 has spatial dependence that is described by a variogram model, rather than independence among all o( .). This is related to the idea of kriging the residuals from regression. For example, let
be the residuals from regression. Then a predictor might be
where a(so) comes from the kriging equations using as the "data." This could be thought of as a kriging-adjusted regression predictor. However, it turns out that the optinlal solution is obtained by minimizing the mean-squared prediction error and is given by the universal kriging equations. Let p"'u(so) = [b(so)]'z
CASE STUDY To compare the prediction methods, regression, kriging, and universal kriging, a data set was used which contained all response variables (Figure 45-2). These data come from a glade in the Ozark area of southeastern Missouri. The response variable is a log-transformed cover value of an herbaceous plant species, Coreopsis [anceo[ala, and the explanatory variable is the square root of the total basal area of all woody plant species. Ecologically, the explanatory variable indicates a shading factor. Each cell in the grid represents a real square 7x7 m plot. In this case, we know all of the values of the response and explanatory variables. Th compare methods, the response value at each location was removed, one at a time, and predicted back wi th each method. Next, the difference between the true value and predicted value was squared, and then averaged over all 100 sites. This is called crossvalidation for the mean-square prediction error (MSPE):
1
MSPE(m)
be the universal kriging predictor. The vector of optimal weights can be obtained by solving the universal kriging equations for b(so),
[[b(SO)]] = [r(s?)], [T,X] X 0 A x(so)
(45-8)
100
= 100t~1
""
'2
~m(St) - z(St)] ,
(45-11)
where m is the prediction method " k, or u for .£egression, kriging, or universal kriging, respectively; and Pm(Sl) is the predictor for location St, where ZeSt) was removed during cross-validation.
where X and x(so) are as defined for the linear model (45-1), T is a matrix of all pairwise semivariogram values for the residuals, 0, in the linear model (45-1), and r( So ) is a vector of the semivariogram values between the residuals of the data and the residual at the location to be predicted. As a practical matter, the variogram is estimated from the residuals
t = € = (z - xl) and the semivariogram is used in the universal kriging equations. If X=l and x(so )=1 (no explanatory variables), then the universal kriging equations are identical to the kri~ing equations; if all 0(·) are independent (Var[o] =0 I), universal kriging is identical to regression prediction. The universal kriging predictor is
The prediction variance is given by o;u( so) = [r(so)] 'l~-lr( so) (45-10) [X'T-1r(so) - x(so)]'(X'T-IX)-1[X'T- 1r(so) - x(sO)].
Figure 45-2: Data set used for case study. The response variable is log-transformed cover values for Coreopsis lanceolata in 7x7 m contiguous plots, and the explanatory variable is the square root of basal area for all woody species in the plot.
450 Jay M. Ver Hoef
5
Z
2.89
.0705(X) P < .000 1
4
4 N Q) r--I
,..0
ro H ro
3
.........
.~
:> 2 Q)
en
Q 0
~
"""'-
en
Q)
H
0
..
.
..
-1-+-----------r--------..---------, o 30 10 20
Explanatory variable X Figure 45-3: Linear regression model for all data in case study.
I I
I
Figure 45-4: Regression predictions during cross-validation, given as isopleth lines. The plots are outlined by the dashed line, with the true value in the upper left-hand corner.
7 ,
r--I--I
I
I
I
6
5
., .. . .
2
O-+--r---,---r---r------r---.---~---r-___r_-_r__-.____r_______,
o
2
3
4
5
6
7
8
9
10 11 12 13
Lag Figure 45-5: Regression prediction variances during cross-validation, given as isopleth lines. The plots are outlined by dashed lines.
Figure 45-6: Empirical variogram (solid circles) and fit variogram (curve) for case study data.
Universal Kriging for Ecological Data 451
Figure 45-7: Kriging predictions during cross-validation, given as isopleth lines. The plots are outlined by the dashed line, with the true value in the upper left-hand corner.
The regression model for all one hundred locations is given in Figure 45-3, and it shows that the response variable has a significant linear relationship to the explanatory variable. Each time a datum was removed during cross-validation, a new linear regression equation was fit and used in (45-2). Figure 45-4 shows the predicted values, using (45-2), as isopleth lines for all 100 locations, along with the true value in the upper left-hand corner of each cell. Figure 45-5 shows the prediction variances (45-3) calculated at each location during cross-validation. From (45-11) for regression, MSPE(r) = 1.37. Prediction using kriging depends on the variogram. The empirical variogram was calculated using the classical formula,
2Y(h)
= n(~ ~
[z(ij) - z(u,v)]2;
(45-12)
where H is the set of pairs {[z(i,j), z(u,v)]; [(i-j)2+(uv)2]O.S=h} and n(B) is the number of distinct pairs in H. These values (45-12) are given as solid circles in Figure 45-6 for all of the data. An exponential variogram model was fit to the empirical values (45-12) using the SAS® nonlinear weighted least-squares procedure. The exponential model is
2y(h) = co
+ c1[1- exp(-czh)]
(45-13)
Figure 45-8: Kriging prediction variances during cross-validation, given as isopleth lines. The plots are outlined by the dashed line.
and the estimated parameters were co= 1.61, C1 =5.09, and
c2=0.092. The fitted model is given as a curve in Figure 45-6. Each time a datum was removed during cross-validation, a new variogram was calculated and used in the kriging predictor (45-6). Figure 45-7 shows the predicted values, using (45-6), as isopleth lines for all 100 locations, along with the true values in the upper left-hand corner of each cell. Figure 45-8 shows the prediction variances (457) calculated at each location during cross-validation. Notice that prediction variances for kriging are smaller toward the center and larger toward the edges. This result is well-known to kriging practitioners. From Eq. (45-11) for kriging, MSPE(k) = 1.15, which is smaller than MSPE(r), making kriging a better predictor than regression for these data. To use universal kriging, the empirical variogram (4512) was calculated for the residuals from regression. These residuals are given as isopleth lines in Figure 45-9. The empirical variogram (solid circles) and the fitted exponential variogram (curve) for the residuals are given in Figure 45-10. For the residuals, the fitted parafi1.eters (45-13) were co=0.14, C1 =2.60, c2=1.15. Each time a datum was removed during cross-validation, a new regression equation was estimated and a new variogram was calculated on the residuals and used in the universal kriging predictor (45-9). Figure 45-11 shows the predicted values, using Eq.
452 Jay M. Ver Hoef
4
,0 ~-
oro ;0 0
~--~---~-
0
o
.. ....
•
I
00
O-+--.....---.----r--~--.,.---r--~--,----r--~-.----...----,
o
1
2
3
4
5
6
7
8
9
10 11 12 13
Lag Figure 45-9: Regression residuals, given as isopleth lines. The plots are outlined by the dashed line.
Figure 45-10: Empirical varigran1 (solid circles) and fit variogram (curve) for residuals in Figure 45-9.
, -.J
_
14
Figure 45-11: Universal kriging predictions during cross-validation, given as isopleth lines. The plots are outlined by the dashed line, with the true value in the upper left-hand corner.
Figure 45-12: Universal kriging prediction variances during crossvalidation, given as isopleth lines. The plots are outlined by the dashed line.
Universal Kriging for Ecological Data 453
(45-9), as isopleth lines for all 100 locations, along with the true value in the upper left-hand corner of each cell. Figure 45-12 shows the prediction variances (45-10) calculated at each location during cross-validation. From Eq. (45-11) for universal kriging, MSPE(u) =0.98, making universal kriging a better predictor than both regression and kriging for these data. By conlparing Figures 45-4 and 45-7, it is clear that regression prediction differs from kriging. And although here kriging is a better predictor than regression, there appears to be an interesting ecological relationship between the shading factor and the spatial abundance of Coreopsis lanceolata. Universal kriging makes use of that relationship in providing the best predictive model. CONCLUSIONS
Universal kriging is ideally suited for predicting missing values of a response variable in a GIS when there are explanatory variables. It has the appealing property of using two sources of information. First, as in regression, it uses the relationship between the response variable and explanatory variable, which is appealing to an ecologist. Second, it uses the fact that the residuals may have spatial informalion left in them, such as residuals that are closer together may tend to be more similar. Hence, universal kriging provides the best predictive model of these three methods, and it also incorporates some understanding of ecological relationships through regression. Other data sets have been analyzed, ranging from the case where no explanatory variables affected the response variable, to the case where there was no apparent spatial dependence in the residuals after the effect of the explanatory variables. In all cases, universal kriging did as well as or better than regression or kriging. Computer packages that perform universal kriging are scarce, to be found only in some specialized geostatistical software. It is hoped that in the future the joining of environmental modeling and the GIS framework will contain universal kriging.
ACKNOWLEDGMENTS
This research was funded in part by the following: Federal Aid in Wildlife Restoration and the Alaska Department of Fish and Game, the Whitehall Foundation, the North Atlantic lteaty Organization, the National Park Service, and the National Science Foundation, grant number DMS-9001862.
REFERENCES
Cressie, N. (1985) Fitting variogram models by weighted least squares. Journal of the International Association for Mathematical Geology 17: 563-586. Cressie, N. (1991) Statistics for Spatial Data, New York: John Wiley and Sons. Huijbregts, C.J., and Matheron, G. (1971) Universal kriging (an optimal method for estimating and contouring in trend surface analysis). In McGerrigle, J.1. (ed.) Proceedings ofNinth International Symposium on Techniques for Decision-Making in the Mineral Industry. The Canadian Institute of Mining and Metallurgy, Special Volume 12: 159-169. Journel, A.G., and Huijbregts, C.J. (1978) Mining Geostatistics, London: Academic Press. Matheron, G. (1963) Principles of geostatistics. Economic Geology 58: 1246-1266. Matheron, G. (1969) Le Krigeage Universal, Fontainebleau: Cahiers du Centre de Morphologie Mathematique, No.1. McIntosh, R.P. (1985) The Background of Ecology, Concept and Theory, Cambridge: Cambridge University Press. Pianka, E.R. (1978) Evolutionary Ecology, New York: Harper and Row. Whittaker, R.H. (1975) Communities and Ecosystems, New York: MacMillan Publishing Company.