The Wild Animal Modeling Wiki

**Autocorrelation structures**

### Table of contents

## Background

An individual may express a labile trait differently as it ages or when it finds itself in different environments. Hence, a trait may be plastic with respect to a covariate. One may wish to explore the genetics of a trait across the covariate. If the covariate has N different so-called “character-states” (e.g. ages), solving for the N state-specific variances and 0.5*N*(N-1) covariances between all states quite rapidly becomes computationally difficult. One possibility is to reduce the complexity of the NxN matrix by employing random regression models (see Random regression models and Random regression with heterogeneous residuals for more information). Another possibility is to use an autocorrelation function. Note that both these approaches may substantially reduce the amount of individual parameters that need to be estimated, but that they do so under fairly different assumptions.### Specifics

Autocorrelation functions are used mainly in spatial statistics, where sites that are spatially closely situated to each other may resemble each other more (in terms of e.g. crop yield) than sites that are spatially distant. The approach is advocated (e.g. Pletcher & Geyer 1999, Genetics 151: 825-835) also for quantitative genetic purposes. From the quantitative genetic perspective, the assumption is that of the N character-states, those that are “nearby” in character-state space will resemble each other (have a high covariance), and that the covariance will decrease with increasing “distance” according to some function. The parameters of the function describing the decrease in covariance are to be estimated (e.g. using REML). Pletcher et al. provide a number of functions, and advocate their use in particular with respect to age-based applications. For example, in studying body size across ages (growth), it may be reasonable to assume that an individual’s body size at age x is highly (auto)correlated to it size at age x+1, but that this correlation may be lower at older ages. The function is assumed to be symmetrical. Hence, the correlation between x and x+n is assumed to be a decreasing function of |n|. When the variances are estimated separately, the correlation structure can be converted into covariances. The reduction in parameters that are to be estimated is substantial. For example, if N = 10, the 45 covariances may be described by a single, two or three parameters (depending on the autocorrelation function that is implemented).## Implementation

Autocorrelation models can be implemented within an animal model approach in ASReml. ASReml offers a number of functions (consult the manual on variance structures). Some of these functions are applicable to be used in the context of wild animal models.The data is structured as described under Random regression with heterogeneous residuals. That is, for each value of the covariate, each individual must have a record (even if the trait is not measured). Records are sorted in the same within each individual.

Below, in part 1 of the code, I present a simple implementation of the most basic autocorrelation function (one-dimensional power model; EXP function in ASReml). The EXP function assumes that the correlation between two covariates is PHI^( difference between the covariates), where PHI is the parameter to be estimated using REML. Clearly, this function can be used if individuals are characterised by some kind of continuous environmental value (yearly temperature in my example below), or by their age. More complicated correlation models that are implemented in ASReml are autoregressive models.

### Disclaimer

Despite its computational appeal, this approach has – as far as I know – not been used in wild animal models to date. There are a number of issues that are restricting its potential at present, which I discuss below. By contributing this section, I hope to stimulate further work on this topic.ASREML Autocorrelation model #next section describes data file, each column heading must be indented ANIMAL !P DAY !M999 #response variable YEAR !A #each year is a different environment st_temp #covariate standardised; 20 different annual temperatures ped.ped !alpha !make # pedigree file with qualifiers all_environments_data.asd !skip 1 #datafile # individual-specific effects as autocorrelation evaluated at the different E's ################################################################################## DAY ~ mu YEAR !r YEAR.ide(ANIMAL) !f mv #Model statement above #Variance structure model below 0 0 1 !STEP 0.001 # 1 UNIVARIATE error structure, 1 G structure # G structure based on one-dimensional power model (EXP) #heterogeneous variances fitted here, by the specification ‘H’ #use EXPV 0.1 0.1 for homogeneous variance, compare the likelihood as a test YEAR.ide(ANIMAL) 2 YEAR st_temp EXPH 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 #starting values above: first one for the coefficient, rest for the 20 variances to be estimated ide(ANIMAL) #The above can be also be applied to the ANIMAL effect

The above will produce a matrix with, as usual, variances in the diagonal (a unique variance for each yearly temperature), and the covariance in the below-diagonal and correlations in the top diagonal. For example, the first three rows for the first 10 environments (with the variances printed in bold) look like:

4.30 | 0.97 | 0.96 | 0.87 | 0.86 | 0.86 | 0.84 | 0.80 | 0.78 | 0.77 |

6.66 | 10.86 | 0.98 | 0.90 | 0.88 | 0.88 | 0.86 | 0.82 | 0.81 | 0.79 |

5.72 | 9.33 | 8.28 | 0.91 | 0.90 | 0.89 | 0.88 | 0.83 | 0.82 | 0.81 |

It is clear that the correlation is a smoothly decreasing function of the “environmental distance”. To test whether the variances really are different, one can contrast the Log(Likelihood) of this model with one where the variance is assumed to be homogeneous

### Notice

At present, there does not seem to be a build-in autocorrelation model that allows negative correlations. This is likely due to the fact that the models are primarily developed for spatial statistics (where the autocorrelation “fades out” across space). For some applications, this may not be an issue. For example, body size is likely to be positively correlated across ages; Pletcher & Geyer (1999, Genetics) present a number of age-based examples where correlations do not fall below zero. In general, however, assuming that correlations are not negative seems a fairly strong*a priori*assumption. In the study of senescence, for example, we actually have a theoretical prediction (antagonistic pleiotropy) that the genetic correlation between performance at young and old ages could be negative. Pletcher & Geyer (1999, Genetics) present some example autocorrelation models, some of which are not constrained to be positive. Hence, a future challenge is to implement these in ASReml, which should be possible using the ‘OWN’ variance structure.

### A hybrid approach

As indicated in the code above, it is straightforward to extend the model to an animal model, by (additionally) introducing the ANIMAL term in a similar fashion as the ide(ANIMAL) term. Another option is to approximate the permanent environmental effect by using an autocorrelation model, and assume the additive genetic variances are described by a linear random regression function. To some extent, this is similar as what is described as “approach 3” in Random regression with heterogeneous residuals, but taking advantage of the autocorrelation function to reduce the number of non-genetic covariances to be estimated. For example,!PVAL st_temp -1.000 -0.907 -0.523 -0.447 -0.387 -0.197 -0.138 -0.073 -0.019 0.017 0.035 0.110 0.190 0.276 0.406 0.536 0.650 0.870 0.926 1.000 DAY ~ mu YEAR !r YEAR.ide(ANIMAL) leg(st_temp,1).ANIMAL !f mv #Model statement above #Variance structure model below 0 0 2 # 1 UNIVARIATE error structure, 2 G structures YEAR.ide(ANIMAL) 2 YEAR st_temp EXPH 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 ide(ANIMAL) leg(st_temp,1).ANIMAL 2 leg(st_temp,1) 0 US 0.1 0.1 0.1 ANIMAL

On the other hand, the above can probably be more effectively modelled by using random regression for both the permanent environment and the additive genetic effects.