Multivariate/ Vector Autoregressive Model with Covariates Stan users mailing list December 11, 2015 Multivariate autogressive (MAR) and vector autoregressive (VAR) are the same thing, ecologists call them MAR. Here we fit a MAR(1) model and include covariates. The model is fit using stats::ar(), vars::VAR(), and a model written in Stan. A VAR model was written in Stan by Rob Trangucci, Ryan Batt added covariates to it and put this document together. Find related discussions on the Stan users mailing list here and here. The model presented here is framed after Ives et al. (2003); in particular, Eq. 27 (quoting from the paper): Xt = A + BXt−1 + CUt + Et “where Ut is a q × 1 vector containing the values of q covariates at time t, and C is a p × q matrix whose elements ci j give the strength of effect of covariates j on species i. The covariates Ut can be any factors that affect the system . . . ” The purpose of this document is to fit a MAR model in Stan. The data are simulated with an ecological process in mind (nutrients, phytoplankton [algae], and zooplankton [herbivores]), but that’s just a loose framing and shouldn’t be read-into too much. # ================= # = Load Packages = # ================= library(vars) library(rstan) # ========= # = Setup = # ========= set.seed(1337) nT <- 2E2 nX <- 2 # don't change nU <- 1 # don't change e_sd <- 1 A <- matrix(c(0.1, 10), nrow=nX, ncol=1) B <- matrix(c(0.75, -0.8, 0.2, 0.95), nrow=nX, ncol=nX) C <- matrix(c(0,1), nrow=nX, ncol=nU) X <- matrix(NA, nrow=nT, ncol=2, dimnames=list(NULL, c("zoop","phyto"))) U <- matrix(rlnorm(nT*nU), nrow=nT, ncol=nU, dimnames=list(NULL, c("nuts"))) E <- matrix(rnorm(nT*nX) * e_sd, nrow=nT, ncol=nX) # ============ # = Simulate = # ============ X[1,] <- c(3,10) 1

for(i in 2:nT){ X[i,] <- A + B%*%matrix(X[i-1,],ncol=1) + C%*%matrix(U[i,],ncol=1) + E[i,] } # ================= # = Plot Simulate = # ================= par(mfrow=c(2,1), mar=c(2,2,0.1,2), cex=1, ps=9, mgp=c(1,0.1,0), tcl=-0.1) plot(X[,1], type="l", ylim=c(min(X,0), max(X)), ylab="X (zoops [black] phytos [blue])", xlab="time step" ) lines(X[,2], col="blue") par(new=TRUE) plot(U, type="l", col="red", xlab="", ylab="", xaxt="n", yaxt="n", lwd=0.5, lty="dashed" ) axis(side=4, col="red") mtext("U (nuts [red])", side=4, line=1) plot(X[,2:1], type="o", lwd=0.25, pch=20) Nutrients come into the system as a log-normal random variate; this representation loosely represents the distribution of nutrient pulses into a lake that would be observed in nature. This description and the simulation doesn’t emphasize the distinction between nutrient loading and nutrient concentration. When there is a spike in the nutrients (red dashed line), phytoplankton jump up (C[2,1]). Then in the next time step, the zooplankton benefit from a greater presence of phytoplankton in the previous time step (B[1,2] in the simulation, or B[1,1,2] in the Stan model output). On the other hand, phytoplankton abundance is intensely suppressed by zooplankton abundance in the previous time step (B[2,1] in simulation, or B[1,2,1] in Stan output). Thus, a nutrient pulse quickly increases phytoplankton, which are subsequently suppressed by zooplankton. Further, the diagonal elements of B are less than 1 and greater than 0, indicating a steady decline in the abundance of both phytoplankton and zooplankton, but the decline is balanced a steady (constant) input from A. The A parameter is quite small for zooplankton (A1 = 0.1) and much larger for phytoplankton (A2 = 10). The top panel shows the time series of these state variables, where the spikes in nutrients and plankton are all plainly visible, as are the shifted timings and differences in the relative sizes of the spikes. The bottom panel of the figure shows the phase plot of the two state variables (X). The effects of large nutrient pulses are seen as the rapid excursions to the right (more phytoplankton), which are followed by the system moving up (more zooplankton) and to the left (fewer phytoplankton), and eventually spiralling counterclockwise back in toward the equilibrium point (which the system may never perfectly reach due to shocks [E] and nutrient pulses [U ] bombarding the system at each time step).

2

30 5

10

15 20 U (nuts [red])

25

X (zoops [black] phytos [blue]) 10 20 30 40

0

0

50

100 time step

150

200

5

10

zoop 15

20

25

0

0

10

20 phyto

30

40

Figure 1: Time series of zooplankton (X[,1], black), phytoplankton (X[,2], blue), and nutrients (U, red)

3

# ============== # = Stan Model = # ============== cat(readLines('VAR_p_cov.stan', warn=FALSE), sep="\n") ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ##

data { int T; // No. of time periods observed int M; // No. of variables in y_t int P; // Lag order vector[M] Y[T]; // Obs of state variables int nU; // No. of covariates vector[nU] U[T]; // Covariates } transformed data { vector[M] Y_obs[T-P]; for (t in 1:(T-P)) Y_obs[t] <- Y[t + P]; } parameters { matrix[M, M] B[P]; cholesky_factor_corr[M] L_corr_noise; vector[M] sd_noise; vector[M] A; matrix[M, nU] C; } transformed parameters { matrix[M,M] L_sigma; L_sigma <- diag_pre_multiply(sd_noise, L_corr_noise); } model { vector[M] mus[T-P]; for (t in 1:(T-P)) { mus[t] <- A + C * U[t+P]; for (p in 1:P) mus[t] <- mus[t] + B[p] * Y[t+p-1]; } L_corr_noise ~ lkj_corr_cholesky(2.0); sd_noise ~ normal(0,1); A ~ normal(0,1); for (p in 1:P) to_vector(B[p]) ~ normal(0, 1); Y_obs ~ multi_normal_cholesky(mus,L_sigma);

} generated quantities { matrix[M,M] Sigma; Sigma <- L_sigma * L_sigma'; }

# ============== # = Fit Models = # ============== 4

# MAR ar(X) # can't handle covariates :( ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ##

Call: ar(x = X) $ar , , 1 zoop phyto zoop 0.7197 0.1918 phyto -0.7783 0.9448 $var.pred zoop phyto zoop 1.7656 0.8752 phyto 0.8752 14.3658

# VAR VAR(X, exogen=U) ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ##

VAR Estimation Results: ======================= Estimated coefficients for equation zoop: ========================================= Call: zoop = zoop.l1 + phyto.l1 + const + nuts zoop.l1 0.720085330

phyto.l1 0.200444289

const nuts 0.417703606 -0.007097964

Estimated coefficients for equation phyto: ========================================== Call: phyto = zoop.l1 + phyto.l1 + const + nuts zoop.l1 -0.822652

phyto.l1 const 0.948470 10.362954

nuts 1.009031

# Stan VAR/ MAR library(rstan) stan_data <- list(T = nT, M = nX, P = 1, Y = X, U = U, nU = nU) model_w_data <- stan(file = 'VAR_p_cov.stan', data=stan_data, iter=200, chains=4, cores=4) print(model_w_data, pars=c("A","B","C"))

5

## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ##

Inference for Stan model: VAR_p_cov. 4 chains, each with iter=200; warmup=100; thin=1; post-warmup draws per chain=100, total post-warmup draws=400. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat A[1] 0.40 0.03 0.32 -0.25 0.22 0.41 0.63 1.06 92 1.02 A[2] 9.43 0.04 0.34 8.67 9.21 9.47 9.66 10.05 67 1.07 B[1,1,1] 0.72 0.00 0.02 0.68 0.71 0.72 0.73 0.77 134 1.01 B[1,1,2] 0.20 0.00 0.01 0.18 0.19 0.20 0.21 0.22 400 1.01 B[1,2,1] -0.77 0.00 0.02 -0.81 -0.79 -0.77 -0.75 -0.72 97 1.04 B[1,2,2] 0.96 0.00 0.01 0.94 0.95 0.96 0.96 0.98 298 1.00 C[1,1] -0.01 0.00 0.02 -0.05 -0.02 -0.01 0.01 0.04 400 0.99 C[2,1] 1.01 0.00 0.03 0.97 1.00 1.02 1.03 1.06 400 1.00 Samples were drawn using NUTS(diag_e) at Fri Dec 11 17:27:55 2015. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).

traceplot(model_w_data, inc_warmup=FALSE, pars=c("A","B","C"))

A[1]

A[2] 10.0

1.0 0.5 0.0 −0.5

9.5

0.75

9.0

0.70

8.5

0.65

100 125 150 175 200 B[1,1,2] 0.22 0.21 0.20 0.19 0.18 0.17

0.80

100 125 150 175 200 B[1,2,1]

B[1,1,1]

100 125 150 175 200 B[1,2,2] chain

−0.72

0.98

1

−0.76

0.96

2

−0.80

0.94

3 4

100 125 150 175 200 C[1,1] 0.04

100 125 150 175 200 C[2,1] 1.05

0.00

1.00

−0.04 0.95 100 125 150 175 200

100 125 150 175 200 Figure 2:

# =========== # = Summary = # ===========

6

100 125 150 175 200

Summary All three models estimated the coefficients and intercepts ~correctly (ar didn’t print out the intercept terms, oddly, but it at least got the coefficients pretty close). Both VAR and the Stan model estimated all parameters quite accurately.

7

Vector Autoregressive Model with Covariates -

Dec 11, 2015 - Multivariate autogressive (MAR) and vector autoregressive (VAR) are the same thing, ecologists call them. MAR. Here we fit a MAR(1) model and include covariates. The model is fit using stats::ar(), vars::VAR(), and a model written in Stan. A VAR model was written in Stan by Rob Trangucci, Ryan Batt ...

251KB Sizes 0 Downloads 198 Views

Recommend Documents

AN AUTOREGRESSIVE PROCESS WITH CORRELATED RANDOM ...
Notations. In the whole paper, Ip is the identity matrix of order p, [v]i refers ... correlation between two consecutive values of the random coefficient. In a time.

Model Selection for Support Vector Machines
New functionals for parameter (model) selection of Support Vector Ma- chines are introduced ... tionals, one can both predict the best choice of parameters of the model and the relative quality of ..... Computer Science, Vol. 1327. [6] V. Vapnik.

The multi-armed bandit problem with covariates
2Supported in part by the NSF Grants DMS-09-06424, DMS-10-53987. MSC2010 subject classifications ... “good” policy can (asymptotically) achieve a smaller regret; see also [4]. The el- egance of the theory and .... we obtain in the bandit setup fo

Semi-supervised learning of the hidden vector state model for ...
capture hierarchical structure but which can be ... sibly exhibit the similar syntactic structures which ..... on protein pairs to gauge the relatedness of the abstracts ...

A Linear 3D Elastic Segmentation Model for Vector ...
Mar 7, 2007 - from a databank. .... We assume that a bounded lipschitzian open domain. O of IR3 ... MR volume data set according to the Gradient Vector.

Semi-supervised learning of the hidden vector state model for ...
trained automatically from only lightly annotated data. To train the HVS model, an abstract annotation needs to be provided for each sentence. For exam- ple, for the ...... USA, 2005. [14] Xu L, Schuurmans D. Unsupervised and semi-supervised multi-cl

AUTOMORPHIC VECTOR BUNDLES WITH GLOBAL ...
Oct 1, 2017 - example is given to show that our conjecture can fail for zip data not of .... We will say that a reduced scheme S is pseudo-complete if every h ∈ H0(S, OS) ...... Define a Zariski open subset U ⊂ SL2 as the non-vanishing locus.

Make a map with vector data - MOBILPASAR.COM
Supported vector formats include Environmental Systems Research Institute (ESRI) shapefiles. (.shp), comma-separated value (.csv) files, and Keyhole Markup Language (KML) files. Vector data files are typically organized in a group called a dataset. A

Make a map with vector data
2. Upload vector data. About vector data. With Google Maps Engine, you can .... 2. Edit the HTML content in the tab's text box (on the left) to match the HTML ...

Multichannel Shopper Segments and Their Covariates (PDF ...
that make shopping more enjoyable, Indeed, this approach views .... nicely with this point of view. That is ...... Profile of the final segments (LCA) (N= 360).

Creating covariates using cohort attributes - GitHub
Mar 28, 2016 - 3.1 Creating the cohort attributes and attributes definitions . ... covariate builders that create covariates on the fly and can be re-used across ...

Using Fractional Autoregressive Integrated Moving Average (FARIMA ...
Using Fractional Autoregressive Integrated Moving Averag ... arture (Interior) in Sulaimani International Airport.pdf. Using Fractional Autoregressive Integrated ...

Automatic Music Transcription using Autoregressive ...
Jun 14, 2001 - indispensable to mix and manipulate the necessary wav-files. The Matlab ..... the problems related to automatic transcription are discussed, and a system trying to resolve the ..... and then deleting a certain number of samples.

Mixtures of Sparse Autoregressive Networks
Given training examples x. (n) we would ... be learned very efficiently from relatively few training examples. .... ≈-86.34 SpARN (4×50 comp, auto) -87.40. DARN.

Multivariate Portmanteau Test For Autoregressive ...
Mar 24, 2012 - The paper provides a method of correction for heteroskedastic errors ... by linear combination of past p values Xt−1,..., Xt−p and an error ϵt .

Improved Hidden Vector Encryption with Short ...
For instance, suppose that the ciphertexts associated with keywords are in a database server, and a user who has permission to read the ciphertexts that are associated with some ..... Let Σ = Zm for some integer m and set Σ∗ = Zm ∪ {∗}. Our s

Decision Making and Games with Vector Outcomes
Apr 13, 2018 - we need to model outcomes as elements in an infinite-dimensional vector space. Hence, the enhanced level of generality in our framework offers flexibility and richness in modeling. In fact, the main result of Shapley (1959) has been ge

A New Measure of Vector Dependence, with ...
vector X = (X1, ..., Xd) have received substantial attention in the literature. Such .... Let F be the joint cdf of X = (X1,...,Xd), Xj ∈ R,j = 1,...,d, and let F1,...,Fd.

Vector Fields with the Oriented Shadowing Property 1 ...
Roosevelt Road, Taipei 106, Taiwan. email: [email protected]. Abstract. We give a description of the C1-interior (Int1(OrientSh)) of the set of smooth vector fields on a smooth closed manifold that have the oriented shadowing property. A sp

"Support vector machine active learning with ... -
"Support vector machine active learning with applications ... The Journal of Machine Learning Research 2 ... Increasing the pool size improves results (Figure 7) ...