Stat 635 (Autumn 2009) – Peter F. Craigmile Nonlinear time series models Reading: Brockwell and Davis, Section 10.3. • Limitations of Gaussian linear models • Nonlinear models • Nonlinearities in the mean and variance • Three motivating examples • When do we have IID noise? • Useful classes of nonlinear models – Bilinear models – Random coefficient AR(p) models – Threshold models – ARCH(p) processes – GARCH(p,q) processes 1 Limitations of Gaussian linear models • So far in this course we have considered (causal)linear time series models, which are also Gaussian. We can write our process {Xt} as: Xt = ∞summationdisplay j=0 ψjZt−j, where {Zt} ∼ IID N(0, σ2). We can add a mean term. • Implications: 1. With Gaussian linear models, (Xt1,...,Xtn)T has the same distribution as (Xtn,...,Xt1)T. – This assumption is violated when rates of increase are different from the rates of decrease in a time series. 2. Bursts of outlying values are rarely seen in Gaussian linear processes. 3. Often time series (especially financial time series) are more volatile (i.e. less predictable) than Gaussian linear processes. 2 Nonlinear models • We can rewrite the model on the previous page as Xt = f(Zt,Zt−1,Zt−2,...) where f(a1,a2,a3,...) = ∞summationdisplay j=0 ψjaj, is a linear function of its arguments. • When we observe deviations from linearity f(·) becomes nonlinear, and we obtain a nonlinear process. • Good references on nonlinear processes: – Tong, H. “Non-linear time series: a dynamical system ap- proach”, Clarendon Press: New York and Oxford, 1990. – Tsay, R. S. “Analysis of Financial Time Series”, Wiley: New York, 2002. 3 Nonlinearities in the mean and variance (Tsay, Chapter 4) • Consider a sequence of RVs {Xt : t∈ Z}. • A common way to define a nonlinear process is via the con- ditional moments. • Let E(Xt|Ft−1) = g(Ft−1) denote the conditional expecta- tion of Xt given Ft−1, the σ-field generated by available in- formation at time t−1. • Let var(Xt|Ft−1) = h(Ft−1) denote the conditional variance of Xt given Ft−1. • g(·) and h(·) must be well-defined functions with h(·) > 0. • If g(·) is nonlinear, {Xt} is said to be nonlinear in mean. • If h(·) varies with time (is time-variant), {Xt} is said to be nonlinear in variance. 4 Example 1 (Sunspot numbers) Wolf’s Sunspot Numbers. 1700 - 1988 (Source: Tong) (This is a longer version of the series that we look at earlier in the course). 1700 1750 1800 1850 1900 1950 0 50 100 150 year n umber of sunspots 5 Example 2 (Canadian lynx numbers) Annual number of Canadian lynx trapped around the MacKenzie River between 1821 and 1934. (Original Source: Elton, C. and Nicholson, M. (1942) ”The ten year cycle in numbers of Canadian lynx”, J. Animal Ecology, 11, 215–244). 1820 1840 1860 1880 1900 1920 0 1000 3000 5000 7000 year n umber of lynx tr apped 6 Example 3 (Log exchange rate) Log returns of the exchange rate between the Deutsch Mark and US Dollar measured every 10 minutes (Source: Tsay). 0 5000 10000 15000 20000 25000 −0.6 −0.4 −0.2 0.0 0.2 0.4 minutes Exchange r ate log retur ns 7 Example 4 (Daily Rosslare series) A time series plot of the residuals after accounting for a sine and cosine term plus ARMA(1,1) errors. 1965 1966 1967 1968 1969 1970 −2 −1 0 1 2 year ARMA(1,1) residuals 8 Motivating example: the tent map • Let X1 be uniformly distributed on [0,1], and a∈ (0,1). Then for t = 2,3,... define Xt = Xt−1 a if Xt−1 ∈ [0,a]; 1−Xt−1 1−a if Xt−1 ∈ (a,1]. a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 t−1 t a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 a71 0 5 10 15 20 25 30 0.0 0.2 0.4 0.6 0.8 1.0 t X_t • With a good deal of math, it be shown that {Xt} has ACF ρX(h) = (2a−1)|h|. 9 Comparing the tent map and the AR(1) process • This is the same ACF as the AR(1) process Yt = (2a−1)Yt−1 +Zt, with {Zt} ∼ WN(0,σ2). • Moral: Based on ACFs alone cannot tell linear and non- linear processes apart. 0 5 10 15 20 25 30 −0.4 0.0 0.2 0.4 Centered T ent map , a=0.2 0 5 10 15 20 25 30 −2 −1 0 1 AR(1), phi=−0.6 0 5 10 15 −1.0 0.0 0.5 1.0 Sample ACF 0 5 10 15 −1.0 0.0 0.5 1.0 Sample PACF 0 5 10 15 −1.0 0.0 0.5 1.0 0 5 10 15 −1.0 0.0 0.5 1.0 10 When do we have IID noise? • Suppose {Xt} is IID noise. • Then both {|Xt|} and {X2t} are IID noise. • Examine both the sample ACFs of {Xt}, {|Xt|}, and {X2t} (or use tests of hypothesis) to see if IID assumption is rea- sonable. • This extends to higher powers. 11 Example: Rosslare time series residuals 0 5 10 15 20 25 30 −0.10 0.05 0.20 lag (day) sample A CF X t 0 5 10 15 20 25 30 −0.10 0.05 0.20 lag (day) sample P A CF X t 0 5 10 15 20 25 30 −0.10 0.05 0.20 lag (day) sample A CF X t 0 5 10 15 20 25 30 −0.10 0.05 0.20 lag (day) sample P A CF X t 0 5 10 15 20 25 30 −0.10 0.05 0.20 lag (day) sample A CF X t 2 0 5 10 15 20 25 30 −0.10 0.05 0.20 lag (day) sample P A CF X t 2 12 Useful classes of nonlinear models 1. The bilinear model of order (p,q,r,s) is Xt = psummationdisplay i=1 φiXt−i + Zt + qsummationdisplay j=1 θjZt−j + rsummationdisplay i=1 ssummationdisplay i=1 δijXt−iZt−j, where {Zt} ∼ IID(0,σ2). (These models are due to Granger and Anderson (1978)). 2. The random coefficient AR(p) model is defined by Xt = psummationdisplay i=1 (φi +U(i)t )Xt−i +Zt, where {Zt} ∼ IID(0,σ2Z) is independent of {U(i)t } ∼ IID(0,σ2U). 13 Useful classes of nonlinear models, continued 3. Threshold models • We consider piecewise linear models in which the linear relationship varies according to the value (state) of the pro- cess. • Consider this example. Xt = φ(1)Xt−1 +Zt, if Xt−1 0 and αi ≥ 0 for i = 1,...,p (you need more restrictions for certain moments to exist). 15 Useful classes of nonlinear models, continued 5. GARCH models • The generalized autoregressive conditional hetroscedacity model, GARCH(p, q) is due to Bollerslev (1986, 1988). • Like the ARCH Xt = σtZt, where {Zt} is an IID(0, 1) process, but now σ2t = psummationdisplay i=1 βiσ2t−i +α0 + qsummationdisplay j=1 αjX2t−j, with α0 > 0 and αi, βj ≥ 0 for all i (again you need more restrictions for certain moments to exist). • The GARCH(1,1) process is a very common model for finan- cial time series. 16 Useful classes of nonlinear models, continued 6. SV models • The stochastic volatility (SV) model (Melino and Turn- bull, 1990; Harvey, Ruiz, and Shepard, 1994; Jacquier, Polson and Rossi, 1994) is given by Xt = σtZt where {Zt} is an IID N(0, 1) process. • The difference from the ARCH and GARCH models is that we model log(σt): φ(B)log(σt) = α0 +Vt, whereφ(B) = 1−summationtextpj=1αjBj is a causal AR polynomial and {Vt} is an IID N(0, σ2V) process. • We assume {Zt} is independent of {Vt}. 17 Properties of the ARCH(1) model • Let Xt = σtZt, where {Zt} ∼ IID(0,1) and σ2t = α0 +α1X2t−1. • If |α1|< 1 then X2t = σ2tZ2t = (α0 +α1Xt−1)Z2t = α0Z2t +α1Z2tXt−1 = α0Z2t +α1Z2t (α0 +α1Xt−2)Z2t−1 = α0Z2t +α0α1Z2tZ2t−1 +α21Z2tZ2t−1Xt−2 ... = Z2t α0 1 + ∞summationdisplay j=1 αj1Z2t−1...Z2t−j . • That is, Xt = Zt radicaltpradicalvertex radicalvertexradicalvertex radicalbtα0 1 + ∞summationdisplay j=1 αj1Z2t−1...Z2t−j , a non-linear function of Zt. 18 Statistical properties of the ARCH(1) process • We have that E(Xt) = 0 for each t. • With more work we find that cov(Xt,Xt+h) = α0/(1−α1), h = 0; 0, hnegationslash= 0. • Thus an ARCH(1) model is a white noise model, but is certainly not an IID process. How could we show this? 19 Simulated ARCH(1) example • Let Xt = σtZt, where {Zt} ∼ IID N(0,1), and σ2t = 1 + 0.5X2t−1. 0 400 800 −3 −1 1 2 3 X t 0 400 800 0.0 1.0 2.0 3.0 X t 0 400 800 0 2 4 6 8 10 X t 2 0 5 15 25 −0.2 0.2 0.4 Lag A CF 0 5 15 25 −0.2 0.2 0.4 Lag A CF 0 5 15 25 −0.2 0.2 0.4 Lag A CF 0 5 15 25 −0.2 0.2 0.4 P ar tial A CF 0 5 15 25 −0.2 0.2 0.4 P ar tial A CF 0 5 15 25 −0.2 0.2 0.4 P ar tial A CF 20 Estimation of ARCH(1) processes • Suppose that {Zt} is a set of IID N(0,1) RVs. • Then for t = 2,3,... the conditional distribution of Xt given Xt−1 is Gaussian with Xt|Xt−1 ∼N(0,σ2t) where σ2t = α0 +α1X2t−1. • Remembering that the likelihood is the probability density function (pdf) evaluated for our data, we can write L(α) = fX1(x1) nproductdisplay t=2 fXt|Xt−1(xt,xt−1), where α = (α0,α1) and fXt|Xt−1(·) is the conditional pdf of Xt given Xt−1. 21 Conditional MLEs of ARCH(1) processes • We do not know fX1(x1) so we just maximize ˜L(α) = nproductdisplay t=2 fXt|Xt−1(xt,xt−1), with respect to the parameters α. • ˜L(α) is called the conditional likelihood. (Can use this method for estimating ARIMA processes too!) • This method extends to ARCH(p) models (but we need to condition on more past values). • Commonly people replace the normal distribution with a scaled t distribution to model heavier tails. 22 Fitting ARCH models in R • R by default cannot fit GARCH models. You can use the tseries R library written by Adrian Trapletti and main- tained by Kurt Hornik. • For Windows machines to the install the library: 1. Select the menu, Packages → Install Packages(s). 2. Select a CRAN mirror (a location nearby that you will download the R package from). 3. Under Packages, select tseries, and click OK. 4. The library will download and install. 5. To use the library, type library(tseries). (It is similar for the Mac, and a bit more more complicated for Linux systems). • You can fit GARCH models using the garch function. 23