lecture 1
Statistics 520 with Percival at University of Washington - Seattle Campus
About this note
By: Maziar Rezakhani
Created: 2009-03-05
File Size: 94 page(s)
Views: 1
Created: 2009-03-05
File Size: 94 page(s)
Views: 1
About StudyBlue
STUDYBLUE makes things that make you better at school.
Things like online flashcards with photos and audio.
Things like personalized quizzes and friendly reminders about when (and what) to study next.
Think of it as a digital backpack™: access to all of your study materials online and on your phone.
STUDYBLUE exists to make studying efficient and effective for every student, for free. Join us.
“Simply amazing. The flash cards are smooth, there are many different types of studying tools, and there is a great search engine. I praise you on the awesomeness.”
Dennis
Dennis
Sign up (free) to study this.
What is Spectral Analysis? ? one of most widely used (& lucrative!) methods in data analysis ? can be regarded as ?analysis of variance of time series using cosines & sines ?cosines & sines + statistics (or Fourier theory + statistics) ? today?s lecture: introduction to spectral analysis ?notion of a ?time? series ?$0.25 introduction to time series analysis, with some basic notions from ?time domain? analysis (subject of Stat 519) ?definition of simplified version of spectrum and two methods for estimating (nonparametric and parametric) ?see Chapter 1 for details I?1 Time Series ? what is a time series? ??one damned thing after another? (R. A. Fisher?) ?denote by x t , t = 1,...,N ?four examples, each with N = 128 (Figs. 2 & 3 in textbook) I?2 First Example: Wind Speed Time Series 0 20 40 60 80 100 120 ? 4 ? 2 0 2 t I?3 Second Example: Atomic Clock Time Series 0 20 40 60 80 100 120 ? 10 ? 5 0 5 10 t I?4 Third Example: Willamette River Time Series 0 20 40 60 80 100 120 8.5 9.0 9.5 10.0 11.0 t I?5 Fourth Example: Ocean Noise Time Series 0 20 40 60 80 100 120 ? 4 ? 2 0 2 4 t I?6 Time Series Analysis ? goal of time series analysis: ?quantify characteristics of time series ? sample mean & variance (two well-know statistics) ¯x ? 1 N N X t=1 x t and ?? 2 ? 1 N N X t=1 (x t ? ¯x) 2 , captureunivariate properties, but do not capturebivariate prop- erties, i.e., do not tell us how x t and x t+k are related I?7 Lagged Scatter Plots: I ? tells us about bivariate distribution of separated pairs ? x t+1 versus x t , t = 1,...,N ?1: lag 1 scatter plot ? four examples (Fig. 4) I?8 Lag 1 Scatter Plot for Wind Speed Series ?4 ?2 0 2 ? 4 ? 2 0 2 I?9 Lag 1 Scatter Plot for Atomic Clock Series ?10 ?5 0 5 10 ? 10 ? 5 0 5 10 I?10 Lag 1 Scatter Plot for Willamette River Series 8.5 9.0 9.5 10.0 10.5 11.0 11.5 8.5 9.0 9.5 10.0 10.5 11.0 11.5 I?11 Lag 1 Scatter Plot for Ocean Noise Series ?4 ?2 0 2 4 ? 4 ? 2 0 2 4 I?12 Lagged Scatter Plots: II ? x t+k versus x t , t = 1,...,N ?k: lag k scatter plot ? summarize scatter plots using linear model: x t+k = ? k + ? k x t + ? t,k (not always reasonable: see Fig. 9) ? Pearson product moment correlation coe?cient ?let y 1 ,...,y N & z 1 ,...,z N be 2 collections of ordered values ?let ¯y & ¯z be sample means (thus ¯y ? P y t /N) ?sample correlation coe?cient: ?? = P (y t ? ¯y)(z t ? ¯z) £ P (y t ? ¯y) 2 P (z t ? ¯z) 2 § 1/2 , ?measures strength of linearity (?1 ? ?? ? 1) I?13 Sample Autocorrelation Sequence ? let {y t } = {x t+k : t = 1,...,N ?k} and {z t } = {x t : t = 1,...,N ?k} ? for each lag k, plug these into ?? = P (y t ? ¯y)(z t ? ¯z) £ P (y t ? ¯y) 2 P (z t ? ¯z) 2 § 1/2 , and get (after a little tweaking) ?? k ? P N?k t=1 (x t+k ? ¯x)(x t ? ¯x) P N t=1 (x t ? ¯x) 2 ? ?? k , k = 0,...,N ?1, called sample acs ? four examples (Figs. 6 and 7) I?14 Sample ACS for Wind Speed Series 0 5 10 15 20 25 30 ? 1.0 ? 0.5 0.0 0.5 1.0 k I?15 Sample ACS for Atomic Clock Series 0 5 10 15 20 25 30 ? 1.0 ? 0.5 0.0 0.5 1.0 k I?16 Sample ACS for Willamette River Series 0 5 10 15 20 25 30 ? 1.0 ? 0.5 0.0 0.5 1.0 k I?17 Sample ACS for Ocean Noise Series 0 5 10 15 20 25 30 ? 1.0 ? 0.5 0.0 0.5 1.0 k I?18 Modeling of Time Series ? assume x t is realization of random variable X t ? need to specify properties of X t (i.e., model x t ) ? simplifying assumptions (related to stationarity) ? ?? k estimates time-independent theoretical acs ? k ? cov{X t ,X t+k } ± ? 2 ? E{(X t ?µ)(X t+k ?µ)} ± ? 2 , where µ ? E{X t } and ? 2 ? E{(X t ?µ) 2 } ?X t ?s are multivariate Gaussian ? statistics of X t ?s completely known if µ, ? 2 and ? k ?s known ? critique of ?time domain? characterization (µ, ? 2 , ? k ): ?not easy to visualize x t from ? k ?s ?statistical properties of ?? k ?s di?cult to use I?19 Frequency Domain Modeling: I ? idea: express X t in terms of cosines and sines (i.e., sinusoids) ? considerartificialtime series cos(2?ft) & sin(2?ft),t = 1,...,128, where f is the frequency of the sinusoid (and 1/f is the period) ? consider ten di?erent frequencies (carefully chosen!): f = 1 128 , 3 128 ,..., 17 128 , 19 128 ? let f j = j 128 , where j = 1,3,...,19 ? in following twenty overheads, top plots show sinusoidal time series whose tth elements are cos(2?f 1 t),sin(2?f 1 t),cos(2?f 3 t),sin(2?f 3 t), ...,cos(2?f 19 t),sin(2?f 19 t) I?20 Frequency Domain Modeling: II ? bottom plots show cumulative sums of series: cos(2?f 1 t) cos(2?f 1 t) + sin(2?f 1 t) cos(2?f 1 t) + sin(2?f 1 t) + cos(2?f 3 t) cos(2?f 1 t) + sin(2?f 1 t) + cos(2?f 3 t) + sin(2?f 3 t) . . . cos(2?f 1 t) + sin(2?f 1 t) + ··· + cos(2?f 19 t) cos(2?f 1 t) + sin(2?f 1 t) + ··· + cos(2?f 19 t) + sin(2?f 19 t) I?21 Sinusoid and Sum of Sinusoids f = 1/128 0 32 64 96 128 I?22 Sinusoid and Sum of Sinusoids f = 1/128 0 32 64 96 128 sum of 2 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 3/128 0 32 64 96 128 sum of 3 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 3/128 0 32 64 96 128 sum of 4 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 5/128 0 32 64 96 128 sum of 5 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 5/128 0 32 64 96 128 sum of 6 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 7/128 0 32 64 96 128 sum of 7 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 7/128 0 32 64 96 128 sum of 8 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 9/128 0 32 64 96 128 sum of 9 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 9/128 0 32 64 96 128 sum of 10 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 11/128 0 32 64 96 128 sum of 11 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 11/128 0 32 64 96 128 sum of 12 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 13/128 0 32 64 96 128 sum of 13 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 13/128 0 32 64 96 128 sum of 14 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 15/128 0 32 64 96 128 sum of 15 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 15/128 0 32 64 96 128 sum of 16 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 17/128 0 32 64 96 128 sum of 17 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 17/128 0 32 64 96 128 sum of 18 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 19/128 0 32 64 96 128 sum of 19 sinusoids I?22 Sinusoid and Sum of Sinusoids f = 19/128 0 32 64 96 128 sum of 20 sinusoids I?22 Frequency Domain Modeling: III ? sum of all 20 sinusoids highly structured and nonrandom in appearance ? let?s repeat this exercise, but now multiply each sinusoid by a random amplitude A (each sinusoid gets a di?erent amplitude) ? A?s chosen from a standard Gaussian (normal) distribution (zero mean, unit variance) I?23 Random Amplitude Sinusoid & Sum of Sinusoids f = 1/128, A = ?1.24 0 32 64 96 128 I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 1/128, A = 0.6 0 32 64 96 128 sum of 2 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 3/128, A = ?0.33 0 32 64 96 128 sum of 3 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 3/128, A = ?0.72 0 32 64 96 128 sum of 4 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 5/128, A = ?0.41 0 32 64 96 128 sum of 5 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 5/128, A = 0.66 0 32 64 96 128 sum of 6 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 7/128, A = 1.32 0 32 64 96 128 sum of 7 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 7/128, A = 1.18 0 32 64 96 128 sum of 8 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 9/128, A = 1.11 0 32 64 96 128 sum of 9 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 9/128, A = 0.66 0 32 64 96 128 sum of 10 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 11/128, A = ?0.07 0 32 64 96 128 sum of 11 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 11/128, A = 0.03 0 32 64 96 128 sum of 12 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 13/128, A = ?2.34 0 32 64 96 128 sum of 13 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 13/128, A = 0.94 0 32 64 96 128 sum of 14 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 15/128, A = ?2 0 32 64 96 128 sum of 15 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 15/128, A = ?1.24 0 32 64 96 128 sum of 16 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 17/128, A = ?1.78 0 32 64 96 128 sum of 17 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 17/128, A = ?0.04 0 32 64 96 128 sum of 18 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 19/128, A = ?0.26 0 32 64 96 128 sum of 19 sinusoids I?24 Random Amplitude Sinusoid & Sum of Sinusoids f = 19/128, A = ?1.15 0 32 64 96 128 sum of 20 sinusoids I?24 Frequency Domain Modeling: IV ? generalize to following simple model for X t : X t = µ + N/2 X j=1 £ A j cos(2?f j t) + B j sin(2?f j t) § ?holds for t = 1,2,...,N, where N is even ?f j ? j/N fixed frequencies (cycles/unit time) (called Fourier or standard frequencies) ?A j ?s and B j ?s are random variables: ?E{A j } = E{B j } = 0 ?var{A j } = var{B j } = ? 2 j (now allowed to depend on j) ?cov{A j ,A k } = cov{B j ,B k } = 0 for j 6= k ?cov{A j ,B k } = 0 for all j,k I?25 The Spectrum: I ? properties of simple model (Exercise [1.1]): ?E{X t } = µ ?? 2 j ?s decompose population variance: ? 2 = E{(X t ?µ) 2 } = N/2 X j=1 ? 2 j ?? 2 j ?s determine acs: ? k = P N/2 j=1 ? 2 j cos(2?f j k) P N/2 j=1 ? 2 j ? define spectrum as S j ? ? 2 j , 1 ? j ? N/2 I?26 The Spectrum: II ? fundamental relationship: N/2 X j=1 S j = ? 2 ?decomposes ? 2 into components related to f j ?S j ?s equivalent to acs and ? 2 (Exercise [1.5]) ? easy to simulate x t ?s from simple model ? four examples of ?spectra versus f j ?acs?s versus k ?x t ?s versus t I?27 Theoretical Spectrum for Wind Speed Series 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 0.6 f j I?28 Theoretical and Sample ACSs for Wind Speed 0 5 10 15 20 25 30 ? 1.0 ? 0.5 0.0 0.5 1.0 k I?29 Actual and Simulated Wind Speed Series 0 32 64 96 128 I?30 Theoretical Spectrum for Atomic Clock Series 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.5 1.0 1.5 f j I?31 Theoretical and Sample ACSs for Atomic Clock 0 5 10 15 20 25 30 ? 1.0 ? 0.5 0.0 0.5 1.0 k I?32 Actual and Simulated Atomic Clock Series 0 32 64 96 128 I?33 Theoretical Spectrum for Willamette River Series 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 f j I?34 Theoretical and Sample ACSs for Willamette River 0 5 10 15 20 25 30 ? 1.0 ? 0.5 0.0 0.5 1.0 k I?35 Actual and Simulated Willamette River Series 0 32 64 96 128 I?36 Theoretical Spectrum for Ocean Noise Series 0.0 0.1 0.2 0.3 0.4 0.5 0.00 0.02 0.04 0.06 0.08 0.10 0.1 2 f j I?37 Theoretical and Sample ACSs for Ocean Noise 0 5 10 15 20 25 30 ? 1.0 ? 0.5 0.0 0.5 1.0 k I?38 Actual and Simulated Ocean Noise Series 0 32 64 96 128 I?39 Nonparametric Estimation of S j : I ? problem: estimate spectrum S j from X 1 ,...,X N ? mine out A j ?s & B j ?s since S j = var{A j } = var{B j } ? could use linear algebra (N knowns and N unknowns) ? can get A j ?s via discrete Fourier cosine transform since N X t=1 X t cos(2?f j t) = NA j 2 ? yields (for 1 ? j < N/2): A j = 2 N N X t=1 X t cos(2?f j t) I?40 Nonparametric Estimation of S j : II ? B j ?s from sine transform: B j = 2 N N X t=1 X t sin(2?f j t) ? since S j = var{A j } = var{B j }, can estimate S j using ? S j ? A 2 j + B 2 j 2 = 2 N 2 ? ? ? ? ? N X t=1 X t cos(2?f j t) ? ? 2 + ? ? N X t=1 X t sin(2?f j t) ? ? 2 ? ? ? ? examples: Figs. 20 and 21 I?41 Theoretical/Estimated Spectra for Wind Speed 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.2 0.4 0.6 0.8 f j I?42 Theoretical/Estimated Spectra for Atomic Clock 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 f j I?43 Theoretical/Estimated Spectra for Willamette River 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 f j I?44 Theoretical/Estimated Spectra for Ocean Noise 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 f j I?45 Nonparametric Estimation of S j : III ? points about ? S j ?uncorrelatednessof A j ?s andB j ?s implies ? S j ?s approximately uncorrelated (exact under Gaussian assumption) ?easy to test hypothesis using ? S j ?s (di?cult for sample acs) ? ? S j is ?2 degrees of freedom? estimate; if S j ?s slowly varying, can average ? S j ?s locally I?46 Parametric Estimation of S j : I ? assume S j ?s depend on small number of parameters ? simple model: S j (?,?) = ? 1 + ? 2 ?2?cos(2?f j ) (related to first-order autoregressive process) ? estimate S j ?s by estimating ?, ?: ? S j (??, ? ?) = ? ? 1 + ?? 2 ?2??cos(2?f j ) I?47 Parametric Estimation of S j : II ? can show that ? 1 ? ?, so let ?? = ?? 1 ? requiring N/2 X j=1 ? S j (??, ? ?) = 1 N N X t=1 (X t ? ¯ X) 2 ? ?? 2 yields estimator ? ? = ?? 2 ? ? N/2 X j=1 1 1 + ?? 2 ?2??cos(2?f j ) ? ? ?1 ? examples: ?theoretical? spectra for wind speed, atomic clock and ocean noise (doesn?t work well for Willamette River series, which points out need to be careful about parameterization) I?48 Parametric/Nonparametric Estimated Spectra for Wind Speed 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.2 0.4 0.6 0.8 f j I?49 Parametric/Nonparametric Estimated Spectra for Atomic Clock 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 f j I?50 Parametric/Nonparametric Estimated Spectra for Ocean Noise 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 f j I?51 ?Industrial Strength? Theory: I ? simple model not adequate in practice ?frequencies in model tied to sample size N ?time series treated as if it were ?circular?; i.e., X k ,X k+1 ,...,X N?1 ,X N ,X 1 ,X 2 ,...,X k?1 has same spectrum as X 1 ,X 2 ,...,X N . ? assume stationarity, which means that E{X t } = µ, var{X t } = ? 2 and cov{X t ,X t+k } = ? k ? 2 , I?52 ?Industrial Strength? Theory: II ? under stationarity, simple model extends to become X t = µ + Z 1/2 ?1/2 e i2?ft dZ(f) ? µ + X f [A(f)cos(2?ft) + B(f)sin(2?ft)], where dZ(f) yields A(f) and B(f), and we now use e i2?ft ? cos(2?ft) + isin(2?ft), i ? ? ?1 ? analogous to simple model, we use var{dZ(f)} = S(f)df to define a spectral density function S(f) I?53 ?Industrial Strength? Theory: III ? fundamental relationship now becomes Z 1/2 ?1/2 S(f)df = ? 2 ? S(f) and ? k ? 2 related via ? k ? 2 = Z 1/2 ?1/2 S(f)e i2?fk df and S(f) = ? 2 ? X k=?? ? k e ?i2?fk ? basic estimator of S(f) is periodogram: ? S (p) (f) ? 1 N Ø Ø Ø Ø Ø Ø N X t=1 (X t ?X)e ?i2?ft Ø Ø Ø Ø Ø Ø 2 , where X ? 1 N N X t=1 X t I?54 ?Industrial Strength? Theory: IV ? ideally it would be nice if 1. E{ ? S (p) (f)} = S(f) 2. var{ ? S (p) (f)} ? 0 as N ?? but, alas, 1. periodogram can be badly biased for finite N (can correct using data tapers) 2. var{ ? S (p) (f)} = S 2 (f) as N ??if 0 < f < 1 2 (can correct using smoothing windows) I?55 Uses of Spectral Analysis ? analysis of variance technique for time series ? some uses ?testing theories (e.g., wind data) ?exploratory data analysis (e.g., rainfall data) ?discriminating data (e.g., neonates) ?diagnostic tests (e.g., ARIMA modeling) ?assessing predictability (e.g., atomic clocks) ? applications ?tout le monde! I?56 ied ied 1-intro
Back
Next
About this note
By: Maziar Rezakhani
Created: 2009-03-05
File Size: 94 page(s)
Views: 1
Created: 2009-03-05
File Size: 94 page(s)
Views: 1
About StudyBlue
STUDYBLUE makes things that make you better at school.
Things like online flashcards with photos and audio.
Things like personalized quizzes and friendly reminders about when (and what) to study next.
Think of it as a digital backpack™: access to all of your study materials online and on your phone.
STUDYBLUE exists to make studying efficient and effective for every student, for free. Join us.
“Simply amazing. The flash cards are smooth, there are many different types of studying tools, and there is a great search engine. I praise you on the awesomeness.”
Dennis
Dennis