Adaptive Sequential Bayesian Change Point Detection Ryan Turner, Yunus Saatci, and Carl Edward Rasmussen

3

Pruning: The total run time of a naive implementation is O(T 2). In practice the run length distribution will be highly peaked. We can prune out run lengths with low probability. The modified algorithm runs in O(T ), where the constant factor depends on pruning threshold. Modularity: Any hazard function H(t) ∈ [0, 1] can be plugged in. Any model that provides a posterior predictive can be used. We have implemented BOCPD modules for changing Gaussian process regression, Bayesian linear regression, and Kernel Density Estimation. Caching: Predictions under given run lengths are made repeatedly. Predictive modules (r) for p(xt|rt−1, xt ) can usually be speed up using intelligent caching.

• Treat change points as latent variables handled in a coherently Bayesian fashion • Closed form and online inference algorithm • Learn hyper-parameters efficiently from data • Highly modular framework • MATLAB code made publicly available at [1]

1

Improving BOCPD

Introduction

• Many Bayesian change approaches are retrospective, while many applications demand online behavior

4

• Bayesian online change point detection (BOCPD) introduced by [2]

Results

Well Log Data We used the logistic hazard, H(t) = hσ(at + b), and used an IID Gaussian UPM, with the aim of detecting changes in mean and variance. After learning the parameters our method has a better predictive likelihood than [2].

• Define run length rt as time since last change point at time t • Goal is calculate p(rt|x1:t) from observations x1:t.

100 150 200 250 300 350

450 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008

Date (years)

• Two components are underlying predictive model (UPM), p(xt|x(t−τ ):t, θm), and change point hazard function, H(r|θh). 2

Figure 2:

NMR

The BOCPD algorithm

event.

−2

Figure 3:

−4

p(xt+1|x1:t) =

p(xt+1|x1:t, rt)p(rt|x1:t) =

rt

X

(r) p(xt+1|xt )p(rt|x1:t) ,

0

rt

p(rt, rt−1, x1:t) =

p(rt, xt|rt−1, x1:t−1)p(rt−1, x1:t−1)

rt−1 X (r) p(rt|rt−1) p(xt|rt−1, xt ) p(rt−1, x1:t−1) {z } | {z }| {z } rt−1 | γt−1 hazard likelihood (UPM) rt−1

=

.

(2)

500

1000

1500

2000

2500

3000

NLL using a one sided t-test. A reference method, the time independent model (TIM), treats the data as iid, normal

3500

for the well log and t for industry data. The TIM parameters are fit to the training set. Well Log: The learned

log p(x1:T |θ) =

hyper-parameter method was trained using the first 1000 points and tested on 3050 points. Industry: We test on

100

the last 8455 points of the portfolio data, 3 July 1975–31 December 2008. The the methods were trained using the

150

first 3000 points, 1 July 1963–2 July 1975. We compare running BOCPD independently on all 30 time series and

200

using one joint BOCPD.

250

Defines forward message passing scheme. Learn the parameters by maximizing the marginal likelihood T X

Run Length

γt := p(rt, x1:t) =

X

300

500

1000

1500

2000

2500

3000

3500

4000

Measurements log p(xt|x1:t−1, θ) .

(3)

t=1 (r) ∂ ∂θm p(xt|rt−1, xt , θm),

Using the derivatives of the UPM, and those of the hazard function, ∂θ∂ h p(rt|rt−1, θh), the derivatives of the one-step ahead predictors can be propagated forward.

A summary of comparing the negative log predictive likelihoods (NLL) (nats/observation) on test

data. We also include the 95% error bars on the NLL and the p-value that the joint model/learned hypers has a higher

(1)

50

X

The BOCPD run length distribution between 1998 and 2008. Many events of market impact create

change points. Some of the other change points correspond to minor rallies or rate changes but not to a historical

0

Consider, X

Northern Rock bank run Lehman collapse

400

• BOCPD sensitive to hyper-parameters, but we learn them from data

2

US presidential election Major rate cut

50

Run Length (trading days)

Abstract

Dot−com bubble burst September 11 Asia crisis, Dot−com bubble

Figure 1:

The BOCPD run length distribution on the well log data. The color represents the CDF of the

run length distribution, while the red line represents the median of the distribution. Areas of a quick transition from black (CDF of zero) to white (CDF of one) indicate a sharply peaked run length distribution.

Industry Portfolio Data Tried the “30 industry portfolios” data set [3]. Change points found coincide with significant events: the climax of the Internet bubble, the burst of the Internet bubble, and the 2004 presidential election.

Well Log Industry Portfolios Method NLL error bars p-value Method NLL error bars p-value TIM 1.53 0.0449 <1e-10 TIM 42.6 0.246 <1e-10 0.313 0.0267 6e-4 indep. 39.64 0.217 0.271 fixed hypers learned hypers 0.247 0.0293 NA joint 39.54 0.213 NA

References [1] http://mlg.eng.cam.ac.uk/rdturner/bocpd/ [2] R. P. Adams and D. J. C. MacKay, “Bayesian online changepoint detection,” Tecnical Report, University of Cambridge, Cambridge, UK, 2007. arXiv:0710.3742v1 [stat.ML]. [3] http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ data_library.html.

Download as a PDF

•MATLAB code made publicly available at [1] ... run length distribution, while the red line represents the median of the distribution. Areas of a ... data_library.html.

423KB Sizes 2 Downloads 302 Views

Recommend Documents

Download as a PDF
Spatial Data Cartridge and ESRI's Spatial Data Engine (SDE). .... include a scan and index-search in conjunction with the plane-sweep algorithm 5]. .... alternative processing strategies for spatial operations during query optimization.

Download as a PDF
School of Computer Science ... for the degree of Doctor of Philosophy ... This research was sponsored in part by the National Science Foundation under grant ...

Download as a PDF
Oct 15, 2007 - Examples demonstrating the rationale, properties and advantages of this ..... point interacts only with a few of its neighbors, or a local cloud of .... quality and without computing the eigenvectors of the graph Laplacian matrix.

Download as a PDF
An advantage of this approach is that the shape of the formation can be .... via a wireless RC signal. ... an advantage over previous methods, particularly.

Download as a PDF
•Closed form and online inference algorithm ... parameters our method has a better predictive likelihood than [2]. 500. 1000. 1500. 2000. 2500 ... data_library.html.

Download as a PDF
Spectrum sharing between wireless networks improves the efficiency ... scarcity due to growing demands for wireless broadband ..... But it is in the advantage of.

Download as a PDF
notebook, taking photographs and video footage of people when they are not ... Ethnography is simply not applicable to ad hoc market research. QMRIJ. 9,2.

Download as a PDF
reaction are attributed to the electronic effects of the xanthone oxygen (O10), the C9 carbonyl ..... ZSE mass spectrometer under fast atom bombardment (FAB).

Download as a PDF - CiteSeerX
Oct 21, 2015 - Aleman, 2000), and was partially validated by lithospheric-scale ana- ..... Jelinek statistics (1977, 1978) using the Anisoft 4.2 software (AGICO).

Download as a PDF - DFKI
camera-captured document analysis is to deal with the page curl and perspective .... The list of horizontal branches is filtered to leave only branches that lie between .... After obtaining the text from the OCR software, the. SKEL. SEG. CTM.

Download as a PDF - CiteSeerX
Oct 21, 2015 - ~56°S. The present-day tectonic setting of the Andes is ...... P-T-t paths from the Cordillera Darwin metamorphic complex, Tierra del Fuego,.

Download as a PDF - CiteSeerX
on caching strategy and universal prediction based on pattern matching due to .... problem of prefetching, competitive analysis is meaningless as the optimal offline .... Definition 1 (MX - (Strongly) φ-Mixing Source): Let Fn m be a σ-field ...

Download as a PDF - CiteSeerX
Nov 3, 2006 - for a computer to process or construct words as a human would. ..... one might want a program to determine that 'an apple is the nicest ...... strated in Table 3.4, a fact that complicates pronoun resolution considerably as gender ...

As-Strong-As-The-Mountains-A-Kurdish-Cultural-Journey.pdf
Our online web service was released with a wish to. work as a comprehensive on the web electronic catalogue that gives usage of multitude of PDF file book ...

Housewife As Busy As A Professional.pdf
Section 13B of the Hindu Marriage Act, 1955 ('the Act' for short). However, the said divorce petition was not pursued. Subsequently, on 20.4.2016, the petitioner filed a divorce. petition on the ground of cruelty and desertion against the. respondent

Ecotourism as a Western Construct
laudable, state-of-the-art eco-technology does not come cheap. The operator ..... In the light of the fact that mainstream environmental education was having little ..... Pleumaron, A. (2001) Message 171 Ecotourism Certification Discussion.

as a driven leaf pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. as a driven leaf ...