Kernel change point detection#
Problem formulation#
In this section, the kernel change point detection setting is briefly described.
The interested reader can refer to [Celisse2018, Arlot2019] for a more complete introduction.
Let \(y = \{y_0,y_1,\dots,y_{T-1}\}\) denote a \(\mathbb{R}^d\)-valued signal with \(T\) samples.
This signal is mapped onto a reproducing Hilbert space (rkhs) \(\mathcal{H}\) associated with a user-defined kernel function \(k(\cdot, \cdot):\mathbb{R}^d\times\mathbb{R}^d\rightarrow\mathbb{R}\).
The mapping function \(\phi:\mathbb{R}^d\rightarrow\mathcal{H}\) onto this rkhs is implicitly defined by \(\phi(y_t) = k(y_t, \cdot)\in\mathcal{H}\) resulting in the following inner-product and norm:
and
for any samples \(y_s,y_t\in\mathbb{R}^d\). Kernel change point detection consists in finding mean-shifts in the mapped signal \(\phi(y)\) by minimizing \(V(\cdot)\) where
where \(\bar{\mu}_{t_k..t_{k+1}}\) is the empirical mean of the sub-signal \(\phi(y_{t_k}), \phi(y_{t_k+1}),\dots,\phi(y_{t_{k+1}-1})\), and \(t_1,t_2,\dots,t_K\) are change point indexes, in increasing order. (By convention \(t_0=0\) and \(t_{K+1}=T\).)
If the number of changes is known beforehand, we solve the following optimization problem, over all possible change positions \(t_1<t_2<\dots<t_K\) (where the number \(K\) of changes is provided by the user):
The exact optimization procedure is described in [Celisse2018].
If the number of changes is not known, we solve the following penalized optimization problem
where \(\beta>0\) is the smoothing parameter (provided by the user) and \(\hat{K}\) is the estimated number of change points. Higher values of \(\beta\) produce lower \(\hat{K}\). The exact optimization procedure is described in [Killick2012].
Available kernels#
We list below a number of kernels that are already implemented in ruptures
.
In the following, \(u\) and \(v\) are two d-dimensional vectors and \(\|\cdot\|\) is the Euclidean norm.
Kernel | Description | Cost function |
---|---|---|
Linearmodel="linear" |
\(k_{\text{linear}}(u, v) = u^T v\). | CostL2 |
Gaussianmodel="rbf" |
\(k_{\text{Gaussian}}(u,v)=\exp(-\gamma \|u-v\|^2)\) where \(\gamma>0\) is a user-defined parameter. |
CostRbf |
Cosinemodel="cosine" |
\(k_{\text{cosine}}(u, v) = (u^T v)/(\|u\|\|v\|)\) | CostCosine |
Implementation and usage#
Kernel change point detection is implemented in the class KernelCPD
, which is a C implementation of dynamic programming and PELT.
To see it in action, please look at the gallery of examples, in particular:
The exact class API is available here.
References#
[Gretton2012] Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B., & Smola, A. (2012). A kernel two-sample test. The Journal of Machine Learning Research, 13, 723–773.
[Killick2012] Killick, R., Fearnhead, P., & Eckley, I. (2012). Optimal detection of changepoints with a linear computational cost. Journal of the American Statistical Association, 107(500), 1590–1598.
[Celisse2018] Celisse, A., Marot, G., Pierre-Jean, M., & Rigaill, G. (2018). New efficient algorithms for multiple change-point detection with reproducing kernels. Computational Statistics and Data Analysis, 128, 200–220.
[Arlot2019] Arlot, S., Celisse, A., & Harchaoui, Z. (2019). A kernel multiple change-point algorithm via model selection. Journal of Machine Learning Research, 20(162), 1–56.