import matplotlib.pyplot as plt # for display purposes
import ruptures as rpt # our package
Generate and display the signal#
Let us generate a 3-dimensional piecewise constant signal with Gaussian noise.
n_samples, n_dims, sigma = 1000, 3, 2
n_bkps = 4 # number of breakpoints
signal, bkps = rpt.pw_constant(n_samples, n_dims, n_bkps, noise_std=sigma)
The true change points of this synthetic signal are available in the bkps
variable.
print(bkps)
Note that the first four element are change point indexes while the last is simply the number of samples.
(This is a technical convention so that functions in ruptures
always know the length of the signal at hand.)
It is also possible to plot our \(\mathbb{R}^3\)-valued signal along with the true change points with the rpt.display
function.
In the following image, the color changes whenever the mean of the signal shifts.
fig, ax_array = rpt.display(signal, bkps)
Change point detection#
We can now perform change point detection, meaning that we find the indexes where the signal mean changes.
To that end, we minimize the sum of squared errors when approximating the signal by a piecewise constant signal.
Formally, for a signal \(y_0,y_1,\dots,y_{T-1}\) (\(T\) samples), we solve the following optimization problem, over all possible change positions \(t_1<t_2<\dots<t_k\) $\(="" \(\bar{y}_{t_k..t_{k+1}}\)="" \(k\)="" \(t_0="0\)" \(t_{k+1}="T\).)" (y_{t_k},="" (by="" (more="" (where="" :="\sum_{k=0}^K\sum_{t=t_k}^{t_{k+1}-1}" <="" [dynp
](..="" [user="" [what="" \hat{t}_1,="" \hat{t}_2,\dots,\hat{t}_k="\arg\min_{t_1,\dots,t_K}" \|y_t-\bar{y}_{t_k..t_{k+1}}\|^2="" and="" by="" change="" changes="" class.="" convention="" defined="" detection="" detection?](="" div="" dynamic="" dynp.md)="" empirical="" guide](="" in="" information="" is="" mean="" number="" of="" optimization="" point="" programming,="" section="" solved="" sub-signal="" the="" this="" user):="" user-guide="" user-guide).)="" using="" v(t_1,t_2,\dots,t_k)="" what-is-cpd)="" where="" with="" y_{t_k+1},\dots,y_{t_{k+1}-1}\).="">
</t_2<\dots<t_k\)>
# detection
algo = rpt.Dynp(model="l2").fit(signal)
result = algo.predict(n_bkps=4)
print(result)
Again the first elements are change point indexes and the last is the number of samples.
Display the results#
To visualy compare the true segmentation (bkps
) and the estimated one (result
), we can resort to rpt.display
a second time.
In the following image, the alternating colors indicate the true breakpoints and the dashed vertical lines, the estimated breakpoints.
# display
rpt.display(signal, bkps, result)
plt.show()
In this simple example, both are quite similar and almost undistinguishable.