Functional Labeled Optimal Partitioning

10/05/2022
by   Toby D. Hocking, et al.
0

Peak detection is a problem in sequential data analysis that involves differentiating regions with higher counts (peaks) from regions with lower counts (background noise). It is crucial to correctly predict areas that deviate from the background noise, in both the train and test sets of labels. Dynamic programming changepoint algorithms have been proposed to solve the peak detection problem by constraining the mean to alternatively increase and then decrease. The current constrained changepoint algorithms only create predictions on the test set, while completely ignoring the train set. Changepoint algorithms that are both accurate when fitting the train set, and make predictions on the test set, have been proposed but not in the context of peak detection models. We propose to resolve these issues by creating a new dynamic programming algorithm, FLOPART, that has zero train label errors, and is able to provide highly accurate predictions on the test set. We provide an empirical analysis that shows FLOPART has a similar time complexity while being more accurate than the existing algorithms in terms of train and test label errors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/24/2020

Labeled Optimal Partitioning

In data sequences measured over space or time, an important problem is a...
research
09/29/2018

Generalized Functional Pruning Optimal Partitioning (GFPOP) for Constrained Changepoint Detection in Genomic Data

We describe a new algorithm and R package for peak detection in genomic ...
research
10/28/2020

Test Set Optimization by Machine Learning Algorithms

Diagnosis results are highly dependent on the volume of test set. To der...
research
03/20/2019

A Novel Dynamic Programming Approach to the Train Marshalling Problem

Train marshalling is the process of reordering the railcars of a train i...
research
03/05/2020

Linear time dynamic programming for the exact path of optimal models selected from a finite set

Many learning algorithms are formulated in terms of finding model parame...
research
03/09/2017

A log-linear time algorithm for constrained changepoint detection

Changepoint detection is a central problem in time series and genomic da...
research
06/03/2015

PeakSegJoint: fast supervised peak detection via joint segmentation of multiple count data samples

Joint peak detection is a central problem when comparing samples in geno...

Please sign up or login with your details

Forgot password? Click here to reset