Ranking top-k trees in tree-based phylogenetic networks
'Tree-based' phylogenetic networks proposed by Francis and Steel have attracted much attention of theoretical biologists in the last few years. At the heart of the definitions of tree-based phylogenetic networks is the notion of 'support trees', about which there are numerous algorithmic problems that are important for evolutionary data analysis. Recently, Hayamizu (arXiv:1811.05849 [math.CO]) proved a structure theorem for tree-based phylogenetic networks and obtained linear-time and linear-delay algorithms for many basic problems on support trees, such as counting, optimisation, and enumeration. In the present paper, we consider the following fundamental problem in statistical data analysis: given a tree-based phylogenetic network N whose arcs are associated with probability, create the top-k support tree ranking for N by their likelihood values. We provide a linear-delay (and hence optimal) algorithm for the problem and thus reveal the interesting property of tree-based phylogenetic networks that ranking top-k support trees is as computationally easy as picking k arbitrary support trees.
READ FULL TEXT