An Extensible, Scalable Spark Platform for Alignment-free Genomic Analysis – Version 1

05/02/2020
by   Umberto Ferraro Petrillo, et al.
0

Alignment-free similarity/distance functions, a computationally convenient alternative to alignment-based tasks in Computational Biology (e.g., classification and taxonomy), are a largely ignored Big Data problem, a fact limiting their impact, potentially vast. We provide the first user-friendly, extensible and scalable Spark platform for their computation, including (a) statistical significance tests of their output; (b) useful novel indications about their day-to-day use. Our contribution addresses an acute need in Alignment-free sequence analysis.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset