EXCLUVIS: A MATLAB GUI Software for Comparative Study of Clustering and Visualization of Gene Expression Data

08/18/2020
by   Sudip Poddar, et al.
0

Clustering is a popular data mining technique that aims to partition an input space into multiple homogeneous regions. There exist several clustering algorithms in the literature. The performance of a clustering algorithm depends on its input parameters which can substantially affect the behavior of the algorithm. Cluster validity indices determine the partitioning that best fits the underlying data. In bioinformatics, microarray gene expression technology has made it possible to measure the gene expression levels of thousands of genes simultaneously. Many genomic studies, which aim to analyze the functions of some genes, highly rely on some clustering technique for grouping similarly expressed genes in one cluster or partitioning tissue samples based on similar expression values of genes. In this work, an application package called EXCLUVIS (gene EXpression data CLUstering and VISualization) has been developed using MATLAB Graphical User Interface (GUI) environment for analyzing the performances of different clustering algorithms on gene expression datasets. In this application package, the user needs to select a number of parameters such as internal validity indices, external validity indices and number of clusters from the active windows for evaluating the performance of the clustering algorithms. EXCLUVIS compares the performances of K-means, fuzzy C-means, hierarchical clustering and multiobjective evolutionary clustering algorithms. Heatmap and cluster profile plots are used for visualizing the results. EXCLUVIS allows the users to easily find the goodness of clustering solutions as well as provides visual representations of the clustering outcomes.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset