Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

A Robust Symmetric Nonnegative Matrix Factorization Framework for Clustering Multiple Heterogeneous Microbiome Data

Version 1 : Received: 17 April 2017 / Approved: 18 April 2017 / Online: 18 April 2017 (03:31:04 CEST)

How to cite: Ma, Y.; Hu, X.; He, T.; Jiang, X. A Robust Symmetric Nonnegative Matrix Factorization Framework for Clustering Multiple Heterogeneous Microbiome Data. Preprints 2017, 2017040105. https://doi.org/10.20944/preprints201704.0105.v1 Ma, Y.; Hu, X.; He, T.; Jiang, X. A Robust Symmetric Nonnegative Matrix Factorization Framework for Clustering Multiple Heterogeneous Microbiome Data. Preprints 2017, 2017040105. https://doi.org/10.20944/preprints201704.0105.v1

Abstract

Integration of multi-view datasets which are comprised of heterogeneous sources or different representations is challenging to understand the subtle and complex relationship in data. Such data integration methods attempt to combine efficiently the complementary information of multiple data types to construct a comprehensive view of underlying data. Nonnegative matrix factorization (NMF), an approach that can be used for signal compression and noise reduction, has aroused widespread attention in the last two decades. The Kullback–Leibler divergence (or relative entropy) information distance can be used to measure the loss function of NMF. In this article, we propose a fast and robust framework (RSNMF) based on symmetric nonnegative matrix factorization (SNMF) and similarity network fusion (SNF) for clustering human microbiome data including functional, metabolic and phylogenetic profiles. Many existing methods typically utilize all the information provided by each view to create a consensus representation, which often suffers a lot from noise in data and cannot provide a precise representation of the latent data structures. In contrast, RSNMF combines the strength of SNMF and the advantage of SNF to form a robust clustering indicator matrix thus can reduce the noise influence. We conduct experiments on one synthetic and two real dataset (microbiome data, text data) and the results show that the proposed RSNMF has better performance over the baseline and the state-of-art methods, which demonstrates the potential application of RSNMF for microbiome data analysis.

Keywords

symmetric nonnegative matrix factorization; similarity network fusion; human microbiome; multi-view clustering

Subject

Computer Science and Mathematics, Mathematics

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.