Preprint Article Version 1 This version is not peer-reviewed

Assessment of Drugs Toxicity and Associated Biomarker Genes Using Hierarchical Clustering

Version 1 : Received: 30 June 2019 / Approved: 3 July 2019 / Online: 3 July 2019 (07:44:28 CEST)

How to cite: Hasan, M.N.; Malek, M.B.; Begum, A.A.; Rahman, M.; Mollah, M.N.H. Assessment of Drugs Toxicity and Associated Biomarker Genes Using Hierarchical Clustering. Preprints 2019, 2019070047 (doi: 10.20944/preprints201907.0047.v1). Hasan, M.N.; Malek, M.B.; Begum, A.A.; Rahman, M.; Mollah, M.N.H. Assessment of Drugs Toxicity and Associated Biomarker Genes Using Hierarchical Clustering. Preprints 2019, 2019070047 (doi: 10.20944/preprints201907.0047.v1).

Abstract

Assessment of drugs toxicity and associated biomarker genes is one of the most important tasks in the pre-clinical phase of drug development pipeline as well as in the toxicogenomic studies. There are few statistical methods for the assessment of doses of drugs (DDs) toxicity and their associated biomarker genes. However, these methods consume more time for computation of the model parameters using the EM (Expectation-Maximization) based iterative approaches. To overcome this problem, in this paper, an attempt is made to propose an alternative approach based on hierarchical clustering (HC) for the same purpose. There are several types of HC approaches whose performance depends on different similarity/distance measures. Therefore, we explored suitable combinations of distance measures and HC methods based on Japanese Toxicogenomics Project (TGP) datasets for better clustering/co-clustering between DDs and genes as well as to detect toxic DDs and their associated biomarker genes. We observed that Word’s HC method with each of Euclidean, Manhattan and Minkowski distance measures produces better clustering/co-clustering results. For an example, in case of glutathione metabolism pathway (GMP) dataset LOC100359539/Rrm2, Gpx6, RGD1562107, Gstm4, Gstm3, G6pd, Gsta5, Gclc, Mgst2, Gsr, Gpx2, Gclm, Gstp1, LOC100912604/Srm, Gstm4, Odc1, Gsr, Gss are the biomarker genes and Acetaminophen_Middle, Acetaminophen_High, Methapyrilene_High, Nitrofurazone_High, Nitrofurazone_Middle, Isoniazid_Middle, Isoniazid_High are their regulatory (associated) DDs explored by our proposed co-clustering algorithm based on the distance and HC method combination Euclidean: Word. Similarly, for the PPAR signaling pathway (PPAR-SP) dataset Cpt1a, Cyp8b1, Cyp4a3, Ehhadh, Plin5, Plin2, Fabp3, Me1, Fabp5, LOC100910385, Cpt2, Acaa1a, Cyp4a1, LOC100365047, Cpt1a, LOC100365047, Angptl4, Aqp7, Cpt1c, Cpt1b, Me1 are the biomarker genes and Aspirin_Low, Aspirin_Middle, Aspirin_High, Benzbromarone_Middle, Benzbromarone_High, Clofibrate_Middle, Clofibrate_High, WY14643_Low, WY14643_High, WY14643_Middle, Gemfibrozil_Middle, Gemfibrozil_High are their regulatory DDs. These results are validated by the available literature and functional annotation.

Subject Areas

biomarker gene; doses of drugs; fold change gene expression; error rate; toxicity; hierarchical clustering

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.