Subject: Medicine & Pharmacology, Oncology & Oncogenics Keywords: lncRNA; melanoma; prognosis; immune checkpoint inhibit; WGCNA; NMF
Online: 11 July 2020 (04:35:06 CEST)
Immune checkpoint inhibitors (ICI) have been widely used in melanoma, but to identify melanoma patients with survival benefit from ICI is still a big challenge. There is an urgent need for prognostic signatures improving the prediction of immunotherapy responses of cancer patients. We used data from EMBL-EBI database and analyzed RNA-seq information and clinical profiles in melanoma. Weighted gene co-expression network analysis (WGCNA) was used to identify the key module, then nonnegative matrix factorization (NMF) was conducted to cluster patients into two different cluster and compared them regarding overall survival (OS) and progression-free survival (PFS). Subsequently, the differentially expressed genes (DEGs) between different clusters were identified, and their function and pathway annotation were performed. 91 melanoma biopsies with complete survival information were included in our analyses and we first identified the key module (magenta) by WGCNA, then identified nine prognostic lncRNAs (ENSG00000258869, ENSG00000179840, ENSG00000206344, ENSG00000226777, ENSG00000205018, ENSG00000204261, ENSG00000163597, ENSG00000197536, and ENSG00000263069) signature that predicted for OS and PFS in patients treated with ICI by NMF. Finally, enrichment analysis showed that the functions of DEGs between two consensus clusters were mainly related to the immune process and treatment. In summary, the nine lncRNAs signature is a novel effective predictor for OS and PFS in melanoma patients treated with ICI.
ARTICLE | doi:10.20944/preprints202112.0184.v2
Subject: Earth Sciences, Other Keywords: Spectral; Geochemistry; Random Forest; Regression; Whole Rock; MIR; SWIR; VNIR; NMF
Online: 21 December 2021 (12:35:45 CET)
The efficacy of predicting geochemical parameters with a 2-chain workflow using spectral data as the initial input is evaluated. Spectral measurements spanning the approximate 400-25000nm spectral range are used to train a workflow consisting of a non-negative matrix function (NMF) step, for data reduction, and a random forest regression (RFR) to predict 8 geochemical parameters. Approximately 175000 spectra with their corresponding chemical analysis were available for training, testing and validation purposes. The samples and their spectral and chemical parameters represent 9399 drillcore. Of those, approximately 20000 spectra and their accompanying analysis were used for training and 5000 for model validation. The remaining pairwise data (150000 samples) were used for testing of the method. The data are distributed over 2 large spatial extents (980 km2 and 3025 km2 respectively) and allowed the proposed method to be tested against samples that are spatially distant from the initial training points. Global R2 scores and wt.% RMSE on the 150000 validation samples are Fe(0.95/3.01), SiO2(0.96/3.77), Al2O3(0.92/1.27), TiO(0.68/0.13), CaO(0.89/0.41), MgO(0.87/0.35), K2O(0.65/0.21) and LOI(0.90/1.14), given as Parameter(R2/RMSE), and demonstrate that the proposed method is capable of predicting the 8 parameters and is stable enough, in the environment tested, to extend beyond the training sets initial spatial location.
ARTICLE | doi:10.20944/preprints202011.0056.v1
Subject: Mathematics & Computer Science, Algebra & Number Theory Keywords: COVID-19; Deep Learning; Natural Language Processing; Topic Modelling; Text Classification; Latent Dirichlet allocation (LDA); Non-negative matrix factorization (NMF)
Online: 2 November 2020 (15:24:20 CET)
Ongoing COVID-19 Pandemic has resulted into massive damage to various platforms of global economy which has caused disruption to human livelihood. Natural Language Processing has been extensively used in different organizations to categorize sentiments, perform recommendation, summarizing information and topic modelling. This research aims to understand the non-medical impact of COVID-19 on global economy by leveraging the natural language processing methodology. This methodology comprises of text classification which includes topic modelling on unstructured COVID-19 media articles dataset provided by Anacode. Like other Natural Language Processing algorithms, Latent Dirichlet allocation (LDA) and Non-negative matrix factorization (NMF) has been proposed to classify the media articles dataset in order to analyze COVID-19 pandemic impacts in the different sectors of global economy. Model Accuracy was examined based on the coherence and perplexity score which came out to be 0.51 and -10.90 using LDA algorithm. Both the LDA and NMF algorithm identified similar prevalent topics that was impacted by COVID-19 pandemic in multiple sectors of economy. Through intertopic distance map visualization produced by LDA algorithm, it can be reciprocated that general industries which includes children schooling, parental care, and family gatherings had the major impact followed by business sector and the financial industry.