Preprint Article Version 1 This version is not peer-reviewed

# Inchoative Discovery of Plausible (Un)explored Synergistic Combinatorial Biological Hypotheses for Static/Time Series Wnt Measurements via Ranking Search Engine : BioSearch Engine Design

Version 1 : Received: 4 December 2018 / Approved: 5 December 2018 / Online: 5 December 2018 (07:52:17 CET)

How to cite: Sinha, S. Inchoative Discovery of Plausible (Un)explored Synergistic Combinatorial Biological Hypotheses for Static/Time Series Wnt Measurements via Ranking Search Engine : BioSearch Engine Design. Preprints 2018, 2018120064 Sinha, S. Inchoative Discovery of Plausible (Un)explored Synergistic Combinatorial Biological Hypotheses for Static/Time Series Wnt Measurements via Ranking Search Engine : BioSearch Engine Design. Preprints 2018, 2018120064

## Abstract

\textsc{Background} Often, in biology, we are faced with the problem of exploring relevant unknown biological hypotheses in the form of myriads of combination of factors that might be affecting the pathway under certain conditions. Currently, a major problem in biology is to cherry pick the combinations based on expert advice, literature survey or guesses for investigation. The search and wet lab testing of these combinations costs a lot in terms of time, investment and energy. In a recent development of the PORCN-WNT inhibitor ETC-1922159 for colorectal cancer, a list of down-regulated genes were recorded in a time buffer after the administration of the drug. The regulation of the genes were recorded individually but for a majority, it is still not known which higher ($\geq 2$) order combinations might be playing a greater role in the pathway. \textsc{Results} The pipeline provides a prioritised list of important $2^{nd}$ order combinations of a range of family of genes involved in the Wnt pathway. More specifically, it reveals the various unexplored FZD-WNT combinations that have been untested till now in the pathway. In relation to ETC-1922159 affected combinations, the down-regulation of LGR-RNF family after the drug treatment is evident in these rankings as it takes bottom priorities for LGR5-RNF43 combination. The LGR6-RNF43 takes higher ranking than LGR5-RNF43, indicating that it might not be playing a greater role as LGR5 during the Wnt enhancing signals. These rankings confirm the efficacy of the proposed search engine design. \textsc{Conclusion} A pipeline has been developed to prioritise an $n^{th}$ order combination of factors that affect a signaling pathway. It takes into account the sensitivity indices computed from variance based (SOBOL) and density-kernel based (HSIC) methods to estimate the influence of each factor or combination of factors. These are then fed as feature vectors into a powerful support vector ranking algorithm that produces a ranked list of the interactions/combinations.

## Subject Areas

combinatorial search forest; sensitivity analysis; support vector ranking algorithm; unknown biological hypotheses; Wnt pathway; PORCN-WNT inhibitors; ETC-1922159; Sobol indices; Hilbert Schmidt independence criterion indices (HSIC); F-divergence (Fdiv); systems biology