Search | Preprints.org

Indoor scene recognition has great significance for intelligent applications such as mobile robots, location-based services (LBS) and so on. Wherever we are or whatever we do, we are under a specific scene. The human brain can easily discern a scene with a quick glance. However, for a machine to achieve this purpose, on one hand, it often requires plenty of well-annotated data which is time-consuming and labor-intensive. On the other hand, it is hard to learn effective visual representations due to large intra-category variation and inter-categories similarity of indoor scenes. To solve these problems, in this paper, we adopted an unsupervised visual representation learning method which can learn from unlabeled data with a Siamese Convolutional Neural Network (Siamese ConvNet) and graph-based constraints. Specifically, we first mined relationships between unlabeled samples with a graph structure. And then, these relationships can be used as supervision for representation learning with a Siamese network. In this method, firstly, a k-NN graph would be constructed by taking each image as a node in the graph and its k nearest neighbors are linked to form the edges. Then, with this graph, cycle consistency and geodesic distance would be considered as criteria for positive and negative pairs mining respectively. In other words, by detecting cycles in the graph, images with large differences but in the same cycle can be considered as same category (positive pairs). By computing geodesic distance instead of Euclidean distance from one node to another, two nodes with large geodesic distance can be regarded as in different categories (negative pairs). After that, visual representations of indoor scenes can be learned by a Siamese network in an unsupervised manner with the mined pairs as inputs. In order to evaluate the proposed method, we tested it on two scene-centric datasets, MIT67 and Places365. Experiments with different number of categories have been conducted to excavate the potential of proposed method. The results demonstrated that semantic visual representations for indoor scenes can be learned in this unsupervised manner. In addition, with the learned visual representations, indoor scene recognition models trained with the learned representations and a few of labeled samples can achieve competitive performance compared to the state-of-the-art approaches.

Preprint ARTICLE | doi:10.20944/preprints201809.0197.v2

Unsupervised Metric Learning Using Low Dimensional Embedding

Parag Jain

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: unsupervised, metric learning, embedding learning, laplacian, information theoretic, diffusion maps

Online: 19 September 2018 (13:53:42 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202407.0425.v1

An Unsupervised Error Detection Methodology for Detecting Mislabels in Healthcare Analytics

Pei-Yuan Zhou, Faith Lum, Tony Jiecao Wang, Chen Dan, San Lee, Andrew K.C. Wong

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: Unsupervised Learning; Error detection; Pattern Discovery and Disentanglement; Healthcare Data Analysis

Online: 4 July 2024 (14:15:25 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202405.1792.v1

Unsupervised Characterization of Water Composition with UAV-based Hyperspectral Imaging and Generative Topographic Mapping

John Waczak, Adam Aker, Lakitha O. H. Wijeratne, Shawhin Talebi, Ashen Fernando, Prabuddha M. H. Dewage, Mazhar Iqbal, Matthew Lary, David Schaefer, Gokul Balagopal, And David J. Lary

Subject: Environmental And Earth Sciences, Environmental Science Keywords: Hyperspectral Imaging; Remote Sensing; Unsupervised Classification; Endmember Extraction; Generative Topographic Mapping

Online: 3 June 2024 (17:09:23 CEST)

Show abstract| Download PDF| Share

Preprint COMMUNICATION | doi:10.20944/preprints202402.0275.v3

Diversity of the Japanese Gut Microbiome Analysis: Relative Approach Using Principal Component Analysis

Tatsuki Itagaki, Ken-ichiro Sakata, Akira Hasebe, Yoshimasa Kitagawa

Subject: Biology And Life Sciences, Biology And Biotechnology Keywords: gut microbiome; compositional data; principal component analysis; unsupervised machine learning; diversity

Online: 11 March 2024 (16:59:03 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202312.2278.v1

Robust Method for Unsupervised Scoring of Inmunohistoquemical Staining

Iván Durán-Díaz, Auxiliadora Sarmiento, Irene Fondón, Clément Bodineau, Mercedes Tomé, Raúl V. Durán

Subject: Engineering, Telecommunications Keywords: Histopathological images; Principal Component Analysis; Unsupervised Stain Separation; Semi-quantitative scoring

Online: 29 December 2023 (10:37:23 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.0008.v1

Forest Communities Spatial Modeling as a Basis for Assessing the Sequestration Potential of Ecosystems (Republic of Tatarstan, Russia)

Artur Gafurov, Vadim Prokhorov, Maria Kozhevnikova, Bulat Usmanov

Subject: Environmental And Earth Sciences, Environmental Science Keywords: unsupervised classification; forest communities; carbon balance; remote sensing, google earth engine

Online: 1 November 2023 (04:10:24 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints201912.0015.v1

Evaluation on Sports Facility Resource in Primary School Using a Combined Approach of Unsupervised Machine Learning: a Case Study of Shanghai, China

Jun Xia, Pei-Jie Chen, Ji-Hong Wang, Jie Zhuang, Zhen-Bo Cao, Qiang Zhang

Subject: Social Sciences, Tourism, Leisure, Sport And Hospitality Keywords: school sports facility; assessment; t-sne; fuzzy c mean; unsupervised learning

Online: 3 December 2019 (05:24:26 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.1049.v1

Unsupervised Color Based Flood Segmentation in UAV Imagery

Georgios Simantiris, Costas Panagiotakis

Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: Flood detection; image segmentation; remote sensing; unmanned aerial vehicle (UAV); unsupervised segmentation

Online: 16 April 2024 (11:40:48 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202305.2218.v1

A New Data Balancing Approach based Generative Adversarial Network for Network Intrusion Detection System

Mohammad Jamoos, Antonio García, Mohammad AlKhanafseh, Ola Surakhi

Subject: Computer Science And Mathematics, Security Systems Keywords: Generative Adversarial Network; Intrusion Detection System; Imbalanced Dataset; Machine Learning; Unsupervised Learning

Online: 31 May 2023 (10:22:58 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202108.0409.v1

Comparative Analysis on Molecular Characteristics of Chromophobe Renal Cancer and Oncocytoma

Khaled bin Satter, Paul Minh Huy Tran, Lynn Kim Hoang Tran, Shan Bai, Natasha M. Savage, Sravan K Kavuri, Martha Terris, Jin-Xiong She, Sharad Purohit

Subject: Medicine And Pharmacology, Oncology And Oncogenics Keywords: Renal cancers; oncocytoma; chromophobe; transcriptomics; machine learning; clustering; gene signature; unsupervised learning

Online: 20 August 2021 (11:23:43 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202011.0696.v1

Geometric Morphometric Data Augmentation using Generative Computational Learning Algorithms

Lloyd A. Courtenay, Diego González-Aguilera

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Archaeological Data Science; Artificial Intelligence; Unsupervised Learning; Generative Adversarial Networks; Robust Statistics.

Online: 27 November 2020 (14:43:36 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202004.0524.v2

A New Advanced In silico Drug Discovery Method for Novel Coronavirus (SARS-CoV-2) with Tensor Decomposition-based Unsupervised Feature Extraction

Y-H. Taguchi, Turki Turki

Subject: Biology And Life Sciences, Virology Keywords: unsupervised learning; tensor decomposition; feature selection; COVID-19; drug discovery; gene expression

Online: 3 June 2020 (05:29:09 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202209.0347.v1

Modelling the Agricultural Soil Landscape of Germany – A Data Science Approach Involving Spatially Allocated Functional Soil Process Units

Mareike Ließ

Subject: Environmental And Earth Sciences, Soil Science Keywords: digital soil mapping; soil process units; soil parameter space; machine learning; unsupervised classification

Online: 22 September 2022 (15:08:05 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202201.0452.v1

Robust Beamforming Based On Graph Attention Networks For IRS-assisted Satellite IoT Communications

Hailin Cao, Wang Zhu, Wenjuan Feng, Jin Fan

Subject: Engineering, Electrical And Electronic Engineering Keywords: intelligent reflecting surface; low Earth orbit satellite; graph attention networks; unsupervised learning; beamforming

Online: 31 January 2022 (11:47:07 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202105.0674.v1

Optimal Life Extension Management of Offshore Wind Farms Based on the Modern Portfolio Theory

Baran Yeter, Yordan Garbatov

Subject: Engineering, Automotive Engineering Keywords: Offshore wind; life extension; modern portfolio theory; unsupervised machine learning; monopile; risk management

Online: 27 May 2021 (14:01:13 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202011.0605.v1

A Parameter-Free Spectral Clustering Approach to Coherent Structure Detection in Geophysical Flows

Margaux Filippi, Irina Rypina, Alireza Hadjighasem, Thomas Peacock

Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: parameter-free spectral clustering; Lagrangian Coherent Structures; clusters; geophysical flows; unsupervised machine learning

Online: 24 November 2020 (09:25:02 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.0754.v1

Unsupervised Vehicle Re-identification Method Based on Source-free Knowledge Transfer

Zhigang Song, Daisong Li, Zhongyou Chen, Wenqin Yang

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: vehicle re-identification; unsupervised domain adaptation; source-free knowledge transfer; pseudo-label; joint training

Online: 12 September 2023 (08:54:10 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202103.0753.v1

Unsupervised Feature Selection for Histogram-Valued Symbolic Data by Hierarchical Conceptual Clustering

Manabu Ichino, Kadri Umbleja, Hiroyuki Yaguchi

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: unsupervised feature selection; histogram-valued data; compactness; hierarchical conceptual clustering; multi-role measure; visualization

Online: 31 March 2021 (07:53:39 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202103.0276.v1

Improved Growth Pattern of Tiger Prawn (TP) Arthropoda in a Pond by Analytical Hierarchical Process (AHP)

Adnan Alam Khan, Asif Ali

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Artificial Intelligent algorithms; Analytical Hierarchical Process (AHP); Prediction methods; unsupervised learning; Biological neural networks

Online: 10 March 2021 (11:07:15 CET)

Show abstract| Download PDF| Share

Artificial intelligence (AI) is a versatile term that is a conclusive remedy to solve the problem using past rational data after deep contemplation using these terms i-e basic statistics, carving data, familiarity with common AI algorithms. Seafood especially tiger prawn export as a busi-ness will provide enormous foreign exchange to any country if the farmers overcome the corre-lated vulnerabilities in prawn farming. This research is elucidating lacking in Tiger prawn (TP) farming like curbing of Oxygen, pH, water temperature, and nutrients, etc. Moreover, hatchery statistics in terms of juveniles will depict this study's clear picture of curbed aquaculture. For normative decisions, the Analytical Hierarchical Process (AHP) is used. The problem which has been faced by local prawn farmers that there is a stagnant TP growth in ponds, the reason is the predominant sensitivity factor in TP. For this reason, they need indemnification of thirteen fac-tors with natural resources to get the plausible results to get calmness in their lives. This study will solely focus on the TP growth model, and the monitoring effect will be established by the Artificial Intelligence algorithm. This study will employ the AHP, 0-1 scaling method, data cura-tion techniques, and ecological statistics. The life of Tiger Prawn (TP) depends upon these factors mainly, a) Physical and b) Chemical parameters. Physical parameters contain environment (E) provided to TP like season (S) and temperature (T) etc. whereas the quality of Ammonia NH3 (N) from fish waste, Oxygen level (O), and water quality hard & soft (W) lies in chemicals do-main. This research will Elucidate the factors which cause conceptual muddles in the aquamarine life of TP, for this reason, Statistical tools will assess the current result, forecast the gap. AHP will analyze the domain inputs, circumspect ramification which will depict visceral factors, later results depict which pond suits the TP. In curtail, these factors will be curbed to improve the growth of TP in a control conditioned environment.

Working Paper REVIEW

Automatic Segmentation of White Matter Hyperintensities from Brain Magnetic Resonance Images in the Era of Deep Learning and Big Data – A Systematic Review

Ramya Balakrishnan, Maria Valdes Hernandez, Andrew Farrall

Subject: Medicine And Pharmacology, Immunology And Allergy Keywords: White matter lesions; white matter hyperintensities; supervised segmentation; unsupervised segmentation; deep learning; FLAIR hyperintensities

Online: 20 November 2020 (13:44:46 CET)

Show abstract| Download PDF| Supplementary Files| Share

Background: White matter hyperintensities (WMH), of presumed vascular origin, are visible and quantifiable neuroradiological markers of brain parenchymal change. These changes may range from damage secondary to inflammation and other neurological conditions, through to healthy ageing. Fully automatic WMH quantification methods are promising, but still, traditional semi-automatic methods seem to be preferred in clinical research. We systematically reviewed the literature for fully automatic methods developed in the last five years, to assess what are considered state-of-the-art techniques, as well as trends in the analysis of WMH of presumed vascular origin. Method: We registered the systematic review protocol with the International Prospective Register of Systematic Reviews (PROSPERO), registration number - CRD42019132200. We conducted the search for fully automatic methods developed from 2015 to July 2020 on Medline, Science direct, IEE Explore, and Web of Science. We assessed risk of bias and applicability of the studies using QUADAS 2. Results: The search yielded 2327 papers after removing 104 duplicates. After screening titles, abstracts and full text, 37 were selected for detailed analysis. Of these, 16 proposed a supervised segmentation method, 10 proposed an unsupervised segmentation method, and 11 proposed a deep learning segmentation method. Average DSC values ranged from 0.538 to 0.91, being the highest value obtained from an unsupervised segmentation method. Only four studies validated their method in longitudinal samples, and eight performed an additional validation using clinical parameters. Only 8/37 studies made available their method in public repositories. Conclusions: We found no evidence that favours deep learning methods over the more established k-NN, linear regression and unsupervised methods in this task. Data and code availability, bias in study design and ground truth generation influence the wider validation and applicability of these methods in clinical research.

Preprint ARTICLE | doi:10.20944/preprints202007.0325.v1

A Machine Learning Solution for Data Center Thermal Characteristics Analysis

marta chinnici, Anastasiia GRISHIna, Ah-Lian KOR, Eric Rondeau, jean philippe georges

Subject: Computer Science And Mathematics, Applied Mathematics Keywords: Data Center; Thermal Characteristics Analysis; Machine Learning, Energy Efficiency, Hotspots, Clustering Technique, Unsupervised Learning

Online: 15 July 2020 (09:16:23 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202002.0113.v1

Blind Source Separation for NMR Spectra with Negative Intensity

Ryan J. McCarty, Nimish Ronghe, Mandy Woo, Todd M. Alam

Subject: Chemistry And Materials Science, Analytical Chemistry Keywords: Blind Source Separation; Component Analysis; Chemometrics; Unsupervised Machine Learning; Endmember Extraction; Spectral Unmixing; NMR

Online: 9 February 2020 (17:18:38 CET)

Show abstract| Download PDF| Share

NMR spectral datasets, especially in systems with limited samples, can be difficult to interpret if they contain multiple chemical components (phases, polymorphs, molecules, crystals, glasses, etc…) and the possibility of overlapping resonances. In this paper, we benchmark several blind source separation techniques for analysis of NMR spectral datasets containing negative intensity. For benchmarking purposes, we generated a large synthetic datasbase of quadrupolar solid-state NMR-like spectra that model spin-lattice T1 relaxation or nutation tip/flip angle experiments. Our benchmarking approach focused exclusively on the ability of blind source separation techniques to reproduce the spectra of the underlying pure components. In general, we find that FastICA (Fast Independent Component Analysis), SIMPLISMA (SIMPLe-to-use-Interactive Self-modeling Mixture Analysis), and NNMF (Non-Negative Matrix Factorization) are top-performing techniques. We demonstrate that dataset normalization approaches prior to blind source separation do not considerably improve outcomes. Within the range of noise levels studied, we did not find drastic changes to the ranking of techniques. The accuracy of FastICA and SIMPLISMA degrades quickly if excess (unreal) pure components are predicted. Our results indicate poor performance of SVD (Singular Value Decomposition) methods, and we propose alternative techniques for matrix initialization. The benchmarked techniques are also applied to real solid state NMR datasets. In general, the recommendations from the synthetic datasets agree with the recommendations and results from the real data analysis. The discussion provides some additional recommendations for spectroscopists applying blind source separation to NMR datasets, and for future benchmark studies. Applications of blind source separation to NMR datasets containing negative intensity may be especially useful for understanding complex and disordered systems with limited samples and mixtures of chemical components.

Preprint ARTICLE | doi:10.20944/preprints202201.0202.v1

Crop Detection Using Time Series of Sentinel-2 and Sentinel-1 and Existing Land Parcel Information Systems

Herman Snevajs, Karel Charvat, Vincent Onckelet, Jiri Kvapil, Frantisek Zadrazil, Hana Kubickova, Jana Seidlova, Iva Bartlova

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: crop detection; Sentinel 1; Sentinel 2; supervised classification; unsupervised classification; time series; agriculture; food security

Online: 14 January 2022 (11:18:59 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202112.0366.v1

An Integrated Epigenomic and Genomic View on Phyllodes and Phyllodes-Like Breast Tumors

Juergen Hench, Tatjana Vlajnic, Savas Deniz Soysal, Ellen C Obermann, Stephan Frank, Simone Muenst

Subject: Medicine And Pharmacology, Pathology And Pathobiology Keywords: fibroepithelial breast lesions; phyllodes tumors; methylation analysis; copy number alterations; dimension reduction; unsupervised machine learning

Online: 22 December 2021 (12:46:50 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202002.0019.v1

“NoTaMe”: Workflow for Non-Targeted LC-MS Metabolic Profiling

Marietta Kokla, Anton Klåvus, Stefania Noerman, Ville M. Koistinen, Marjo Tuomainen, Iman Zarei, Topi Meuronen, Merja R. Häkkinen, Soile Rummukainen, Ambrin Farizah Babu, Taisa Sallinen, Olli Kärkkäinen, Jussi Paananen, David Broadhurst, Carl Brunius, Kati Hanhineva

Subject: Biology And Life Sciences, Endocrinology And Metabolism Keywords: metabolomics; LC-MS; mass spectrometry; metabolic profiling; computational; statistical; unsupervised learning; supervised learning; pathway analysis

Online: 3 February 2020 (05:54:14 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints201807.0207.v1

Smart Home Anti-Theft System: A Novel Approach for Near Real-Time Monitoring, Smart Home Security and Large Video Data Handling for Wellness Protocol

Sharnil Pandya, Hemant Ghayvat, Ketan Kotecha, Moi Hoon Yap, Prosanta Gope

Subject: Computer Science And Mathematics, Security Systems Keywords: smart anti-theft system; intruder detection; unsupervised activity monitoring; smart home; partially/fully covered faces

Online: 11 July 2018 (16:47:59 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202001.0375.v1

Exploring Geometric Feature Hyper-Space in Data to Learn Representations of Abstract Concepts

Rahul Sharma, Bernardete Ribeiro, Alexandre Miguel Pinto, Amilcar F cardoso

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: unsupervised machine learning; hierarchical learning; computational representation; computational cognitive modeling; contextual modeling; classification; IoT data modeling

Online: 31 January 2020 (04:38:51 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints201708.0003.v1

De-Anonymizing Authors of Electronic Texts: A Survey on Electronic Text Stylometry

Mahmoud Khonji, Youssef Iraqi

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: stylometry; author identification; author verification; authorprofiling; stylistic inconsistency; text analysis; supervised learning; unsupervised learning; classification; forensics

Online: 2 August 2017 (12:38:17 CEST)

Show abstract| Download PDF| Share

Electronic text stylometry is a collection of forensics methods that analyze the writing styles of input electronic texts in order to extract information about authors of the input electronic texts. Such extracted information could be the identity of the authors, or aspects of the authors, such as their gender, age group, ethnicity, etc. This survey paper presents the following contributions: 1) A description of all stylometry problems in probability terms, under a unified notation. To the best of our knowledge, this is the most comprehensive definition to date. 2) A survey of key methods, with a particular attention to data representation (or feature extraction) methods. 3) An evaluation of 23,760 feature extraction methods, which is the most comprehensive evaluation of feature extraction methods in the literature of stylometry to date. The importance of this evaluation is two fold. First, identifying the relative effectiveness of the features (since, currently, many are not evaluated jointly; e.g. syntactic n-grams are not evaluated against k-skip n-grams, and so forth). Second, thanks to our generalizations, we could evaluate novel grams, such as what we name compound grams. 4) The release of our associated Python feature extraction library, namely Fextractor. Essentially, the library generalizes all existing n-gram based feature extraction methods under the "at least l-frequent, dir-directed, k-skipped n-grams'', and allows grams to be diversely defined, including definitions that are based on high-level grammatical aspects, such as POS tags, as well as lower-level ones, such as distribution of function words, word shapes, etc. This makes the library, by far, the most extensive in this domain to date. 5) The construction, evaluation, and release of the first dataset for Emirati social media text. This evaluation represents the first evaluation of author identification against Emirati social media texts. Interestingly, we find that, when using our models and feature extraction library (Fextractor), authors could be identified significantly more accurately than what is reported with similarly sized datasets. The dataset also contains sub-datasets that represent other languages (Dutch, English, Greek and Spanish), and our findings are consistent across them.

Preprint ARTICLE | doi:10.20944/preprints202312.2274.v1

Effect of Network Architecture on Physics-Informed Deep Learning of the Reynolds-Averaged Turbulent Flow Field around Cylinders without Training Data

Jan Hauke Harmening, Franz-Josef Peitzmann, Ould el Moctar

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Physics-informed deep learning; unsupervised learning; Reynolds-averaged Navier-Stokes equations; high Reynolds number flow; turbulence modeling

Online: 29 December 2023 (13:04:32 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202009.0729.v1

Machine Learning Algorithms Based on an Optimization Model

Mirpouya Mirmozaffari, Noorbakhsh Amiri Golilarz, Shahab S. Band

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Data Envelopment Analysis; Machine learning; Optimization; Parametric and non-parametric methods; Supervised and unsupervised models; CVS model

Online: 30 September 2020 (08:19:51 CEST)

Show abstract| Download PDF| Share

The main purpose of this paper is to propose a novel optimization model with a new machine learning approach in the first section to achieve the best results in financial institutions in the second section. Since the constancy of efficacy derived from parametric and non-parametric is not significant, this paper provides a scientific assessment at the optimization section and proposes a novel combined parametric and non-parametric model which will be a new experiment in literature perception. A scientific assessment of banks based on a combination of the efficiency measurement method of CCR(Charnes, Cooper and Rhodes model) or CRS(Constant Return to Scale) BCC(Banker, Charnes, and Cooper model) or VRS (Variable Return to Scale) in Data Envelopment Analysis (DEA), as well as Stochastic Frontier Approach (SFA) for 65 banks during Feb to July 2020, are introduced. For analyzing the performance of the parametric and non-parametric approaches we have considered the linear regression and Unreplicated Linear Functional Relationship (ULFR). At the machine learning section, a novel four-layers data mining filtering pre-processes for selected supervised classification as well as unsupervised clustering algorithms to increase the accuracy and to remove unrelated attributes and data are applied. For the four kinds of preprocessing approaches of unsupervised attributes, supervised attributes, supervised instances, and unsupervised instances, we have chosen discretization, attribute selection, stratified remove folds, and resample filters respectively. Based on the nature of the suggested financial institution's dataset and attributes, the most appropriate preprocessing filter in each layer to achieve the highest performance is suggested. Finally, the superior bank, best performance model, and the most accurate algorithm are introduced. The results indicate that the bank number 56 is the superior bank. Among the proposed techniques, the novel recommended CVS compared with CCR-BCC and SFA models, has a more positive correlation with profit risk, and show a higher coefficient of determination values. Sequential Minimal Optimization(SMO) algorithm receives the highest accuracy in all four suggested filtering layers.

Preprint ARTICLE | doi:10.20944/preprints201902.0233.v1

A Review of Deep Learning with Special Emphasis on Architectures, Applications and Recent Trends

Saptarshi Sengupta, Sanchita Basak, Pallabi Saikia, Sayak Paul, Vasilios Tsalavoutis, Frederick Ditliac Atiah, Vadlamani Ravi, Richard Alan Peters II

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep neural network architectures; supervised learning; unsupervised learning; testing neural networks; applications of deep learning; evolutionary computation

Online: 26 February 2019 (04:02:00 CET)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Textual Data Distributions: Kullback Leibler Textual Distributions Contrasts on GPT-2 Generated Texts, with Supervised, Unsupervised Learning on Vaccine & Market Topics & Sentiment

Jim Samuel, Ratnakar Palle, Eduardo Correa Soares

Subject: Computer Science And Mathematics, Computer Science Keywords: Textual data distributions; supervised learning; unsupervised learning; Kullback-Leibler divergence; sentiment; textual analytics; text generation; vaccine; stock market

Online: 17 June 2021 (10:03:41 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202203.0093.v1

Evaluation of 6-Hydroxydopamine and Rotenone In Vitro Neurotoxicity on Differentiated SH-SY5Y Cells Using Applied Computational Statistics

Rui F. Simões, Paulo J. Oliveira, Teresa Cunha-Oliveira, Francisco B. Pereira

Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: 6-hydroxydopamine; rotenone; in vitro neurotoxicity; mitochondrial dysfunction; exploratory data analysis; applied computational statistics; unsupervised and supervised machine learning

Online: 7 March 2022 (09:16:28 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202109.0389.v1

Optimizing Few-Shot Learning based on Variational Autoencoders

Ruoqi Wei, Ausif Mahmood

Subject: Engineering, Control And Systems Engineering Keywords: Deep learning; Variational Autoencoders (VAEs); data representation learning; generative models; unsupervised learning; few shot learning; latent space; transfer learning

Online: 22 September 2021 (16:04:22 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202407.0959.v1

Critical Investigation of Surrogate Modeling Based on Simultaneous Physics-Informed Deep Learning of the High Reynolds Number Flow around Airfoils under Variable Angles of Attack

Jan Hauke Harmening, Franz-Josef Peitzmann, Ould el Moctar

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Physics-informed deep learning; unsupervised learning; Reynolds-averaged Navier-Stokes equations; high Reynolds number flow; variable geometry; parameterized surrogate modeling

Online: 11 July 2024 (12:28:57 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202302.0070.v1

Overcoming Domain Shift in Neural Networks for Accurate Plant Counting in Aerial Images

Javier Rodriguez-Vazquez, Miguel Fernandez-Cortizas, David Perez-Saura, Martin Molina, Pascual Campoy

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: deep learning; aerial imagery; precision agriculture; plant detection; domain adaptation; unsupervised learning; self-supervision; adversarial learning; domain shift; tropical crops

Online: 3 February 2023 (10:14:09 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201801.0125.v1

Cloud Service Providers Optimized Ranking Algorithm Based on Machine Learning and Multi-Criteria Decision Analysis

Muhammad Umer Wasim, Abdallah A. Z. A. Ibrahim, Pascal Bouvry, Tadas Limba

Subject: Computer Science And Mathematics, Probability And Statistics Keywords: multi-criteria decision analysis (MCDA); online broker; misspecification of criteria; structural uncertainty; unsupervised machine learning; factor analysis, quality of service (QoS)

Online: 15 January 2018 (11:29:56 CET)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Real Time Aircraft Atypical Approach Detection for Air Traffic Control

Gabriel Jarry, Daniel Delahaye, Stephane Puechmorel, Eric Feron

Subject: Computer Science And Mathematics, Applied Mathematics Keywords: Approach Path Management; Atypical Flight Event; Non-Compliant Approach; Real Time; Anomaly Detection; Functional Principal Component Analysis; Unsupervised Learn- ing; Dubins Path

Online: 12 March 2021 (21:17:22 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201712.0110.v1

A Novel Strategy for Very-Large-Scale Cash-Crop Mapping in the Context of Weather-Related Risk Assessment, Combining Global Satellite Multispectral Datasets, Environmental Constraints, and in-Situ Acquisition of Geospatial Data

Fabio Dell'Acqua, Gianni Cristian Iannelli, Marco A. Torres, Mario L. V. Martina

Subject: Environmental And Earth Sciences, Geography Keywords: best practice; crop mapping; crowdsourcing; drought risk assessment; exposure; flood risk assessment; geospatial data; spaceborne remote sensing; unsupervised classification; rule-based classification

Online: 17 December 2017 (08:26:29 CET)

Show abstract| Download PDF| Share

Cash crops are agricultural crops intended to be sold for profit as opposed to subsistence crops, meant to support the producer, or to support livestock. Since cash crops are intended for future sale, they translate into large financial value when considered on a wide geographical scale, so their production directly involves financial risk. At a national level, extreme weather events including destructive rain or hail, as well as drought, can have a significant impact on the overall economic balance. It is thus important to map such crops in order to set up insurance and mitigation strategies. Using locally generated data -such as municipality-level records of crop seeding- for mapping purposes implies facing a series of issues like data availability, quality, homogeneity etc. We thus opted for a different approach relying on global datasets. Global datasets ensure homogeneity and availability of data, although sometimes at the expense of precision and accuracy. A typical global approach makes use of spaceborne remote sensing, for which different land cover classification strategies are available in literature at different levels of cost and accuracy. We selected the optimal strategy in the perspective of a global processing chain. Thanks to a specifically developed strategy for fusing unsupervised classification results with environmental constraints and other geospatial inputs including ground-based data, we managed to obtain good classification results despite the constraints placed. The overall production process was composed using ``good-enough" algorithms at each step, ensuring that the precision, accuracy, and data-hunger of each algorithm was commensurate to the precision, accuracy, and amount of data available. This paper describes the tailored strategy developed on the occasion as a cooperation among different groups with diverse backgrounds, a strategy which is believed to be profitably reusable in other, similar contexts. The paper presents the problem, the constraints and the adopted solutions; it then summarizes the main findings including that efforts and costs can be saved on the side of Earth Observation data processing when additional ground-based data are available to support the mapping task.

Preprint ARTICLE | doi:10.20944/preprints202407.1075.v1

Advancements in Financial Market Predictions Using Machine Learning Techniques

Emmanuel Idowu

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: financial market prediction; machine learning; supervised learning; unsupervised learning; reinforcement learning; deep learning; neural networks; alternative data; stock prices; currency exchange rates; commodity prices; model interpretability.

Online: 12 July 2024 (23:57:06 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202302.0066.v1

Smarter Sustainable Tourism: Data-Driven Multi-Perspective Parameter Discovery for Autonomous Design and Operations

Raniah Alsahafi, Ahmed Alzahrani, Rashid Mehmood

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Smart Tourism; Sustainable Tourism; Natural language Processing (NLP); Big Data Analytics; Deep Learning; Machine Learning; Unsupervised Learning; Bidirectional Encoder Representations from Transformers (BERT); Literature Review; Smart Societies

Online: 3 February 2023 (09:47:55 CET)

Show abstract| Download PDF| Share

The Global natural and manmade events are exposing the fragility of the tourism industry and its impact on the global economy. Prior to the COVID-19 pandemic, tourism contributed 10.3% to the global GDP and employed 333 million people but saw a significant decline due to the pandemic. Sustainable and smart tourism requires collaboration from all stakeholders and a comprehensive understanding of global and local issues to drive responsible and innovative growth in the sector. This paper presents an approach for leveraging big data and deep learning to dis-cover holistic, multi-perspective (e.g., local, cultural, national, and international) and objective information on a subject. Specifically, we develop a machine learning pipeline to extract parameters from academic literature and public opinions on Twitter, providing a unique and comprehensive view of the industry from both academic and public perspectives. The academic-view dataset was created from the Scopus database and contains 156,759 research articles from 2000 to 2022, which were modelled to identify 33 distinct parameters in 4 categories: Tourism Types, Planning, Challenges, and Media & Technologies. A Twitter dataset of 485,813 tweets was collected over 18 months starting March 2021 to August 2022 to showcase public perception of tourism in Saudi Arabia, which was modelled to reveal 13 parameters categorized into two broader sets: Tourist Attractions and Tourism Services. Discovering system parameters are re-quired to embed autonomous capabilities in systems and for decision-making and problem-solving during system design and operations. The proposed approach improves AI-based information discovery by extending the use of scientific literature, Twitter, and other sources for autonomous, dynamic optimizations of systems, promoting novel research in the tourism sector and contributing to the development of smart and sustainable societies. The paper also presents a comprehensive knowledge structure and literature review of the tourism sector based on over 250 research articles.

Preprint ARTICLE | doi:10.20944/preprints202312.0825.v1

Improving Data-Driven Estimation of Significant Wave Height through Preliminary Training on Synthetic X-band Radar Sea Clutter Imagery

Vadim Rezvov, Mikhail Krinitskiy, Alexander Gavrikov, Viktor Golikov, Mikhail Borisov, Alexander Suslov, Natalia Tilinina