Search | Preprints.org

Preprint ARTICLE | doi:10.20944/preprints201808.0350.v2

Integration of Data Mining Clustering Approach with the Personalized E-Learning System

Samina Kausar, Huahu Xu, Iftikhar Hussain, Wenhau Zhu, Misha Zahid

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: big data; clustering; data mining; educational data mining; e-learning; profile learning

Online: 19 October 2018 (05:58:05 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202103.0593.v1

Creating a Business and Supporting Digital Transformation

Miguel Ayala, Jorge Portella, Sergio Martinez, Maria Rojas, Luis Jimenez

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Business Inteligence; Data Mining; Data Warehouse.

Online: 24 March 2021 (13:47:31 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.1237.v1

A Method to Enable Automatic Extraction of Cost and Quantity Data from Hierarchical Construction Information Documents to Enable Rapid Digital Comparison and Analysis

Daniel Adanza Dopazo, Lamine Mahdjoubi, Bill Gething

Subject: Engineering, Transportation Science And Technology Keywords: data mining; data extraction; data science; cost infrastructure projects

Online: 17 August 2023 (09:25:22 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202105.0102.v1

Implying Association Rule Mining and Market Basket Analysis for Knowing Consumer Behavior and Buying Pattern in Lockdown - A Data Mining Approach

Anurag Sinha

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Market basket analysis; association rule mining; buying pattern; data mining

Online: 6 May 2021 (15:14:25 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201908.0019.v1

Performance Evaluation of Supervised Machine Learning Techniques for Efficient Detection of Emotions from Online Content

Muhammad Zubair Asghar, Fazli Subhan, Muhammad Imran, Fazal Masud Kundi, Shahab Shamshirband, Amir Mosavi, Peter Csiba, Annamária R. Várkonyi-Kóczy

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: emotion classification; machine learning classifiers; ISEAR dataset; data mining; performance evaluation; data science; opinion-mining

Online: 2 August 2019 (08:49:27 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202208.0224.v1

Recognition of Vehicles Entering Expressway Service Areas and Estimation of Dwell Time Using ETC Data

Qiqin Cai, Dingrong Yi, Fumin Zou, Zhaoyi Zhou, Nan Li, Feng Guo

Subject: Engineering, Automotive Engineering Keywords: VR-XGBoost; K-VDTE; ETC data; ESAs; data mining

Online: 12 August 2022 (03:53:23 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201909.0040.v1

Application of Data Mining on Web Usage Data for Security: WebSecuDMiner

Muhammad Zia Aftab Khan, Jihyun Park

Subject: Business, Economics And Management, Business And Management Keywords: data mining; security; association rule; ECLAT

Online: 4 September 2019 (03:48:58 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201610.0012.v1

Bio-Resource Exchange: Study of Prevalence of Antibody Donation and Development of a Web Portal to Facilitate it

Sandeep Subramanian, Madhavi Ganapathiraju

Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: data exchange; resource donations; text mining

Online: 5 October 2016 (15:08:32 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202311.1570.v1

Consore: A Powerful Federated Data Mining Tool Driving a French Research Network to Accelerate Cancer Research

Julien Guérin, Amine Nahid, Louis Tassy, Marc Deloger, François Bocquet, Simon Thézenas, Emmanuel Desandes, Marie-Cécile Le Deley, Xavier DURANDO, Anne Jaffré, Ikram Es Saad, Hugo Crochet, Marie Le Morvan, François Lion, Judith Raimbourg, Oussama Khay, Franck Craynest, Alexia Giro, Yec'han Laizet, Aurélie Bertaut, Frédérik Joly, Alain Livartowski, Pierre Etienne Heudel

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: cancer research; cancer; natural language processing; data mining; data warehouse; big data

Online: 26 November 2023 (05:13:14 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202108.0256.v1

Effect of Non-Academic Parameters on Student’s Performance

Shantanu Lokhande, Vedant Bahel

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Learning Analytics, Education, Educational Data Mining, Pattern Recognition, Data Visualization.

Online: 11 August 2021 (11:23:48 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201806.0440.v1

An Efficient Grid-based K-prototypes Algorithm for Sustainable Decision Making Using Spatial Objects

Hong-Jun Jang, Byoungwook Kim, Jongwan Kim, Soon-Young Jung

Subject: Computer Science And Mathematics, Computational Mathematics Keywords: clustering; spatial data; grid-based k-prototypes; data mining; sustainability

Online: 27 June 2018 (10:21:22 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202401.1143.v1

Effects of Different Physical Exercise Programs on the Anthropometric, Cardiovascular, Metabolic, and Strength Variables of Elderly Participating in Health Care Programs

Aristéia Sampaio, Altemir Braga, Jader Bezerra, Antônio Castro, Luis Carlos Gonçalves, Aníbal Magalhães-Neto, Thiago Fidale, Miguel Bortolini, Romeu Silva

Subject: Biology And Life Sciences, Aging Keywords: health promotion; data mining; sports medicine; metabolism

Online: 15 January 2024 (16:12:34 CET)

Show abstract| Download PDF| Share

OBJECTIVE: Analyzing the effects of different physical exercise programs on the anthropometric, cardiovascular, metabolic, and strength variables of the elderly participating in health care programs. METHODS: Controlled clinical trial, with 60 elderly participants from health care groups, allocated into four groups: Group of resistance exercises in open-air gyms – GT1 (n = 17); Group of aerobic and localized exercises - GT2 (n = 11); Group of resistance exercises at GT3 gym (n = 17); Control group: non-exercise practitioners - CG (n = 15). Anthropometric (BMI, % body fat, and waist circumference), cardiovascular (SBP, DBP, HR, and DP), metabolic (total cholesterol, triglycerides, and glycemia), and strength variables were evaluated before and after 16 weeks of intervention. Descriptive statistics, Shapiro-Wilk and Kolmogorov-Smirnov normality test, equal variance test, T-Student test, non-parametric Mann-Witney test, ANOVA One Way, multivariate data analysis using data mining and machine learning techniques, Pearson and Spearman correlation tests, Classical Clustering (Agglomerative Hierarchical Method); Principal Component Analysis (PCA), Z score, Fruchterman-Reingold algorithm, Euclidian Similarity Index and, Cohen’s equations were applied. RESULTS: To observe effect size, morphofunctional variables for the GT1 group show a small effect for fat percentage, WHR, and ULS and a medium effect for LLS. For the GT2 group, there was a small effect for fat percentage, WHR, and ULS and a large effect for LLS. For the GT3 group, there was a small effect for the percentage of fat and HC, a medium effect for ULS, and a large effect for LLS. A small effect for glycemia in GT1, a medium effect for glycemia and triglycerides for GT2, a small effect for total cholesterol for GT3, and a large effect for glycemia in CG, with this effect being an increase in this analyte. For the cardiovascular variables, there is a small effect for SBP and HR in GT1, a small effect for DBP, HR, and DP, and a large effect for SBP in GT2, a small effect for SBP, DBP, and HR in GT3 and small effects for SBP and HR in CG. The correlation between BMI (P = 0.0007) and Body Fat% (P = 0.007) with ULS. The variables TG, LLS, and ULS were the ones that differed most in each type of stimulus chosen as an intervention in the present study. CONCLUSIONS: The results of this study indicate that GT2 caused more significant percentage reductions in body mass and BMI. However, GT3 caused greater fat percentage reduction. The GT2 caused the greatest decrease in waist circumference, while the GT3 caused the largest decline in hip circumference. The three groups induced an increase in strength (ULS and LLS). However, in ascending order, GT1 caused the smallest increase, GT2 an intermediate increase, and GT3 the greatest increase in strength, with considerable effects in GT2 and GT3. Finally, GT2 caused a greater percentage reduction in systolic blood pressure (large effect) and double product (small effect) than the other groups. The three methods proved to be efficient, but with particularities that may reflect the choice of one over another due to health conditions, objectives to be achieved, or characteristics of the patient at the time of their choice by the prescriber.

Preprint ARTICLE | doi:10.20944/preprints202312.1377.v1

Informer Model with Season-Aware Block for Efficient Power Series Forecasting

Yunlong Cui, Zhao Li, Peng Zhang

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: LSTF; self-attention; data mining; temporal covariates

Online: 29 December 2023 (09:17:31 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201906.0144.v1

Applications of Data Mining Algorithms for Network Security

Kai Chain

Subject: Computer Science And Mathematics, Security Systems Keywords: data mining; network security; association rules; DDoS

Online: 16 June 2019 (02:42:59 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.1391.v1

An Automated Method for Extracting and Analyzing Railway Infrastructure Cost Data

Daniel Adanza Dopazo, Lamine Mahdjoubi, Bill Gething

Subject: Engineering, Transportation Science And Technology Keywords: data extraction; data mining; railway infrastructure costs; infrastructure costs data analysis; cost analysis

Online: 18 August 2023 (16:03:08 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202007.0078.v1

Data Driven Analytics for Personalized Medical Decision Making

Nataliia Melnykova, Nataliya Shakhovska, Michal Gregus, Volodymyr Melnykov, Mariana Zakharchuk, Olena Vovk

Subject: Computer Science And Mathematics, Information Systems Keywords: personalization; decision making; medical data; artificial intelligence; Data-driving; Big Data; Data Mining; Machine Learning

Online: 5 July 2020 (15:04:17 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202008.0487.v1

Extracting Reliable Twitter Data for Flood Risk Communication using Manual Assessment and Google Vision API from Text and Images

Xiaohui Liu, Bandana Kar, Francisco Alejandro Montiel Ishino, Chaoyang Zhang, Faustine Williams

Subject: Social Sciences, Geography, Planning And Development Keywords: Twitter; data reliability; risk communication; data mining; Google Cloud Vision API

Online: 22 August 2020 (02:32:40 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202310.1998.v1

Marburg Virus Outbreak and a New Conspiracy Theory: Findings from a Comprehensive Analysis of Web Behavior

Nirmalya Thakur, Shuqi Cui, Kesha A. Patel, Nazif Azizi, Victoria Knieling, Changhee Han, Audrey Poon, Rishika Shah

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: Marburg virus; big data; data mining; data analysis; google trends; web behavior; data science; conspiracy theory

Online: 31 October 2023 (07:02:07 CET)

Show abstract| Download PDF| Share

During virus outbreaks in the recent past web behavior mining, modeling, and analysis have served as means to examine, explore, interpret, assess, and forecast the worldwide perception, readiness, reactions, and response linked to these virus outbreaks. The recent outbreak of the Marburg Virus disease (MVD), the high fatality rate of MVD, and the conspiracy theory linking the FEMA alert signal in the United States on October 4, 2023, with MVD and a zombie outbreak, resulted in a diverse range of reactions in the general public which has transpired in a surge in web behavior in this context. This resulted in “Marburg Virus” featuring in the list of the top trending topics on Twitter on October 3, 2023, and “Emergency Alert System” and “Zombie” featuring in the list of top trending topics on Twitter on October 4, 2023. No prior work in this field has mined and analyzed the emerging trends in web behavior in this context. The work presented in this paper aims to address this research gap and makes multiple scientific contributions to this field. First, it presents the results of performing time series forecasting of the search interests related to MVD emerging from 216 different regions on a global scale using ARIMA, LSTM, and Autocorrelation. The results of this analysis present the optimal model for forecasting web behavior related to MVD in each of these regions. Second, the correlation between search interests related to MVD and search interests related to zombies (in the context of this conspiracy theory) was investigated. The findings show that there were several regions where there was a statistically significant correlation between MVD-related searches and zombie-related searches (in the context of this conspiracy theory) on Google on October 4, 2023. Finally, the correlation between zombie-related searches (in the context of this conspiracy theory) in the United States and other regions was investigated. This analysis helped to identify those regions where this correlation was statistically significant.

Preprint COMMUNICATION | doi:10.20944/preprints202206.0172.v3

MonkeyPox2022Tweets: The First Public Twitter Dataset on the 2022 MonkeyPox Outbreak

Nirmalya Thakur

Subject: Computer Science And Mathematics, Information Systems Keywords: Monkeypox; monkey pox; Twitter; Dataset; Tweets; Social Media; Big Data; Data Mining; Data Science

Online: 25 July 2022 (09:41:19 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.0858.v6

Detecting CSV File Dialects by Table Uniformity Measurement and Data Type Inference

Wilfredo García

Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: comma separated values; CSV dialect detection; data mining

Online: 18 March 2024 (07:25:57 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202105.0601.v1

A Study on Ways to Improve Mobile RPG Using Big Data Text Mining

DongHyun Youm, JungYoon Kim

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Mobile RPG; Big Data; Text Mining; Topic Modeling

Online: 25 May 2021 (10:21:36 CEST)

Show abstract| Download PDF| Share

Preprint DATA DESCRIPTOR | doi:10.20944/preprints202308.1701.v1

A Dataset of Search Interests Related to Disease X Originating from Different Geographic Regions

Nirmalya Thakur, Kesha A. Patel, Isabella Hall, Yuvraj Nihal Duggal, Shuqi Cui

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: disease X; big data; data science; data analysis; dataset development; database; google trends; data mining; healthcare; epidemiology

Online: 24 August 2023 (05:48:54 CEST)

Show abstract| Download PDF| Share

Preprint COMMUNICATION | doi:10.20944/preprints202303.0453.v1

Analysis of Public Discourse on Twitter involving COVID-19 and MPox: Findings from Sentiment Analysis and Text Analysis

Nirmalya Thakur

Subject: Social Sciences, Media Studies Keywords: COVID-19; MPox; Twitter; Big Data; Data Mining; Data Analysis; Sentiment Analysis; Data Science; Social Media; Monkeypox

Online: 27 March 2023 (08:39:28 CEST)

Show abstract| Download PDF| Share

Mining and analysis of the Big Data of Twitter conversations have been of significant interest to the scientific community in the fields of healthcare, epidemiology, big data, data science, computer science, and their related areas, as can be seen from several works in the last few years that focused on sentiment analysis and other forms of text analysis of Tweets related to Ebola, E-Coli, Dengue, Human papillomavirus (HPV), Middle East Respiratory Syndrome (MERS), Measles, Zika virus, H1N1, influenza-like illness, swine flu, flu, Cholera, Listeriosis, cancer, Liver Disease, Inflammatory Bowel Disease, kidney disease, lupus, Parkinson's, Diphtheria, and West Nile virus. The recent outbreaks of COVID-19 and MPox have served as "catalysts" for Twitter usage related to seeking and sharing information, views, opinions, and sentiments involving both these viruses. While there have been a few works published in the last few months that focused on performing sentiment analysis of Tweets related to either COVID-19 or MPox, none of the prior works in this field thus far involved analysis of Tweets focusing on both COVID-19 and MPox at the same time. With an aim to address this research gap, a total of 61,862 Tweets that focused on Mpox and COVID-19 simultaneously, posted between May 7, 2022, to March 3, 2023, were studied to perform sentiment analysis and text analysis. The findings of this study are manifold. First, the results of sentiment analysis show that almost half the Tweets (the actual percentage is 46.88%) had a negative sentiment. It was followed by Tweets that had a positive sentiment (31.97%) and Tweets that had a neutral sentiment (21.14%). Second, this paper presents the top 50 hashtags that were used in these Tweets. Third, it presents the top 100 most frequently used words that are featured in these Tweets. The findings of text analysis show that some of the commonly used words involved directly referring to either or both viruses. In addition to this, the presence of words such as "Polio", "Biden", "Ukraine", "HIV", "climate", and "Ebola" in the list of the top 100 most frequent words indicate that topics of conversations on Twitter in the context of COVID-19 and MPox also included a high level of interest related to other viruses, President Biden, and Ukraine. Finally, a comprehensive comparative study that involves a comparison of this work with 49 prior works in this field is presented to uphold the scientific contributions and relevance of the same.

Working Paper ARTICLE

Business Intelligence and Its Big Evolution

Andres Velosa, Gustavo Pabon

Subject: Engineering, Automotive Engineering Keywords: Business Intelligence; Data warehouse; Data Marts; Architecture; Data; Information; cloud; Data Mining; evolution; technologic companies; tools; software

Online: 24 March 2021 (13:06:53 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202008.0074.v1

Clustering of Cardiovascular Disease Patients Using Data Mining Techniques with Principal Component Analysis and K-Medoids

Edy Irwansyah, Ebiet Salim Pratama, Margaretha Ohyver

Subject: Computer Science And Mathematics, Probability And Statistics Keywords: data mining; cardiovascular diseases; cluster analysis; principle component analysis

Online: 4 August 2020 (03:56:19 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202006.0161.v1

Applications of Artificial Intelligence Technologies in COVID-19 Research: A Bibliometric Study

Md Mahbub Hossain, Shah Akib Sarwar, E. Lisako J. McKyer, Ping Ma

Subject: Medicine And Pharmacology, Epidemiology And Infectious Diseases Keywords: COVID-19; Coronavirus; Artificial intelligence; Machine learning; Data mining

Online: 14 June 2020 (03:34:22 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201907.0338.v1

How Artificial Intelligence Can Improve Understanding in Challenging Chaotic Environments

Reza Hafezi

Subject: Engineering, Automotive Engineering Keywords: prediction; futures studies; complex environment; machine learning data mining

Online: 30 July 2019 (03:48:37 CEST)

Show abstract| Download PDF| Share

Preprint COMMUNICATION | doi:10.20944/preprints202206.0383.v2

Twitter Big Data as A Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research Questions

Nirmalya Thakur

Subject: Computer Science And Mathematics, Information Systems Keywords: Exoskeleton; Twitter; Tweets; Big Data; social media; Data Mining; dataset; Data Science; Natural Language Processing; Information Retrieval

Online: 21 July 2022 (04:06:53 CEST)

Show abstract| Download PDF| Share

Preprint COMMUNICATION | doi:10.20944/preprints202210.0351.v1

Machine Learning Heuristics on Gingivobuccal Cancer Gene Datasets Reveals Key Candidate Attributes for Prognosis

Tanvi Singh, Girik Malik, Saloni Someshwar, Hien Thi Thu Le, Rathnagiri Polavarapu, Laxmi Chavali, Jayaraman K. Valadi, Vijayaraghava Seshadri Sundararajan, Nidheesh M, Kavi Kishor PB, Prashanth N Suravajhala

Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: oral cancer; machine learning; gene prioritization; genomic datasets; data mining

Online: 24 October 2022 (07:10:08 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202107.0230.v1

Diverging from News Media: An Exploratory Study on the Changing Dynamics between Media and Public Attention on Cancer in China from 2011-2020

Yangkun Huang, Xiaoping Xu, Sini Su

Subject: Business, Economics And Management, Accounting And Taxation Keywords: Cancer; Public Attention; News Media; Granger Causality Test; Data Mining

Online: 9 July 2021 (15:44:24 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202103.0738.v1

Impact of the Coronavirus Pandemic on Science and Society: Insights from Temporal Bibliometric Networks

Ramya Gupta, Abhishek Prasad, Suresh Babu, Gitanjali Yadav

Subject: Computer Science And Mathematics, Analysis Keywords: bibliometry; coronavirus; text and data mining; SARS; MERS; COVID-19

Online: 31 March 2021 (17:30:56 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202001.0048.v1

The NBS-LRR Gene Class is a Small Family in Cucurbita pepo

Belen Roman, Pedro Gomez, Belen Pico, Jose V. Die

Subject: Biology And Life Sciences, Horticulture Keywords: cis-regulatory element; data mining; NBS-LRR resistance genes; Zucchini

Online: 5 January 2020 (17:22:10 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201812.0056.v1

Exploring Group Movement Pattern through Cellular Data: A Case Study of Tourists in Hainan

Xinning Zhu, Tianyue Sun, Hao Yuan, Zheng Hu, Jiansong Miao

Subject: Computer Science And Mathematics, Computer Science Keywords: Low accuracy CDRs; Group movement pattern; Data mining; Travel behaviors

Online: 4 December 2018 (10:02:30 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201809.0466.v1

Topological Signature of 19th Century Novelists: Persistence Homology in Context-Free Text Mining

Shafie Gholizadeh, Armin Seyeditabari, Wlodek Zadrozny

Subject: Computer Science And Mathematics, Information Systems Keywords: topological data analysis; text mining; computational topology; style; persistent homology

Online: 24 September 2018 (15:33:02 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201801.0231.v1

Analysis of Occupational Accidents in Underground and Surface Mining in Spain Using Data Mining Techniques

Lluís Sanmiquel, Marc Bascompta, Josep Ma. Rossell, Hernán Anticoi, Eduard Guash

Subject: Engineering, Control And Systems Engineering Keywords: Data mining; Association rules; Previous Cause; Type of Accident; Overexertion

Online: 24 January 2018 (19:40:52 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201801.0017.v1

Determining Quality of Articles in Polish Wikipedia Based on Linguistic Features

Włodzimierz Lewoniewski, Krzysztof Węcel, Witold Abramowicz

Subject: Computer Science And Mathematics, Information Systems Keywords: Wikipedia; Polish; information quality; linguistic features; linguistics; data mining; NLP

Online: 3 January 2018 (02:03:51 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202310.1450.v1

Analysis of Open-Social Data Behavior Concerning Gasoline Stealing: A Case Study of the Mexican Petroleum Crisis

Roberto Zagal-Flores, Felix Mata-Rivera, Miguel Torres-Ruiz, Violeta Shaid Benitez-Valerio, Rolando Quintero, Giovanni Guzmán, Joel Omar Juárez-Gambino

Subject: Computer Science And Mathematics, Computer Science Keywords: semantic and linguistic technologies; spatial data mining; spatial data analytics; spatio-temporal characterization; social media

Online: 23 October 2023 (16:15:23 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202201.0229.v1

Applying the FAIR4Health Solution to Identify Multimorbidity Patterns and Their Association With Mortality Through a Frequent Pattern Growth Association Algorithm

Jonás Carmona-Pírez, Beatriz Poblador-Plou, Antonio Poncel-Falcó, Jessica Rochat, Celia Alvarez-Romero, Alicia Martínez-García, Carmen Angioletti, Marta Almada, Mert Gencturk, Anil Sinaci, Jara Eloisa Ternero-Vega, Christophe Gaudet-Blavignac, Christian Lovis, Rosa Liperoti, Elisio Costa, Carlos Luis Parra-Calderón, Aida Moreno-Juste, Antonio Gimeno-Miguel, Alexandra Prados-Torres

Subject: Public Health And Healthcare, Public Health And Health Services Keywords: FAIR principles; Multimorbidity; Mortality; Research data management; Pathfinder case study; Privacy-Preserving Distributed Data Mining.

Online: 17 January 2022 (13:04:03 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.0219.v1

Course Prophet: A Machine Learning-Based System for Early Prediction of Student Failure in Numerical Methods Course in the Bachelor’s Degree in Engineering at the University of Córdoba, Colombia

Isaac Caicedo-Castro

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Machine learning; educational data mining; supervised methods; classifiers; course failure risk

Online: 3 August 2023 (02:43:48 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202305.1894.v1

Analysis of Relation Between Brainwave Activity and Reaction Time of Short-Haul Pilots Based on EEG Data

Bartosz Binias, Dariusz Myszor, Sandra Binias, Krzysztof Cyran

Subject: Biology And Life Sciences, Neuroscience And Neurology Keywords: Aircraft control human factors; Cognitive workload; Data Mining; Electroencephalography; Fatigue; Safety

Online: 26 May 2023 (08:40:35 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202206.0360.v1

The Effect of Small Particulate Matter on Tourism and Related SMEs in Chiang Mai, Thailand

Phisek Srinamphon, Sainatee Chernbumroong, Korrakot Tippayawong

Subject: Business, Economics And Management, Business And Management Keywords: tourism and related; SMEs; small particulate matters; association rules; data mining

Online: 27 June 2022 (10:24:27 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202111.0499.v1

A Herd Effect Detection Method Based on Text Features

Tingzhen Liu, Tong Zhou, Yuxin Shi, Siyuan Liu, Jin Gao

Subject: Computer Science And Mathematics, Information Systems Keywords: Social Networks; Data Mining; Graph Structure; Natural Language Processing; Machine Learning

Online: 26 November 2021 (10:45:06 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202108.0564.v1

Initial Experience in Developing AI Algorithms in Medical Imaging Based on Annotations Derived From an E-Learning Platform

Maurice Henkel, Hanns-Christian Breit, Patricia Wiesner, Jakob Wasserthal, Victor Parmar, Thomas Weikert, Verena Hofmann, Sebastian Eiden, Lena Schmülling, Konrad Appelt, David Winkel, Fabiano Paciolla, Christian A. Lechtenboehmer, Moritz Vogt, Laurent Binsfeld, Raphael Sexauer, Christian Wetterauer, Kirsten D. Mertz, Alexander Sauter, Bram Stieltjes

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: E-learning derived annotations; Pneumothorax; Artificial intelligence; Crowdsourcing; Educational data mining

Online: 31 August 2021 (11:23:12 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201811.0216.v1

Safety Helmet Wearing Management System for Construction Workers Using Three-Axis Accelerometer Sensor

SungHun Kim, Changwon Wang, Se Dong Min, Seung-Hyun Lee

Subject: Engineering, Architecture, Building And Construction Keywords: Construction, worker safety, safety helmet, three-axis accelerometer sensor, data mining

Online: 8 November 2018 (14:03:21 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201810.0678.v1

Unstructured Text in EMR Improves Prediction of Death after Surgery in Children

Oguz Akbilgic, Ramin Homayouni, Kevin Heinrich, Max Raymond langham, Jr, Robert Lowell Davis

Subject: Medicine And Pharmacology, Pediatrics, Perinatology And Child Health Keywords: post-operative death; unstructured data; logistic regression; text mining; surgery outcome

Online: 29 October 2018 (11:46:18 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201806.0247.v1

Association Rules for Understanding Policyholder Lapses

Himchan Jeong, Guojun Gan, Emiliano A. Valdez

Subject: Computer Science And Mathematics, Analysis Keywords: data mining; association rule learning; policyholder lapse; auto insurance; market inefficiency

Online: 15 June 2018 (09:01:03 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201708.0055.v1

A Survey of Data Processing of EMR (Electronic Medical Record) Based on Data Mining

Wencheng Sun, Fang Liu, Zhiping Cai, Shengqun Fang, Guoyan Wang

Subject: Computer Science And Mathematics, Information Systems Keywords: EMR; data preprocessing; text mining; information extraction; medical decision support system

Online: 15 August 2017 (05:46:43 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202108.0301.v1

Fusion of Unobtrusive Sensing Solutions for Sprained Ankle Rehabilitation Exercises Monitoring in Home Environments

Idongesit Ekerete, Matias Garcia-Constantino, Yohanca Diaz, Chris Nugent, James Mclaughlin

Subject: Engineering, Electrical And Electronic Engineering Keywords: Unobtrusive Sensing; Data Fusion; Data Mining; Radar Sensing; Thermal Sensing; Sprained Ankle; Infrared Thermopile Array; Home Environment.

Online: 13 August 2021 (15:12:24 CEST)

Show abstract| Download PDF| Share

Working Paper ARTICLE

Adolescents Are More than Twice as Likely to Consume Soft Drinks and Chips at Locations Away from Home and School: Correspondence Analysis and Logistic Regression Results from the UK National Diet and Nutrition Survey Rolling Programme

Luigi Palla, Andrew Chapman, Eric Beh, Gerda Pot, Eva Almiron-Roig

Subject: Medicine And Pharmacology, Dietetics And Nutrition Keywords: obesity; eating context; nutrient-poor foods; nutritional surveillance; adolescents; survey data analysis; data-mining; correspondence analysis; biplots

Online: 9 June 2020 (13:52:45 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202111.0440.v1

Hybrid Algorithm for Anomaly Removal in Time Series Data Mining

Abdul Razaque, Marzhan Abenova, Munif Alotaibi, Bandar Alotaibi, Hamoud Alshammari, Salim Hariri, Aziz Alotaibi

Subject: Engineering, Control And Systems Engineering Keywords: time series; NMP algorithm; anomalies; data mining; similarities in time series; clustering

Online: 23 November 2021 (17:51:42 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202108.0345.v1

Educational Data Mining, Student Academic Performance Prediction, Prediction Methods, Algorithms and Tools: An Overview of Reviews

Chaka Chaka

Subject: Social Sciences, Education Keywords: student academic performance; educational data mining; methods; algorithms; tools; higher education; overview

Online: 16 August 2021 (14:04:57 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201811.0328.v1

Medi-Test: GENERATING Tests from Medical Reference Texts

Íonuț Pistol, Diana Trandabăț, Mădălina Răschip

Subject: Computer Science And Mathematics, Analysis Keywords: e-learning; automatic test generation; medical ontology; data mining for medical texts

Online: 14 November 2018 (09:45:38 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201804.0008.v1

A Parallel Software Pipeline for Personalized Medicine

Giuseppe Agapito, Pietro Hiram Guzzi, Mario Cannataro

Subject: Computer Science And Mathematics, Information Systems Keywords: SNP; multiple analysis pipeline; pharmacogenomics; overall survival curves; data mining: statistical analysis

Online: 2 April 2018 (07:53:23 CEST)

Show abstract| Download PDF| Share

Personalized medicine is an aspect of the P4 medicine (predictive, preventive, personalized and participatory) based precisely on the customization of all medical characters of each subject. In personalized medicine, the development of medical treatments and drugs is tailored to the individual characteristics and needs of each subject, according to the study of diseases at different scales from genotype to phenotype scale. To make concrete the goal of personalized medicine, it is necessary to employ high-throughput methodologies such as Next Generation Sequencing (NGS), Genome-Wide Association Studies (GWAS), Mass Spectrometry or Microarrays, that are able to investigate a single disease from a broader perspective. For example, by using genotyping microarrays (e.g. collections of Single Nucleotide Polymorphism - SNP) it is possible to uncover the reasons (i.e. mutation in genes) because a treatment works properly in some patients (for example absence of mutated genes), but it does not work (presence of mutated genes) in others. A side effect of high-throughput methodologies is the massive amount of data produced for each single experiment, that poses several challenges (e.g. high execution time and required memory) to bioinformatic software. Thus a main requirement of modern bioinformatic software is the use of good software engineering methods and efficient programming techniques, able to face those challenges, that include the use of parallel programming and efficient and compact data structures. Thus, to exploit all the potential of this massive amount of data in the short possible time (before that data becomes obsolete), the necessity to develop parallel software tools for efficient data collection and analysis arise. Moreover, due to the heterogeneity of the data produced by the different kinds of experimental platforms, it is necessary to automatize in a comprehensive software pipeline, the various steps that compose a bioinformatic analysis, such as: the preprocessing of raw data to remove noise or corrupted data; the annotation of data with external knowledge (e.g. Gene Ontology), and the integration of molecular data with clinical data. It should be noted that such steps are necessary to make statistical or data mining analysis more effective. This paper presents the design and the experimentation of a comprehensive software pipeline, named microPipe, for the preprocessing, annotation and analysis of microarray-based SNP genotyping data. A case study in pharmacogenomics is presented. The main advantages of using microPipe are: the reduction of errors that may happen when trying to make data compatible among different tools; the possibility to analyze in parallel huge datasets; the easy annotation and integration of data.

Preprint ARTICLE | doi:10.20944/preprints201707.0011.v1

RGCA: a Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization

Yuling Fang, Qingkui Chen, Neal N. Xiong, Deyu Zhao, Jingjuan Wang

Subject: Computer Science And Mathematics, Computer Science Keywords: Internet of Things; data mining algorithms; GPU cluster; performance; energy consumption; reliability

Online: 6 July 2017 (12:40:22 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202401.0571.v1

Hybrid-electric Vehicle Powertrain Mounting System Optimization Based on Cross-industry Standard Process for Data Mining

Yudong Wu, Dandan Zhao, Jingyuan Peng, Xingyu Xiang, Haibo Huang

Subject: Engineering, Automotive Engineering Keywords: Hybrid-electric vehicle powertrain mounting; Data-mining; Mounting stiffness; Multi-SVR; MRTs; MLPR

Online: 8 January 2024 (07:10:00 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.1073.v2

Predicting Students' Progress in Intelligent Tutoring Systems

Guijia He, Chengwei Huang, Steven Yang, Kelvin Lwin, Eng Lieh Ouh, Ran Ju, Xiaoming Zhu

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Academic Performance, Progress Prediction, Score Prediction, Learning Behavior, Learning Dataset, Educational Data Mining

Online: 20 December 2023 (10:19:25 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202310.1988.v1

Assistive Learning Intelligence Navigator (ALIN) Dataset: Predicting Test Results from Learning Data

Guijia He, Chengwei Huang, Steven Yang, Kelvin Lwin, Eng Lieh Ouh, Ran Ju, Xiaoming Zhu

Subject: Computer Science And Mathematics, Computer Science Keywords: academic performance; progress prediction; score prediction; learning behavior; learning dataset; educational data mining

Online: 31 October 2023 (09:40:38 CET)

Show abstract| Download PDF| Share

Preprint COMMUNICATION | doi:10.20944/preprints202104.0575.v2

A Personalized Machine-Learning-enabled Method for Efficient Research in Ethnopharmacology. The case of Southern Balkans and Coastal zone of Asia Minor

Evangelos Axiotis, Andreas Kontogiannis, Eleftherios Kalpoutzakis, George Giannakopoulos

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Ethnopharmacology; Artificial Intelligence; Web Crawling; Active Learning; Reinforcement Learning; Text Mining; Big Data

Online: 23 June 2021 (11:47:32 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints201911.0338.v1

Sentiment Analysis on Indian Indigenous Languages: A Review on Multilingual Opinion Mining

Sonali Rajesh Shah, Abhishek Kaushik

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Indian; Sentiment Analysis; Indigenous Languages; Machine Learning; Deep learning; Data; Opinion Mining; Languages.

Online: 27 November 2019 (09:30:07 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201906.0202.v1

Developing a Data Mining Based Model to Extract Predictor Factors in Energy Systems: Application of Global Natural Gas Demand

Reza Hafezi, Amir Naser Akhavan, Mazdak Zamani, Saeed Pakseresht, Shahab Shamshirband

Subject: Engineering, Mechanical Engineering Keywords: Natural gas demands; Prediction; Energy market; Genetic algorithm; Artificial neural network; Data mining.

Online: 20 June 2019 (15:58:25 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202310.0056.v2

Semiconductor Manufacturing Process Improvement Using Data-Driven Methodologies

Hribhu Chowdhury

Subject: Engineering, Industrial And Manufacturing Engineering Keywords: semiconductor manufacturing; statistical process control (SPC); data mining techniques; data-driven decision models; process improvement; product yield rate; process stability

Online: 7 October 2023 (09:53:09 CEST)

Show abstract| Download PDF| Share

The paper investigates into the intricacies of semiconductor manufacturing, a highly complex process entailing a wide array of subprocesses and diverse equipment. Semiconductors are miniaturized integrated circuits comprising numerous components. The semiconductor manufacturing process begins with the thin disc-shaped silicon wafers. On each wafer, up to thousands of identical chips can be prepared depending upon the diameter of the wafer to build up the circuits layer by layer in a wafer fab. The size of the semiconductors requires a high number of units to be produced, thus necessitating a large amount of data to control for improving the semiconductor manufacturing process. Therefore, the collection and analysis of the equipment data, process data, and machine history data throughout the manufacturing process are required to diagnose faults, monitor the process, and manage the manufacturing process effectively. This research is focused on improving the semiconductor manufacturing process through a rigorous analysis of collected manufacturing process data, employing statistical process control (SPC), data mining techniques, and data-driven decision models. The project's primary objective is to increase the manufacturing process stability and productivity by utilizing the latest data-driven technologies in the scientific community. A structured review was undertaken, exploring contemporary data-driven methodologies in semiconductor manufacturing process improvement, specifically pertaining to process capability, product yield rate, and process stability. This review accentuates a comprehensive evaluation of data-driven methodologies applicable to conventional semiconductor manufacturing facilities, aiming to drive substantial process improvements. It features a detailed demonstration facilitating the selection of optimal semiconductor manufacturing processes to enhance overall operational performance. This study of process improvement in the semiconductor manufacturing steps through the application of data-driven methodologies will be effective in delivering advanced, real-time, and proactive control decisions throughout the manufacturing facilities. This endeavor is expected to promptly provide critical insights for enterprise manufacturing decision-makers to reduce manufacturing cycle time, improve the product yield rate, and increase the overall efficiency of the manufacturing process.

Preprint ARTICLE | doi:10.20944/preprints202309.1930.v1

Data Mining and Fusion Framework for In-Home Monitoring Applications

Idongesit Ekerete, Matias Garcia-Constantino, Paul McCullagh, Christopher Nugent, James McLaughlin

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: sensing solution; thermal sensor; Radar sensor; sensor fusion; data mining; in-home; machine learning

Online: 28 September 2023 (10:06:06 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.1302.v1

The Combined Effect of Big 5 Personality Traits on Fourth-Graders´ Math Performance

Roberto Araya, Pablo González-Vicente

Subject: Social Sciences, Education Keywords: Big 5; Child Personality; Elementary School Mathematic Performance; Socio Emotional Effects; Educational Data Mining

Online: 18 August 2023 (09:56:56 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202208.0495.v1

Deep Learning Approaches to Automatic Chronic Venous Disease Classification

Marina Barulina, Aschat Sanbaev, Sergey Okunkov, Ivan Ulitin, Ivan Okoneshnikov

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: chronic venous disease; deep leaning; data mining; Resnet50; DeiT; automatic classification; automatic CEAP classification

Online: 29 August 2022 (12:46:56 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202005.0051.v1

Mapping the Spread of COVID-19 Outbreak in India

Vanshika Bidhan, Bhavini Malhotra, Mansi Pandit, Narayanan Latha

Subject: Public Health And Healthcare, Health Policy And Services Keywords: COVID-19; Data mining; Infection in India; R package; State- wise analysis; Statistical analysis

Online: 5 May 2020 (02:28:26 CEST)

Show abstract| Download PDF| Share

Background & Objectives: The global pandemic caused by novel coronavirus SARS-CoV-2 has claimed several lives worldwide. With the virus gathering rapid spread, the world has witnessed increasing number of confirmed cases and mortality rate, India is not far behind with approximately 37,000 affected individuals as on May 2, 2020. The ongoing pandemic has raised several questions which need to be answered by analysis of transmission of the infection. The data has been collected on daily basis from WHO and other sites. We have represented the data collated graphically using statistical packages, R and other online softwares. The present study provides a holistic overview of the spread of COVID-19 infection in India. Methods: Real-time data query was done based on daily observations using publicly available data from reference websites for COVID-19 and other government official reports for the period (15^th February, 2020 to April 28^th, 2020). Statistical analysis was performed to draw important inferences regarding COVID-19 trend in India. Results: A decrease in growth rate of cases due to COVID-19 in India post lockdown and improvement in recovery rate during the month of April was identified. The case fatality rate was estimated to be 3.22% of the total reported cases. State-wise analysis revealed a deteriorating situation in states of Maharashtra and Gujarat among others as cases continued to increase rapidly there. A positive linear correlation between the number of deaths and total cases and exponential relation between population density and number of cases reported per square km was established. Interpretation & Conclusions: Despite early preventive measures taken up by the Government of India, the increasing number of cases in India is a concern. This study compiles state-wise and district-wise data to report the daily conﬁrmed cases, case fatalities and strategies adopted in the form of case studies. Understanding the transmission spread of SARS-CoV-2 in a diverse and populated country like India will be crucial in assessing the effectiveness of control policies towards the spread of COVID-19 infection.

Preprint TECHNICAL NOTE | doi:10.20944/preprints201911.0073.v1

Beyond Traditional Covariates in Medical Informatics

Uri Kartoun

Subject: Computer Science And Mathematics, Probability And Statistics Keywords: deep behavioral covariates; clinical informatics; predictive modeling; electronic medical records; machine-learning; data-mining

Online: 7 November 2019 (09:25:04 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.1771.v1

Exploring Predictive Factors for Heart Failure Progression in Hypertensive Patients Based on Medical diagnosis Data from the MIMIC-IV Database

Jinmyung Jung, Doyoon Kim, Inkyung Hwang

Subject: Public Health And Health Services, Public Health And Healthcare Keywords: heart failure; hypertension; predictive factors; MIMIC-IV database; data mining; XGBoost modeling; chi-square test

Online: 26 April 2024 (20:12:01 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Preprint REVIEW | doi:10.20944/preprints202309.2111.v1

Qualitative Comparative Analysis of Medical and Epidemiological Data

Valerii Tsvetkov, Ivan Tokin

Subject: Medicine And Pharmacology, Other Keywords: qualitative comparative analysis; qualitative analysis; data mining; calibration; truth table; logical minimization; QMC; eQMC; CCubes

Online: 29 September 2023 (13:05:00 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202304.0048.v1

Evolutionary Multi-Objective Optimization of Extrusion Barrier Screws: Data Mining and Decision Making

António Gaspar-Cunha, Paulo Costa, Alexandre Cláudio Botazzo Delbem, Francisco Monaco, M. J. Ferreira, José A. Covas

Subject: Engineering, Other Keywords: Polymer Extrusion; Barrier Screws; Multi-Objective Optimization; Data Mining, Decision Making; Number of Objectives reduction

Online: 4 April 2023 (14:33:09 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202203.0178.v1

Digital Triage Tool Using Artificial Intelligence and Patient History for Detecting Selected Neurological Diseases and Sensing the Bottleneck between Symptoms, Diagnosis, and Therapy

Lorenz Grigull, Werner Lechner, Frank Klawonn

Subject: Computer Science And Mathematics, Computational Mathematics Keywords: artificial intelligence; data mining; diagnostic decision support; rare diseases; questionnaire anamnesis; neuromuscular diseases; high latencies

Online: 14 March 2022 (08:58:29 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202201.0445.v1

Internet of Things-Driven Data Mining for Smart Crop Production Prediction in the Peasant Farming Domain

Luis Omar Colombo-Mendoza, Mario Andrés Paredes-Valverde, María del Pilar Salas-Zárate, Rafael Valencia-García

Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: data mining; predictive analytics; Internet of Things; peasant farming; smart farming system; crop production prediction

Online: 31 January 2022 (10:58:30 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202102.0108.v1

Machine Learning and Deep Learning for Sentiment Analysis Over Students' Reviews: An Overview Study

Ru Yang

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Sentiment Analysis; Students' feedback; Students' reviews; Natural language processing; Data mining; Deep learning; Machine learning

Online: 3 February 2021 (10:11:54 CET)

Show abstract| Download PDF| Share

Working Paper ARTICLE

PigLeg: Prediction of Swine Phenotype Using Machine Learning

Siroj Bakoev, Lyubov Getmantseva, Maria Kolosova, Olga Kostyunina, Duane Chartier, Tatiana V. Tatarinova

Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: artificial intelligence; bioinformatics; computational biology; data mining & machine learning; evolutionary studies; mathematical biology; animal behavior

Online: 6 November 2019 (05:07:24 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.0925.v1

Prediction and Prevention of High Intensity Mining Water Burst under Reservoirs in Western Mining Areas

Tao Yang, Jiayue Dend, Bing Peng, Jie Zhang, Yiming Zhang, Yihui Yan, Jianchen Zhang

Subject: Environmental And Earth Sciences, Environmental Science Keywords: Coal mining under reservoirs; High-intensity mining; Green mining; Physical simulation; Water conducting fracture zone

Online: 13 June 2023 (10:22:10 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.2033.v1

Towards Action + State Process Model Discovery

Alessio Bottrighi, Marco Guazzone, Giorgio Leonardi, Stefania Montani, Manuel Striani, Paolo Terenziani

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Process Mining; Process Model Discovery; Mining action+state evolution

Online: 28 June 2023 (16:19:00 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202305.2103.v1

Sustainable Mining in Cameroon

Lucien Antoine Zang, Pablo Higueras

Subject: Environmental And Earth Sciences, Environmental Science Keywords: Cameroon; mining; Small Scale mining; Sustainable development; Betare Oya

Online: 30 May 2023 (10:00:47 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202308.1358.v1

Schools Students Performance with Artificial Intelligence Machine Learning: Features Taxonomy, Methods and Evaluation

Alain Hennebelle, Leila Ismail, Tanya Linden

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: artificial intelligence; classification models; educational data mining; educational machine learning; feature selection; student performance prediction; taxonomy

Online: 18 August 2023 (09:52:57 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202301.0254.v1

Quickening Data-Aware Conformance Checking through Temporal Algebras

Giacomo Bergami, Samuel Appleby, Graham Morgan

Subject: Computer Science And Mathematics, Computer Science Keywords: Logic Artificial Intelligence; Knowledge Bases; Query Plan; Temporal Logic; Conformance Checking; Temporal Data Mining; Intraquery Parallelism

Online: 13 January 2023 (11:07:20 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.0169.v1

Visualising Daily PM10 Pollution in an Open-Cut Mining Valley of New South Wales, Australia - Part II: Classification of Synoptic Circulation Types and Local Meteorological Patterns and Their Relation to Elevated Air Pollution in Spring and Summer

Ningbo Jiang, Matthew Riley, Merched Azzi, Giovanni Di Virgilio, Hiep Nguyen Duc, Praveen Puppala

Subject: Environmental And Earth Sciences, Atmospheric Science And Meteorology Keywords: PM10 pollution; local meteorological pattern; synoptic circulation type; self-organising map (SOM); air pollution conduciveness; data clustering; data visualisation; open-cut mining valley

Online: 2 April 2024 (07:42:50 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.1691.v1

Influence of Key Strata on the Evolution Law of Mining-induced Stress in the Working Face of Deep and Large-scale Mining

Jianlin Xie, Shan Ning, Weibing Zhu, Xiaozhen Wang, Tao Hou

Subject: Engineering, Mining And Mineral Processing Keywords: key strata; mining-induced stress; DOFS; 3DEC; large-scale mining

Online: 23 June 2023 (14:09:45 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints201812.0083.v1

Succession and Vegetation-Soil Relationship in Quarries of Southeastern Mexico

Mirna Valdez-Hernández, Rossana Gil-Medina, Jorge Omar López-Martínez, Nuria Torrescano-Valle, Nancy Cabanillas-Terán, Gerald Alexander Islebe

Subject: Biology And Life Sciences, Ecology, Evolution, Behavior And Systematics Keywords: post-mining regeneration; succession; tropical dry forest; post-mining recovery

Online: 6 December 2018 (11:04:06 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202206.0050.v2

Harvesting Context and Mining Emotions Related to Olfactory Cultural Heritage

M.Besher Massri, Inna Novalija, Dunja Mladenić, Janez Brank, Sara Graça da Silva, Natasza Marrouch, Carla Murteira, Ali Hürriyetoğlu, Beno Šircelj

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Emotions Mining; Context Mining; Sensory Mining; Artificial Intelligence; Information extraction; Text classification; Fairy tales; Olfactory Cultural Heritage

Online: 2 August 2022 (07:57:35 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202312.1527.v1

Unveiling IoT Customer Behaviour: Segmentation and Insights for Enhanced IoT-CRM Strategies: A Real Case Study

Elaheh Eslami, Nazila Razi, Mahshid Lonbani, Javad Rezazadeh

Subject: Computer Science And Mathematics, Probability And Statistics Keywords: Internet of Things (IoT); Data mining; Customer Preferences; Customer Satisfaction; Customer Segmentation; Self Organizing Map; Decision Tree

Online: 20 December 2023 (10:24:49 CET)

Show abstract| Download PDF| Share

In today’s competitive landscape, achieving customer-centricity is paramount for the sustainable growth and success of organisations. This research is dedicated to understanding customer preferences in the context of the Internet of Things (IoT) and employs a two-part modeling ap-proach tailored in this digital era. In the first phase, we leverage the power of the Self-Organizing Map (SOM) algorithm to segment IoT customers based on their connected device usage patterns. This segmentation approach reveals three distinct customer clusters, with the second cluster demonstrating the highest propensity for IoT device adoption and usage. In the second phase, we introduce a robust Decision Tree methodology designed to prioritize various factors influencing customer satisfaction in the IoT ecosystem. We employ the Classification and Regression Tree (CART) technique to analyze 17 key questions that assess the significance of factors impacting IoT device purchase decisions. By aligning these factors with the identified IoT customer clusters, we gain profound insights into customer behaviour and preferences in the rapidly evolving world of connected devices. This comprehensive analysis delves into the factors contributing to customer retention in the IoT space, with a strong emphasis on crafting logical marketing strategies, en-hancing customer satisfaction, and fostering customer loyalty in the digital realm. Our research methodology involves surveys and questionnaires distributed to 207 IoT users, categorizing them into three distinct IoT customer groups. Leveraging analytical statistical methods, regression analysis, and IoT-specific tools and software, this study rigorously evaluate the factors influencing IoT device purchases. Importantly, this approach not only effectively clusters the IoT Customer Relationship Management (IoT-CRM) dataset but also provides valuable visualizations that are essential for understanding the complex dynamics of the IoT customer landscape. Our findings underscore the critical role of logical marketing strategies, customer satisfaction, and customer loyalty in enhancing customer retention in the IoT era. This research makes a significant contri-bution to businesses seeking to optimize their IoT -CRM strategies and capitalize on the oppor-tunities presented by the IoT ecosystem.

Preprint ARTICLE | doi:10.20944/preprints202003.0298.v1

A Composite Hybrid Feature Selection Learning-Based Optimization of Genetic Algorithm For Breast Cancer Detection

Ahmed Abdullah Farid, Gamal Selim, Hatem Khater

Subject: Computer Science And Mathematics, Information Systems Keywords: Data Mining; Breast Cancer; Hybrid Feature Selection; Machine learning; Support Vector Machine; Optimize Genetic Algorithm; boosting algorithms

Online: 19 March 2020 (11:13:15 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.1430.v1

The Technogenic Deposits’ Development as a Factor of Overcoming Resource Limitations and Ensuring Sustainability

Ivan Potravny, Andrey Novoselov, Irina Novoselova, Violetta Gassiy, Davaakhuu Nyamdorj

Subject: Business, Economics And Management, Economics Keywords: depletion of natural capital; mining; technogenic deposits; mining dumps; circular economy; environmental protection; Erdenet Mining Corporation SOE; Mongolia

Online: 21 September 2023 (08:52:35 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202104.0404.v1

Putting It All Together: Combining Learning Analytics Methods and Data Sources to Understand Students’ Approaches to Learning Programming

Sonsoles López-Pernas, Mohammed Saqr, Olga Viberg

Subject: Business, Economics And Management, Accounting And Taxation Keywords: automated assessment; computer science; learning analytics; process mining; programming; sequence mining

Online: 15 April 2021 (09:40:33 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints201807.0116.v1

Computational Approaches to Identify Natural Products as Inhibitors of DNA Methyltransferases

Fernanda I. Saldívar-González, Alejandro Gómez-García, Norberto Sánchez-Cruz, Javier Ruiz-Rios, B. Angélica Pilón-Jiménez, José L. Medina-Franco

Subject: Chemistry And Materials Science, Medicinal Chemistry Keywords: chemical space; chemoinformatics; data mining; databases; DNMT inhibitors; drug discovery; epi-informatics; molecular modeling; similarity searching; virtual screening

Online: 6 July 2018 (10:04:44 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.0104.v1

Towards a Long-Term UAV Monitoring Framework/Strategy for Post-mining Effects: Prosper-Haniel Case

Marcin Pawlik, Benjamin Haske, Hernan Flores, Bodo Bernsdorf, Tobias Rudolph

Subject: Environmental And Earth Sciences, Remote Sensing Keywords: Geomonitoring; Post-Mining; UAV

Online: 4 March 2024 (10:36:43 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202110.0184.v1

Self-Attention Based Models for the Extraction of Molecular Interactions from Biological Texts

Prashant Srivastava, Saptarshi Bej, Kristina Yordanova, Olaf Wolkenhauer

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: text-mining; self-attention models; biological literature mining; relationship extraction; natural language processing

Online: 12 October 2021 (14:17:46 CEST)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202207.0010.v1

Recycling Strategies of Mine Tailing, with Environmental, Safety, Technical and Materials Considerations

Francisco S. M. Araujo, Isabella Taborda Llano, Hugo Fantucci, Everton Barbosa Nunes, Rafael M. Santos

Subject: Engineering, Architecture, Building And Construction Keywords: mining; tailings; waste; recycling; restoration

Online: 1 July 2022 (09:00:31 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202011.0424.v1

A New Method of Regulating the Load on the Shaft Lining in Sections Passing Through the Salt Rock Mass

Paweł Kamiński

Subject: Engineering, Automotive Engineering Keywords: mining shaft; salt rock; leaching

Online: 16 November 2020 (14:20:15 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202008.0265.v2

Mining Stack Overflow: a Recommender Systems-Based Model

Fouzi Harrag, Mokdad Khamliche

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Ecommender system; learning to rank; Mining software repositories; Text Mining; Deep learning; Stack Overflow

Online: 4 September 2020 (11:20:33 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202311.0905.v1

Mining Negative Associations From Medical Databases Considering Frequent, Regular, Closed and Maximal Patterns

Sastry Kodanda Rama Jammalamadaka, Raja Rao Budaraju

Subject: Computer Science And Mathematics, Computer Science Keywords: data mining; databases; closed item sets; maximal item sets; regular patterns; frequent patterns; negative associations; maximal patterns; frequent patterns; static

Online: 14 November 2023 (10:12:27 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202307.1831.v1

Using Machine Learning in Veterinary Medical Education: An Introduction for Veterinary Medicine Educators

Sarah E. Hooper, Kent G. Hecker, Elpida Artemiou

Subject: Medicine And Pharmacology, Veterinary Medicine Keywords: machine learning; veterinary medical education; random forest; medical education; artificial intelligence; Python; R; veterinary educators; educational data mining; learning analytics

Online: 26 July 2023 (14:02:31 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202303.0192.v1

The Human Extracellular Matrix Diseasome Reveals Genotype-Phenotype Associations with Clinical Implications for Age-Related Diseases

Cyril Statzer, Karan Luthria, Arastu Sharma, Maricel G. Kann, Collin Y. Ewald

Subject: Biology And Life Sciences, Anatomy And Physiology Keywords: Phenome; Matrisome; Matreotype; Phenotype; Extracellular Matrix; Data Mining; SNP; PheWAS; GWAS; Electronic Health Records; Drug Repurposing; Precision Medicine; Collagen; Human

Online: 10 March 2023 (09:34:15 CET)

Show abstract| Download PDF| Supplementary Files| Share

Preprint ARTICLE | doi:10.20944/preprints202003.0297.v1

Applying Artificial Intelligence Techniques to Improve Clinical Diagnosis of Alzheimer’s Disease

Ahmed Abdullah Farid, Gamal Selim, Hatem Khater

Subject: Computer Science And Mathematics, Information Systems Keywords: Data Mining; Alzheimer’s Dementia; Composite Hybrid Feature Selection; Machine learning; Stack ‎Hybrid Classification; AI Techniques; Classification; AD Diagnose; Clinical AD Dataset

Online: 19 March 2020 (10:52:31 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202309.1505.v1

Research on the Closure and Remediation Processes of Mining Areas in Romania and Approaches to the Strategy for Heavy Metal Pollution Remediation

Violeta-Monica Radu, Anca-Marina Vîjdea, Alexandru-Anton Ivanov, Veronica-Elena Alexe, George Dincă, Valentina-Maria Cetean, Andra-Elena Filiuță

Subject: Environmental And Earth Sciences, Environmental Science Keywords: heavy metals, mining activities, pollution, remediation

Online: 22 September 2023 (06:37:03 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202110.0033.v1

KOMBAT: Knowledgebase of Microbes’ Battling Agents for Therapeutics

Anasuya Bhargav, Srijanee Gupta, Surabhi Seth, Sweety James, Firdaus Fatima, Pratibha Chaurasia, Srinivasan Ramachandran

Subject: Biology And Life Sciences, Immunology And Microbiology Keywords: Antibiotic resistance; text mining; therapy; database

Online: 4 October 2021 (08:58:52 CEST)

Show abstract| Download PDF| Supplementary Files| Share

Working Paper ARTICLE

Fraud Audit based on Visual Analysis. A Process Mining Approach.

Jorge-Félix Rodríguez-Quintero, Alexander Sánchez-Díaz, Leonel Iriarte-Navarro, Alejandro Maté, Manuel Marco-Such, Juan Trujillo

Subject: Computer Science And Mathematics, Information Systems Keywords: fraud audit; process mining; visual analytics

Online: 2 March 2021 (09:19:01 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202003.0299.v1

Applying Artificial Intelligence Techniques for Prediction of Neurodegenerative Disorders: A Comparative Case-Study on Clinical Tests and Neuroimaging Tests with Alzheimer’s Disease

Ahmed Abdullah Farid, Gamal Selim, Hatem Khater

Subject: Computer Science And Mathematics, Information Systems Keywords: Data Mining; Alzheimer’s Dementia; Composite Hybrid Feature Selection; Machine learning; stack ‎Hybrid Classification; AI; MRI; Neuroimaging; MPEG7 edge histogram feature extraction; CNN

Online: 19 March 2020 (11:25:01 CET)

Show abstract| Download PDF| Share

Alzheimer's disease (AD) detection acting as an essential role in global health care due to misdiagnosis and sharing many clinical sets with other types of dementia, and costly monitoring the progression of the disease over time by magnetic reasoning imaging (MRI) with consideration of human error in manual reading. This paper goal a comparative study on the performance of data mining techniques on two datasets of Clinical and Neuroimaging Tests with AD. Our proposed model in the first stage, Apply clinical medical dataset to a composite hybrid feature selection (CHFS), for extract new features to select the best features due to eliminating obscures features, In parallel with Apply a novel hybrid feature extraction of three batch edge detection algorithm and texture from MRI images dataset and optimized with fuzzy 64-bin histogram. In the second stage, we applied a clinical dataset to a stacked hybrid classification(SHC) model to combine Jrip and random forest classifiers with six model evaluations as meta-classifier individually to improve the prediction of clinical diagnosis. At the same stage of improving the classification accuracy of neuroimaging (MRI) dataset images by applying a convolution neural network (CNN) in comparison with traditional classifiers, running on extracted features from images. The authors have collected the clinical dataset of 426 subjects with (1229 potential patient sample) from oasis.org and (MRI) dataset from a benchmark kaggle.com with a total of around ~5000 images each segregated into the severity of Alzheimer's. The datasets evaluated using an explorer set of weka data mining software for the analysis purpose. The experimental show that the proposed model of ‏(CHFS) feature extraction ‏ lead to effectively reduced the false-negative rate with a relatively high overall accuracy with a stack hybrid classification of support vector machine (SVM) as meta-classifier of 96.50% compared to 68.83% of the previous result on a clinical dataset, Besides a compared model of CNN classification on MRI images dataset of 80.21%. The results showed the superiority of our CHFS model in predicting Alzheimer's disease more accurately with the clinical medical dataset in early-stage compared with the neuroimaging (MRI) dataset. The results of the proposed model were able to predict with accurately classify Alzheimer's clinical samples at a low cost in comparison with the MRI-CNN images model at the early stage and get a good indicator for high classification rate for MRI images when applying our proposed model of SHC.

Search Results

1589 articles found