Search | Preprints.org

Preprint ARTICLE | doi:10.20944/preprints201908.0073.v1

Self-Supervised Contextual Keyword and Keyphrase Retrieval with Self-Labelling

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: contextual keyword extraction; BERT; word embedding; LSTM; transformers; Deep Learning

Online: 6 August 2019 (09:17:36 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202210.0238.v1

From Twitter to Aso-Rock: A Natural Language Processing Spotlight for Understanding Nigeria 2023 Presidential Election

Olusola Olabanjo, Ashiribo Wusu, Mauton Asokere, Rebecca Padonu, Olufemi Olabanjo, Oluwafolake Ojo, Oseni Afisi, Olusegun Folorunso, Manuel Mazzara, Benjamin Aribisala

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: NLP; NLU; Twitter; Sentiment Analysis; Opinion Mining; Nigeria; Election; Machine Learning; BERT; LSTM; SVM

Online: 17 October 2022 (12:01:42 CEST)

Show abstract| Download PDF| Share

Introduction: Social media platforms such as Facebook, LinkedIn, Twitter, among others have been used as a tool for staging protests, opinion polls, campaign strategy, medium of agitation and a place of interest expression especially during elections. Past studies have established people’s opinion elections using social media posts. The advent of state-of-the-art algorithms for unstructured text processing implies tremendous progress in natural language processing and understanding. Aim: In this work, a Natural Language framework is designed to understand Nigeria 2023 presidential election based on public opinion using Twitter dataset. Methods: Raw datasets concerning discourse around Nigeria 2023 elections from Twitter of 2,059,113 18 dimensions were collected. Sentiment analysis was performed on the preprocessed dataset using three different machine learning models namely: Long Short-Term Memory (LSTM) Recurrent Neural Network, Bidirectional Encoder Representations from Transformers (BERT) and Linear Support Vector Classifier (LSVC) models. Personal tweet analysis of the three candidates provided insight on their campaign strategies and personalities while public tweet analysis established the public’s opinion about them. The performance of the models was also compared using accuracy, recall, false positive rate, precision and F-measure. Results: LSTM model gave an accuracy, precision, recall, AUC and f-measure of 88%, 82.7%, 87.2% , 87.6% and 82.9% respectively; the BERT model gave an accuracy, precision, recall, AUC and f-measure of 94%, 88.5%, 92.5%, 94.7% and 91.7% respectively while the LSVC model gave an accuracy, precision, recall, AUC and f-measure of 73%, 81.4%, 76.4%, 81.2% and 79.2% respectively. Conclusion: The experimental results show that sentiment analysis and other Natural Language Processing tasks can aid in the understanding of the social media space. Results also revealed the leverage of each aspirant towards winning the election. We conclude that sentiment analysis can form a general basis for generating insights for election and modeling election outcomes.

Preprint ARTICLE | doi:10.20944/preprints202006.0223.v1

Cooking is All About People: Comment Classification on Cookery Channels Using Bert and Classification Models (Malayalam-English Mix-Code)

Subramaniam Kazhuparambil, Abhishek Kaushik

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: BERT; Classification; Mix-Code; Language Model; Youtube; Parametric and Non-Parametric

Online: 17 June 2020 (13:40:22 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.1075.v1

Use of Large Language Model for Cyberbullying Detection

Bayode Ogunleye, Babitha Dharmaraj

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: BERT; Cyberbullying; RoBERTa; Language Model; Machine learning; Online abuse; Natural language processing; NLP

Online: 15 June 2023 (05:35:25 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202303.0335.v1

Twitter Sentiment Analysis of Lagos State 2023 Gubernatorial Election using BERT

Ashiribo Senapon Wusu, Olusola Aanu Olabanjo, Rebecca Maulome Padonu, Manuel Mazzara

Subject: Medicine And Pharmacology, Internal Medicine Keywords: NLP; NLU; Twitter; Sentiment Analysis; Opinion Mining; Nigeria; Election; Machine Learning; BERT; LSTM; SVM

Online: 20 March 2023 (02:52:52 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202208.0233.v1

Smart Homes and Families to Enable Sustainable Societies: A Data-Driven Approach for Multi-Perspective Parameter Discovery using BERT Modelling

Eman Alqahtani, Nourah Janbi, Sanaa Sharaf, Rashid Mehmood

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Smart Families; Smart Homes; Sustainable Societies; Smart Cities; Deep Learning; Natural Language Processing (NLP); Social Sustainability; Environmental Sustainability; Economic Sustainability; Bidirectional Encoder Representations from Transformers (BERT); Triple Bottom Line (TBL); Internet of Things (IoT)

Online: 12 August 2022 (10:22:17 CEST)

Show abstract| Download PDF| Share

Technological advancements and innovations have profoundly changed the lives of people giving rise to smart environments, cities, and societies. As homes are the building block of cities and societies, smart homes are critical to establishing smart living and are expected to play a key role in enabling smart cities and societies. The current academic literature and commercial advancements on smart homes have mainly focused on developing and providing smart functions for homes to provide security management and facilitate the residents in their various activities such as ambiance management. Homes are much more than physical structures, buildings, appliances, operational machines, and systems. Homes are composed of families and are inherently complex phenomena underlined by humans and their relationships with each other, subject to individual, intragroup, intergroup, and intercommunity goals. There is a clear need to understand, define, consolidate existing research, and actualize the overarching roles of smart homes, the roles of smart homes that would serve the needs of future smart cities and societies. This paper introduces our data-driven parameter discovery methodology and uses it to provide, for the first time, an extensive, rather fairly comprehensive, analysis of the families and homes landscape seen through the eyes of academics and the public using over a hundred thousand research papers and nearly a million tweets. We develop a methodology using deep learning, natural language processing (NLP), and big data analytics methods and apply it to automatically discover parameters that capture a comprehensive knowledge and design space of smart families and homes comprising social, political, economic, environmental, and other dimensions. The 66 discovered parameters and the knowledge space comprising 100s of dimensions are explained by reviewing and referencing over 300 articles from the academic literature and tweets. The knowledge and parameters discovered in this paper can be used to develop a holistic understanding of matters related to families and homes facilitating the development of better, community-specific, policies, technologies, solutions, and industries for families and homes, leading to strengthening families and homes, and in turn, empowering sustainable societies across the globe.

Preprint ARTICLE | doi:10.20944/preprints202101.0081.v1

What All Do Audio Transformer Models Hear? Probing Acoustic Representations for Language Delivery and Its Structure

Yaman Kumar, Jui Shah, Rajiv Ratn Shah, Changyou Chen

Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Transformers; wave2vec; bert; mockingjay; interpretability

Online: 5 January 2021 (11:20:22 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202312.1739.v1

DEEP LEARNING BASED QUESTION ANSWERING SYSTEM (SURVEY)

Nayyab Saeed, humaira ashraf, NZ Jhanjhi

Subject: Computer Science And Mathematics, Computer Science Keywords: BERT, RNN, Depp Learning

Online: 22 December 2023 (11:48:43 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202302.0066.v1

Smarter Sustainable Tourism: Data-Driven Multi-Perspective Parameter Discovery for Autonomous Design and Operations

Raniah Alsahafi, Ahmed Alzahrani, Rashid Mehmood

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Smart Tourism; Sustainable Tourism; Natural language Processing (NLP); Big Data Analytics; Deep Learning; Machine Learning; Unsupervised Learning; Bidirectional Encoder Representations from Transformers (BERT); Literature Review; Smart Societies

Online: 3 February 2023 (09:47:55 CET)

Show abstract| Download PDF| Share

The Global natural and manmade events are exposing the fragility of the tourism industry and its impact on the global economy. Prior to the COVID-19 pandemic, tourism contributed 10.3% to the global GDP and employed 333 million people but saw a significant decline due to the pandemic. Sustainable and smart tourism requires collaboration from all stakeholders and a comprehensive understanding of global and local issues to drive responsible and innovative growth in the sector. This paper presents an approach for leveraging big data and deep learning to dis-cover holistic, multi-perspective (e.g., local, cultural, national, and international) and objective information on a subject. Specifically, we develop a machine learning pipeline to extract parameters from academic literature and public opinions on Twitter, providing a unique and comprehensive view of the industry from both academic and public perspectives. The academic-view dataset was created from the Scopus database and contains 156,759 research articles from 2000 to 2022, which were modelled to identify 33 distinct parameters in 4 categories: Tourism Types, Planning, Challenges, and Media & Technologies. A Twitter dataset of 485,813 tweets was collected over 18 months starting March 2021 to August 2022 to showcase public perception of tourism in Saudi Arabia, which was modelled to reveal 13 parameters categorized into two broader sets: Tourist Attractions and Tourism Services. Discovering system parameters are re-quired to embed autonomous capabilities in systems and for decision-making and problem-solving during system design and operations. The proposed approach improves AI-based information discovery by extending the use of scientific literature, Twitter, and other sources for autonomous, dynamic optimizations of systems, promoting novel research in the tourism sector and contributing to the development of smart and sustainable societies. The paper also presents a comprehensive knowledge structure and literature review of the tourism sector based on over 250 research articles.

Preprint ARTICLE | doi:10.20944/preprints202302.0077.v1

aeroBERT-Classifier: Classification of Aerospace Requirements using BERT

Archana Tikayat Ray, Bjorn F. Cole, Olivia J. Pinon Fischer, Ryan T. White, Dimitri N. Mavris

Subject: Computer Science And Mathematics, Computer Science Keywords: Requirements Engineering; Natural Language Processing; NLP; BERT; Requirements Classification; Text Classification

Online: 6 February 2023 (02:26:56 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202201.0061.v1

EmmDocClassifier: Efficient Multimodal Document Image Classifier for Scarce Data

Shrinidhi Kanchi, Alain Pagani, Hamam Mokayed, Marcus Liwicki, Didier Stricker, Muhammad Zeshan Afzal

Subject: Computer Science And Mathematics, Computer Vision And Graphics Keywords: BERT, Document Image Classification, EfficientNet, fine-tuned BERT, Hierarchical Attention Networks, Multimodal, RVL-CDIP, Two-stream, Tobacco-3482

Online: 6 January 2022 (10:08:38 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202203.0245.v1

Deep Journalism and DeepJournal V1.0: A Data-Driven Deep Learning Approach to Discover Parameters for Transportation (As A Case Study)

Istiak Ahmad, Fahad Alqurashi, Ehab Abozinadah, Rashid Mehmood

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Natural language processing (NLP); topic modelling; BERT; transportation; newspaper; magazine; academic research; journalism; deep learning; smart cities

Online: 17 March 2022 (07:58:15 CET)

Show abstract| Download PDF| Share

Preprint REVIEW | doi:10.20944/preprints202401.1857.v1

A Review on BERT: Language Understanding for Different Types of NLP Task

Md Saiful Islam, Long Zhang

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Natural Language Processing (NLP); BERT; Language Model; Transfer Learning; Transformers

Online: 26 January 2024 (03:39:32 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202208.0451.v1

Employing a Multilingual Transformer Model for Segmenting Unpunctuated Arabic Text

Abdullah M. Alshanqiti, Sami Albouq, Ahmad B. Alkhodre, Abdallah Namoun, Emad Nabil

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: text splitting; text tokenization; transfer learning; mask-fill prediction; NLP linguistic rules; missing punctuations; cross-lingual BERT model; Masked Language Modeling

Online: 26 August 2022 (05:19:39 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202307.0192.v2

Examining the Potential of Generative Language Models for Aviation Safety Analysis: Case Study and Insights using the Aviation Safety Reporting System (ASRS)

Archana Tikayat Ray, Anirudh Prabhakara Bhat, Ryan T White, Van Minh Nguyen, Olivia J Pinon Fischer, Dimitri N Mavris

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Aviation Safety Reporting System; ASRS; Aviation Safety; Human Factors; Large Language Models; LLM; ChatGPT; Generative Language Models; GPT-3.5; aeroBERT; BERT; InstructGPT; Prompt Engineering

Online: 11 July 2023 (07:13:20 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202111.0378.v1

An Empirical Comparison of Portuguese and Multilingual BERT Models for Auto-Classification of NCM Codes in International Trade

Roberta Rodrigues de Lima, Anita M. R. Fernandes, James Roberto Bombasar, Bruno Alves da Silva, Paul Crocker, Valderi Reis Quietinho Leithardt

Subject: Engineering, Control And Systems Engineering Keywords: NCM classification; natural language processing; transformers; multilingual BERT; portuguese BERT; NLP; BERT

Online: 22 November 2021 (10:59:43 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202305.1325.v1

Agile Methodology for the Standardization of Engineering Requirements using Large Language Models

Archana Tikayat Ray, Bjorn F Cole, Olivia J Pinon Fischer, Anirudh Prabhakara Bhat, Ryan T White, Dimitri N Mavris

Subject: Engineering, Aerospace Engineering Keywords: Requirements Engineering; Natural Language Processing; NLP; BERT; Requirements boilerplates; Model-Based Systems Engineering; MBSE; Requirements table; Large Language Models (LLMs); Transformer based language models

Online: 18 May 2023 (10:19:18 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.0111.v1

Resumes Classification Using Neural Network Approaches Combined with Bert and Gensim: CVS of Moroccan Engineering Students

Aniss Qostal, Aniss Moumen, Younes Lakhrissi

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Gated Recurrent Unit (GRU); Long Short-Term Memory (LSTM); Convolutional Neural Networks (CNN); BERT; Gensim; Moroccan engineering students; Ibn Tofail University; Resumes; CVs; ENSAK

Online: 4 March 2024 (09:44:13 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202312.0158.v1

Analyzing Data Theft Ransomware Traffic Patterns Using BERT

Gabriela Almeida, Felipe Vasconcelos

Subject: Computer Science And Mathematics, Computer Science Keywords: Ransomware Evolution; Data Theft; Network Traffic Analysis; BERT Model; Cybersecurity Adaptation

Online: 4 December 2023 (06:58:59 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202304.0935.v1

A Study on Generating Webtoons using Multilingual Text-to-Image Models

Kyungho Yu, Hyungho Ju, Jeongin Kim, Chanjun Chun, Pankoo Kim

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Multilingual BERT; Text-to-image; DCGAN; Webtoon; GAN

Online: 26 April 2023 (03:16:07 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.0957.v1

Symptom Extraction of Internal Medicine Diseases of Traditional Chinese Medicine Based on BERT-BiLSTM-CRF Model

Hanqing ZHAO, Yuehan Li, Shuai Zhang

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Named entity recognition; Corpus; Information extraction; BERT-BiLSTM-CRF

Online: 19 February 2024 (14:50:47 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202402.1665.v1

Predatory Behaviours in Science

Mario Coccia

Subject: Social Sciences, Library And Information Sciences Keywords: Predatory research field; Science evolution; Science of Science; Social dynamics of science; Transformers; ChatGPT; BERT; Microsoft Copilot:; Large Language Model; Emerging technology; Radical Technology; Natural Language Processing Tool; AI Technology; Deep Learning Architecture; Multi-Head Attention Mechanism. ;

Online: 29 February 2024 (09:06:19 CET)

Show abstract| Download PDF| Share

Predatory research field in science is when an emerging scientific topic destroys current topics and characterizes a main scientific change. Predatory research field can be a basic driver of scientific and technological change that generates a 'creative destruction' in science and society in contexts of knowledge-based competition and rapid changes. The prediction of proposed theory of predatory research fields is that it destroys with a fast growth other research fields. The theoretical approach is tested here in research fields of large language models (LLM) by analyzing the transformers (a deep learning architecture based on the multi-head attention mechanism) proposed in 2017 and from November 2022 started main applications in generative artificial intelligence with innovations of BERT, ChatGPT, Microsoft Copilot (launched on February 7, 2023) and other natural language processing tools driven by AI technology for engaging conversations, gain insights, automate tasks, etc., etc. Statistical evidence suggests that growth rate of transformer technologies is 80.58%, a high level compared to all other research fields in machine learning (having a growth rate of 13.83%). Moreover, predatory research field of transformers has a destructive power such that all other domains in LLM from 2021 to 2023 have a general reduction of scientific growth. The impact of transformers is much more drastic of previous radical technologies such as CNN having a temporal growth rate of 0.16%, lower than 0.38% by transformers, ceteris paribus. These analysis reveals that transformers have characteristics to generate a radical scientific and technological change in a not-too-distant future. Overall, then, the study suggests that predatory research field on emerging topics and technologies can generate path-breaking innovations and the examination here can clarify the essential elements of the science dynamics for a better theory of scientific and technological change, providing also main implications for knowledge policy to support promising research fields and technologies to guide economic and social change.

Preprint ARTICLE | doi:10.20944/preprints202402.1083.v1

PixieGPT: Design and Implementation of a Generative Pre-Trained Transformer for Universities of Bangladesh

Hasan Mahmood Aminul Islam, Mehedi Hasan, Sumiaya Ahmed, Ariful Islam Fardin, Mehedi Hasan Nabil

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: NLP; BeRT; PixieGPT

Online: 20 February 2024 (11:28:38 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202406.1669.v1

Analyzing Multi-Head Attention on Broken BERT Models

Jingwei Wang

Subject: Computer Science And Mathematics, Computer Science Keywords: multi-head attention; BERT

Online: 24 June 2024 (13:53:36 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202403.0316.v1

Advancements in Word Sense Disambiguation: A Poly-Encoder Bert Model Perspective

Linhan Xia, Jiaxin Cai, Enpei Huang, Junbang Liu

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: NLP; Bert model; Semcor dataset; Transformer; Word semantic Disambiguation

Online: 6 March 2024 (11:02:52 CET)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202407.1804.v1

A New Chinese Named Entity Recognition Method for Pig Disease Domain Based on Lexicon Enhanced BERT and Contrastive Learning

Cheng Peng, Xiajun Wang, Qifeng Li, Qinyang Yu, Ruixiang Jiang, Weihong Ma, Wenbiao Wu, Rui Meng, Haiyan Li, Heju Huai, Shuyan Wang, Longjuan He

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: pig disease; Chinese named entity recognition; lexicon enhanced BERT; contrastive learning; small sample

Online: 23 July 2024 (16:05:26 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202306.0364.v2

Enhancing Chinese Address Parsing in Low-Resource Scenarios through In-Context Learning

Guangming Ling, Xiaofeng Mu, Chao Wang, Aiping Xu

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Chinese address parsing; low-resource scenarios; In-context learning; GPT; BERT; k-nearest neighbors

Online: 9 June 2023 (04:28:59 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202404.1574.v1

Enhancing Multimodal Emotion Recognition through Attention Mechanisms in BERT and CNN Architectures

Fazliddin Makhmudov, Alpamis Kultimuratov, Young-Im Cho

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Deep learning; CNN; BERT; Emotion recognition

Online: 24 April 2024 (08:29:13 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202408.0966.v1

Enhancing the Interpretability of Malaria and Typhoid Diagnosis with Explainable AI and Large Language Models

Kingsley Attai, Moses Ekpenyong, Constance Amannah, Daniel Asuquo, Peterben Ajuga, Okure Obot, Ekemini Johnson, Anietie John, Omosivie Maduka, Christie Akwaowo, Faith-Michael Uzoka

Subject: Public Health And Healthcare, Primary Health Care Keywords: Malaria Diagnosis; Typhoid Diagnosis; Machine Learning; XAI; LIME; GPT; BERT; ChatGPT; Gemini; Perplexity; Explainability; Interpretability

Online: 14 August 2024 (09:43:31 CEST)

Show abstract| Download PDF| Share

Preprint ARTICLE | doi:10.20944/preprints202405.1300.v1

Development Of Sentiment Analysis Model In Kazakh Language To Analyze Reviews

Sanzhar Akhmedov, Aliya Nugumanova

Subject: Computer Science And Mathematics, Computer Science Keywords: Sentiment analysis, Natural Language Processing, Fine Tuning, BERT, Transformers

Online: 20 May 2024 (17:03:21 CEST)

Show abstract| Download PDF| Share

Sentiment analysis has become an important tool for understanding public opinion across languages and domains. Recently, there has been an increase in the number of studies on sentiment analysis in low-resource languages such as Kazakh. This is important to ensure that modern text analysis technologies are accessible to all users, regardless of their language background. The aim of the study is to create a sentiment analysis model for analyzing texts in Kazakh. As part of this work, we aim to use fine-tuning techniques on our own dataset for already existing models, thus improving their accuracy and efficiency for analyzing Kazakh language texts. This paper presents a manually collected dataset "KazIntTelCom" from the city information service 2GIS, consisting of user reviews, manually annotated by the authors taking into account the polarity of sentiment (i.e. negative positive or neutral). This dataset was used to fine-tune two pre-trained multi-lingual Transformer-based sentiment analysis models taken from the HuggingFace platform. The distillBERT and XLM-RoBERTa models were used for tuning. Also, the models were tested on the dataset "KazSAnDRA". The results show that accurate tuning even on a relatively small dataset gives a significant increase in performance, which is confirmed by an increase in the accuracy index by 20%-30%. In addition, false misses and false detections are analyzed, which allows us to identify directions for further improvement of the models. The contribution of this work, in addition to the dataset, is the analysis of model errors, which will help future developers to make more accurate settings of hyperparameters of training for sentiment analysis in Kazakh. These results are important for natural language processing and their adaptation to low-resource languages, promoting more inclusive and equitable access to modern analytical tools. Thus, this study demonstrates the effectiveness of the Transformers architecture for sentiment analysis in Kazakh and opens new opportunities for further model improvement.

Preprint ARTICLE | doi:10.20944/preprints202309.1779.v1

An Adaptive Mixup Hard Negative Sampling for Zero-Shot Entity Linking

Shisen Cai, Xi Wu, Maihemuti Maimaiti, Yichang Chen, Zhixiang Wang, Jiong Zheng

Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: zero-shot; BERT; Adaptive_mixup_hard; Biencoder; Zeshel

Online: 26 September 2023 (10:24:34 CEST)

Show abstract| Download PDF| Share