ARTICLE | doi:10.20944/preprints202303.0070.v1
Subject: Computer Science And Mathematics, Information Systems Keywords: encoder; decoder; seq2seq; LSTM; RNN; chatbots
Online: 3 March 2023 (10:08:32 CET)
Chatbots are extensively needed in customer services to handle customer inquiries, such as tracking orders or providing information about products and services. One of the most reliable implementations of chatbots is using the common architectures of LSTM networks named Seq2Seq networks. The networks are using an encoder and a decoder. Seq2Seq chatbot is a type of chat system that is professional enough to pass the Turing test. The Turing test is a way of deciding the accuracy of the machine by examining its response, it should appear like a human response. In this research, we will introduce a novel architecture that can pass the Turing test. The seq2seq Accuracy is improved by making incremental training to the chatbot. The new proposal provides higher accuracy and high similarity to human chat responses.
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Amharic script; Attention mechanism; OCR; Encoder-decoder; Text-image
Online: 15 October 2020 (13:42:28 CEST)
In the present, the growth of digitization and worldwide communications make OCR systems of exotic languages a very important task. In this paper, we attempt to develop an OCR system for one of these exotic languages with a unique script, Amharic. Motivated by the recent success of the Attention mechanism in Neural Machine Translation (NMT), we extend the attention mechanism for Amharic text-image recognition. The proposed model consists of CNNs and attention embedded recurrent encoder-decoder networks that are integrated following the configuration of the seq2seq framework. The attention network parameters are trained in an end-to-end fashion and the context vector is injected, with the previously predicted output, at each time steps of decoding. Unlike the existing OCR model that minimizes the CTC objective function, the new model minimizes the categorical cross-entropy loss. The performance of the proposed attention-based model is evaluated against the test dataset from the ADOCR database which consists of both printed and synthetically generated Amharic text-line images and achieved promising results with a CER of 1.54% and 1.17% respectively.
ARTICLE | doi:10.20944/preprints202104.0630.v1
Subject: Computer Science And Mathematics, Algebra And Number Theory Keywords: Paraphrase Identification; Paraphrase Generation; Natural Language Generation; Language Model; Encoder Decoder; Transformer
Online: 23 April 2021 (10:35:20 CEST)
Paraphrase Generation is one of the most important and challenging tasks in the field of Natural Language Generation. The paraphrasing techniques help to identify or to extract/generate phrases/sentences conveying the similar meaning. The paraphrasing task can be bifurcated into two sub-tasks namely, Paraphrase Identification (PI) and Paraphrase Generation (PG). Most of the existing proposed state-of-the-art systems have the potential to solve only one problem at a time. This paper proposes a light-weight unified model that can simultaneously classify whether given pair of sentences are paraphrases of each other and the model can also generate multiple paraphrases given an input sentence. Paraphrase Generation module aims to generate fluent and semantically similar paraphrases and the Paraphrase Identification systemaims to classify whether sentences pair are paraphrases of each other or not. The proposed approach uses an amalgamation of data sampling or data variety with a granular fine-tuned Text-To-Text Transfer Transformer (T5) model. This paper proposes a unified approach which aims to solve the problems of Paraphrase Identification and generation by using carefully selected data-points and a fine-tuned T5 model. The highlight of this study is that the same light-weight model trained by keeping the objective of Paraphrase Generation can also be used for solving the Paraphrase Identification task. Hence, the proposed system is light-weight in terms of the model’s size along with the data used to train the model which facilitates the quick learning of the model without having to compromise with the results. The proposed system is then evaluated against the popular evaluation metrics like BLEU (BiLingual Evaluation Understudy):, ROUGE (Recall-Oriented Understudy for Gisting Evaluation), METEOR, WER (Word Error Rate), and GLEU (Google-BLEU) for Paraphrase Generation and classification metrics like accuracy, precision, recall and F1-score for Paraphrase Identification system. The proposed model achieves state-of-the-art results on both the tasks of Paraphrase Identification and paraphrase Generation.
ARTICLE | doi:10.20944/preprints202007.0474.v1
Subject: Computer Science And Mathematics, Artificial Intelligence And Machine Learning Keywords: Convolutional Neural Network; Encoder-Decoder Architecture; Semantic Segmentation; Feature Silencing; Crack Detection
Online: 21 July 2020 (13:54:13 CEST)
An autonomous concrete crack inspection system is necessary for preventing hazardous incidents arising from deteriorated concrete surfaces. In this paper, we represent a concrete crack detection framework to aid the process of automated inspection. The proposed approach employs a deep convolutional neural network architecture for crack segmentation from concrete image. The proposed network alleviates the effect of gradient vanishing problem present in deep neural network architectures. A feature silencing module is incorporated in the crack detection framework, for eliminating unnecessary feature maps from the network. The overall performance of the network significantly improves as a result. Experimental results support the benefit of incorporating feature silencing within a convolutional neural network architecture for improving the network’s robustness, sensitivity, and specificity. An added benefit of the proposed architecture is its ability to accommodate for the trade-off between specificity (positive class detection accuracy) and sensitivity (negative class detection accuracy) with respect to the target application. Furthermore, the proposed framework achieves a high precision rate and processing time than crack detection architectures present in literature.
ARTICLE | doi:10.20944/preprints202303.0034.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: unmanned aerial vehicle (UAV); synthetic aperture radar (SAR); automatic target recognition (ATR); deep neural network (DNN); adversarial example; transferability; encoder-decoder; real-time attack
Online: 2 March 2023 (04:43:20 CET)
In recent years, the unmanned aerial vehicle (UAV) synthetic aperture radar (SAR) has become a highly sought-after topic for its wide applications in the field of target recognition, detection, and tracking. However, SAR automatic target recognition (ATR) models based on deep neural networks (DNN) are suffering from adversarial examples. Generally, non-cooperators rarely disclose any information about SAR-ATR models, making adversarial attacks challenging. In this situation, we propose Transferable Adversarial Network (TAN) to attack these models with highly transferable adversarial examples. The proposed method improves the transferability via a two-player game, in which we simultaneously train two encoder-decoder models: a generator that crafts malicious samples through a one-step forward mapping from original data, and an attenuator that weakens the effectiveness of malicious samples by capturing the most harmful deformations. In particular, compared to traditional iterative methods, our approach is able to one-step map original samples to adversarial examples, thus enabling real-time attacks. Experimental results indicate that the proposed approach achieves state-of-the-art transferability with acceptable adversarial perturbations and minimum time costs compared to existing attack methods, i.e., it excellently realizes real-time transferable adversarial attacks.
ARTICLE | doi:10.20944/preprints202305.0975.v1
Subject: Engineering, Electrical And Electronic Engineering Keywords: Load Forecasting; Long Short Term Memory; Temporal Convolution Networks; Multilayer Perceptron; Convolutional Neural Networks; CNN-LSTM; Convolutional LSTM Encoder- Decoder; Evaluation Metrics; Power Sector; Data Analysis
Online: 15 May 2023 (04:39:19 CEST)
Nowadays, power sector is an area that gather great scientific interest, due to events such as the increase in electricity prices in the wholesale energy market and new investments due to technological development in various sectors. These new challenges have in turn created new needs, such as the accurate prediction of the electrical load of the end users. On the occasion of the new challenges, Artificial Neural Networks approaches have become increasingly popular due to their ability to adopt efficiently to time-series predictions. In this paper, it is presented the development of a model which, through an automated process, will provide an accurate prediction of electrical load for the island of Thira in Greece. Through an automated application, deep learning load forecasting models have been created, such as Multilayer Perceptron, Long Short-Term Memory (LSTM), Convolutional Neural Network One Dimensional (CNN-1D), CNN-LSTM, Temporal Convolutional Network (TCN) and a proposed hybrid model called Convolutional LSTM Encoder-Decoder. The results in terms of prediction accuracy show satisfactory performances for all models, with the proposed hybrid model achieving the best accuracy.